Prof. Dr. K. R. Rao, IEEE Fellow, University of Texas Arlington, USA
Abstract
Based on a human visual system (HVS) based approach, digital video
image quality and perceptual coding(DVIQPC) outlines the principles,
metrics an standards associated with perceptual coding as well as the
latest techniques and applications. It
discusses the latest developments in vision research as they relate to
HVS based video and image coding. It discusses subjective and objective
assessment methods, quantitative quality metrics including vision model
based digital
video impairment metrics, test
criteria and procedures. It examines post-filtering and reconstruction
issues
associated with color bleeding, blocking, ringing and temporal
fluctuation
artifacts in detail along with methods to reduce/eliminate them. It
also
focuses on picture quality assessment criteria. It poses new challenges
to
vision research and/or how to transfer vision science to imaging and
visual
communication systems engineering. It also poses an obvious theoretical
and
practical challenge regarding the concept of psychovisual redundancy
(also how
to define this quantitatively) and to establish theoretical bound for
perceptually lossless coding.
Bio
K. R. Rao received the Ph. D. degree in
electrical engineering from The University of New Mexico, Albuquerque
in
1966. Since 1966,
he has
been with the University
of Texas
at Arlington
where he is currently a professor
of electrical engineering. He, along
with two other researchers, introduced the Discrete Cosine Transform in
1975
which has
since become very popular
in digital signal processing. He
is the
co-author of the books “Orthogonal
Transforms for Digital Signal Processing”(Springer-Verlag,
1975), Also
recorded for
the blind in Braille by the Royal
National
Institute for
the blind. “Fast
Transforms:Analyses
and Applications”(Academic
Press,
1982), “Discrete Cosine Transform-Algorithms, Advantages,
Applications”(Academic
Press, 1990).He has
edited a benchmark volume,
“Discrete
Transforms and Their Applications” (Van Nostrand Reinhold,
1985). He has
coedited a benchmark volume,
“Teleconferencing” (Van Nostrand Reinhold, 1985).
He is co-author of the
books, “Techniques and
standards for Image/Video/Audio Coding” (Prentice Hall) 1996
“Packet video
communications over ATM networks”(Prentice Hall) 2000 and
“Multimedia
communication systems” (Prentice Hall) 2002. He has
coedited a handbook
“ The transform
and data compression handbook,”
(
CRC Press, 2001). Digital video image quality and perceptual coding,
(with H.R.
Wu),Taylor and Francis(2006).
Introduction to multimedia communications: applications, middleware,
networking, (with Z.S. Bojkovic
and D.A. Milovanovic), Wiley,
(2006). He has also published a book, “ Discrete cosine and
sine transforms”,
with V. Britanak and P. YIP (Elsevier 2007). Some of his books have
been translated into
Japanese, Chinese, Korean
and Russian and also published as Asian (paperback) editions. He has
been an external
examiner for graduate students
from Universities in
Australia,
Canada,
Hong Kong, India,
Singapore,
Thailand and Taiwan.
He was a visiting
professor in several
Universities -3 weeks to 7 and1/2 months- (Australia,
Japan,
Korea,
Singapore
and Thailand.
He has
conducted workshops/tutorials
on video/audio coding/standards worldwide. He has
supervised several students at the Masters (59) and Doctoral (29)
levels. He has
published extensively in refereed journals and has been a consultant to
industry, research
institutes, law firms and academia. He is a Fellow of the IEEE.
Prof. Dr. M. Rupp, Vienna University of Technology, Austria
Abstract
The deployment of third generation mobile networks enabled new
real-time multimedia services like video call, conferencing and
streaming. The real-time nature of these services excludes the
possibility of end-to-end retransmissions. Therefore, errors affecting
the quality of the received service are inevitable. The aim of error
resilience methods is to minimize the impact of errors on the end-user
quality.
In this talk, the effect of errors at different positions in the video
stream and the possibility of their detection will be discussed.
Typically, within an IP video stream, the presence of errors in one IP
packet can be detected by means of a simple checksum. Thus, the IP
packet size determines the resolution of the error detection. In order
to reduce the rate increase due to packet headers, the IP packets are
rather large and their loss results in a loss of a considerable part of
a picture. Currently, the erroneous IP packets are discarded at the
receiver and the corresponding missing parts of the video sequence are
concealed. However, the discarded IP packets may still contain
correctly received information. If this information is used
additionally, an essential improvement in the end-user quality can be
obtained.
If the access network technology is known, an appropriate cross-layer
design enables easier error detection and allows for further
improvements of error resilience. In this talk, the UMTS access network
is focused. Error resilience methods can be further improved by an
appropriate scheduling of the video stream. Here, link-error aware and
distortion-aware concepts and their combinations will be discussed and
their performance demonstrated.
Bio
Markus Rupp received his Dipl.-Ing. degree in 1988 at the University of
Saarbruecken, Germany and his Dr.-Ing. degree in 1993 at the Technische
Universitaet Darmstadt, Germany, where he worked with Eberhardt
Haensler on designing new algorithms for acoustical and electrical echo
compensation.
From November 1993 until July 1995 he had a postdoctoral position at
the University of Santa Barbara, California with Sanjit Mitra where he
worked with Ali H. Sayed on a robustness description of adaptive
filters with impacts on neural networks and active noise control. From
October 1995 until August 2001 he was a member of the Technical Staff
in the Wireless Technology Research Department of Bell-Labs where he
was working on various topics related to adaptive equalization and
rapid implementation for IS-136, 802.11 and UMTS. He is presently a
full professor for Digital Signal Processing in Mobile Communications
at the Technical University of Vienna. He was associate editor of IEEE
Transactions on Signal Processing from 2002-2005, is currently
associate editor of JASP EURASIP Journal of Applied Signal Processing,
and of JES EURASIP Journal on Embedded Systems and is elected AdCom
member of EURASIP. He authored and co-authored more than 200 papers and
patents on adaptive filtering, wireless communications and rapid
prototyping as well as automatic design methods.
Prof. Dr. L. Onural, Bilkent University, Ankara, Turkey
Abstract
A consortium of 19 European institutions, led by Bilkent University,
has been focusing on all technical aspects of 3DTV: 3D scene capture,
representation, compression, transmission and display are the main
technical building blocks. Fundamental signal processing issues
associated with scalar wave propagation, diffraction and holography are
also of prime interest. Advanced versions of stereoscopic viewing,
integral imaging based systems, volumetric displays and holographic
displays are among the candidate techniques for presenting 3DTV to the
viewer. Current 3D monitors are mostly stereoscopic, however,
experimental holographic displays are also demonstrated. It is
envisioned that future 3DTV systems will decouple the capture and
display steps: 3D scenes will be captured by some means, like
multi-camera systems, and this data will then be converted to abstract
3D representation using computer graphics techniques. The display will
then access this abstract data to generate the 3D video to the
observer.
Bio
Levent Onural received his Ph.D. in ECE from SUNY at Buffalo
in 1985 (BS, MS from METU). He was a Fulbright scholar. After a
research
assistant professor position at ECE Department of SUNYAB, he joined
EEE Department of Bilkent University, where he is a full Professor at
present. His current research interests are in image and video
processing, 3DTV,
holographic TV, and signal processing aspects of optical wave
propagation.
Dr. Onural received a TUBITAK award in 1995, and an IEEE Third
Millenium Medal IEEE in 2000. He served IEEE as the Director of IEEE
Region 8
(Europe, Middle Eastand Africa), and as the Secretary of IEEE. He was a
member of IEEE Board of Directors, IEEE Executive Committee and IEEE
Assembly. He
is an Associate
Editor of IEEE TCSVT. Currently, he is leading the EC funded 3DTV
project as the coordinator; 19 institutions
(about 200
researchers) are contributing to 3DTV.
Prof. Dr. H. Wechsler, George Mason University, Fairfax, Virginia, USA
Abstract
The ability to recognize objects, in general, and living creatures, in
particular, in photographs or video clips, is a critical enabling
technology for a wide range of applications including health care,
human-computer intelligent interaction, search engines for image
retrieval and data mining, industrial and personal robotics,
surveillance and security, and transportation. Despite almost 50 years
of research, however, today’s object recognition systems are
still largely unable to handle the extraordinary wide range of
appearances assumed by common objects [including human faces] in
typical images. Some of the challenges for modern pattern recognition
that have to be addressed in order to advance and make practical both
detection and categorization include open set recognition, occlusion
and masking, change detection and time-varying imagery, lack of enough
data for training, and proper performance evaluation and error
analysis. Open set recognition operates under the assumption that not
all the test (unknown) probes have mates in the gallery (training set),
occlusion and masking hide and disguise parts of the input, image
contents vary across both the spatial and temporal dimensions, the
amount of data available for learning and adaptation is limited, and
errors are not uniformly distributed across patterns. The
recognition-by-parts approach proposed here to address the challenges
listed above is driven by transduction and boosting. Transduction
employs local estimation and inference to find a compatible labeling of
joined training and test data. Active learning further promotes the
recognition process by making incremental choices about what is best to
learn and when in order to accumulate the evidence needed to
disambiguate among alternative interpretations. The interplay between
labeled (“training”) and unlabeled
(“test”) data points mediates between
semi-supervised learning and transduction. The additional information
coming from the unlabeled data points includes constraints and hints
about the meaningful relations and regularities affecting their very
discrimination. Boosting combines in an iterative fashion part-based,
model-free, and non-parametric simple weak classifiers, whose contents
and relative ranking are driven by their
“strangeness” characteristics. The scope of the
proposed approach covers also stream-based data points and includes
change detection. The benefits of the proposed discriminative
recognition-by-parts approach include a priori setting of rejection
thresholds, no need for image segmentation, robustness to occlusion,
clutter, and disguise. Examples drawn from biometrics illustrate the
proposed approach and show its feasibility and utility.
Bio
Harry Wechsler received the PhD degree in information and computer
science
from the University of California, Irvine. Currently, he is a professor
of
computer science and director for the Center of Distributed and
Intelligent
Computation at George Mason University (GMU). His research in the field
of
intelligent systems focuses on computational vision, image and signal
processing, data mining, and machine learning and pattern recognition,
with
applications to biometrics / face recognition / gait analysis /
performance
evaluation, augmented cognition and HCI, change detection and link
analysis,
and video processing and surveillance. He has published more than 250
scientific papers, serves on the editorial board of major scientific
publications,
and is the author of Computational Vision (Academic Press, 1990). As a
leading
researcher in face recognition, he organized and directed in 1997 the
seminal
NATO Advanced Study Institute (ASI) on “Face Recognition:
From Theory to Applications”
whose proceedings were published by Springer in 1998. His book on
Reliable Face
Recognition Methods, which breaks new ground in applied modern pattern
recognition
and biometrics, was published by Springer in the fall of 2006. Dr.
Wechsler has
directed at GMU the design and development of FERET, which has become
the standard
facial data base for benchmark studies and experimentation. He was
elected an
IEEE Fellow in 1992 for “contributions to spatial/spectral
image representations
and neural networks and their theoretical integration and application
to human
and machine perception” and an IAPR (International
Association of Pattern Recognition)
Fellow in 1998. He was granted (together with his former doctoral
students)
two patents by USPO in 2004 on fractal image compression using
quad-q-learning
(licensed in 2006) and feature based classification (for face
recognition).
Two additional patents (together with his former doctoral students) on
open set
(face) recognition (and intrusion / outlier detection) and change
detection
using martingale are now pending with USPO.