|
TUTORIALS
- The H.264 / MPEG-4 AVC Standard: Core Coding Technology and Recent Extensions.
Dr Detlev Marpe
Fraunhofer Inst. HHI Berlin
more info >
- Recent developments in audio coding
Dr. Maciej Bartkowiak
Poznan University of Technology
more info >
- Multiple Description Coding for Scalable Multimedia Coding
Prof. Peter Schelkens
Vrije Universiteit Brussel (VUB)
more info >
DETAILS
- The H.264 / MPEG-4 AVC Standard: Core Coding Technology and Recent Extensions.
Dr Detlev Marpe
Fraunhofer Inst. HHI Berlin
Abstract
This tutorial will present and discuss the main recent advances in standardized video coding technology, as being developed collaboratively by members of both the ITU-T VCEG and ISO/IEC MPEG organizations during the standardization of the new H.264/MPEG-4 Advanced Video Coding (AVC) standard. Being designed as a generic video coding standard for a broad range of applications, H.264/MPEG-4 AVC has already received an overwhelming amount of attention from industry. Application areas are ranging from videoconferencing over mobile TV and broadcasting of standard-/high-definition TV content up to very high-quality video applications such as professional digital video recording or digital cinema / large-screen digital imagery.
The first part of the tutorial is devoted to the core technology of the H.264/MPEG-4 AVC video coding layer, as specified in the 2003 version of the standard. We will focus on the key innovations such as given by enhanced motion-compensated prediction capabilities, low-complexity integer transforms, content-adaptive in-loop deblocking filter, and enhanced entropy coding methods. We will then highlight the technical features of the so-called Fidelity Range Extensions (FRExt) of H.264/MPEG-4 AVC addressing the specific needs of rapidly growing higher-fidelity video applications. Finally, we will provide an understanding of the basic concepts of the Scalable Video Coding (SVC) extensions, as the most recent and presently ongoing work for extending the capabilities of H.264/MPEG-4 AVC. We will show how the present design of those SVC extensions supports the functionalities of spatial scalability, SNR scalability, and temporal scalability as well as their combinations in a maximally-consistent fashion relative to the current syntax and decoding process of H.264/MPEG-4 AVC.
This tutorial is intended for researchers, students and engineers who are interested in gaining an understanding of the recent advances in standardized video coding technology.
About the Presenter
Dr.-Ing. Detlev Marpe is a Scientific Project Manager in the Image Processing Department of the Fraunhofer Institute for Telecommunications – Heinrich-Hertz-Institute (HHI), Berlin, Germany. He is author or co-author of more than 100 technical publications in the area of image and video processing, and he holds several international patents in this field. From 2001 to 2003, as an Ad-hoc Group Chairman in the Joint Video Team of ITU-T VCEG and ISO/IEC MPEG, he was responsible for the development of the CABAC entropy coding scheme within the H.264/MPEG-4 AVC standardization project. He also served as the Co-Editor of the H.264/MPEG-4 AVC FRExt Amendment. In 2004, he received the Fraunhofer Prize for outstanding scientific achievements and the ITG Best Paper Award of the German Society for Information Technology. His current research interests include still image and video coding, image and video communication as well as computer vision, and information theory.
- Recent developments in audio coding
Dr. Maciej Bartkowiak
Poznan University of Technology
Abstract
A tremendous progress has taken place in audio coding technologies in recent years. Perceptual coding paradigm became ubiquitous among standard and proprietary techniques for compressed audio storage and delivery in applications ranging from narrowband internet streaming of music and speech to high definition multichannel home theatre. Coding scenarios make heavy use of psychoacoustics, especially masking phenomena that render some components of the sound inaudible in the presence of other strong components with similar spectra.
Departing from 1st generation basic subband coding scheme with perceptually controlled uniform scalar quantization, modern algorithms of 2nd and 3rd generation have evolved into sophisticated forms of transform coding with switched time and frequency resolution and temporal shaping of quantization noise. Several additional tools have been introduced to model and remove the redundant and perceptually irrelevant components of audio signal, thus increasing the coding efficiency. These tools include variants of prediction in time and frequency domain, perceptual substitution of noise components, diverse quantization schemes (including vector quantization) and refined entropy coding. Although the original 1st and 2nd generation techniques were aimed at near-CD or broadcast quality, there was a strong demand for codecs offering decent quality at reduced bit rate. At very low bit rates however, traditional waveform coding methods are no longer able to hide the quantization noise below the threshold of audibility and coding artefacts become apparent because the masking conditions of the perceptual models are heavily violated. The 4th generation audio codec recently standardized by ISO uses two new model-based tools to describe parametrically the high frequency content and spatial information instead of coding them. Spectral band replication and parametric stereo encoding allow reducing the bit rate required for good quality audio down to a range which have been associated rather with speech coding. Several sound examples will be shown to demonstrate the progress in compression efficiency.
Alongside transform coding which is an attempt to approximate the waveform with precisely controlled error, there have been studied parametric coding techniques which describe the signal in terms of parameters of a model that is used to resynthesize a similar signal. Parametric coding offers a perspective of data reduction down to very low number of bits representing only the semantic content, however currently available techniques are only slightly more efficient than state-of-the-art waveform coding. A new standard technique for high quality parametric coding has been recently recommended by MPEG committee. Again, a brief audio demonstration of such approach will be given.
Due to popular demands, many new flavours of coding techniques are being or have been recently developed. These include low delay, error resilient, and scalable variants of previous techniques, as well as new lossless and spatial audio coding for multichannel programmes. These shall be briefly covered in the tutorial..
- Multiple Description Coding for Scalable Multimedia Coding
Prof. Peter Schelkens
Vrije Universiteit Brussel (VUB)
Abstract
Real-time delivery of multimedia contents involves heterogeneous links featuring different reliability and bandwidth provision – e.g. wireless/wired. As a result, data packets sent through best-effort networks may be lost and retransmission is often impractical as it increases congestion and introduces excessive delay. In such scenarios, it is essential to develop image/video coding techniques providing resiliency against network erasures. This problem is efficiently addressed by multiple description coding (MDC). MDC techniques generate two (or more) complementary representations of the input, called descriptions, which are sent to the receiver (typically) over different links. In case of channel impairments, the decoder approximates the original signal by using the available descriptions. Distortion in the reconstructed signal decreases upon reception of any additional description and is lower-bounded by the distortion attained by single description coding (SDC), operating at the same overall bit-rate in an error-free transmission scenario. Techniques such as data partitioning, transform coding (e.g. pairwise correlating transforms and frame expansions), multiple description scalar quantization and forward error correction provide practical instantiations of the MDC concept.
Within the broad scope of MDC applications, ranging from still image transmission to multimedia streaming, its application in scalable video coding (SVC) will be illustrated. A recently proposed SVC scheme, coupling the compression efficiency of the open-loop architecture with the robustness of MDC, features a novel framework that selects the amount of resiliency after compression has been performed. This approach adjusts robustness to transmission errors without additional encoding operations. As a result, given the channel statistics, robustness to data losses is traded for better visual quality when transmission occurs over reliable channels, – i.e. MDC approaches the SDC coding performance – while error resilience is introduced when noisy links are involved.
About the Presenter
Peter Schelkens was born in Willebroek, Belgium in 1969. He received his Electronic Engineering degree in VLSI-design from the Industriële Hogeschool Antwerpen-Mechelen (IHAM), Campus Mechelen in 1991. Thereafter, he obtained the Electrical Engineering degree (M.Sc.) in applied physics in 1994, the Biomedical Engineering degree (medical physics) in 1995, and the Ph.D. degree in Applied Sciences in 2001 from the Vrije Universiteit Brussel (VUB).
Since October 1994, he is a member of the Department of Electronics and Information Processing (ETRO) at VUB and currently he holds professorship at the Vrije Universiteit Brussel and a post-doctoral fellowship with the Fund for Scientific Research – Flanders (FWO), Belgium. Peter Schelkens is director of the VUB node of the Interdisciplinary Institute for Broadband Technology (IBBT) and member of the Directors committee of the same institute. Additionally, he is also affiliated as scientific advisor to the DESICS division of the Interuniversity Microelectronics Institute (IMEC).
Since 2000, he is coordinating a research team in the field of image and video compression, and multimedia technologies (e.g. audiovisual content analysis) that is part of Interdisciplinary Institute for Broadband Technology (IBBT). This team is participating to the ISO/IEC JTC1/SC29/WG1 (JPEG2000) and WG11 (MPEG-21) standardization activities.
Peter Schelkens is the Belgian head of delegation for the ISO/IEC JPEG standardization committee, editor/chair of part 10 of JPEG2000: “Extensions for Three-Dimensional and Floating Point Data”, and member of the MPEG-committee.
Peter Schelkens is the author/co-author of plural scientific publications and patent(s) (applications) and has participated in several national and international projects (IWT, FWO-FNRS, DWTC-SSTC, EC, ESA).
He is member of IEEE (also member of member board of the IEEE CAS Benelux Chapter), SPIE and SMPTE, and is active in several Belgian research networks (Scientific Research Networks of the Fund for Scientific Research: “Image Processing Systems (IPS), “Program acceleration by Application-driven & architecture-driven Code Transformations (PACT)”, "Broadband communication and multimedia services for mobile users (MOBILE)") and industrial networks (DSP Valley, Multimedia Valley).
|
NEWS
|