SlideShare una empresa de Scribd logo
1 de 59
Descargar para leer sin conexión
Video
Coding




             Video Compression

            MIT 6.344, Spring 2004


             John G. Apostolopoulos
         Streaming Media Systems Group
          Hewlett-Packard Laboratories
               japos@hpl.hp.com



                                         John G. Apostolopoulos
                                         April 22, 2004           Page 1
Video
 Coding
          Overview of Next Three Lectures

Today     • Video Compression (Thurs, 4/22)
              – Principles and practice of video coding
              – Basics behind MPEG compression algorithms
              – Current image & video compression standards

          • Video Communication & Video Streaming I (Tues, 4/27)
              – Video application contexts & examples: DVD and Digital TV
              – Challenges in video streaming over the Internet
              – Techniques for overcoming these challenges

          • Video Communication & Video Streaming II (Thurs, 4/29)
              – Video over lossy packet networks and wireless links → Error-
                resilient video communications
                                                            John G. Apostolopoulos
                                                            April 22, 2004           Page 2
Video
Coding
             Outline of Today’s Lecture

         • Motivation for compression
         • Brief review of generic compression system (from prior lecture)
         • Brief review of image compression (from last lecture)
         • Video compression
             – Exploit temporal dimension of video signal
             – Motion-compensated prediction
             – Generic (MPEG-type) video coder architecture
             – Scalable video coding
         • Overview of current video compression standards
             – What do the standards specify?
             – Frame-based video coding: MPEG-1/2/4, H.261/3/4
             – Object-based video coding: MPEG-4

                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 3
Video    Motivation for Compression:
Coding
         Example of HDTV Video Signal

         • Problem:
             – Raw video contains an immense amount of data
             – Communication and storage capabilities are limited
               and expensive
         • Example HDTV video signal:
             – 720x1280 pixels/frame, progressive scanning at
               60 frames/s:
              ⎛ 720 × 1280 pixels ⎞⎛ 60 frames ⎞⎛ 3colors ⎞⎛ 8bits ⎞
              ⎜                   ⎟⎜           ⎟⎜         ⎟⎜       ⎟ = 1.3Gb / s
              ⎝       frame       ⎠⎝ sec ⎠⎝ pixel ⎠⎝ color ⎠
             – 20 Mb/s HDTV channel bandwidth
             → Requires compression by a factor of 70 (equivalent
               to .35 bits/pixel)

                                                                John G. Apostolopoulos
                                                                April 22, 2004           Page 4
Video
Coding
         Achieving Compression

         • Reduce redundancy and irrelevancy
         • Sources of redundancy
              – Temporal: Adjacent frames highly correlated
              – Spatial: Nearby pixels are often correlated with
                each other
              – Color space: RGB components are correlated
                among themselves
              → Relatively straightforward to exploit
         • Irrelevancy
              – Perceptually unimportant information
              → Difficult to model and exploit



                                                            John G. Apostolopoulos
                                                            April 22, 2004           Page 5
Video    Spatial and Temporal Redundancy
Coding




         • Why can video be compressed?
           – Video contains much spatial and temporal redundancy.

         • Spatial redundancy: Neighboring pixels are similar
         • Temporal redundancy: Adjacent frames are similar

         Compression is achieved by exploiting the spatial and temporal
                          redundancy inherent to video.


                                                                John G. Apostolopoulos
                                                                April 22, 2004           Page 6
Video
Coding
             Outline of Today’s Lecture

         • Motivation for compression
         • Brief review of generic compression system (from prior lecture)
         • Brief review of image compression (from last lecture)
         • Video compression
             – Exploit temporal dimension of video signal
             – Motion-compensated prediction
             – Generic (MPEG-type) video coder architecture
             – Scalable video coding
         • Overview of current video compression standards
             – What do the standards specify?
             – Frame-based video coding: MPEG-1/2/4, H.261/3/4
             – Object-based video coding: MPEG-4

                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 7
Video
Coding
           Generic Compression System

    Original                                                        Compressed
    Signal     Representation                          Binary        Bitstream
                                  Quantization
                 (Analysis)                           Encoding


          A compression system is composed of three key building blocks:
          • Representation
              – Concentrates important information into a few parameters
          • Quantization
              – Discretizes parameters
          • Binary encoding
              – Exploits non-uniform statistics of quantized parameters
              – Creates bitstream for transmission


                                                                 John G. Apostolopoulos
                                                                 April 22, 2004           Page 8
Video
Coding
           Generic Compression System (cont.)

    Original                                                          Compressed
    Signal     Representation                          Binary          Bitstream
                                   Quantization
                 (Analysis)                           Encoding



                 Generally             Lossy            Lossless
                  lossless

           • Generally, the only operation that is lossy is the
             quantization stage
           • The fact that all the loss (distortion) is localized to a
             single operation greatly simplifies system design
           • Can design loss to exploit human visual system (HVS)
             properties

                                                                   John G. Apostolopoulos
                                                                   April 22, 2004           Page 9
Video
Coding
           Generic Compression System (cont.)
    Original                                                     Compressed
    Signal                                                        Bitstream
                 Representation                     Binary
                                    Quantization
                   (Analysis)                      Encoding

                                  Source Encoder                            Channel
 Reconstructed
    Signal
                 Representation       Inverse       Binary
                   (Synthesis)      Quantization   Decoding

                                  Source Decoder

           • Source decoder performs the inverse of each of the three
             operations

                                                              John G. Apostolopoulos
                                                              April 22, 2004           Page 10
Video
  Coding
             Review of Image Compression
Original                                                           Compressed
 Image     RGB                                         Runlength &  Bitstream
            to        Block DCT       Quantization         Huffman
           YUV                                             Coding


            • Coding an image (single frame):
                 – RGB to YUV color-space conversion
                 – Partition image into 8x8-pixel blocks
                 – 2-D DCT of each block
                 – Quantize each DCT coefficient
                 – Runlength and Huffman code the nonzero quantized DCT
                   coefficients
            → Basis for the JPEG Image Compression Standard
            → JPEG-2000 uses wavelet transform and arithmetic coding
                                                                     John G. Apostolopoulos
                                                                     April 22, 2004           Page 11
Video
Coding
             Outline of Today’s Lecture

         • Motivation for compression
         • Brief review of generic compression system (from prior lecture)
         • Brief review of image compression (from last lecture)
         • Video compression
             – Exploit temporal dimension of video signal
             – Motion-compensated prediction
             – Generic (MPEG-type) video coder architecture
             – Scalable video coding
         • Overview of current video compression standards
             – What do the standards specify?
             – Frame-based video coding: MPEG-1/2/4, H.261/3/4
             – Object-based video coding: MPEG-4

                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 12
Video
Coding
         Video Compression

         • Video: Sequence of frames (images) that are related
         • Related along the temporal dimension
             – Therefore temporal redundancy exists
         • Main addition over image compression
             – Temporal redundancy
             → Video coder must exploit the temporal redundancy




                                                          John G. Apostolopoulos
                                                          April 22, 2004           Page 13
Video
Coding
         Temporal Processing

         • Usually high frame rate: Significant temporal redundancy
         • Possible representations along temporal dimension:
             – Transform/subband methods
                – Good for textbook case of constant velocity uniform
                  global motion
                – Inefficient for nonuniform motion, I.e. real-world motion
                – Requires large number of frame stores
                     – Leads to delay (Memory cost may also be an issue)
             – Predictive methods
                – Good performance using only 2 frame stores
                – However, simple frame differencing in not enough…



                                                                  John G. Apostolopoulos
                                                                  April 22, 2004           Page 14
Video      Video Compression
Coding

           • Goal: Exploit the temporal redundancy
           • Predict current frame based on previously coded frames
           • Three types of coded frames:
               – I-frame: Intra-coded frame, coded independently of all
                 other frames
               – P-frame: Predictively coded frame, coded based on
                 previously coded frame
               – B-frame: Bi-directionally predicted frame, coded based
                 on both previous and future coded frames




         I frame                  P-frame              B-frame
                                                               John G. Apostolopoulos
                                                               April 22, 2004           Page 15
Video    Temporal Processing:
Coding
         Motion-Compensated Prediction

         • Simple frame differencing fails when there is motion
         • Must account for motion
             → Motion-compensated (MC) prediction
         • MC-prediction generally provides significant improvements
         • Questions:
             – How can we estimate motion?
             – How can we form MC-prediction?




                                                           John G. Apostolopoulos
                                                           April 22, 2004           Page 16
Video    Temporal Processing:
Coding
         Motion Estimation

         • Ideal situation:
             – Partition video into moving objects
             – Describe object motion
             → Generally very difficult
         • Practical approach: Block-Matching Motion Estimation
             – Partition each frame into blocks, e.g. 16x16 pixels
             – Describe motion of each block
             → No object identification required
             → Good, robust performance




                                                           John G. Apostolopoulos
                                                           April 22, 2004           Page 17
Video    Block-Matching Motion Estimation
Coding
                                                                4
                                          3 4              3
                                 2                    2
                            1         7     8     1             8
                                                           7
                                 6                    6
                            5               12    5             12
                                     11                    11
                                10                    10
          Motion Vector     9              16     9             16
                                14   15                    15
           (mv1, mv2)      13                    13
                                                      14



                          Reference Frame        Current Frame
   • Assumptions:
     – Translational motion within block:
              f (n1 , n2 , kcur ) = f (n1 − mv1 , n2 − mv2 , k ref )
     – All pixels within each block have the same motion
   • ME Algorithm:
     1) Divide current frame into non-overlapping N1xN2 blocks
     2) For each block, find the best matching block in reference frame
   • MC-Prediction Algorithm:
     – Use best matching blocks of reference frame as prediction of
         blocks in current frame
                                                                     John G. Apostolopoulos
                                                                     April 22, 2004           Page 18
Video     Block Matching:
Coding
          Determining the Best Matching Block
     • For each block in the current frame search for best matching
       block in the reference frame
         – Metrics for determining “best match”:

           MSE =    ∑ ∑ [ f (n1, n2 , kcur ) − f (n1 − mv1, n2 − mv2 , kref )]2
                   ( n1 ,n2 )∈Block
           MAE =     ∑ ∑ f (n1, n2 , kcur ) − f (n1 − mv1, n2 − mv2 , kref )
                    ( n1 ,n2 )∈Block
         – Candidate blocks:      All blocks in, e.g., (± 32,±32) pixel area
         – Strategies for searching candidate blocks for best match
             – Full search: Examine all candidate blocks
             – Partial (fast) search: Examine a carefully selected subset
     • Estimate of motion for best matching block: “motion vector”

                                                                 John G. Apostolopoulos
                                                                 April 22, 2004           Page 19
Video
Coding
         Motion Vectors and Motion Vector Field

         • Motion vector
             – Expresses the relative horizontal and vertical offsets
               (mv1,mv2), or motion, of a given block from one
               frame to another
             – Each block has its own motion vector
         • Motion vector field
             – Collection of motion vectors for all the blocks in a
               frame




                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 20
Video    Example of Fast Motion Estimation Search:
Coding
         3-Step (Log) Search
                            • Goal: Reduce number of search
                              points
                            • Example: (± 7,±7 ) search area
                            • Dots represent search points
                            • Search performed in 3 steps
                              (coarse-to-fine):
                                Step 1:      (± 4 pixels )
                                Step 2:      (± 2 pixels )
                                Step 3:      (± 1 pixels )
                            • Best match is found at each step
                            • Next step: Search is centered
                              around the best match of prior step


                            • Speedup increases for larger
                              search areas
                                                   John G. Apostolopoulos
                                                   April 22, 2004           Page 21
Video    Motion Vector Precision?
Coding

         • Motivation:
            – Motion is not limited to integer-pixel offsets
            – However, video only known at discrete pixel locations
            – To estimate sub-pixel motion, frames must be spatially
              interpolated
         • Fractional MVs are used to represent the sub-pixel motion
         • Improved performance (extra complexity is worthwhile)
         • Half-pixel ME used in most standards: MPEG-1/2/4
         • Why are half-pixel motion vectors better?
            – Can capture half-pixel motion
            – Averaging effect (from spatial interpolation) reduces
              prediction error → Improved prediction
            – For noisy sequences, averaging effect reduces noise →
              Improved compression

                                                         John G. Apostolopoulos
                                                         April 22, 2004           Page 22
Video     Practical Half-Pixel Motion Estimation
Coding
          Algorithm

     • Half-pixel ME (coarse-fine) algorithm:
         1) Coarse step: Perform integer motion estimation on blocks; find
            best integer-pixel MV
         2) Fine step: Refine estimate to find best half-pixel MV
             a) Spatially interpolate the selected region in reference frame
             b) Compare current block to interpolated reference frame
                block
             c) Choose the integer or half-pixel offset that provides best
                match
     • Typically, bilinear interpolation is used for spatial interpolation



                                                               John G. Apostolopoulos
                                                               April 22, 2004           Page 23
Video     Example: MC-Prediction for Two
Coding
          Consecutive Frames




           Previous Frame                                           Current Frame
         (Reference Frame)                                        (To be Predicted)
                                                             4
                                     3 4                3
                            2                      2
                       1         7     8      1              8
                                                        7
                            6                      6
                      5                12     5              12
                                11                      11
                           10         16           10
                       9        15            9              16
                           14                           15
                                                   14
                      13                      13


                    Reference Frame         Predicted Frame                    John G. Apostolopoulos
                                                                               April 22, 2004           Page 24
Video    Example: MC-Prediction for Two
Coding
         Consecutive Frames (cont.)


   Prediction of
   Current Frame




  Prediction Error
    (Residual)


                                          John G. Apostolopoulos
                                          April 22, 2004           Page 25
Video
Coding
         Block Matching Algorithm: Summary
         • Issues:
              – Block size?
              – Search range?
              – Motion vector accuracy?
         • Motion typically estimated only from luminance
         • Advantages:
              – Good, robust performance for compression
              – Resulting motion vector field is easy to represent (one MV
                per block) and useful for compression
              – Simple, periodic structure, easy VLSI implementations
         • Disadvantages:
              – Assumes translational motion model → Breaks down for
                more complex motion
              – Often produces blocking artifacts (OK for coding with
                Block DCT)
                                                            John G. Apostolopoulos
                                                            April 22, 2004           Page 26
Video           Bi-Directional MC-Prediction
Coding


                                                  4                                4
                         3 4                 3
               2                        2                                3
          1          7     8        1             8        1 2               7     8
                                             7
               6                        6                            6
         5                 12       5             12       5                 11 12
                    11                       11
               10         16            10                          10
          9         15              9             16           9             15 16
              14                             15                     14
                                        14                     13
         13                        13


    Previous Frame                Current Frame          Future Frame
              • Bi-Directional MC-Prediction is used to estimate a block in the
                current frame from a block in:
                 1) Previous frame
                 2) Future frame
                 3) Average of a block from the previous frame and a block
                    from the future frame
                 4) Neither, i.e. code current block without prediction
                                                                                 John G. Apostolopoulos
                                                                                 April 22, 2004           Page 27
Video                MC-Prediction and Bi-Directional
  Coding
                       MC-Prediction (P- and B-frames)

                       • Motion compensated prediction: Predict the current frame
                         based on reference frame(s) while compensating for the motion
                       • Examples of block-based motion-compensated prediction
                         (P-frame) and bi-directional prediction (B-frame):



                                           4                                            4                                   4
                3 4                   3                          3 4               3
       2                         2                      2                     2                                     3
  1         7     8         1              8       1         7     8    1               8             1 2               7   8
                                      7                                            7
       6                         6                      6                     6                                 6
                  12        5              12     5                12   5               12            5                 11 12
  5        11                         11                    11                     11
      10                         10                    10         16          10                               10
                 16                                                                                       9
  9        15               9              16      9
                                                       14   15          9
                                                                                   15
                                                                                        16                              15 16
      14                              15                                                                       14
                                 14                                           14                          13
 13                        13                     13                    13


Previous Frame                  P-Frame         Previous Frame               B-Frame              Future Frame

                                                                                             John G. Apostolopoulos
                                                                                             April 22, 2004                 Page 28
Video      Video Compression
Coding

           • Main addition over image compression:
               – Exploit the temporal redundancy
           • Predict current frame based on previously coded frames
           • Three types of coded frames:
               – I-frame: Intra-coded frame, coded independently of all
                 other frames
               – P-frame: Predictively coded frame, coded based on
                 previously coded frame
               – B-frame: Bi-directionally predicted frame, coded based
                 on both previous and future coded frames




         I frame                  P-frame              B-frame
                                                               John G. Apostolopoulos
                                                               April 22, 2004           Page 29
Video     Example Use of I-,P-,B-frames:
Coding
          MPEG Group of Pictures (GOP)

         • Arrows show prediction dependencies between frames




          I0   B1    B2   P3     B4   B5   P6    B7    B8           I9


                               MPEG GOP



                                                        John G. Apostolopoulos
                                                        April 22, 2004           Page 30
Video
Coding
         Summary of Temporal Processing

         • Use MC-prediction (P and B frames) to reduce temporal
           redundancy
         • MC-prediction usually performs well; In compression have a
           second chance to recover when it performs badly
         • MC-prediction yields:
            – Motion vectors
            – MC-prediction error or residual → Code error with
               conventional image coder
         • Sometimes MC-prediction may perform badly
            – Examples: Complex motion, new imagery (occlusions)
            – Approach:
                1. Identify frame or individual blocks where prediction fails
                2. Code without prediction

                                                            John G. Apostolopoulos
                                                            April 22, 2004           Page 31
Video
Coding
         Basic Video Compression Architecture

         • Exploiting the redundancies:
             – Temporal: MC-prediction (P and B frames)
             – Spatial: Block DCT
             – Color: Color space conversion
         • Scalar quantization of DCT coefficients
         • Zigzag scanning, runlength and Huffman coding of the
           nonzero quantized DCT coefficients




                                                          John G. Apostolopoulos
                                                          April 22, 2004           Page 32
Video       Example Video Encoder
   Coding

Input                                                Buffer fullness
Video                 Residual
Signal   RGB
                                                           Huffman
          to                  DCT     Quantize                           Buffer
                                                           Coding
         YUV                                                                          Output
                                                                                     Bitstream
                                             Inverse
                                             Quantize MV data


                                                 Inverse
                                                   DCT
                      MC-Prediction



                   Motion                   Frame Store
                Compensation
                                                      Previous
                       MV data                        Reconstructed
                                                      Frame
                  Motion
                 Estimation
                                                                       John G. Apostolopoulos
                                                                       April 22, 2004           Page 33
Video
   Coding
                Example Video Decoder

                                                                  Reconstructed
                                                      Residual    Frame
                 Huffman        Inverse     Inverse
       Buffer                                                            YUV to RGB
                 Decoder        Quantize      DCT
  Input                                                                                            Output
Bitstream                                                                                          Video
                                                                                                   Signal
                                           MC-Prediction         Frame Store

                                                                       Previous
                           MV data                   Motion            Reconstructed
                                                  Compensation         Frame




                                                                          John G. Apostolopoulos
                                                                          April 22, 2004            Page 34
Video
Coding
             Outline of Today’s Lecture

         • Motivation for compression
         • Brief review of generic compression system (from prior lecture)
         • Brief review of image compression (from last lecture)
         • Video compression
             – Exploit temporal dimension of video signal
             – Motion-compensated prediction
             – Generic (MPEG-type) video coder architecture
             – Scalable video coding
         • Overview of current video compression standards
             – What do the standards specify?
             – Frame-based video coding: MPEG-1/2/4, H.261/3/4
             – Object-based video coding: MPEG-4

                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 35
Video       Motivation for Scalable Coding
Coding

         Basic situation:
         1. Diverse receivers may request the same video
             – Different bandwidths, spatial resolutions, frame rates,
                 computational capabilities
         2. Heterogeneous networks and a priori unknown network conditions
             – Wired and wireless links, time-varying bandwidths
         → When you originally code the video you don’t know which client
            or network situation will exist in the future
         → Probably have multiple different situations, each requiring a
            different compressed bitstream
         → Need a different compressed video matched to each situation
         • Possible solutions:
             1. Compress & store MANY different versions of the same video
             2. Real-time transcoding (e.g. decode/re-encode)
             3. Scalable coding
                                                           John G. Apostolopoulos
                                                           April 22, 2004           Page 36
Video
Coding
         Scalable Video Coding

         • Scalable coding:
            – Decompose video into multiple layers of prioritized
                importance
            – Code layers into base and enhancement bitstreams
            – Progressively combine one or more bitstreams to produce
                different levels of video quality
         • Example of scalable coding with base and two enhancement
           layers: Can produce three different qualities
                 1. Base layer
                 2. Base + Enh1 layers               Higher quality
                 3. Base + Enh1 + Enh2 layers
         • Scalability with respect to: Spatial or temporal resolution, bit
           rate, computation, memory


                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 37
Video
Coding
          Example of Scalable Coding
 • Encode image/video into three layers:
                                                Base Enh1 Enh2
                               Encoder


 • Low-bandwidth receiver: Send only Base layer
          Base
                                Decoder                    Low Res

 • Medium-bandwidth receiver: Send Base & Enh1 layers
      Base Enh1
                                Decoder                    Med Res

 • High-bandwidth receiver: Send all three layers
    Base Enh1 Enh2
                                Decoder                    High Res

 • Can adapt to different clients and network situations    John G. Apostolopoulos
                                                            April 22, 2004           Page 38
Video
Coding
         Scalable Video Coding (cont.)

         • Three basic types of scalability (refine video quality
           along three different dimensions):
             – Temporal scalability → Temporal resolution
             – Spatial scalability → Spatial resolution
             – SNR (quality) scalability → Amplitude resolution
         • Each type of scalable coding provides scalability of one
           dimension of the video signal
             – Can combine multiple types of scalability to provide
               scalability along multiple dimensions




                                                              John G. Apostolopoulos
                                                              April 22, 2004           Page 39
Video
Coding
         Scalable Coding: Temporal Scalability

         • Temporal scalability: Based on the use of B-frames to
           refine the temporal resolution
             – B-frames are dependent on other frames
             – However, no other frame depends on a B-frame
             – Each B-frame may be discarded without affecting
               other frames




               I0   B1   B2   P3     B4   B5   P6   B7   B8   I9


                                   MPEG GOP                        John G. Apostolopoulos
                                                                   April 22, 2004           Page 40
Video
Coding
             Scalable Coding: Spatial Scalability

             • Spatial scalability: Based on refining the spatial resolution
                 – Base layer is low resolution version of video
                 – Enh1 contains coded difference between upsampled
                   base layer and original video
                 – Also called: Pyramid coding

                                        Enh layer
                                  Enc               Dec

                    ↓2      ↑2                            ↑2                   High-Res
  Original                                                                      Video
   Video                    Dec
                                  Base layer                                      Low-Res
                    Enc                             Dec                            Video

                                                                   John G. Apostolopoulos
                                                                   April 22, 2004           Page 41
Video    Scalable Coding: SNR (Quality)
Coding
         Scalability

         • SNR (Quality) Scalability: Based on refining the
           amplitude resolution
             – Base layer uses a coarse quantizer
             – Enh1 applies a finer quantizer to the difference
               between the original DCT coefficients and the
               coarsely quantized base layer coefficients


                                          EP frame
                   EI frame


                                                     Note: Base & enhancement
                                                     layers are at the same spatial
                    I frame              P-frame     resolution
                                                               John G. Apostolopoulos
                                                               April 22, 2004           Page 42
Video
Coding
         Summary of Scalable Video Coding
         • Three basic types of scalable video coding:
             – Temporal scalability
             – Spatial scalability
             – SNR (quality) scalability
         • Scalable coding produces different layers with prioritized
           importance
         • Prioritized importance is key for a variety of applications:
             – Adapting to different bandwidths, or client resources
               such as spatial or temporal resolution or computational
               power
             – Facilitates error-resilience by explicitly identifying most
               important and less important bits


                                                               John G. Apostolopoulos
                                                               April 22, 2004           Page 43
Video
Coding
             Outline of Today’s Lecture

         • Motivation for compression
         • Brief review of generic compression system (from prior lecture)
         • Brief review of image compression (from last lecture)
         • Video compression
             – Exploit temporal dimension of video signal
             – Motion-compensated prediction
             – Generic (MPEG-type) video coder architecture
             – Scalable video coding
         • Overview of current video compression standards
             – What do the standards specify?
             – Frame-based video coding: MPEG-1/2/4, H.261/3/4
             – Object-based video coding: MPEG-4

                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 44
Video
Coding
         Motivation for Standards

         • Goal of standards:
             – Ensuring interoperability: Enabling communication
               between devices made by different manufacturers
             – Promoting a technology or industry
             – Reducing costs




                                                          John G. Apostolopoulos
                                                          April 22, 2004           Page 45
Video
Coding
         What do the Standards Specify?


         Encoder     Bitstream    Decoder




                                          John G. Apostolopoulos
                                          April 22, 2004           Page 46
Video
Coding
         What do the Standards Specify?



         Encoder            Bitstream           Decoder

                                                (Decoding
                                                 Process)


   • Not the encoder                       Scope of Standardization
   • Not the decoder
   • Just the bitstream syntax and the decoding process (e.g. use IDCT,
     but not how to implement the IDCT)
       → Enables improved encoding & decoding strategies to be
          employed in a standard-compatible manner

                                                          John G. Apostolopoulos
                                                          April 22, 2004           Page 47
Video       Current Image and Video
Coding
            Compression Standards
 Standard      Application                      Bit Rate
 JPEG          Continuous-tone still-image      Variable
               compression
 H.261         Video telephony and              p x 64 kb/s
               teleconferencing over ISDN
 MPEG-1        Video on digital storage media   1.5 Mb/s
               (CD-ROM)
 MPEG-2        Digital Television               2-20 Mb/s
 H.263         Video telephony over PSTN        33.6-? kb/s
 MPEG-4        Object-based coding, synthetic   Variable
               content, interactivity
 JPEG-2000     Improved still image compression Variable
 H.264 /    Improved video compression          10’s to 100’s kb/s
 MPEG-4 AVC

                                                            John G. Apostolopoulos
                                                            April 22, 2004           Page 48
Video     Comparing Current Video Compression
Coding
          Standards

    • Based on the same fundamental building blocks
         – Motion-compensated prediction (I, P, and B frames)
         – 2-D Discrete Cosine Transform (DCT)
         – Color space conversion
         – Scalar quantization, runlengths, Huffman coding
    • Additional tools added for different applications:
         – Progressive or interlaced video
         – Improved compression, error resilience, scalability, etc.
    • MPEG-1/2/4, H.261/3/4: Frame-based coding
    • MPEG-4: Object-based coding and Synthetic video



                                                              John G. Apostolopoulos
                                                              April 22, 2004           Page 49
Video      MPEG Group of Pictures (GOP)
Coding
           Structure
         • Composed of I, P, and B frames
         • Arrows show prediction dependencies
         • Periodic I-frames enable random access into the coded bitstream
         • Parameters: (1) Spacing between I frames, (2) number of B frames
           between I and P frames




           I0   B1     B2   P3     B4    B5   P6    B7     B8           I9


                                 MPEG GOP                   John G. Apostolopoulos
                                                            April 22, 2004           Page 50
Video
Coding
         MPEG Structure

         • MPEG codes video in a hierarchy of layers. The
           sequence layer is not shown.
          GOP Layer                       Picture Layer
                                      P
                                  B
                              B
                          P
                      B
                  B                                       4 8x8 DCT
              I                                              1 MV      8x8 DCT

                                                                        Block
                                                          Macroblock    Layer
                                          Slice Layer
                                                            Layer




                                                                           John G. Apostolopoulos
                                                                           April 22, 2004           Page 51
Video
Coding
            MPEG-2 Profiles and Levels

            • Goal: To enable more efficient implementations for
              different applications (interoperability points)
                – Profile: Subset of the tools applicable for a family of
                  applications
                – Level: Bounds on the complexity for any profile
    Level
                                                HDTV: Main Profile at
    High                                        High Level (MP@HL)

    Main                                        DVD & SD Digital TV:
                                                Main Profile at Main Level
    Low                                         (MP@ML)
                                            Profile
             Simple     Main       High
                                                                 John G. Apostolopoulos
                                                                 April 22, 2004           Page 52
Video
Coding
         MPEG-4 Natural Video Coding

         • Extension of MPEG-1/2-type algorithms to code
           arbitrarily shaped objects




                          Frame-based Coding




                           Object-based Coding         [MPEG Committee]

         Basic Idea: Extend Block-DCT and Block-ME/MC-prediction
          to code arbitrarily shaped objects
                                                           John G. Apostolopoulos
                                                           April 22, 2004           Page 53
Video
  Coding

 Example of
  MPEG-4
   Scene
(Object-based
  Coding)




 [MPEG Committee]   John G. Apostolopoulos
                    April 22, 2004           Page 54
Video    Example MPEG-4 Object Decoding Process
Coding




                                          [MPEG Committee]
                                           John G. Apostolopoulos
                                           April 22, 2004           Page 55
Video
Coding
         Sprite Coding (Background Prediction)

         • Sprite: Large background image
            – Hypothesis: Same background exists for many frames,
              changes resulting from camera motion and occlusions
         • One possible coding strategy:
            1. Code & transmit entire sprite once
            2. Only transmit camera motion parameters for each
               subsequent frame
         • Significant coding gain for some scenes




                                                       John G. Apostolopoulos
                                                       April 22, 2004           Page 56
Video
Coding
           Sprite Coding Example




 Sprite (background)                  Foreground
                                        Object




            Reconstructed
               Frame               [MPEG Committee]


                                      John G. Apostolopoulos
                                      April 22, 2004           Page 57
Video
Coding
             Review of Today’s Lecture

         • Motivation for compression
         • Brief review of generic compression system (from prior lecture)
         • Brief review of image compression (from last lecture)
         • Video compression
             – Exploit temporal dimension of video signal
             – Motion-compensated prediction
             – Generic (MPEG-type) video coder architecture
             – Scalable video coding
         • Overview of current video compression standards
             – What do the standards specify?
             – Frame-based video coding: MPEG-1/2/4, H.261/3/4
             – Object-based video coding: MPEG-4

                                                             John G. Apostolopoulos
                                                             April 22, 2004           Page 58
Video
Coding
         References and Further Reading

         General Video Compression References:
         • J.G. Apostolopoulos and S.J. Wee, ``Video Compression Standards'',
           Wiley Encyclopedia of Electrical and Electronics Engineering, John
           Wiley & Sons, Inc., New York, 1999.
         • V. Bhaskaran and K. Konstantinides, Image and Video Compression
           Standards: Algorithms and Architectures, Boston, Massachusetts:
           Kluwer Academic Publishers, 1997.
         • J.L. Mitchell, W.B. Pennebaker, C.E. Fogg, and D.J. LeGall, MPEG
           Video Compression Standard, New York: Chapman & Hall, 1997.
         • B.G. Haskell, A. Puri, A.N. Netravali, Digital Video: An Introduction to
           MPEG-2, Kluwer Academic Publishers, Boston, 1997.
         MPEG web site:
          http://drogo.cselt.stet.it/mpeg




                                                                        John G. Apostolopoulos
                                                                        April 22, 2004           Page 59

Más contenido relacionado

La actualidad más candente

Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2VijayKumarArya
 
MPEG Compression Standards
MPEG Compression StandardsMPEG Compression Standards
MPEG Compression StandardsAjay
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingChristian Kehl
 
video compression techique
video compression techiquevideo compression techique
video compression techiqueAshish Kumar
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standardanuragjagetiya
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)danishrafiq
 
Video compression
Video compressionVideo compression
Video compressionnnmaurya
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression BasicsSanjiv Malik
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionChamp Yen
 
Digital Video And Compression
Digital Video And CompressionDigital Video And Compression
Digital Video And CompressionRobert Burk
 

La actualidad más candente (20)

Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 
ISDD Video Compression
ISDD Video CompressionISDD Video Compression
ISDD Video Compression
 
MPEG Compression Standards
MPEG Compression StandardsMPEG Compression Standards
MPEG Compression Standards
 
video compression
video compressionvideo compression
video compression
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
Jpeg and mpeg ppt
Jpeg and mpeg pptJpeg and mpeg ppt
Jpeg and mpeg ppt
 
video compression techique
video compression techiquevideo compression techique
video compression techique
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standard
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)
 
Video compression
Video compressionVideo compression
Video compression
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression Basics
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & Introduction
 
Digital Video And Compression
Digital Video And CompressionDigital Video And Compression
Digital Video And Compression
 
Hw2
Hw2Hw2
Hw2
 
MPEG4 vs H.264
MPEG4 vs H.264MPEG4 vs H.264
MPEG4 vs H.264
 
MPEG 4
MPEG 4MPEG 4
MPEG 4
 
Compression
CompressionCompression
Compression
 
Video Compression
Video CompressionVideo Compression
Video Compression
 
Compression
CompressionCompression
Compression
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 

Destacado

28 h 264-avc_by_dhchang
28   h 264-avc_by_dhchang28   h 264-avc_by_dhchang
28 h 264-avc_by_dhchangBadri Patro
 
Hw3 0972552
Hw3 0972552Hw3 0972552
Hw3 0972552s0972552
 
Standards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéOStandards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéObriantais
 
Iain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson
 

Destacado (6)

Chap55
Chap55Chap55
Chap55
 
28 h 264-avc_by_dhchang
28   h 264-avc_by_dhchang28   h 264-avc_by_dhchang
28 h 264-avc_by_dhchang
 
Hw3 0972552
Hw3 0972552Hw3 0972552
Hw3 0972552
 
MPEG/Audio Compression
MPEG/Audio CompressionMPEG/Audio Compression
MPEG/Audio Compression
 
Standards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéOStandards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéO
 
Iain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video Compression
 

Similar a video_compression_2004

Similar a video_compression_2004 (20)

Mpeg4copy 120428133000-phpapp01
Mpeg4copy 120428133000-phpapp01Mpeg4copy 120428133000-phpapp01
Mpeg4copy 120428133000-phpapp01
 
Video Communications and Video Streaming
Video Communications and Video StreamingVideo Communications and Video Streaming
Video Communications and Video Streaming
 
Introduction to video compression
Introduction to video compressionIntroduction to video compression
Introduction to video compression
 
mpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptmpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.ppt
 
Video enc basic_p_pt_type
Video enc basic_p_pt_typeVideo enc basic_p_pt_type
Video enc basic_p_pt_type
 
H 264 in cuda presentation
H 264 in cuda presentationH 264 in cuda presentation
H 264 in cuda presentation
 
Video Compression Technology
Video Compression TechnologyVideo Compression Technology
Video Compression Technology
 
Scrambling For Video Surveillance
Scrambling For Video SurveillanceScrambling For Video Surveillance
Scrambling For Video Surveillance
 
06 vdo
06 vdo06 vdo
06 vdo
 
Basics of Mpeg 4 3D Graphics Compression
Basics of Mpeg 4 3D Graphics CompressionBasics of Mpeg 4 3D Graphics Compression
Basics of Mpeg 4 3D Graphics Compression
 
Battle of the Codecs
Battle of the CodecsBattle of the Codecs
Battle of the Codecs
 
Optimisation and Compression Intro
Optimisation and Compression IntroOptimisation and Compression Intro
Optimisation and Compression Intro
 
Video00.ppt
Video00.pptVideo00.ppt
Video00.ppt
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
 
Scct2013 topic4 video
Scct2013 topic4 videoScct2013 topic4 video
Scct2013 topic4 video
 
MPEG-4 Developments
MPEG-4 DevelopmentsMPEG-4 Developments
MPEG-4 Developments
 
Android Media Player Development
Android Media Player DevelopmentAndroid Media Player Development
Android Media Player Development
 
Multimedia
MultimediaMultimedia
Multimedia
 
What’s new in MPEG?
What’s new in MPEG?What’s new in MPEG?
What’s new in MPEG?
 
Digital Video 101.ppt
Digital Video 101.pptDigital Video 101.ppt
Digital Video 101.ppt
 

Más de aniruddh Tyagi

Más de aniruddh Tyagi (20)

whitepaper_mpeg-if_understanding_mpeg4
whitepaper_mpeg-if_understanding_mpeg4whitepaper_mpeg-if_understanding_mpeg4
whitepaper_mpeg-if_understanding_mpeg4
 
BUC BLOCK UP CONVERTER
BUC BLOCK UP CONVERTERBUC BLOCK UP CONVERTER
BUC BLOCK UP CONVERTER
 
digital_set_top_box2
digital_set_top_box2digital_set_top_box2
digital_set_top_box2
 
Discrete cosine transform
Discrete cosine transformDiscrete cosine transform
Discrete cosine transform
 
DCT
DCTDCT
DCT
 
EBU_DVB_S2 READY TO LIFT OFF
EBU_DVB_S2 READY TO LIFT OFFEBU_DVB_S2 READY TO LIFT OFF
EBU_DVB_S2 READY TO LIFT OFF
 
ADVANCED DVB-C,DVB-S STB DEMOD
ADVANCED DVB-C,DVB-S STB DEMODADVANCED DVB-C,DVB-S STB DEMOD
ADVANCED DVB-C,DVB-S STB DEMOD
 
DVB_Arch
DVB_ArchDVB_Arch
DVB_Arch
 
haffman coding DCT transform
haffman coding DCT transformhaffman coding DCT transform
haffman coding DCT transform
 
Classification
ClassificationClassification
Classification
 
tyagi 's doc
tyagi 's doctyagi 's doc
tyagi 's doc
 
quantization_PCM
quantization_PCMquantization_PCM
quantization_PCM
 
ECMG & EMMG protocol
ECMG & EMMG protocolECMG & EMMG protocol
ECMG & EMMG protocol
 
7015567A
7015567A7015567A
7015567A
 
Basic of BISS
Basic of BISSBasic of BISS
Basic of BISS
 
euler theorm
euler theormeuler theorm
euler theorm
 
fundamentals_satellite_communication_part_1
fundamentals_satellite_communication_part_1fundamentals_satellite_communication_part_1
fundamentals_satellite_communication_part_1
 
quantization
quantizationquantization
quantization
 
art_sklar7_reed-solomon
art_sklar7_reed-solomonart_sklar7_reed-solomon
art_sklar7_reed-solomon
 
DVBSimulcrypt2
DVBSimulcrypt2DVBSimulcrypt2
DVBSimulcrypt2
 

Último

Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 

Último (20)

Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 

video_compression_2004

  • 1. Video Coding Video Compression MIT 6.344, Spring 2004 John G. Apostolopoulos Streaming Media Systems Group Hewlett-Packard Laboratories japos@hpl.hp.com John G. Apostolopoulos April 22, 2004 Page 1
  • 2. Video Coding Overview of Next Three Lectures Today • Video Compression (Thurs, 4/22) – Principles and practice of video coding – Basics behind MPEG compression algorithms – Current image & video compression standards • Video Communication & Video Streaming I (Tues, 4/27) – Video application contexts & examples: DVD and Digital TV – Challenges in video streaming over the Internet – Techniques for overcoming these challenges • Video Communication & Video Streaming II (Thurs, 4/29) – Video over lossy packet networks and wireless links → Error- resilient video communications John G. Apostolopoulos April 22, 2004 Page 2
  • 3. Video Coding Outline of Today’s Lecture • Motivation for compression • Brief review of generic compression system (from prior lecture) • Brief review of image compression (from last lecture) • Video compression – Exploit temporal dimension of video signal – Motion-compensated prediction – Generic (MPEG-type) video coder architecture – Scalable video coding • Overview of current video compression standards – What do the standards specify? – Frame-based video coding: MPEG-1/2/4, H.261/3/4 – Object-based video coding: MPEG-4 John G. Apostolopoulos April 22, 2004 Page 3
  • 4. Video Motivation for Compression: Coding Example of HDTV Video Signal • Problem: – Raw video contains an immense amount of data – Communication and storage capabilities are limited and expensive • Example HDTV video signal: – 720x1280 pixels/frame, progressive scanning at 60 frames/s: ⎛ 720 × 1280 pixels ⎞⎛ 60 frames ⎞⎛ 3colors ⎞⎛ 8bits ⎞ ⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟ = 1.3Gb / s ⎝ frame ⎠⎝ sec ⎠⎝ pixel ⎠⎝ color ⎠ – 20 Mb/s HDTV channel bandwidth → Requires compression by a factor of 70 (equivalent to .35 bits/pixel) John G. Apostolopoulos April 22, 2004 Page 4
  • 5. Video Coding Achieving Compression • Reduce redundancy and irrelevancy • Sources of redundancy – Temporal: Adjacent frames highly correlated – Spatial: Nearby pixels are often correlated with each other – Color space: RGB components are correlated among themselves → Relatively straightforward to exploit • Irrelevancy – Perceptually unimportant information → Difficult to model and exploit John G. Apostolopoulos April 22, 2004 Page 5
  • 6. Video Spatial and Temporal Redundancy Coding • Why can video be compressed? – Video contains much spatial and temporal redundancy. • Spatial redundancy: Neighboring pixels are similar • Temporal redundancy: Adjacent frames are similar Compression is achieved by exploiting the spatial and temporal redundancy inherent to video. John G. Apostolopoulos April 22, 2004 Page 6
  • 7. Video Coding Outline of Today’s Lecture • Motivation for compression • Brief review of generic compression system (from prior lecture) • Brief review of image compression (from last lecture) • Video compression – Exploit temporal dimension of video signal – Motion-compensated prediction – Generic (MPEG-type) video coder architecture – Scalable video coding • Overview of current video compression standards – What do the standards specify? – Frame-based video coding: MPEG-1/2/4, H.261/3/4 – Object-based video coding: MPEG-4 John G. Apostolopoulos April 22, 2004 Page 7
  • 8. Video Coding Generic Compression System Original Compressed Signal Representation Binary Bitstream Quantization (Analysis) Encoding A compression system is composed of three key building blocks: • Representation – Concentrates important information into a few parameters • Quantization – Discretizes parameters • Binary encoding – Exploits non-uniform statistics of quantized parameters – Creates bitstream for transmission John G. Apostolopoulos April 22, 2004 Page 8
  • 9. Video Coding Generic Compression System (cont.) Original Compressed Signal Representation Binary Bitstream Quantization (Analysis) Encoding Generally Lossy Lossless lossless • Generally, the only operation that is lossy is the quantization stage • The fact that all the loss (distortion) is localized to a single operation greatly simplifies system design • Can design loss to exploit human visual system (HVS) properties John G. Apostolopoulos April 22, 2004 Page 9
  • 10. Video Coding Generic Compression System (cont.) Original Compressed Signal Bitstream Representation Binary Quantization (Analysis) Encoding Source Encoder Channel Reconstructed Signal Representation Inverse Binary (Synthesis) Quantization Decoding Source Decoder • Source decoder performs the inverse of each of the three operations John G. Apostolopoulos April 22, 2004 Page 10
  • 11. Video Coding Review of Image Compression Original Compressed Image RGB Runlength & Bitstream to Block DCT Quantization Huffman YUV Coding • Coding an image (single frame): – RGB to YUV color-space conversion – Partition image into 8x8-pixel blocks – 2-D DCT of each block – Quantize each DCT coefficient – Runlength and Huffman code the nonzero quantized DCT coefficients → Basis for the JPEG Image Compression Standard → JPEG-2000 uses wavelet transform and arithmetic coding John G. Apostolopoulos April 22, 2004 Page 11
  • 12. Video Coding Outline of Today’s Lecture • Motivation for compression • Brief review of generic compression system (from prior lecture) • Brief review of image compression (from last lecture) • Video compression – Exploit temporal dimension of video signal – Motion-compensated prediction – Generic (MPEG-type) video coder architecture – Scalable video coding • Overview of current video compression standards – What do the standards specify? – Frame-based video coding: MPEG-1/2/4, H.261/3/4 – Object-based video coding: MPEG-4 John G. Apostolopoulos April 22, 2004 Page 12
  • 13. Video Coding Video Compression • Video: Sequence of frames (images) that are related • Related along the temporal dimension – Therefore temporal redundancy exists • Main addition over image compression – Temporal redundancy → Video coder must exploit the temporal redundancy John G. Apostolopoulos April 22, 2004 Page 13
  • 14. Video Coding Temporal Processing • Usually high frame rate: Significant temporal redundancy • Possible representations along temporal dimension: – Transform/subband methods – Good for textbook case of constant velocity uniform global motion – Inefficient for nonuniform motion, I.e. real-world motion – Requires large number of frame stores – Leads to delay (Memory cost may also be an issue) – Predictive methods – Good performance using only 2 frame stores – However, simple frame differencing in not enough… John G. Apostolopoulos April 22, 2004 Page 14
  • 15. Video Video Compression Coding • Goal: Exploit the temporal redundancy • Predict current frame based on previously coded frames • Three types of coded frames: – I-frame: Intra-coded frame, coded independently of all other frames – P-frame: Predictively coded frame, coded based on previously coded frame – B-frame: Bi-directionally predicted frame, coded based on both previous and future coded frames I frame P-frame B-frame John G. Apostolopoulos April 22, 2004 Page 15
  • 16. Video Temporal Processing: Coding Motion-Compensated Prediction • Simple frame differencing fails when there is motion • Must account for motion → Motion-compensated (MC) prediction • MC-prediction generally provides significant improvements • Questions: – How can we estimate motion? – How can we form MC-prediction? John G. Apostolopoulos April 22, 2004 Page 16
  • 17. Video Temporal Processing: Coding Motion Estimation • Ideal situation: – Partition video into moving objects – Describe object motion → Generally very difficult • Practical approach: Block-Matching Motion Estimation – Partition each frame into blocks, e.g. 16x16 pixels – Describe motion of each block → No object identification required → Good, robust performance John G. Apostolopoulos April 22, 2004 Page 17
  • 18. Video Block-Matching Motion Estimation Coding 4 3 4 3 2 2 1 7 8 1 8 7 6 6 5 12 5 12 11 11 10 10 Motion Vector 9 16 9 16 14 15 15 (mv1, mv2) 13 13 14 Reference Frame Current Frame • Assumptions: – Translational motion within block: f (n1 , n2 , kcur ) = f (n1 − mv1 , n2 − mv2 , k ref ) – All pixels within each block have the same motion • ME Algorithm: 1) Divide current frame into non-overlapping N1xN2 blocks 2) For each block, find the best matching block in reference frame • MC-Prediction Algorithm: – Use best matching blocks of reference frame as prediction of blocks in current frame John G. Apostolopoulos April 22, 2004 Page 18
  • 19. Video Block Matching: Coding Determining the Best Matching Block • For each block in the current frame search for best matching block in the reference frame – Metrics for determining “best match”: MSE = ∑ ∑ [ f (n1, n2 , kcur ) − f (n1 − mv1, n2 − mv2 , kref )]2 ( n1 ,n2 )∈Block MAE = ∑ ∑ f (n1, n2 , kcur ) − f (n1 − mv1, n2 − mv2 , kref ) ( n1 ,n2 )∈Block – Candidate blocks: All blocks in, e.g., (± 32,±32) pixel area – Strategies for searching candidate blocks for best match – Full search: Examine all candidate blocks – Partial (fast) search: Examine a carefully selected subset • Estimate of motion for best matching block: “motion vector” John G. Apostolopoulos April 22, 2004 Page 19
  • 20. Video Coding Motion Vectors and Motion Vector Field • Motion vector – Expresses the relative horizontal and vertical offsets (mv1,mv2), or motion, of a given block from one frame to another – Each block has its own motion vector • Motion vector field – Collection of motion vectors for all the blocks in a frame John G. Apostolopoulos April 22, 2004 Page 20
  • 21. Video Example of Fast Motion Estimation Search: Coding 3-Step (Log) Search • Goal: Reduce number of search points • Example: (± 7,±7 ) search area • Dots represent search points • Search performed in 3 steps (coarse-to-fine): Step 1: (± 4 pixels ) Step 2: (± 2 pixels ) Step 3: (± 1 pixels ) • Best match is found at each step • Next step: Search is centered around the best match of prior step • Speedup increases for larger search areas John G. Apostolopoulos April 22, 2004 Page 21
  • 22. Video Motion Vector Precision? Coding • Motivation: – Motion is not limited to integer-pixel offsets – However, video only known at discrete pixel locations – To estimate sub-pixel motion, frames must be spatially interpolated • Fractional MVs are used to represent the sub-pixel motion • Improved performance (extra complexity is worthwhile) • Half-pixel ME used in most standards: MPEG-1/2/4 • Why are half-pixel motion vectors better? – Can capture half-pixel motion – Averaging effect (from spatial interpolation) reduces prediction error → Improved prediction – For noisy sequences, averaging effect reduces noise → Improved compression John G. Apostolopoulos April 22, 2004 Page 22
  • 23. Video Practical Half-Pixel Motion Estimation Coding Algorithm • Half-pixel ME (coarse-fine) algorithm: 1) Coarse step: Perform integer motion estimation on blocks; find best integer-pixel MV 2) Fine step: Refine estimate to find best half-pixel MV a) Spatially interpolate the selected region in reference frame b) Compare current block to interpolated reference frame block c) Choose the integer or half-pixel offset that provides best match • Typically, bilinear interpolation is used for spatial interpolation John G. Apostolopoulos April 22, 2004 Page 23
  • 24. Video Example: MC-Prediction for Two Coding Consecutive Frames Previous Frame Current Frame (Reference Frame) (To be Predicted) 4 3 4 3 2 2 1 7 8 1 8 7 6 6 5 12 5 12 11 11 10 16 10 9 15 9 16 14 15 14 13 13 Reference Frame Predicted Frame John G. Apostolopoulos April 22, 2004 Page 24
  • 25. Video Example: MC-Prediction for Two Coding Consecutive Frames (cont.) Prediction of Current Frame Prediction Error (Residual) John G. Apostolopoulos April 22, 2004 Page 25
  • 26. Video Coding Block Matching Algorithm: Summary • Issues: – Block size? – Search range? – Motion vector accuracy? • Motion typically estimated only from luminance • Advantages: – Good, robust performance for compression – Resulting motion vector field is easy to represent (one MV per block) and useful for compression – Simple, periodic structure, easy VLSI implementations • Disadvantages: – Assumes translational motion model → Breaks down for more complex motion – Often produces blocking artifacts (OK for coding with Block DCT) John G. Apostolopoulos April 22, 2004 Page 26
  • 27. Video Bi-Directional MC-Prediction Coding 4 4 3 4 3 2 2 3 1 7 8 1 8 1 2 7 8 7 6 6 6 5 12 5 12 5 11 12 11 11 10 16 10 10 9 15 9 16 9 15 16 14 15 14 14 13 13 13 Previous Frame Current Frame Future Frame • Bi-Directional MC-Prediction is used to estimate a block in the current frame from a block in: 1) Previous frame 2) Future frame 3) Average of a block from the previous frame and a block from the future frame 4) Neither, i.e. code current block without prediction John G. Apostolopoulos April 22, 2004 Page 27
  • 28. Video MC-Prediction and Bi-Directional Coding MC-Prediction (P- and B-frames) • Motion compensated prediction: Predict the current frame based on reference frame(s) while compensating for the motion • Examples of block-based motion-compensated prediction (P-frame) and bi-directional prediction (B-frame): 4 4 4 3 4 3 3 4 3 2 2 2 2 3 1 7 8 1 8 1 7 8 1 8 1 2 7 8 7 7 6 6 6 6 6 12 5 12 5 12 5 12 5 11 12 5 11 11 11 11 10 10 10 16 10 10 16 9 9 15 9 16 9 14 15 9 15 16 15 16 14 15 14 14 14 13 13 13 13 13 Previous Frame P-Frame Previous Frame B-Frame Future Frame John G. Apostolopoulos April 22, 2004 Page 28
  • 29. Video Video Compression Coding • Main addition over image compression: – Exploit the temporal redundancy • Predict current frame based on previously coded frames • Three types of coded frames: – I-frame: Intra-coded frame, coded independently of all other frames – P-frame: Predictively coded frame, coded based on previously coded frame – B-frame: Bi-directionally predicted frame, coded based on both previous and future coded frames I frame P-frame B-frame John G. Apostolopoulos April 22, 2004 Page 29
  • 30. Video Example Use of I-,P-,B-frames: Coding MPEG Group of Pictures (GOP) • Arrows show prediction dependencies between frames I0 B1 B2 P3 B4 B5 P6 B7 B8 I9 MPEG GOP John G. Apostolopoulos April 22, 2004 Page 30
  • 31. Video Coding Summary of Temporal Processing • Use MC-prediction (P and B frames) to reduce temporal redundancy • MC-prediction usually performs well; In compression have a second chance to recover when it performs badly • MC-prediction yields: – Motion vectors – MC-prediction error or residual → Code error with conventional image coder • Sometimes MC-prediction may perform badly – Examples: Complex motion, new imagery (occlusions) – Approach: 1. Identify frame or individual blocks where prediction fails 2. Code without prediction John G. Apostolopoulos April 22, 2004 Page 31
  • 32. Video Coding Basic Video Compression Architecture • Exploiting the redundancies: – Temporal: MC-prediction (P and B frames) – Spatial: Block DCT – Color: Color space conversion • Scalar quantization of DCT coefficients • Zigzag scanning, runlength and Huffman coding of the nonzero quantized DCT coefficients John G. Apostolopoulos April 22, 2004 Page 32
  • 33. Video Example Video Encoder Coding Input Buffer fullness Video Residual Signal RGB Huffman to DCT Quantize Buffer Coding YUV Output Bitstream Inverse Quantize MV data Inverse DCT MC-Prediction Motion Frame Store Compensation Previous MV data Reconstructed Frame Motion Estimation John G. Apostolopoulos April 22, 2004 Page 33
  • 34. Video Coding Example Video Decoder Reconstructed Residual Frame Huffman Inverse Inverse Buffer YUV to RGB Decoder Quantize DCT Input Output Bitstream Video Signal MC-Prediction Frame Store Previous MV data Motion Reconstructed Compensation Frame John G. Apostolopoulos April 22, 2004 Page 34
  • 35. Video Coding Outline of Today’s Lecture • Motivation for compression • Brief review of generic compression system (from prior lecture) • Brief review of image compression (from last lecture) • Video compression – Exploit temporal dimension of video signal – Motion-compensated prediction – Generic (MPEG-type) video coder architecture – Scalable video coding • Overview of current video compression standards – What do the standards specify? – Frame-based video coding: MPEG-1/2/4, H.261/3/4 – Object-based video coding: MPEG-4 John G. Apostolopoulos April 22, 2004 Page 35
  • 36. Video Motivation for Scalable Coding Coding Basic situation: 1. Diverse receivers may request the same video – Different bandwidths, spatial resolutions, frame rates, computational capabilities 2. Heterogeneous networks and a priori unknown network conditions – Wired and wireless links, time-varying bandwidths → When you originally code the video you don’t know which client or network situation will exist in the future → Probably have multiple different situations, each requiring a different compressed bitstream → Need a different compressed video matched to each situation • Possible solutions: 1. Compress & store MANY different versions of the same video 2. Real-time transcoding (e.g. decode/re-encode) 3. Scalable coding John G. Apostolopoulos April 22, 2004 Page 36
  • 37. Video Coding Scalable Video Coding • Scalable coding: – Decompose video into multiple layers of prioritized importance – Code layers into base and enhancement bitstreams – Progressively combine one or more bitstreams to produce different levels of video quality • Example of scalable coding with base and two enhancement layers: Can produce three different qualities 1. Base layer 2. Base + Enh1 layers Higher quality 3. Base + Enh1 + Enh2 layers • Scalability with respect to: Spatial or temporal resolution, bit rate, computation, memory John G. Apostolopoulos April 22, 2004 Page 37
  • 38. Video Coding Example of Scalable Coding • Encode image/video into three layers: Base Enh1 Enh2 Encoder • Low-bandwidth receiver: Send only Base layer Base Decoder Low Res • Medium-bandwidth receiver: Send Base & Enh1 layers Base Enh1 Decoder Med Res • High-bandwidth receiver: Send all three layers Base Enh1 Enh2 Decoder High Res • Can adapt to different clients and network situations John G. Apostolopoulos April 22, 2004 Page 38
  • 39. Video Coding Scalable Video Coding (cont.) • Three basic types of scalability (refine video quality along three different dimensions): – Temporal scalability → Temporal resolution – Spatial scalability → Spatial resolution – SNR (quality) scalability → Amplitude resolution • Each type of scalable coding provides scalability of one dimension of the video signal – Can combine multiple types of scalability to provide scalability along multiple dimensions John G. Apostolopoulos April 22, 2004 Page 39
  • 40. Video Coding Scalable Coding: Temporal Scalability • Temporal scalability: Based on the use of B-frames to refine the temporal resolution – B-frames are dependent on other frames – However, no other frame depends on a B-frame – Each B-frame may be discarded without affecting other frames I0 B1 B2 P3 B4 B5 P6 B7 B8 I9 MPEG GOP John G. Apostolopoulos April 22, 2004 Page 40
  • 41. Video Coding Scalable Coding: Spatial Scalability • Spatial scalability: Based on refining the spatial resolution – Base layer is low resolution version of video – Enh1 contains coded difference between upsampled base layer and original video – Also called: Pyramid coding Enh layer Enc Dec ↓2 ↑2 ↑2 High-Res Original Video Video Dec Base layer Low-Res Enc Dec Video John G. Apostolopoulos April 22, 2004 Page 41
  • 42. Video Scalable Coding: SNR (Quality) Coding Scalability • SNR (Quality) Scalability: Based on refining the amplitude resolution – Base layer uses a coarse quantizer – Enh1 applies a finer quantizer to the difference between the original DCT coefficients and the coarsely quantized base layer coefficients EP frame EI frame Note: Base & enhancement layers are at the same spatial I frame P-frame resolution John G. Apostolopoulos April 22, 2004 Page 42
  • 43. Video Coding Summary of Scalable Video Coding • Three basic types of scalable video coding: – Temporal scalability – Spatial scalability – SNR (quality) scalability • Scalable coding produces different layers with prioritized importance • Prioritized importance is key for a variety of applications: – Adapting to different bandwidths, or client resources such as spatial or temporal resolution or computational power – Facilitates error-resilience by explicitly identifying most important and less important bits John G. Apostolopoulos April 22, 2004 Page 43
  • 44. Video Coding Outline of Today’s Lecture • Motivation for compression • Brief review of generic compression system (from prior lecture) • Brief review of image compression (from last lecture) • Video compression – Exploit temporal dimension of video signal – Motion-compensated prediction – Generic (MPEG-type) video coder architecture – Scalable video coding • Overview of current video compression standards – What do the standards specify? – Frame-based video coding: MPEG-1/2/4, H.261/3/4 – Object-based video coding: MPEG-4 John G. Apostolopoulos April 22, 2004 Page 44
  • 45. Video Coding Motivation for Standards • Goal of standards: – Ensuring interoperability: Enabling communication between devices made by different manufacturers – Promoting a technology or industry – Reducing costs John G. Apostolopoulos April 22, 2004 Page 45
  • 46. Video Coding What do the Standards Specify? Encoder Bitstream Decoder John G. Apostolopoulos April 22, 2004 Page 46
  • 47. Video Coding What do the Standards Specify? Encoder Bitstream Decoder (Decoding Process) • Not the encoder Scope of Standardization • Not the decoder • Just the bitstream syntax and the decoding process (e.g. use IDCT, but not how to implement the IDCT) → Enables improved encoding & decoding strategies to be employed in a standard-compatible manner John G. Apostolopoulos April 22, 2004 Page 47
  • 48. Video Current Image and Video Coding Compression Standards Standard Application Bit Rate JPEG Continuous-tone still-image Variable compression H.261 Video telephony and p x 64 kb/s teleconferencing over ISDN MPEG-1 Video on digital storage media 1.5 Mb/s (CD-ROM) MPEG-2 Digital Television 2-20 Mb/s H.263 Video telephony over PSTN 33.6-? kb/s MPEG-4 Object-based coding, synthetic Variable content, interactivity JPEG-2000 Improved still image compression Variable H.264 / Improved video compression 10’s to 100’s kb/s MPEG-4 AVC John G. Apostolopoulos April 22, 2004 Page 48
  • 49. Video Comparing Current Video Compression Coding Standards • Based on the same fundamental building blocks – Motion-compensated prediction (I, P, and B frames) – 2-D Discrete Cosine Transform (DCT) – Color space conversion – Scalar quantization, runlengths, Huffman coding • Additional tools added for different applications: – Progressive or interlaced video – Improved compression, error resilience, scalability, etc. • MPEG-1/2/4, H.261/3/4: Frame-based coding • MPEG-4: Object-based coding and Synthetic video John G. Apostolopoulos April 22, 2004 Page 49
  • 50. Video MPEG Group of Pictures (GOP) Coding Structure • Composed of I, P, and B frames • Arrows show prediction dependencies • Periodic I-frames enable random access into the coded bitstream • Parameters: (1) Spacing between I frames, (2) number of B frames between I and P frames I0 B1 B2 P3 B4 B5 P6 B7 B8 I9 MPEG GOP John G. Apostolopoulos April 22, 2004 Page 50
  • 51. Video Coding MPEG Structure • MPEG codes video in a hierarchy of layers. The sequence layer is not shown. GOP Layer Picture Layer P B B P B B 4 8x8 DCT I 1 MV 8x8 DCT Block Macroblock Layer Slice Layer Layer John G. Apostolopoulos April 22, 2004 Page 51
  • 52. Video Coding MPEG-2 Profiles and Levels • Goal: To enable more efficient implementations for different applications (interoperability points) – Profile: Subset of the tools applicable for a family of applications – Level: Bounds on the complexity for any profile Level HDTV: Main Profile at High High Level (MP@HL) Main DVD & SD Digital TV: Main Profile at Main Level Low (MP@ML) Profile Simple Main High John G. Apostolopoulos April 22, 2004 Page 52
  • 53. Video Coding MPEG-4 Natural Video Coding • Extension of MPEG-1/2-type algorithms to code arbitrarily shaped objects Frame-based Coding Object-based Coding [MPEG Committee] Basic Idea: Extend Block-DCT and Block-ME/MC-prediction to code arbitrarily shaped objects John G. Apostolopoulos April 22, 2004 Page 53
  • 54. Video Coding Example of MPEG-4 Scene (Object-based Coding) [MPEG Committee] John G. Apostolopoulos April 22, 2004 Page 54
  • 55. Video Example MPEG-4 Object Decoding Process Coding [MPEG Committee] John G. Apostolopoulos April 22, 2004 Page 55
  • 56. Video Coding Sprite Coding (Background Prediction) • Sprite: Large background image – Hypothesis: Same background exists for many frames, changes resulting from camera motion and occlusions • One possible coding strategy: 1. Code & transmit entire sprite once 2. Only transmit camera motion parameters for each subsequent frame • Significant coding gain for some scenes John G. Apostolopoulos April 22, 2004 Page 56
  • 57. Video Coding Sprite Coding Example Sprite (background) Foreground Object Reconstructed Frame [MPEG Committee] John G. Apostolopoulos April 22, 2004 Page 57
  • 58. Video Coding Review of Today’s Lecture • Motivation for compression • Brief review of generic compression system (from prior lecture) • Brief review of image compression (from last lecture) • Video compression – Exploit temporal dimension of video signal – Motion-compensated prediction – Generic (MPEG-type) video coder architecture – Scalable video coding • Overview of current video compression standards – What do the standards specify? – Frame-based video coding: MPEG-1/2/4, H.261/3/4 – Object-based video coding: MPEG-4 John G. Apostolopoulos April 22, 2004 Page 58
  • 59. Video Coding References and Further Reading General Video Compression References: • J.G. Apostolopoulos and S.J. Wee, ``Video Compression Standards'', Wiley Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, Inc., New York, 1999. • V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards: Algorithms and Architectures, Boston, Massachusetts: Kluwer Academic Publishers, 1997. • J.L. Mitchell, W.B. Pennebaker, C.E. Fogg, and D.J. LeGall, MPEG Video Compression Standard, New York: Chapman & Hall, 1997. • B.G. Haskell, A. Puri, A.N. Netravali, Digital Video: An Introduction to MPEG-2, Kluwer Academic Publishers, Boston, 1997. MPEG web site: http://drogo.cselt.stet.it/mpeg John G. Apostolopoulos April 22, 2004 Page 59