SlideShare una empresa de Scribd logo
1 de 30
CS 414 - Spring 2010
CS 414 – Multimedia Systems Design
Lecture 10 – MPEG-1 Video
and MP3 Audio) (Part 5)
Klara Nahrstedt
Spring 2010
CS 414 - Spring 2010
Administrative
 MP2 posted on February 9 and deadline
will be March 1
 MP2 demos – Monday, March 1, 5-7pm,
0216 SC
 Please, start early
 We will have a discussion section next
week for MP2
Outline
 Few Comments about losses that occur in
JPEG (and similar in MPEG later on)
 MPEG-1 Hybrid Coding
 MP3 Audio Encoding
 Reading:
 Required: Media Coding Book, Section 7.7.1-7.7.4 and lectures
 Recommended Paper on MP3: Davis Pan, “A Tutorial on MPEG/Audio
Compression”, IEEE Multimedia, pp. 6-74, 1995
 Recommended books on JPEG/ MPEG Audio/Video Fundamentals:
 Haskell, Puri, Netravali, “Digital Video: An Introduction to MPEG-2”,
Chapman and Hall, 1996
 Pennebaker, Mitchel, “JPEG: Still Image Data Compression Standard,
VNR Publisher, 1993
CS 414 - Spring 2010
Comment on JPEG Losses
 Losses in JPEG compression occur during the
quantization because S(u,v)’ = Round(S(u,v)/Q(u,v)).
1. Image Preparation =>
2. DCT Transform S(u,v) = DCT(S(x,y)) =>
3. Quantization with S(u,v)’ = Round(S(u,v)/Q(u,v) =>
4. Entropy Encoding =>
5. Transmission of compressed Image =>
6. Entropy Decoding =>
7. De-quantization S(u,v)’’ = S(u,v)’*Q(u,v) Compression Loss
since S(u,v)” ≠ S(u,v) =>
8. Apply Inverse DCT =>
9. Display Image CS 414 - Spring 2010
Motion Picture Expert Group
(MPEG)
 General Information about MPEG
Began in 1988; Part of Same ISO as JPEG
 MPEG-1/Video
 MPEG/Audio – MP3
 MPEG-2
 MPEG-4
 MPEG-7
 MPEG-21 CS 414 - Spring 2010
MPEG General Information
 Goal: data compression 1.5 Mbps
 MPEG defines video, audio coding and
system data streams with synchronization
 MPEG information
Aspect ratios: 1:1 (CRT), 4:3 (NTSC), 16:9
(HDTV)
Refresh frequencies: 23.975, 24, 25, 29.97,
50, 59.94, 60 Hz
CS 414 - Spring 2010
MPEG Image Preparation
(Resolution and Dimension)
 MPEG defines exactly format
 Three components: Luminance and two chrominance
components (2:1:1)
 Resolution of luminance comp:X1 ≤ 768; Y1 ≤ 576
pixels
 Pixel precision is 8 bits for each component
 Example of Video format: 352x240 pixels, 30
fps; chrominance components: 176x120 pixels
CS 414 - Spring 2010
MPEG Image Preparation -
Blocks
 Each image is divided into macro-blocks
 Macro-block : 16x16 pixels for luminance;
8x8 for each chrominance component
 Macro-blocks are useful for Motion
Estimation
 No MCUs which implies sequential non-
interleaving order of pixels values
CS 414 - Spring 2010
MPEG Video Processing
 Intra frames (same as JPEG)
typically about 12 frames between I frames
 Predictive frames
encode from previous I or P reference frame
 Bi-directional frames
encode from previous and future I or P frames
CS 414 - Spring 2010
I P I
P P
B B B B B B B B
Selecting I, P, or B Frames
 Heuristics
change of scenes should generate I frame
limit B and P frames between I frames
B frames are computationally intense
Type Size Compress
I 18K 7:1
P 6K 20:1
B 2.5K 50:1
Avg 4.8K 27:1
CS 414 - Spring 2010
MPEG Video I-Frames
CS 414 - Spring 2010
Intra-coded images
I-frames – points of
random access in
MPEG stream
I-frames use 8x8
blocks defined within
Macro-block
No quantization
table for all DCT
coefficients, only
quantization factor
MPEG Video P-Frames
CS 414 - Spring 2010
Predictive coded frames
require information of
previous I frame and or
previous P frame for
encoding/decoding
For Temporary Redundancy
we determine last P or I frame
that is most similar to the
block under consideration
Motion Estimation Method
Motion Computation for P
Frames
 Predictive search
 Look for match window within a given
search window
Match window – macro-block
Search window – arbitrary window size
depending how far away are we willing to look
 Displacement of two match windows is
expressed by motion vector
CS 414 - Spring 2010
Matching Methods
 SSD metric
 SAD metric
 Minimum error represents best match
 must be below a specified threshold
 error and perceptual similarity not always correlated





1
0
2
)
(
N
i
i
i y
x
SSD





1
0
|
|
N
i
i
i y
x
SAD
CS 414 - Spring 2010
CS 414 - Spring 2010
Example of Finding Minimal SSD
CS 414 - Spring 2010
Example of Comparing Minimal SSD and
SAD
Syntax of P Frame
CS 414 - Spring 2010
Addr: address the syntax of P frame
Type: INTRA block is specified if no good match was found
Quant: quantization value per macro-block (vary quantization to
fine-tune compression)
Motion Vector: a 2D vector used for motion compensation provides
offset from coordinate position in target image to
coordinates in reference image
CBP(Coded Block Pattern): bit mask indicates which blocks are
present
MPEG Video B Frames
CS 414 - Spring 2010
Bi-directionally Predictive-coded frames
MPEG Video Decoding
CS 414 - Spring 2010
I1 P1 I2
P2 P3
B1 B2 B3 B4 B5 B6 B7 B8
B2
I1 P1 I2
P2 P3
B1 B3 B4 B5 B6 B7 B8
Display Order
Decoding Order
MPEG Video Quantization
 AC coefficients of B/P frames are usually
large values, I frames have smaller values
Adjust quantization
 If data rate increases over threshold, then
quantization enlarges step size (increase
quantization factor Q)
 If data rate decreases below threshold,
then quantization decreases Q
CS 414 - Spring 2010
MPEG-1 Interchange Format
CS 414 - Spring 2010
GOP GOP
...
Seq Seq Seq
…
Seq
Seq
SC
Video
Param
Bitstream
Param
QT,
misc
Pict Pict
...
GOP
SC
GOP
Param
Time
Code
MB MB
...
SSC QScale
Vert
Pos
Slice Slice
...
PSC Type Buffer
Param
Encode
Param
CBP b5
...
Addr Type Motion
Vector
QScale
b0
GOP Layer
Sequence Layer
Picture Layer
Slice Layer
Macro-block Layer
Block Layer
MPEG Audio Encoding
 Characteristics
Precision 16 bits
Sampling frequency: 32KHz, 44.1 KHz, 48
KHz
3 compression layers: Layer 1, Layer 2, Layer
3 (MP3)
 Layer 3: 32-320 kbps, target 64 kbps
 Layer 2: 32-384 kbps, target 128 kbps
 Layer 1: 32-448 kbps, target 192 kbps
CS 414 - Spring 2010
MPEG Audio Encoding Steps
CS 414 - Spring 2010
MPEG Audio Filter Bank
 Filter bank divides input into multiple sub-bands
(32 equal frequency sub-bands)
 Sub-band i defined
 - filter output sample for sub-band
i at time t, C[n] – one of 512 coefficients, x[n] –
audio input sample from 512 sample buffer
CS 414 - Spring 2010
]
64
[
*
]
64
[
(
*
64
)
16
)(
1
2
(
cos(
3
]
[
7
0
7
0
j
k
x
j
k
C
k
i
i
S
j
k
t 



 
 


]
[
],
31
,
0
[ i
S
i t

MPEG Audio Psycho-acoustic
Model
 MPEG audio compresses by removing
acoustically irrelevant parts of audio signals
 Takes advantage of human auditory systems
inability to hear quantization noise under
auditory masking
 Auditory masking: occurs when ever the
presence of a strong audio signal makes a
temporal or spectral neighborhood of weaker
audio signals imperceptible.
CS 414 - Spring 2010
CS 414 - Spring 2010
MPEG/audio divides audio signal into frequency sub-bands that approximate critical
bands. Then we quantize each sub-band according to the audibility of quantization
noise within the band
MPEG Audio Bit Allocation
 This process determines number of code bits allocated to
each sub-band based on information from the psycho-
acoustic model
 Algorithm:
1. Compute mask-to-noise ratio: MNR=SNR-SMR
 Standard provides tables that give estimates for SNR resulting
from quantizing to a given number of quantizer levels
2. Get MNR for each sub-band
3. Search for sub-band with the lowest MNR
4. Allocate code bits to this sub-band.
 If sub-band gets allocated more code bits than appropriate, look
up new estimate of SNR and repeat step 1
CS 414 - Spring 2010
MP3 Audio Format
CS 414 - Spring 2010
Source: http://wiki.hydrogenaudio.org/images/e/ee/Mp3filestructure.jpg
MPEG Audio Comments
 Precision of 16 bits per sample is needed to get
good SNR ratio
 Noise we are getting is quantization noise from
the digitization process
 For each added bit, we get 6dB better SNR ratio
 Masking effect means that we can raise the
noise floor around a strong sound because the
noise will be masked away
 Raising noise floor is the same as using less bits
and using less bits is the same as compression
CS 414 - Spring 2010
Conclusion
 MPEG system data stream
 Interchange format for audio and video streams
 Interleave audio and video packets; insert time stamps into each “frame”
of data
 Synchronization
 SRC – system clock reference; DTS – decoding time stamp; PTS –
presentation time stamp
 During encoding
 Insert SCR values into system stream
 Stamp each frame with PTS and DTS
 Use encode time to approximate decode time
 During decoding
 Initialize local decoder clock with start values
 Compare PTS to the value of local clock
 Periodically synchronize local clock to SCR
CS 414 - Spring 2010

Más contenido relacionado

Similar a lect10-mpeg1.ppt

Mm01 a vformat
Mm01 a vformatMm01 a vformat
Mm01 a vformatgotovikas
 
Chapter 5 - Data Compression
Chapter 5 - Data CompressionChapter 5 - Data Compression
Chapter 5 - Data CompressionPratik Pradhan
 
MPEG 3D graphics compression offer
MPEG 3D graphics compression offerMPEG 3D graphics compression offer
MPEG 3D graphics compression offerMarius Preda PhD
 
Video Coding Standard
Video Coding StandardVideo Coding Standard
Video Coding StandardVideoguy
 
audiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptxaudiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptxPawachMetharattanara
 
An overview Survey on Various Video compressions and its importance
An overview Survey on Various Video compressions and its importanceAn overview Survey on Various Video compressions and its importance
An overview Survey on Various Video compressions and its importanceINFOGAIN PUBLICATION
 
Multimedia
MultimediaMultimedia
MultimediaBUDNET
 
Image compression and it’s security1
Image compression and it’s security1Image compression and it’s security1
Image compression and it’s security1Reyad Hossain
 
damaro.ppt
damaro.pptdamaro.ppt
damaro.pptVideoguy
 
J03502050055
J03502050055J03502050055
J03502050055theijes
 
Digital Video And Compression
Digital Video And CompressionDigital Video And Compression
Digital Video And CompressionRobert Burk
 
MULTECH2 LESSON 5.pdf
MULTECH2 LESSON 5.pdfMULTECH2 LESSON 5.pdf
MULTECH2 LESSON 5.pdfRayCenteno1
 
Video Compression Technology
Video Compression TechnologyVideo Compression Technology
Video Compression TechnologyTong Teerayuth
 

Similar a lect10-mpeg1.ppt (20)

Mm01 a vformat
Mm01 a vformatMm01 a vformat
Mm01 a vformat
 
Compression
CompressionCompression
Compression
 
Jpeg and mpeg ppt
Jpeg and mpeg pptJpeg and mpeg ppt
Jpeg and mpeg ppt
 
Chapter 5 - Data Compression
Chapter 5 - Data CompressionChapter 5 - Data Compression
Chapter 5 - Data Compression
 
MPEG 3D graphics compression offer
MPEG 3D graphics compression offerMPEG 3D graphics compression offer
MPEG 3D graphics compression offer
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
Video Coding Standard
Video Coding StandardVideo Coding Standard
Video Coding Standard
 
Audio compression
Audio compressionAudio compression
Audio compression
 
audiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptxaudiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptx
 
video comparison
video comparison video comparison
video comparison
 
An overview Survey on Various Video compressions and its importance
An overview Survey on Various Video compressions and its importanceAn overview Survey on Various Video compressions and its importance
An overview Survey on Various Video compressions and its importance
 
Multimedia
MultimediaMultimedia
Multimedia
 
Compression
CompressionCompression
Compression
 
Compression
CompressionCompression
Compression
 
Image compression and it’s security1
Image compression and it’s security1Image compression and it’s security1
Image compression and it’s security1
 
damaro.ppt
damaro.pptdamaro.ppt
damaro.ppt
 
J03502050055
J03502050055J03502050055
J03502050055
 
Digital Video And Compression
Digital Video And CompressionDigital Video And Compression
Digital Video And Compression
 
MULTECH2 LESSON 5.pdf
MULTECH2 LESSON 5.pdfMULTECH2 LESSON 5.pdf
MULTECH2 LESSON 5.pdf
 
Video Compression Technology
Video Compression TechnologyVideo Compression Technology
Video Compression Technology
 

Último

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Último (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

lect10-mpeg1.ppt

  • 1. CS 414 - Spring 2010 CS 414 – Multimedia Systems Design Lecture 10 – MPEG-1 Video and MP3 Audio) (Part 5) Klara Nahrstedt Spring 2010
  • 2. CS 414 - Spring 2010 Administrative  MP2 posted on February 9 and deadline will be March 1  MP2 demos – Monday, March 1, 5-7pm, 0216 SC  Please, start early  We will have a discussion section next week for MP2
  • 3. Outline  Few Comments about losses that occur in JPEG (and similar in MPEG later on)  MPEG-1 Hybrid Coding  MP3 Audio Encoding  Reading:  Required: Media Coding Book, Section 7.7.1-7.7.4 and lectures  Recommended Paper on MP3: Davis Pan, “A Tutorial on MPEG/Audio Compression”, IEEE Multimedia, pp. 6-74, 1995  Recommended books on JPEG/ MPEG Audio/Video Fundamentals:  Haskell, Puri, Netravali, “Digital Video: An Introduction to MPEG-2”, Chapman and Hall, 1996  Pennebaker, Mitchel, “JPEG: Still Image Data Compression Standard, VNR Publisher, 1993 CS 414 - Spring 2010
  • 4. Comment on JPEG Losses  Losses in JPEG compression occur during the quantization because S(u,v)’ = Round(S(u,v)/Q(u,v)). 1. Image Preparation => 2. DCT Transform S(u,v) = DCT(S(x,y)) => 3. Quantization with S(u,v)’ = Round(S(u,v)/Q(u,v) => 4. Entropy Encoding => 5. Transmission of compressed Image => 6. Entropy Decoding => 7. De-quantization S(u,v)’’ = S(u,v)’*Q(u,v) Compression Loss since S(u,v)” ≠ S(u,v) => 8. Apply Inverse DCT => 9. Display Image CS 414 - Spring 2010
  • 5. Motion Picture Expert Group (MPEG)  General Information about MPEG Began in 1988; Part of Same ISO as JPEG  MPEG-1/Video  MPEG/Audio – MP3  MPEG-2  MPEG-4  MPEG-7  MPEG-21 CS 414 - Spring 2010
  • 6. MPEG General Information  Goal: data compression 1.5 Mbps  MPEG defines video, audio coding and system data streams with synchronization  MPEG information Aspect ratios: 1:1 (CRT), 4:3 (NTSC), 16:9 (HDTV) Refresh frequencies: 23.975, 24, 25, 29.97, 50, 59.94, 60 Hz CS 414 - Spring 2010
  • 7. MPEG Image Preparation (Resolution and Dimension)  MPEG defines exactly format  Three components: Luminance and two chrominance components (2:1:1)  Resolution of luminance comp:X1 ≤ 768; Y1 ≤ 576 pixels  Pixel precision is 8 bits for each component  Example of Video format: 352x240 pixels, 30 fps; chrominance components: 176x120 pixels CS 414 - Spring 2010
  • 8. MPEG Image Preparation - Blocks  Each image is divided into macro-blocks  Macro-block : 16x16 pixels for luminance; 8x8 for each chrominance component  Macro-blocks are useful for Motion Estimation  No MCUs which implies sequential non- interleaving order of pixels values CS 414 - Spring 2010
  • 9. MPEG Video Processing  Intra frames (same as JPEG) typically about 12 frames between I frames  Predictive frames encode from previous I or P reference frame  Bi-directional frames encode from previous and future I or P frames CS 414 - Spring 2010 I P I P P B B B B B B B B
  • 10. Selecting I, P, or B Frames  Heuristics change of scenes should generate I frame limit B and P frames between I frames B frames are computationally intense Type Size Compress I 18K 7:1 P 6K 20:1 B 2.5K 50:1 Avg 4.8K 27:1 CS 414 - Spring 2010
  • 11. MPEG Video I-Frames CS 414 - Spring 2010 Intra-coded images I-frames – points of random access in MPEG stream I-frames use 8x8 blocks defined within Macro-block No quantization table for all DCT coefficients, only quantization factor
  • 12. MPEG Video P-Frames CS 414 - Spring 2010 Predictive coded frames require information of previous I frame and or previous P frame for encoding/decoding For Temporary Redundancy we determine last P or I frame that is most similar to the block under consideration Motion Estimation Method
  • 13. Motion Computation for P Frames  Predictive search  Look for match window within a given search window Match window – macro-block Search window – arbitrary window size depending how far away are we willing to look  Displacement of two match windows is expressed by motion vector CS 414 - Spring 2010
  • 14. Matching Methods  SSD metric  SAD metric  Minimum error represents best match  must be below a specified threshold  error and perceptual similarity not always correlated      1 0 2 ) ( N i i i y x SSD      1 0 | | N i i i y x SAD CS 414 - Spring 2010
  • 15. CS 414 - Spring 2010 Example of Finding Minimal SSD
  • 16. CS 414 - Spring 2010 Example of Comparing Minimal SSD and SAD
  • 17. Syntax of P Frame CS 414 - Spring 2010 Addr: address the syntax of P frame Type: INTRA block is specified if no good match was found Quant: quantization value per macro-block (vary quantization to fine-tune compression) Motion Vector: a 2D vector used for motion compensation provides offset from coordinate position in target image to coordinates in reference image CBP(Coded Block Pattern): bit mask indicates which blocks are present
  • 18. MPEG Video B Frames CS 414 - Spring 2010 Bi-directionally Predictive-coded frames
  • 19. MPEG Video Decoding CS 414 - Spring 2010 I1 P1 I2 P2 P3 B1 B2 B3 B4 B5 B6 B7 B8 B2 I1 P1 I2 P2 P3 B1 B3 B4 B5 B6 B7 B8 Display Order Decoding Order
  • 20. MPEG Video Quantization  AC coefficients of B/P frames are usually large values, I frames have smaller values Adjust quantization  If data rate increases over threshold, then quantization enlarges step size (increase quantization factor Q)  If data rate decreases below threshold, then quantization decreases Q CS 414 - Spring 2010
  • 21. MPEG-1 Interchange Format CS 414 - Spring 2010 GOP GOP ... Seq Seq Seq … Seq Seq SC Video Param Bitstream Param QT, misc Pict Pict ... GOP SC GOP Param Time Code MB MB ... SSC QScale Vert Pos Slice Slice ... PSC Type Buffer Param Encode Param CBP b5 ... Addr Type Motion Vector QScale b0 GOP Layer Sequence Layer Picture Layer Slice Layer Macro-block Layer Block Layer
  • 22. MPEG Audio Encoding  Characteristics Precision 16 bits Sampling frequency: 32KHz, 44.1 KHz, 48 KHz 3 compression layers: Layer 1, Layer 2, Layer 3 (MP3)  Layer 3: 32-320 kbps, target 64 kbps  Layer 2: 32-384 kbps, target 128 kbps  Layer 1: 32-448 kbps, target 192 kbps CS 414 - Spring 2010
  • 23. MPEG Audio Encoding Steps CS 414 - Spring 2010
  • 24. MPEG Audio Filter Bank  Filter bank divides input into multiple sub-bands (32 equal frequency sub-bands)  Sub-band i defined  - filter output sample for sub-band i at time t, C[n] – one of 512 coefficients, x[n] – audio input sample from 512 sample buffer CS 414 - Spring 2010 ] 64 [ * ] 64 [ ( * 64 ) 16 )( 1 2 ( cos( 3 ] [ 7 0 7 0 j k x j k C k i i S j k t           ] [ ], 31 , 0 [ i S i t 
  • 25. MPEG Audio Psycho-acoustic Model  MPEG audio compresses by removing acoustically irrelevant parts of audio signals  Takes advantage of human auditory systems inability to hear quantization noise under auditory masking  Auditory masking: occurs when ever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible. CS 414 - Spring 2010
  • 26. CS 414 - Spring 2010 MPEG/audio divides audio signal into frequency sub-bands that approximate critical bands. Then we quantize each sub-band according to the audibility of quantization noise within the band
  • 27. MPEG Audio Bit Allocation  This process determines number of code bits allocated to each sub-band based on information from the psycho- acoustic model  Algorithm: 1. Compute mask-to-noise ratio: MNR=SNR-SMR  Standard provides tables that give estimates for SNR resulting from quantizing to a given number of quantizer levels 2. Get MNR for each sub-band 3. Search for sub-band with the lowest MNR 4. Allocate code bits to this sub-band.  If sub-band gets allocated more code bits than appropriate, look up new estimate of SNR and repeat step 1 CS 414 - Spring 2010
  • 28. MP3 Audio Format CS 414 - Spring 2010 Source: http://wiki.hydrogenaudio.org/images/e/ee/Mp3filestructure.jpg
  • 29. MPEG Audio Comments  Precision of 16 bits per sample is needed to get good SNR ratio  Noise we are getting is quantization noise from the digitization process  For each added bit, we get 6dB better SNR ratio  Masking effect means that we can raise the noise floor around a strong sound because the noise will be masked away  Raising noise floor is the same as using less bits and using less bits is the same as compression CS 414 - Spring 2010
  • 30. Conclusion  MPEG system data stream  Interchange format for audio and video streams  Interleave audio and video packets; insert time stamps into each “frame” of data  Synchronization  SRC – system clock reference; DTS – decoding time stamp; PTS – presentation time stamp  During encoding  Insert SCR values into system stream  Stamp each frame with PTS and DTS  Use encode time to approximate decode time  During decoding  Initialize local decoder clock with start values  Compare PTS to the value of local clock  Periodically synchronize local clock to SCR CS 414 - Spring 2010