SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
Module 4
                      Arithmetic Coding
                             Prof. Hung-Ta Pai




Module 4, Data Compression       LISA, NTPU      1
Reals in Binary
          Any real number x in the interval [0, 1) can be
          represented in binary as .b1b2... where bi is a bit




Module 4, Data Compression       LISA, NTPU                 2
First Conversion
                 L:=0; R:=1; i :=1;
                 while x > L *
                     if x < (L+R)/2 then bi := 0; R := (L+R)/2;
                     if x ≥ (L+R)/2 then bi := 1; L := (L+R)/2;
                     i := i + 1;
                 end{while}
                 bi := 0 for all j ≥ i;


                  * Invariant: x is always in the interval [L, R)

Module 4, Data Compression          LISA, NTPU                      3
Basic Ideas
          Represent each string x of length n by a unique
          interval [L, R) in [0, 1)
          The width of the interval [L, R) represents the
          probability of x occurring
          The interval [L, R) can itself be represented by
          any number, called a tag, within the half open
          interval
          The k significant bits of the tag .t1t2t3.... is the
          code of x
               That is, .t1t2t3...tk000... is in the interval [L, R)

Module 4, Data Compression            LISA, NTPU                       4
Example
                             1. Tag must be in the half open interval
                             2. Tag can be chosen to be (L+R)/2
                             3. Code is the significant bits of the tag




Module 4, Data Compression         LISA, NTPU                        5
Better Tag




Module 4, Data Compression     LISA, NTPU   6
Example of Codes
          P(a) = 1/3, P(b) = 2/3




Module 4, Data Compression        LISA, NTPU    7
Code Generation from Tag
        If binary tag is .t1t2t3... = (L+R)/2 in [L, R), then
        we want to choose k to form the code t1t2 ...tk
        Short code: choose k to be as small as possible
        so that L ≤ . t1t2 ...tk000... < R
        Guaranteed code:
           Choose k = ⎡log2(1/(R-L))⎤ + 1
           L ≤ . t1t2 ...tkb1b2b3... < R for any bits b1b2b3...
           For fixed length strings provides a good prefix code
           Example: [.000000000..., .000010010...), tag
           = .000001001...
                   Short code: 0
                   Guaranteed code: 000001
Module 4, Data Compression           LISA, NTPU                   8
Guaranteed Code Example
          P(a) = 1/3, P(b) = 2/3




                             Guaranteed code -> Prefix code
Module 4, Data Compression   LISA, NTPU                       9
Coding Algorithm
          P(a1), P(a2), ..., P(am)
          C(ai) = P(a1) + P(a2) + ... +P(ai-1)
          Encode x1x2...xn
                    Initialize L := 0; and R:=1;
                    For i = 1 to n do
                        W := R - L;
                        L := L + W * C(xi);
                        R := L + W * P(xi);
                    end;
                    t := (L+R)/2; choose code for the tag
Module 4, Data Compression          LISA, NTPU              10
Coding Example
          P(a) = 1/4, P(b) = 1/2, P(c) = 1/4
          C(a) = 0, C(b) =1/4, C(c) = 3/4
          abca




Module 4, Data Compression       LISA, NTPU    11
Coding Excercise
          P(a) = 1/4, P(b) = 1/2, P(c) = 1/4
          C(a) = 0, C(b) =1/4, C(c) = 3/4
          bbbb




Module 4, Data Compression        LISA, NTPU    12
Decoding (1/3)
          Assume the length is known to be 3
          0001 which converts to the tag .0001000




Module 4, Data Compression       LISA, NTPU         13
Decoding (2/3)
          Assume the length is known to be 3
          0001 which converts to the tag .0001000




Module 4, Data Compression       LISA, NTPU         14
Decoding (3/3)
          Assume the length is known to be 3
          0001 which converts to the tag .0001000




Module 4, Data Compression       LISA, NTPU         15
Decoding Algorithm
          P(a1), P(a2), ..., P(am)
          C(ai) = P(a1) + P(a2) + ... +P(ai-1)
          Decode b1b2...bm, number of symbols is n
     Initialize L := 0; and R:=1;
     t := b1b2...bm000...
     for i = 1 to n do
         W := R - L;
         find j such that L + W * C(aj) ≤ t < L + W * (C(aj)+P(aj));
         output aj;
         L := L + W * C(aj); R = L + W * P(aj);
Module 4, Data Compression         LISA, NTPU                      16
Decoding Example
          P(a) = 1/4, P(b) = 1/2, P(c) = 1/4
          C(a) = 0, C(b) =1/4, C(c) = 3/4
          00101




Module 4, Data Compression        LISA, NTPU    17
Decoding Issues
          There are two ways for the decoder to know
          when to stop decoding
               Transmit the length of the string
               Transmit a unique end of string symbol




Module 4, Data Compression       LISA, NTPU             18
Practical Arithmetic Coding
          Scaling:
               By scaling we can keep L and R in a reasonable
               range of values so that W = R–L does not underflow
               The code can be produced progressively, not at the
               end
               Complicates decoding some
          Integer arithmetic coding avoids floating point
          altogether



Module 4, Data Compression       LISA, NTPU                    19
Adaptation
          Simple solution – Equally Probable Model
               Initially all symbols have frequency 1
               After symbol x is coded, increment its frequency by 1
               Use the new model for coding the next symbol
               Example in alphabet a, b, c, d




Module 4, Data Compression        LISA, NTPU                      20
Zero Frequency Problem
          How do we weight symbols that have not
          occurred yet?
               Equal weight? Not so good with many symbols
               Escape symbol, but what should its weight be?
               When a new symbol is encountered send the <esc>,
               followed by the symbol in the equally probable model
               (both encoded arithmetically)




Module 4, Data Compression        LISA, NTPU                     21
End of File Problem
          Similar to Zero Frequency Problem
          Reasonable solution:
               Add EOF to the post-ESC equally-probable model
          When done compressing:
               First send ESC
               Then send EOF
          What’s the cost of this approach?




Module 4, Data Compression         LISA, NTPU                   22
Arithmetic vs. Huffman
          Both compress very well
          For m symbol grouping
               Huffman is within 1/m of entropy
               Arithmetic is within 2/m of entropy
          Context
               Huffman needs a tree for every context
               Arithmetic needs a small table of frequencies for
               every context
          Adaptation
               Huffman has an elaborate adaptive algorithm
               Arithmetic has a simple adaptive mechanism

Module 4, Data Compression         LISA, NTPU                      23

Más contenido relacionado

La actualidad más candente

Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic codingVikas Goyal
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisRamakant Soni
 
Information Theory - Introduction
Information Theory  -  IntroductionInformation Theory  -  Introduction
Information Theory - IntroductionBurdwan University
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transformpiyush_11
 
Windowing techniques of fir filter design
Windowing techniques of fir filter designWindowing techniques of fir filter design
Windowing techniques of fir filter designRohan Nagpal
 
PULSE CODE MODULATION (PCM)
PULSE CODE MODULATION (PCM)PULSE CODE MODULATION (PCM)
PULSE CODE MODULATION (PCM)vishnudharan11
 
Multiplexing and Frequency Division Multiplexing
Multiplexing and Frequency Division MultiplexingMultiplexing and Frequency Division Multiplexing
Multiplexing and Frequency Division MultiplexingKath Mataac
 
Delta modulation
Delta modulationDelta modulation
Delta modulationmpsrekha83
 
Digital t carriers and multiplexing power point (laurens)
Digital t carriers and multiplexing power point (laurens)Digital t carriers and multiplexing power point (laurens)
Digital t carriers and multiplexing power point (laurens)Laurens Luis Bugayong
 
Log Transformation in Image Processing with Example
Log Transformation in Image Processing with ExampleLog Transformation in Image Processing with Example
Log Transformation in Image Processing with ExampleMustak Ahmmed
 

La actualidad más candente (20)

Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
06 Arithmetic 1
06 Arithmetic 106 Arithmetic 1
06 Arithmetic 1
 
Information Theory - Introduction
Information Theory  -  IntroductionInformation Theory  -  Introduction
Information Theory - Introduction
 
Chapter 03 cyclic codes
Chapter 03   cyclic codesChapter 03   cyclic codes
Chapter 03 cyclic codes
 
Information theory
Information theoryInformation theory
Information theory
 
Convolutional codes
Convolutional codesConvolutional codes
Convolutional codes
 
Arithmetic Coding
Arithmetic CodingArithmetic Coding
Arithmetic Coding
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transform
 
Windowing techniques of fir filter design
Windowing techniques of fir filter designWindowing techniques of fir filter design
Windowing techniques of fir filter design
 
PULSE CODE MODULATION (PCM)
PULSE CODE MODULATION (PCM)PULSE CODE MODULATION (PCM)
PULSE CODE MODULATION (PCM)
 
Multiplexing and Frequency Division Multiplexing
Multiplexing and Frequency Division MultiplexingMultiplexing and Frequency Division Multiplexing
Multiplexing and Frequency Division Multiplexing
 
Delta modulation
Delta modulationDelta modulation
Delta modulation
 
BCH Codes
BCH CodesBCH Codes
BCH Codes
 
Digital t carriers and multiplexing power point (laurens)
Digital t carriers and multiplexing power point (laurens)Digital t carriers and multiplexing power point (laurens)
Digital t carriers and multiplexing power point (laurens)
 
Turbo Codes
Turbo CodesTurbo Codes
Turbo Codes
 
5 linear block codes
5 linear block codes5 linear block codes
5 linear block codes
 
Ripple Carry Adder
Ripple Carry AdderRipple Carry Adder
Ripple Carry Adder
 
Log Transformation in Image Processing with Example
Log Transformation in Image Processing with ExampleLog Transformation in Image Processing with Example
Log Transformation in Image Processing with Example
 
Convolution Codes
Convolution CodesConvolution Codes
Convolution Codes
 

Similar a Module 4 Arithmetic Coding

High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)Enrique Monzo Solves
 
basicsofcodingtheory-160202182933-converted.pptx
basicsofcodingtheory-160202182933-converted.pptxbasicsofcodingtheory-160202182933-converted.pptx
basicsofcodingtheory-160202182933-converted.pptxupendrabhatt13
 
The performance of turbo codes for wireless communication systems
The performance of turbo codes for wireless communication systemsThe performance of turbo codes for wireless communication systems
The performance of turbo codes for wireless communication systemschakravarthy Gopi
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marksvvcetit
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxPJS KUMAR
 
Linear Cryptanalysis Lecture 線形解読法
Linear Cryptanalysis Lecture 線形解読法Linear Cryptanalysis Lecture 線形解読法
Linear Cryptanalysis Lecture 線形解読法Kai Katsumata
 
Matlab 3
Matlab 3Matlab 3
Matlab 3asguna
 
Intermediate code generator
Intermediate code generatorIntermediate code generator
Intermediate code generatorsanchi29
 
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐCHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐlykhnh386525
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)Eric Zhang
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework HelpC++ Homework Help
 
Fast Fourier Transform (FFT) of Time Series in Kafka Streams
Fast Fourier Transform (FFT) of Time Series in Kafka StreamsFast Fourier Transform (FFT) of Time Series in Kafka Streams
Fast Fourier Transform (FFT) of Time Series in Kafka StreamsHostedbyConfluent
 
Pilot induced cyclostationarity based method for dvb system identification
Pilot induced cyclostationarity based method for dvb system identificationPilot induced cyclostationarity based method for dvb system identification
Pilot induced cyclostationarity based method for dvb system identificationiaemedu
 
01 - DAA - PPT.pptx
01 - DAA - PPT.pptx01 - DAA - PPT.pptx
01 - DAA - PPT.pptxKokilaK25
 

Similar a Module 4 Arithmetic Coding (20)

High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
 
Er24902905
Er24902905Er24902905
Er24902905
 
basicsofcodingtheory-160202182933-converted.pptx
basicsofcodingtheory-160202182933-converted.pptxbasicsofcodingtheory-160202182933-converted.pptx
basicsofcodingtheory-160202182933-converted.pptx
 
The performance of turbo codes for wireless communication systems
The performance of turbo codes for wireless communication systemsThe performance of turbo codes for wireless communication systems
The performance of turbo codes for wireless communication systems
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
 
Linear Cryptanalysis Lecture 線形解読法
Linear Cryptanalysis Lecture 線形解読法Linear Cryptanalysis Lecture 線形解読法
Linear Cryptanalysis Lecture 線形解読法
 
Matlab 3
Matlab 3Matlab 3
Matlab 3
 
Lec5 Compression
Lec5 CompressionLec5 Compression
Lec5 Compression
 
internal assement 3
internal assement 3internal assement 3
internal assement 3
 
Ch9b
Ch9bCh9b
Ch9b
 
Intermediate code generator
Intermediate code generatorIntermediate code generator
Intermediate code generator
 
Modulation techniques matlab_code
Modulation techniques matlab_codeModulation techniques matlab_code
Modulation techniques matlab_code
 
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐCHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework Help
 
Fast Fourier Transform (FFT) of Time Series in Kafka Streams
Fast Fourier Transform (FFT) of Time Series in Kafka StreamsFast Fourier Transform (FFT) of Time Series in Kafka Streams
Fast Fourier Transform (FFT) of Time Series in Kafka Streams
 
Pilot induced cyclostationarity based method for dvb system identification
Pilot induced cyclostationarity based method for dvb system identificationPilot induced cyclostationarity based method for dvb system identification
Pilot induced cyclostationarity based method for dvb system identification
 
01 - DAA - PPT.pptx
01 - DAA - PPT.pptx01 - DAA - PPT.pptx
01 - DAA - PPT.pptx
 

Más de anithabalaprabhu (20)

Shannon Fano
Shannon FanoShannon Fano
Shannon Fano
 
Ch 04 Arithmetic Coding ( P P T)
Ch 04  Arithmetic  Coding ( P P T)Ch 04  Arithmetic  Coding ( P P T)
Ch 04 Arithmetic Coding ( P P T)
 
Compression
CompressionCompression
Compression
 
Datacompression1
Datacompression1Datacompression1
Datacompression1
 
Speech Compression
Speech CompressionSpeech Compression
Speech Compression
 
Z24 4 Speech Compression
Z24   4   Speech CompressionZ24   4   Speech Compression
Z24 4 Speech Compression
 
Dictor
DictorDictor
Dictor
 
Dictionary Based Compression
Dictionary Based CompressionDictionary Based Compression
Dictionary Based Compression
 
Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Lassy
LassyLassy
Lassy
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Lossy
LossyLossy
Lossy
 
Planning
PlanningPlanning
Planning
 
Lossless
LosslessLossless
Lossless
 
Losseless
LosselessLosseless
Losseless
 
Lec32
Lec32Lec32
Lec32
 
Huffman Student
Huffman StudentHuffman Student
Huffman Student
 
Huffman Encoding Pr
Huffman Encoding PrHuffman Encoding Pr
Huffman Encoding Pr
 
Huffman1
Huffman1Huffman1
Huffman1
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Module 4 Arithmetic Coding

  • 1. Module 4 Arithmetic Coding Prof. Hung-Ta Pai Module 4, Data Compression LISA, NTPU 1
  • 2. Reals in Binary Any real number x in the interval [0, 1) can be represented in binary as .b1b2... where bi is a bit Module 4, Data Compression LISA, NTPU 2
  • 3. First Conversion L:=0; R:=1; i :=1; while x > L * if x < (L+R)/2 then bi := 0; R := (L+R)/2; if x ≥ (L+R)/2 then bi := 1; L := (L+R)/2; i := i + 1; end{while} bi := 0 for all j ≥ i; * Invariant: x is always in the interval [L, R) Module 4, Data Compression LISA, NTPU 3
  • 4. Basic Ideas Represent each string x of length n by a unique interval [L, R) in [0, 1) The width of the interval [L, R) represents the probability of x occurring The interval [L, R) can itself be represented by any number, called a tag, within the half open interval The k significant bits of the tag .t1t2t3.... is the code of x That is, .t1t2t3...tk000... is in the interval [L, R) Module 4, Data Compression LISA, NTPU 4
  • 5. Example 1. Tag must be in the half open interval 2. Tag can be chosen to be (L+R)/2 3. Code is the significant bits of the tag Module 4, Data Compression LISA, NTPU 5
  • 6. Better Tag Module 4, Data Compression LISA, NTPU 6
  • 7. Example of Codes P(a) = 1/3, P(b) = 2/3 Module 4, Data Compression LISA, NTPU 7
  • 8. Code Generation from Tag If binary tag is .t1t2t3... = (L+R)/2 in [L, R), then we want to choose k to form the code t1t2 ...tk Short code: choose k to be as small as possible so that L ≤ . t1t2 ...tk000... < R Guaranteed code: Choose k = ⎡log2(1/(R-L))⎤ + 1 L ≤ . t1t2 ...tkb1b2b3... < R for any bits b1b2b3... For fixed length strings provides a good prefix code Example: [.000000000..., .000010010...), tag = .000001001... Short code: 0 Guaranteed code: 000001 Module 4, Data Compression LISA, NTPU 8
  • 9. Guaranteed Code Example P(a) = 1/3, P(b) = 2/3 Guaranteed code -> Prefix code Module 4, Data Compression LISA, NTPU 9
  • 10. Coding Algorithm P(a1), P(a2), ..., P(am) C(ai) = P(a1) + P(a2) + ... +P(ai-1) Encode x1x2...xn Initialize L := 0; and R:=1; For i = 1 to n do W := R - L; L := L + W * C(xi); R := L + W * P(xi); end; t := (L+R)/2; choose code for the tag Module 4, Data Compression LISA, NTPU 10
  • 11. Coding Example P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 C(a) = 0, C(b) =1/4, C(c) = 3/4 abca Module 4, Data Compression LISA, NTPU 11
  • 12. Coding Excercise P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 C(a) = 0, C(b) =1/4, C(c) = 3/4 bbbb Module 4, Data Compression LISA, NTPU 12
  • 13. Decoding (1/3) Assume the length is known to be 3 0001 which converts to the tag .0001000 Module 4, Data Compression LISA, NTPU 13
  • 14. Decoding (2/3) Assume the length is known to be 3 0001 which converts to the tag .0001000 Module 4, Data Compression LISA, NTPU 14
  • 15. Decoding (3/3) Assume the length is known to be 3 0001 which converts to the tag .0001000 Module 4, Data Compression LISA, NTPU 15
  • 16. Decoding Algorithm P(a1), P(a2), ..., P(am) C(ai) = P(a1) + P(a2) + ... +P(ai-1) Decode b1b2...bm, number of symbols is n Initialize L := 0; and R:=1; t := b1b2...bm000... for i = 1 to n do W := R - L; find j such that L + W * C(aj) ≤ t < L + W * (C(aj)+P(aj)); output aj; L := L + W * C(aj); R = L + W * P(aj); Module 4, Data Compression LISA, NTPU 16
  • 17. Decoding Example P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 C(a) = 0, C(b) =1/4, C(c) = 3/4 00101 Module 4, Data Compression LISA, NTPU 17
  • 18. Decoding Issues There are two ways for the decoder to know when to stop decoding Transmit the length of the string Transmit a unique end of string symbol Module 4, Data Compression LISA, NTPU 18
  • 19. Practical Arithmetic Coding Scaling: By scaling we can keep L and R in a reasonable range of values so that W = R–L does not underflow The code can be produced progressively, not at the end Complicates decoding some Integer arithmetic coding avoids floating point altogether Module 4, Data Compression LISA, NTPU 19
  • 20. Adaptation Simple solution – Equally Probable Model Initially all symbols have frequency 1 After symbol x is coded, increment its frequency by 1 Use the new model for coding the next symbol Example in alphabet a, b, c, d Module 4, Data Compression LISA, NTPU 20
  • 21. Zero Frequency Problem How do we weight symbols that have not occurred yet? Equal weight? Not so good with many symbols Escape symbol, but what should its weight be? When a new symbol is encountered send the <esc>, followed by the symbol in the equally probable model (both encoded arithmetically) Module 4, Data Compression LISA, NTPU 21
  • 22. End of File Problem Similar to Zero Frequency Problem Reasonable solution: Add EOF to the post-ESC equally-probable model When done compressing: First send ESC Then send EOF What’s the cost of this approach? Module 4, Data Compression LISA, NTPU 22
  • 23. Arithmetic vs. Huffman Both compress very well For m symbol grouping Huffman is within 1/m of entropy Arithmetic is within 2/m of entropy Context Huffman needs a tree for every context Arithmetic needs a small table of frequencies for every context Adaptation Huffman has an elaborate adaptive algorithm Arithmetic has a simple adaptive mechanism Module 4, Data Compression LISA, NTPU 23