SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
x86/x64最適化勉強会#4
A x86-optimized rank/select
dictionary for bit sequences
                             2012/6/16
                     Takeshi Yamamuro




                                         1
What’s Succinct Data Structure?




                                  2
SDS: Succinct Data Structure
        • Recently, Getting Popular in Some Areas
              – Researches & Engineering

        • Not Data Structure, But Data Representation
              – A compressed method for other data structures
              – e.g., alphabets, trees, and graphs

        • Transparent Operations w/o Unpacking Explicitly
              – e.g., succinct LZ77 compression*1




*1
                                                                                                             3
     Kreft, S. and Navarro, G.: LZ77-Like Compression with Fast Random Access, In Proceedings of DCC, 2010
More Details
• SDS = Succinct Data + Succinct Index

• Succinct Data
  – Compact representation for target data
  – Almost to information theoretic lower bounds
               e.g., If N patterns, the lower bound’s logN


• Succinct Index
  – O(1) operations for target data
  – o(N) space costs: ignored asymptotically




                                                             4
More Details

   If you need more information, ...




                  cited from: http://goo.gl/rkQ5z
                                                    5
A rank/select dictionary for SDS




                                   6
A Rank/Select Operations
• SDS Composed of Rank/Select Operations
  – Many calls of rank/select inside

• Rank/Select for Succinct Bit Sequences: B[i]
  – rankx(n, B): the total of 1s in B[0...n]
  – selectx(n, B): n-th position of x in B[]



        i   0    1     2    3   4    5   6     7   8
     B[i]   1    0     1    1   0    0   1     1   0
                     rank1(5, B)=3   select1(4, B)=6


                                                       7
A Rank/Select Operations
• Available Rank/Select Implementation
  – ux-trie: http://code.google.com/p/ux-trie/
  – rx: http://code.google.com/p/mozc/
  – marisa-trie: http://code.google.com/p/marisa-trie/


• Today Contributions
  – x86-optimized rank/select
  – https://github.com/maropu/dbitv




                                                         8
Performance Results
        • Performance Benchmark Setups*1
              – Generate a random sequence of bits: 50% density
              – Random rank/select queries over the bits
              – CPU: Intel Core-i5 U470@1.33GHz

        • Latency Observed
              – 11 trials, and median latency




*1
                                                                   9
     Reference: http://d.hatena.ne.jp/s-yata/20111216/1324032373
Performance Results: Rank

                             1.E+03
averaged rank latency (ns)




                             1.E+02




                             1.E+01                ux
                                                   rx
                                                   marisa
                                                   opt

                             1.E+00




                                      bit length
                                                            10
Performance Results: Select

                               1.E+04
averaged select latency (ns)




                               1.E+03



                               1.E+02


                                                     ux
                               1.E+01                rx
                                                     marisa
                                                     opt

                               1.E+00




                                        bit length

                                                              11
Implementation Details




                         12
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space

 B[] =              A sequence of bits


                          N-bits




                                               13
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
                log 2 N
  B[] =                          A sequence of bits

  L[] =            l1                       l2


• Split into log2N fixed-length blocks
• Total Counts Pre-computed in L[]

                           x          x / log 2 N                      x
          rank1 ( x, B)   B[i ]                    B[i ]           B[i]
                          i 1            i 1                                
                                                                 i  x / log 2 N 1

                                      L1[ x / log 2 N ]

                                                                                      14
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
                log 2 N
  B[] =                          A sequence of bits

  L[] =            l1                       l2


• Split into log2N fixed-length blocks
• Total Counts Pre-computed in L[]

                           x          x / log 2 N                      x
          rank1 ( x, B)   B[i ]                    B[i ]           B[i]
                          i 1            i 1                                
                                                                 i  x / log 2 N 1

                                      L[ x / log 2 N ]
                                                                         O(log2N)
                                                   O(1)                               15
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                     A sequence of bits

  L[] =          l1                l2


• L[]: o(N) space costs

            N                  N
             2
                 log N  O(       )  o( N )
          log N              log N



                                                16
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                          A sequence of bits

  L[] =           l1                           l2                     1 log n
                                                                       2
 S[] = s1 s2
• Split into 1/2logN fixed-length blocks again
• Total Counts Pre-computed in S[]
                                                         1           
                 x           x / log N 
                                    2                    x / 2 log N 
                                                                                  x
 rank1 ( x, B)   B[i ]                  B[i ]           B[i]                B[i]
                i 1             i 1                               
                                                      i  x / log 2 N 1        1         
                                                                           i   x / log N  1
                                                                                2         
                                                            1
                             L[ x / log 2 n]          S[ x / log n]
                                                            2
                                                                                                  17
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                          A sequence of bits

  L[] =           l1                           l2                    1 log n
                                                                      2
 S[] = s1 s2
• Split into 1/2logN fixed-length blocks again
• Total Counts Pre-computed in S[]
                                                         1                        O(logN)
                             x / log N 
                                    2                     x / log N 
                                                         2
                 x                                                                x
 rank1 ( x, B)   B[i ]                  B[i ]           B[i]                B[i]
                i 1             i 1                              
                                                      i  x / log 2 N 1        1         
                                                                           i   x / log N  1
                                                                                2         
                                                             1
                             L[ x / log 2 n]          S [ x / log n]
                                                             2
                                        O(1)                       O(1)                           18
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
             log 2 N
 B[] =                    A sequence of bits

  L[] =        l1                 l2           1 log n
                                                2
 S[] = s1 s2
• S[]: o(N) space costs

          N                           log log N
            2
                log(log N )  O( N 
                        2
                                                )  o( N )
     1 2 log N                          log N



                                                             19
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
              log 2 N
 B[] =                           A sequence of bits

  L[] =           l1                             l2                     1 log n
                                                                         2
 S[] = s1 s2
• O(1) Popcount/Table-Lookup in Last Term

                                                           1                         O(logN) -> O(1)
                 x           x / log 2 N                 x / 2 log N 
                                                                                     x
 rank1 ( x, B)   B[i ]                    B[i ]           B[i]                 B[i]
                i 1             i 1                                 
                                                        i  x / log 2 N 1         1         
                                                                              i   x / log N  1
                                                                                   2         
                                                               1
                             L[ x / log 2 n]            S [ x / log n]
                                                               2
                                          O(1)                         O(1)
                                                                                                     20
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space
                 log 2 N
 B[] =                         A sequence of bits

  L[] =              l1                l2           1 log n
                                                     2
 S[] = s1 s2
• As a result, o(N) Space Costs

            N     4 N log log N          log log N
                                O( N            )  o( N )
          log N       log N                log N
          L[] size         S[] size



                                                                21
Implementation: 4 Russian Methods
• Rule: O(1) operation costs with o(N) space




                                               22
Implementation: Practice
• Low Computation Costs & High Cache Penalties
   – 3 cache/TLB misses per rank




                         ex. rank1(402=256*1+32*4+18, B)
                256bit

  B[]: 01..000000....101......0 0110....001...............0 0000100 ...
        32bit                                   Popcount these left bits

 L[]:            18                     21                                 …
 S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 …




                                                                           23
Implementation: Practice
• Low Computation Costs & High Cache Penalties
   – 3 cache/TLB misses per rank




                         ex. rank1(402=256*1+32*4+18, B)
                256bit

  B[]: 01..000000....101......0 0110....001...............0 0000100 ...
        32bit                      Miss!        Popcount these left bits

 L[]:            18      Miss!          21                                 …
 S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 …
                           Miss!




                                                                           24
Implementation: Practice
• Packing the required data into a single cacheline




                                 56B Chunk
         4B                 1B                     32B


   ・・・        12B padding
                                         0110....001..........0 padding


                                 64B Cache line




                                                                          25
Implementation: Practice
• Packing the required data into a single cacheline




                                                      26
Implementation: Practice
• BTW, where select?
  – Omitted for my time limit 
  – Plz see the code ...


• 2 Way Implementation
  – O(logN) complexity
     • ux-trie, rx, and marisa-trie
     • Binary searches with rank
     • Many cache/TLB misses suffered


  – O(1) complexity
     • My implementation to minimize these penalties
     • 1-rank, 1-SIMD comparison, and O(1) –bsf
     • Only 2 cache/TLB misses
                                                       27
Implementation: Practice
• BTW, where select?
  – Omitted for my time limit 
  – Plz see the code ...


• 2 Way Implementation
  – O(logN) complexity
     • ux-trie, rx, and marisa-trie
     • Binary searches with rank
     • Many cache/TLB misses suffered


  – O(1) complexity
     • My implementation to minimize these penalties
     • 1-rank, 1-SIMD comparison, and O(1) –bsf
     • Only 2 cache/TLB misses
                      Not implemented yet ...

                                                       28

Más contenido relacionado

La actualidad más candente

Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Data assimilation with OpenDA
Data assimilation with OpenDAData assimilation with OpenDA
Data assimilation with OpenDAnilsvanvelzen
 
Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011Ed Dodds
 
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...Yusuf Bhujwalla
 
2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC MeetupDavid Smiley
 
Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015David Smiley
 
STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...Luuk Brederode
 
The status of the GeoServer WPS
The status of the GeoServer WPSThe status of the GeoServer WPS
The status of the GeoServer WPSGeoSolutions
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagramTeam-VLSI-ITMU
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexityashishtinku
 
Algorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAlgorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAdelina Ahadova
 
Final presentation optical flow estimation with DL
Final presentation  optical flow estimation with DLFinal presentation  optical flow estimation with DL
Final presentation optical flow estimation with DLLeapMind Inc
 

La actualidad más candente (18)

Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Data assimilation with OpenDA
Data assimilation with OpenDAData assimilation with OpenDA
Data assimilation with OpenDA
 
Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011
 
4241
42414241
4241
 
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
The Impact of Smoothness on Model Class Selection in Nonlinear System Identif...
 
Binary decision diagrams
Binary decision diagramsBinary decision diagrams
Binary decision diagrams
 
2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup2016-01 Lucene Solr spatial in 2015, NYC Meetup
2016-01 Lucene Solr spatial in 2015, NYC Meetup
 
An32272275
An32272275An32272275
An32272275
 
Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015Lucene/Solr spatial in 2015
Lucene/Solr spatial in 2015
 
STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...STAQ based Matrix estimation - initial concept (presented at hEART conference...
STAQ based Matrix estimation - initial concept (presented at hEART conference...
 
The status of the GeoServer WPS
The status of the GeoServer WPSThe status of the GeoServer WPS
The status of the GeoServer WPS
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagram
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph Convolution
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
 
Algorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAlgorithm Complexity and Main Concepts
Algorithm Complexity and Main Concepts
 
Final presentation optical flow estimation with DL
Final presentation  optical flow estimation with DLFinal presentation  optical flow estimation with DL
Final presentation optical flow estimation with DL
 
MSc Presentation
MSc PresentationMSc Presentation
MSc Presentation
 

Destacado

Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介MITSUNARI Shigeo
 
x86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNTx86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNTtakesako
 
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜Ryoma Sin'ya
 
Popcntによるハミング距離計算
Popcntによるハミング距離計算Popcntによるハミング距離計算
Popcntによるハミング距離計算Norishige Fukushima
 
X86opti01 nothingcosmos
X86opti01 nothingcosmosX86opti01 nothingcosmos
X86opti01 nothingcosmosnothingcosmos
 

Destacado (6)

Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介Haswellサーベイと有限体クラスの紹介
Haswellサーベイと有限体クラスの紹介
 
x86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNTx86x64 SSE4.2 POPCNT
x86x64 SSE4.2 POPCNT
 
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
AVX2時代の正規表現マッチング 〜半群でぐんぐん!〜
 
Popcntによるハミング距離計算
Popcntによるハミング距離計算Popcntによるハミング距離計算
Popcntによるハミング距離計算
 
X86opti01 nothingcosmos
X86opti01 nothingcosmosX86opti01 nothingcosmos
X86opti01 nothingcosmos
 
明日使えないすごいビット演算
明日使えないすごいビット演算明日使えないすごいビット演算
明日使えないすごいビット演算
 

Similar a A x86-optimized rank&select dictionary for bit sequences

Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsYu Liu
 
Threshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random PermutationsThreshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random PermutationsAleksandr Yampolskiy
 
Graph Regularised Hashing
Graph Regularised HashingGraph Regularised Hashing
Graph Regularised HashingSean Moran
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptxpallavidhade2
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorPTIHPA
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Leonid Zhukov
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonshin
 
Generic parallelization strategies for data assimilation
Generic parallelization strategies for data assimilationGeneric parallelization strategies for data assimilation
Generic parallelization strategies for data assimilationnilsvanvelzen
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...Alex Pruden
 
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Sean Moran
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marksvvcetit
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
 
Code generation in Compiler Design
Code generation in Compiler DesignCode generation in Compiler Design
Code generation in Compiler DesignKuppusamy P
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-documentmaomao125
 
Selective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarizationSelective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarizationKodaira Tomonori
 
Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2Khaja Dileef
 

Similar a A x86-optimized rank&select dictionary for bit sequences (20)

Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
 
Threshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random PermutationsThreshold and Proactive Pseudo-Random Permutations
Threshold and Proactive Pseudo-Random Permutations
 
Slide11 icc2015
Slide11 icc2015Slide11 icc2015
Slide11 icc2015
 
Graph Regularised Hashing
Graph Regularised HashingGraph Regularised Hashing
Graph Regularised Hashing
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
 
Mmclass5
Mmclass5Mmclass5
Mmclass5
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
Basic data structures part I
Basic data structures part IBasic data structures part I
Basic data structures part I
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
 
Generic parallelization strategies for data assimilation
Generic parallelization strategies for data assimilationGeneric parallelization strategies for data assimilation
Generic parallelization strategies for data assimilation
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
Learning to Project and Binarise for Hashing-based Approximate Nearest Neighb...
 
system software 16 marks
system software 16 markssystem software 16 marks
system software 16 marks
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
 
Code generation in Compiler Design
Code generation in Compiler DesignCode generation in Compiler Design
Code generation in Compiler Design
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document15 06-0459-02-003c-cm-matlab-release-0-85-support-document
15 06-0459-02-003c-cm-matlab-release-0-85-support-document
 
Selective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarizationSelective encoding for abstractive sentence summarization
Selective encoding for abstractive sentence summarization
 
Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2Systemsoftwarenotes 100929171256-phpapp02 2
Systemsoftwarenotes 100929171256-phpapp02 2
 

Más de Takeshi Yamamuro

LT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature ExpectationLT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature ExpectationTakeshi Yamamuro
 
Quick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + αQuick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + αTakeshi Yamamuro
 
MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理Takeshi Yamamuro
 
Taming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache SparkTaming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache SparkTakeshi Yamamuro
 
LLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecodeLLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecodeTakeshi Yamamuro
 
20180417 hivemall meetup#4
20180417 hivemall meetup#420180417 hivemall meetup#4
20180417 hivemall meetup#4Takeshi Yamamuro
 
An Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List CompressionAn Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List CompressionTakeshi Yamamuro
 
Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題Takeshi Yamamuro
 
VLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging HardwareVLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging HardwareTakeshi Yamamuro
 
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4Takeshi Yamamuro
 
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)Takeshi Yamamuro
 
Introduction to Modern Analytical DB
Introduction to Modern Analytical DBIntroduction to Modern Analytical DB
Introduction to Modern Analytical DBTakeshi Yamamuro
 
SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-Takeshi Yamamuro
 
VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-Takeshi Yamamuro
 
研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法Takeshi Yamamuro
 

Más de Takeshi Yamamuro (20)

LT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature ExpectationLT: Spark 3.1 Feature Expectation
LT: Spark 3.1 Feature Expectation
 
Apache Spark + Arrow
Apache Spark + ArrowApache Spark + Arrow
Apache Spark + Arrow
 
Quick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + αQuick Overview of Upcoming Spark 3.0 + α
Quick Overview of Upcoming Spark 3.0 + α
 
MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理MLflowによる機械学習モデルのライフサイクルの管理
MLflowによる機械学習モデルのライフサイクルの管理
 
Taming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache SparkTaming Distributed/Parallel Query Execution Engine of Apache Spark
Taming Distributed/Parallel Query Execution Engine of Apache Spark
 
LLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecodeLLJVM: LLVM bitcode to JVM bytecode
LLJVM: LLVM bitcode to JVM bytecode
 
20180417 hivemall meetup#4
20180417 hivemall meetup#420180417 hivemall meetup#4
20180417 hivemall meetup#4
 
An Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List CompressionAn Experimental Study of Bitmap Compression vs. Inverted List Compression
An Experimental Study of Bitmap Compression vs. Inverted List Compression
 
Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題Sparkのクエリ処理系と周辺の話題
Sparkのクエリ処理系と周辺の話題
 
20160908 hivemall meetup
20160908 hivemall meetup20160908 hivemall meetup
20160908 hivemall meetup
 
20150513 legobease
20150513 legobease20150513 legobease
20150513 legobease
 
20150516 icde2015 r19-4
20150516 icde2015 r19-420150516 icde2015 r19-4
20150516 icde2015 r19-4
 
VLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging HardwareVLDB2013 R1 Emerging Hardware
VLDB2013 R1 Emerging Hardware
 
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
浮動小数点(IEEE754)を圧縮したい@dsirnlp#4
 
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
LLVMで遊ぶ(整数圧縮とか、x86向けの自動ベクトル化とか)
 
Introduction to Modern Analytical DB
Introduction to Modern Analytical DBIntroduction to Modern Analytical DB
Introduction to Modern Analytical DB
 
SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-SIGMOD’12勉強会 -Session 7-
SIGMOD’12勉強会 -Session 7-
 
VAST-Tree, EDBT'12
VAST-Tree, EDBT'12VAST-Tree, EDBT'12
VAST-Tree, EDBT'12
 
VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-VLDB’11勉強会 -Session 9-
VLDB’11勉強会 -Session 9-
 
研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法研究動向から考えるx86/x64最適化手法
研究動向から考えるx86/x64最適化手法
 

Último

Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Tina Ji
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfOnline Income Engine
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...Any kyc Account
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insightsseribangash
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 DelhiCall Girls in Delhi
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 

Último (20)

Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdf
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 

A x86-optimized rank&select dictionary for bit sequences

  • 1. x86/x64最適化勉強会#4 A x86-optimized rank/select dictionary for bit sequences 2012/6/16 Takeshi Yamamuro 1
  • 2. What’s Succinct Data Structure? 2
  • 3. SDS: Succinct Data Structure • Recently, Getting Popular in Some Areas – Researches & Engineering • Not Data Structure, But Data Representation – A compressed method for other data structures – e.g., alphabets, trees, and graphs • Transparent Operations w/o Unpacking Explicitly – e.g., succinct LZ77 compression*1 *1 3 Kreft, S. and Navarro, G.: LZ77-Like Compression with Fast Random Access, In Proceedings of DCC, 2010
  • 4. More Details • SDS = Succinct Data + Succinct Index • Succinct Data – Compact representation for target data – Almost to information theoretic lower bounds e.g., If N patterns, the lower bound’s logN • Succinct Index – O(1) operations for target data – o(N) space costs: ignored asymptotically 4
  • 5. More Details If you need more information, ... cited from: http://goo.gl/rkQ5z 5
  • 7. A Rank/Select Operations • SDS Composed of Rank/Select Operations – Many calls of rank/select inside • Rank/Select for Succinct Bit Sequences: B[i] – rankx(n, B): the total of 1s in B[0...n] – selectx(n, B): n-th position of x in B[] i 0 1 2 3 4 5 6 7 8 B[i] 1 0 1 1 0 0 1 1 0 rank1(5, B)=3 select1(4, B)=6 7
  • 8. A Rank/Select Operations • Available Rank/Select Implementation – ux-trie: http://code.google.com/p/ux-trie/ – rx: http://code.google.com/p/mozc/ – marisa-trie: http://code.google.com/p/marisa-trie/ • Today Contributions – x86-optimized rank/select – https://github.com/maropu/dbitv 8
  • 9. Performance Results • Performance Benchmark Setups*1 – Generate a random sequence of bits: 50% density – Random rank/select queries over the bits – CPU: Intel Core-i5 U470@1.33GHz • Latency Observed – 11 trials, and median latency *1 9 Reference: http://d.hatena.ne.jp/s-yata/20111216/1324032373
  • 10. Performance Results: Rank 1.E+03 averaged rank latency (ns) 1.E+02 1.E+01 ux rx marisa opt 1.E+00 bit length 10
  • 11. Performance Results: Select 1.E+04 averaged select latency (ns) 1.E+03 1.E+02 ux 1.E+01 rx marisa opt 1.E+00 bit length 11
  • 13. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space B[] = A sequence of bits N-bits 13
  • 14. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 • Split into log2N fixed-length blocks • Total Counts Pre-computed in L[] x x / log 2 N  x rank1 ( x, B)   B[i ]   B[i ]   B[i] i 1 i 1   i  x / log 2 N 1 L1[ x / log 2 N ] 14
  • 15. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 • Split into log2N fixed-length blocks • Total Counts Pre-computed in L[] x x / log 2 N  x rank1 ( x, B)   B[i ]   B[i ]   B[i] i 1 i 1   i  x / log 2 N 1 L[ x / log 2 N ] O(log2N) O(1) 15
  • 16. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 • L[]: o(N) space costs N N 2  log N  O( )  o( N ) log N log N 16
  • 17. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • Split into 1/2logN fixed-length blocks again • Total Counts Pre-computed in S[]  1  x x / log N  2  x / 2 log N    x rank1 ( x, B)   B[i ]   B[i ]   B[i]   B[i] i 1 i 1   i  x / log 2 N 1  1  i   x / log N  1  2  1 L[ x / log 2 n] S[ x / log n] 2 17
  • 18. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • Split into 1/2logN fixed-length blocks again • Total Counts Pre-computed in S[]  1  O(logN) x / log N  2 x / log N   2 x   x rank1 ( x, B)   B[i ]   B[i ]   B[i]   B[i] i 1 i 1   i  x / log 2 N 1  1  i   x / log N  1  2  1 L[ x / log 2 n] S [ x / log n] 2 O(1) O(1) 18
  • 19. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • S[]: o(N) space costs N log log N 2  log(log N )  O( N  2 )  o( N ) 1 2 log N log N 19
  • 20. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • O(1) Popcount/Table-Lookup in Last Term  1  O(logN) -> O(1) x x / log 2 N   x / 2 log N    x rank1 ( x, B)   B[i ]   B[i ]   B[i]   B[i] i 1 i 1   i  x / log 2 N 1  1  i   x / log N  1  2  1 L[ x / log 2 n] S [ x / log n] 2 O(1) O(1) 20
  • 21. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space log 2 N B[] = A sequence of bits L[] = l1 l2 1 log n 2 S[] = s1 s2 • As a result, o(N) Space Costs N 4 N log log N log log N   O( N  )  o( N ) log N log N log N L[] size S[] size 21
  • 22. Implementation: 4 Russian Methods • Rule: O(1) operation costs with o(N) space 22
  • 23. Implementation: Practice • Low Computation Costs & High Cache Penalties – 3 cache/TLB misses per rank ex. rank1(402=256*1+32*4+18, B) 256bit B[]: 01..000000....101......0 0110....001...............0 0000100 ... 32bit Popcount these left bits L[]: 18 21 … S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 … 23
  • 24. Implementation: Practice • Low Computation Costs & High Cache Penalties – 3 cache/TLB misses per rank ex. rank1(402=256*1+32*4+18, B) 256bit B[]: 01..000000....101......0 0110....001...............0 0000100 ... 32bit Miss! Popcount these left bits L[]: 18 Miss! 21 … S[]: 1 3 4 6 7 9 10 13 2 5 7 9 12 13 18 19 1 3 7 … Miss! 24
  • 25. Implementation: Practice • Packing the required data into a single cacheline 56B Chunk 4B 1B 32B ・・・ 12B padding 0110....001..........0 padding 64B Cache line 25
  • 26. Implementation: Practice • Packing the required data into a single cacheline 26
  • 27. Implementation: Practice • BTW, where select? – Omitted for my time limit  – Plz see the code ... • 2 Way Implementation – O(logN) complexity • ux-trie, rx, and marisa-trie • Binary searches with rank • Many cache/TLB misses suffered – O(1) complexity • My implementation to minimize these penalties • 1-rank, 1-SIMD comparison, and O(1) –bsf • Only 2 cache/TLB misses 27
  • 28. Implementation: Practice • BTW, where select? – Omitted for my time limit  – Plz see the code ... • 2 Way Implementation – O(logN) complexity • ux-trie, rx, and marisa-trie • Binary searches with rank • Many cache/TLB misses suffered – O(1) complexity • My implementation to minimize these penalties • 1-rank, 1-SIMD comparison, and O(1) –bsf • Only 2 cache/TLB misses Not implemented yet ... 28