More Related Content
Similar to Tiny Google Projects (20)
Tiny Google Projects
- 11. Tesseract OCR
Arabic, English, Bulgarian, Catalan, Czech,
Chinese (Simplified and Traditional), Danish
(standard and Fraktur script), German, Greek,
Finnish, French, Hebrew, Croatian, Hungarian,
Indonesian, Italian, Japanese, Korean, Latvian,
Lithuanian, Dutch, Norwegian, Polish,
Portuguese, Romanian, Russian, Slovak
(standard and Fraktur script), Slovenian,
Spanish, Serbian, Swedish, Tagalog, Thai,
Turkish, Ukrainian and Vietnamese
- 22. google protocol buffers
Person person;
person.set_id(123);
>
message Person { person.set_name("Bob");
required int32 id = 1; person.set_email("bob@example.com");
required string name = 2;
optional string email = 3; fstream out("person.pb", ios::out ...
} person.SerializeToOstream(&out);
out.close();
- 23. 512 bytes / tweet
340,000,000 tweets / day (2012)
7,253,333,333 bytes / hour
2,014,814 bytes / second
1,921 Mbytes / second
15,371 Mbits / second
8 Tbytes / day (2011)
Google: ~ 377M searches/day
- 33. Size(less is better)
compression ratio (%)
80
70
60
50
40
30
20
10
0
lzjb 2010 lzo 2.04 1x fastlz 0.1 - fastlz 0.1 - 3.6 vf lzf 3.6 uf lzrw1
lzf lzrw1-a lzrw2 lzrw3 lzrw3-a snappy quicklz quicklz
1 2 1.0 1.5.0 -1 1.5.0 -2
- 34. 6
Data types
5
4
compression ratio
3 snappy
zlib
2
1
0
plain text html jpeg
- 36. Speed is better)
Compression (MB/s) (more
250
200
150
100
50
0
lzjb 2010 lzo 2.04 fastlz 0.1 - fastlz 0.1 - 3.6 vf lzf 3.6 uf lzrw1
lzf lzrw1-a lzrw2 lzrw3 lzrw3-a snappy quicklz quicklz
1x 1 2 1.0 1.5.0 -1 1.5.0 -2
- 37. Speed is better)
Decompression (MB/s) (more
500
450
400
350
300
250
200
150
100
50
0
lzjb 2010 lzo 2.04 fastlz 0.1 - fastlz 0.1 - 3.6 vf lzf 3.6 uf lzrw1
lzf lzrw1-a lzrw2 lzrw3 lzrw3-a snappy quicklz quicklz
1x 1 2 1.0 1.5.0 -1 1.5.0 -2
- 38. On 1 core of 64-bit Core i7 processor:
• Compression: 250MB/s
• Decompression: 500MB/s
:P
- 43. @TarasRoshko
HTTP headers here:
http://code.google.com/p/snappy/
source/browse/trunk/framing_for
mat.txt
- 44. QA? Ostap Andrusiv
Software Engineer
Eleks software
@p1f
Editor's Notes
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- In-memory test (compression and decompression) with ENWIK8 using1 core of Intel Xeon X5355 @ 2.66GHz (64-bit compilation under gcc 4.1.1 (Linux) -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999 -DNDEBUG)
- zlibsnappyplain text1.5-1.72.7html2-4 3-7 jpeg11
- http://aws.amazon.com/glacier/
- http://pastebin.com/SFaNzRuf
- http://encode.ru/threads/1255-Google-released-Snappy-compression-decompression-library
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
- http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/