Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

How to Write the Fastest JSON Parser/Writer in the World

24.363 visualizaciones

Publicado el

How RapidJSON is developed in order to achieve highest performance among 20 C/C++ JSON libraries. Benchmarks, some C++ design, algorithm and low-level optimizations are covered.

Publicado en: Tecnología
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

How to Write the Fastest JSON Parser/Writer in the World

  1. 1. How to Write the Fastest JSON Parser/Writer in the World Milo Yip Tencent 28 Mar 2015
  2. 2. Milo Yip 叶劲峰 • Expert Engineer (2011 to now) – Engine Technology Center, R & D Department, Interactive Entertainment Group (IEG), Tencent • Master of Philosophy in System Engineering & Engineering Management, CUHK • Bachelor of Cognitive Science, HKU • https://github.com/miloyip • http://www.cnblogs.com/miloyip • http://www.zhihu.com/people/miloyip
  3. 3. Table of Contents 1. Introduction 2. Benchmark 3. Design 4. Limitations 5. Thoughts 6. References
  4. 4. 1. INTRODUCTION
  5. 5. JSON • JavaScript Object Notation • Alternative to XML • Human-readable text to transmit/persist data • RFC 7159/ECMA-404 • Common uses – Open API (e.g. Twitter, Facebook, etc.) – Data storage/exchange (e.g. GeoJSON)
  6. 6. RapidJSON • https://github.com/miloyip/rapidjson • MIT License • C++ Header-only Library • Started in Nov 2011 • Inspired by RapidXML • Will release 1.0 under Tencent *soon*
  7. 7. Features • Both SAX and DOM style API • Fast • Cross platform/compiler • No dependencies • Memory friendly • UTF-8/16/32/ASCII and transcoding • In-situ Parsing • More at http://miloyip.github.io/rapidjson/md_doc_features.html
  8. 8. Hello RapidJSON! #include "rapidjson/document.h" #include "rapidjson/writer.h" #include "rapidjson/stringbuffer.h" #include <iostream> using namespace rapidjson; int main() { // 1. Parse a JSON string into DOM. const char* json = "{"project":"rapidjson","stars":10}"; Document d; d.Parse(json); // 2. Modify it by DOM. Value& s = d["stars"]; s.SetInt(s.GetInt() + 1); // 3. Stringify the DOM StringBuffer buffer; Writer<StringBuffer> writer(buffer); d.Accept(writer); // Output {"project":"rapidjson","stars":11} std::cout << buffer.GetString() << std::endl; return 0; }
  9. 9. Fast, AND Reliable • 103 Unit Tests • Continuous Integration – Travis on Linux – AppVeyor on Windows – Valgrind (Linux) for memory leak checking • Use in real applications – Use in client and server applications at Tencent – A user reported parsing 50 million JSON daily
  10. 10. Public Projects using RapidJSON • Cocos2D-X: Cross-Platform 2D Game Engine http://cocos2d-x.org/ • Microsoft Bond: Cross-Platform Serialization https://github.com/Microsoft/bond/ • Google Angle: OpenGL ES 2 for Windows https://chromium.googlesource.com/angle/angle/ • CERN LHCb: Large Hadron Collider beauty http://lhcb-comp.web.cern.ch/lhcb-comp/ • Tell me if you know more
  11. 11. 2. BENCHMARK
  12. 12. Benchmarks for Native JSON libraries • https://github.com/miloyip/nativejson-benchmark • Compare 20 open source C/C++ JSON libraries • Evaluate speed, memory and code size • For parsing, stringify, traversal, and more
  13. 13. Libaries • CAJUN • Casablanca • cJSON • dropbox/json11 • FastJson • gason • jansson • json-c • json spirit • Json Box • JsonCpp • JSON++ • parson • picojson • RapidJSON • simplejson • udp/json • ujson4c • vincenthz/libjson • YAJL
  14. 14. Results: Parsing Speed
  15. 15. Results: Parsing Memory
  16. 16. Results: Stringify Speed
  17. 17. Results: Code Size
  18. 18. Benchmarks for Spine • Spine is a 2D skeletal animation tool • Spine-C is the official runtime in C https://github.com/EsotericSoftware/spine-runtimes/tree/master/spine-c • It uses JSON as data format • It has a custom JSON parser • Adapt RapidJSON and compare loading time
  19. 19. Test Data • http://esotericsoftware.com/forum/viewtopic.php?f=3&t=2831 • Original 80KB JSON • Interpolate to get multiple JSON files • Load 100 times
  20. 20. Results
  21. 21. 3. DESIGN
  22. 22. The Zero Overhead Principle • Bjarne Stroustrup[1]: “What you don't use, you don't pay for.” • RapidJSON tries to obey this principle – SAX and DOM – Combinable options, configurations
  23. 23. SAX StartObject() Key("hello", 5, true) String("world", 5, true) Key("t", 1, true) Bool(true) Key("f", 1, true) Bool(false) Key("n", 1, true) Null() Key("i") UInt(123) Key("pi") Double(3.1416) Key("a") StartArray() Uint(1) Uint(2) Uint(3) Uint(4) EndArray(4) EndObject(7) DOM When parsing a JSON to DOM, use SAX events to build a DOM. When stringify a DOM, traverse it and generate events to SAX. {"hello":"world", "t":true, "f":false, "n":null, "i":123, "pi":3.1416, "a":[1, 2, 3, 4]}
  24. 24. DOM SAX Architecture Value Document Reader Writer <<concept>> Handler <<concept>> Stream <<concept>> Encoding <<concept>> Allocator calls implements implements accepts has
  25. 25. Handler: Template Parameter • Handler handles SAX event callbacks • How to implement callbacks? – Traditional: virtual function – RapidJSON: template parameter template <unsigned parseFlags, typename InputStream, typename Handler> ParseResult Reader::Parse(InputStream& is, Handler& handler); • No virtual function overhead • Inline callback functions
  26. 26. Parsing Options: Template Argument • Many parse options -> Zero overhead principle • Use integer template argument template <unsigned parseFlags, typename InputStream, typename Handler> ParseResult Reader::Parse(InputStream& is, Handler& handler); if (parseFlags & kParseInsituFlag) { // ... } else { // ... } • Compiler optimization eliminates unused code
  27. 27. Recursive SAX Parser • Simple to write/optimize by hand • Use program stack to maintain parsing state of the tree structure • Prone to stack overflow – So also provide an iterative parser (Contributed by Don Ding @thebusytypist)
  28. 28. Normal Parsing
  29. 29. In situ Parsing No allocation and copying for strings! Cache Friendly!
  30. 30. Parsing Number: the Pain ;( • RapidJSON supports parsing JSON number to uint32_t, int32_t, uint64_t, int64_t, double • Difficult to detect in single pass • Even more difficult for double (strtod() is slow) • Implemented kFullPrecision option using 1. Fast-path 2. DIY-FP (https://github.com/floitsch/double-conversion) 3. Big Integer method [2]
  31. 31. How difficult? • PHP Hangs On Numeric Value 2.2250738585072011e-308 http://www.exploringbinary.com/php-hangs-on-numeric- value-2-2250738585072011e-308/ • Java Hangs When Converting 2.2250738585072012e-308 http://www.exploringbinary.com/java-hangs-when- converting-2-2250738585072012e-308/ • "2.22507385850720113605740979670913197593481954635 164564e-308“ → 2.2250738585072009e-308 • "2.22507385850720113605740979670913197593481954635 164565e-308“→ 2.2250738585072014e-308 • And need to be fast…
  32. 32. DOM Designed for Fast Parsing • A JSON value can be one of 6 types – object, array, number, string, boolean, null • Inheritance needs new for each value • RapidJSON uses a single variant type Value
  33. 33. Layout of Value String Ch* str SizeType length unsigned flags Number int i unsigned u int64_t i64 uint64_t u64 double d 0 0 unsigned flags Object Member* members SizeType size SizeType capacity unsigned flags Array Value* values SizeType size SizeType capacity unsigned flags
  34. 34. Move Semantics • Deep copying object/array/string is slow • RapidJSON enforces move semantics
  35. 35. The Default Allocator • Internally allocates a single linked-list of buffers • Do not free objects (thus FAST!) • Suitable for parsing (creating values consecutively) • Not suitable for DOM manipulation
  36. 36. Custom Initial Buffer • User can provide a custom initial buffer – For example, buffer on stack, scratch buffer • The allocator use that buffer first until it is full • Possible to archive zero allocation in parsing
  37. 37. Short String Optimization • Many JSON keys are short • Contributor @Kosta-Github submitted a PR to optimize short strings String Ch* str SizeType length unsigned flags ShortString Ch str[11]; uint8_t x; unsigned flags Let length = 11 – x So 11-char long string is ended with ‘0’
  38. 38. SIMD Optimization • Using SSE2/SSE4 to skip whitespaces (space, tab, LF, CR) • Each iteration compare 16 chars × 4 chars • Fast for JSON with indentation • Visual C++ 2010 32-bit test: strlen() for ref. strspn() RapidJSON (no SIMD) RapidJSON (SSE2) RapidJSON (SSE4) Skip 1M whitespace (ms) 752 3011 1349 170 102
  39. 39. Integer-to-String Optimization • Integer-To-String conversion is simple – E.g. 123 -> “123” • But standard library is quite slow – E.g. sprintf(), _itoa(), etc. • Tried various implementations
  40. 40. My implementations • https://github.com/miloyip/itoa-benchmark • Visual C++ 2013 on Windows 64-bit
  41. 41. Double-to-String Optimziation • Double-to-string conversion is very slow – E.g. 3.14 -> “3.14” • Grisu2 is a fast algorithm for this[3] – 100% cases give correct results – >99% cases give optimal results • Google V8 has an implementation – https://github.com/floitsch/double-conversion – But not header-only, so…
  42. 42. My Grisu2 Implementation • https://github.com/miloyip/dtoa-benchmark • Visual C++ 2013 on Windows 64-bit:
  43. 43. 4. LIMITATIONS
  44. 44. Tradeoff: User-Friendliness • DOM only supports move semantics – Cannot copy-construct Value/Document – So, cannot pass them by value, put in containers • DOM APIs needs allocator as parameter, e.g. numbers.PushBack(1, allocator); • User needs to concern life-cycle of allocator and its allocated values
  45. 45. Pausing in Parsing • Cannot pause in parsing and resume it later – Not keeping all parsing states explicitly – Doing so will be much slower • Typical Scenario – Streaming JSON from network – Don’t want to store the JSON in memory • Solution – Parse in an separate thread – Block the input stream to pause
  46. 46. 5. THOUGHTS
  47. 47. Origin • RapidJSON is my hobby project in 2011 • Also my first open source project • First version released in 2 weeks
  48. 48. Community • Google Code helps tracking bugs but hard to involve contributions • After migrating to GitHub in 2014 – Community much more active – Issue tracking more powerful – Pull requests ease contributions
  49. 49. Future • Official Release under Tencent – 1.0 beta → 1.0 release (after 3+ years…) – Can work on it in working time – Involve marketing and other colleagues – Establish Community in China • Post-1.0 Features – Easy DOM API (but slower) – JSON Schema – Relaxed JSON syntax – Optimization on Object Member Access • Open source our internal projects at Tencent
  50. 50. To Establish an Open Source Project • Courage • Start Small • Make Different – Innovative Idea? – Easy to Use? – Good Performance? • Embrace Community • Learn
  51. 51. References 1. Stroustrup, Bjarne. The design and evolution of C++. Pearson Education India, 1994. 2. Clinger, William D. How to read floating point numbers accurately. Vol. 25. No. 6. ACM, 1990. 3. Loitsch, Florian. "Printing floating-point numbers quickly and accurately with integers." ACM Sigplan Notices 45.6 (2010): 233-243.
  52. 52. Q&A

×