Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

NoSQL databases

272 visualizaciones

Publicado el

NoSQL: state of the art.
Review of main NoSQL database models and available software solutions.

Publicado en: Software
  • Inicia sesión para ver los comentarios

NoSQL databases

  1. 1. NoSQL databases STATE OF THE ART 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 1
  2. 2. I - Overview 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 2
  3. 3. What is NoSQL? 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 3
  4. 4. (typically) NoSQL is … Non-relational Distributed Horizontally scalable Big data Performant Open source 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 4
  5. 5. Relational VS NoSQL Property Relational NoSQL Performance for high data volume Low High Horizontal scalability Complex, error-prone Simple Flexibility Low High Consistency Strong (ACID) Eventual (BASE) Indexing Multiple columns Single column Data duplication Not possible Allowed Standard query language Yes No Data model Single Multiple 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 5
  6. 6. II - Models 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 6
  7. 7. Main NoSQL database models Key-value Document Column Graph 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 7
  8. 8. Key-value store. Data model Key 1 Key 2 Key 3 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 8 Value 1 Value 2 Value 3 KEYS VALUES
  9. 9. Key-value store. Characteristics PROS Frequent reads / writes Simple data model Rapid query execution CONS 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 9 Small reads / writes Simple data model Poor query capabilities
  10. 10. Key-value store. Implementations 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 10
  11. 11. Document store. Data model 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 11 Document 1 – ID 1 { id: ‘1’ name: ‘foo’ attributeX: ‘bar’ } JSON Document 2 – ID 2 { id: ‘2’ name: ‘bar’ } JSON Document 3 – ID 3 <element> <name>A</name> <content> <type>B</type> <color>red</color> </content> </element> XML Document 4 – ID 4 <element> <name>B</name> <value>5</value> </element> XML
  12. 12. Document store. Characteristics Flexible Object in single document Rich querying capabilities 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 12 PROS CONS No joins
  13. 13. Document store. Implementations 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 13
  14. 14. Column store. Data model 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 14 Column Family Row1 Row2 Row Key1 Row Key2 Column1 name1 : value1 timestamp1 Column2 name2 : value2 timestamp2 ColumnN nameN : valueN timestampN Column1 name1 : value1 timestamp1 Column3 name3 : value3 timestamp3 ColumnM nameM : valueM timestampM
  15. 15. Column store. Data model 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 15 Super Column Family Row1 Row Key1 SuperColumnX … name1 value1 time stamp1 nameN valueN time stampN SuperColumnY … name1 value1 time stamp1 nameM valueM time stamp M
  16. 16. Column store. Characteristics Large number of data (in dynamic columns) Fast queries on columns (usually reads) 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 16 PROS CONS Slow queries on rows (usually writes)
  17. 17. Column store. Implementations 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 17
  18. 18. Graph store. Data model 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 18 Node1 Node2 Node4 Node3 Node6 Node5 Edge1 Property1 Property2 Property3 Edge2 Edge3 Edge4 Edge5 Edge6
  19. 19. Graph store. Characteristics Network modelling Graph-like queries Rapid deep traversal Fully ACID 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 19 PROS CONS No sharding Poor horizontal scalability Complex data model
  20. 20. Graph store. Implementations 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 20
  21. 21. Other NoSQL database models • Based on few other modelsMultimodel • Follows OOP principlesObject-oriented • Mutli-valued attributesMultiValue • Optimized to managa time series dataTime series • …And many more 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 21
  22. 22. Comparison of NoSQL models * 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 22 Model Performance Scalability Flexibility Complexity Functionality Key-value high high high none variable (none) Document high variable (high) high low variable (low) Column high high moderate low minimal Graph variable variable high high graph theory Relational variable variable low moderate relational algebra * Summary of a presentation by Ben Scofield: https://www.slideshare.net/bscofield/nosql-codemash-2010
  23. 23. Comparison by data size / complexity 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 23 Key-value Column Document Graph Data size Data complexity
  24. 24. III – Software 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 24
  25. 25. Criteria for evaluation Popularity rank * Data model Consistency Availability Concurrency Scalability Querying 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 25 * According to DB-Engines ranking https://db-engines.com/en/ranking (April 2017). Relational DBMSs where discarded.
  26. 26. TOP 4 Systems 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 26 MongoDB Cassandra Redis Elasticsearch 1 2 3 4 Document Column + key-value In-memory key-value Document (search engine)
  27. 27. Consistency 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 27 MongoDB • Configurable • Strong by default Cassandra • Configurable Redis • Eventual Elasticsearch • Configurable • Consistent, with options
  28. 28. Availability 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 28 MongoDB • Replicated Cassandra • Distributed Redis • Replicated Elasticsearch • Replicated High availability
  29. 29. Concurrency 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 29 • Multi- granularity locking (MGL) MongoDB • Multiversion concurrency control (MVCC) Cassandra • Optimistic concurrency control (OCC) Redis • Optimistic concurrency control (OCC) Elasticsearch
  30. 30. Scalability 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 30 • High (automatic data sharding) MongoDB • High (automatic addition / removal of nodes in cluster) Cassandra • Poor Redis • High (dynamic sharding on live cluster) Elasticsearch
  31. 31. Querying 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 31 • Internal API (MapReduce) • Complex query support MongoDB • Internal API, CQL SQL-like • Complex query support Cassandra • By key or value range • Rapid • No complex queries Redis • Own query language (Query DSL) • Full text search, filters Elasticsearch
  32. 32. IV – Geospatial 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 32
  33. 33. GIS (geographic information system) 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 33
  34. 34. 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 34
  35. 35. Idea behind GIS « magic » 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 35 Geospatial data Geohash API GIS support
  36. 36. Available solutions 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 36
  37. 37. Solutions 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 37 New document format GeoJSON (MongoDB) GeoMesa + Apache Spark (Hadoop) CQL extension (Cassandra) GeoCouch extension (CouchDB) Fast I/O in-memory geospatial operations (Redis) Library Neo4j Spatial (Neo4j)
  38. 38. V - Conclusion 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 38
  39. 39. 4/14/2017 BY MARKIYAN RIZUN, UNIVERSITÉ LILLE 1, SOFTEAM, EMAIL: MRIZUN@SOFTEAM.FR 39

×