SlideShare a Scribd company logo
1 of 17
Anotomy of NoSQL Databases
Date: 11/10/13
Amit Kumar
2
Agenda
+Background
+What are NoSQL Databases
+Relational vs NoSQL Databases
+HBase
+Cassandra
+Design Strategies behind NoSQL Databases
3
Background
+Traditional Applications
Limited Data
Top priority on consistency
Focus on average latency
Ideally fit with RDBMS
Utilized the DB intrinsic features well
Good part of logic resided in DB
+Next Gen Applications
Web Scale (~infinite)
ALWAYS available
High performance in ALL cases
Data in the form of key/value pair
Logic part of Application Layer
4
RDBMS with Nextgen Apps – Failure
+Scale
Limit to maximum data supported
Sharding is an option, but then RDBMS features are lost
+Economy
Requires large arrays of fast, expensive disks
Very expensive
+Availability still an issue
5
NoSQL Databases
+Name is confusing
Not RDBMS at all
NoREL Databases a better name
+Key Value Store
+Extremely scalable
+High performance
+Always available
+Weak Consistency (CAP Theorem)
+Distributed
Use commodity hardware - Cheap
+Might not hold ACID properties
+Only for specific Use – Not everything is good
RDBMS vs NoSQL Databases
+Go for RDBMS when
Small instances of simple straight forward systems
Joins, secondary indexing, referential integrity, group by/order by
+Go for NoSQL when
Data scale
Read/write scale
Data model is
Flexible
Semi-structured
6
NoSQL Current Limitations
+Maturity
+Support
+Analytics & Business Intelligence
+Administration
+Ease of Use
7
Some famous NoSQL Databases
+Open-source
HBase
Cassandra
Voldemort
Dynomite
Hypertable
CouchDB
VPork
MongoDB
Riak
+Closed-source
BigTable
Dynamo
PNUTS
8
9
HBase
+Based on Google BigTable
+Sparse distributed persistent multi-dimensional sorted map
+On top of Hadoop HDFS
+Master Slave Model
Single Master (SPOF)
+Especially good when
Objects are huge
Data production/consumption is distributed and is tunneled through map/reduce
jobs
+Loose Data Model
Column Families
+Timestamp based versioning
+Not supported on Windows
+Major Users – Adobe, Twitter, Yahoo, Veoh, Streamy, Trend Micro
HBase Architecture & Table Structure
+Loosely based on Consistent Hashing
+Table made up of regions
Region specified by startkey and endkey
A region may live on a different node.
+Tables sorted by Rows
+Schema defines column families only
Each family consists of any no. of columns
Each column consists of any no. of versions
Columns within a family are sorted & stored together
+Everything except table name are byte[]
10
Connecting to Hbase
+Java Client API
HBaseConfiguration config = new HBaseConfiguration();
HTable table = new HTable(config, “table_name”);
Put p = new Put(Bytes.toBytes(“key”));
p.add(Bytes.toBytes(“key”), Bytes.toBytes(“column”), Bytes.toBytes(“value”));
table.put(p);
Get g = new Get(Bytes.toBytes(“key”));
Result r = table.get(g);
+HBase Shell
$ ${HBASE_HOME}/bin/hbase shell
hbase> describe “table_name”
hbase> put “table_name", “key”, “columnfamily:columnname", "value“
hbase> get “table_name”, “key”
hbase> scan “table_name”
+Thrift Gateway
+REST Gateway
+Many other non-java clients
11
Cassandra
+Based on Amazon Dynamo
+Open sourced by Facebook in 2008
+Peer to Peer Model
No Master Node
+Works on Windows as well
+Distributed Key/Value Store
+Configurable parameters for Consistency/Availability
+Especially suited if
Number of Objects is huge
objects are of small sizes (<1 MB)
+Major Users: Facebook, Digg, Twitter etc.
12
13
NoSQL Databases – Assumptions
+Data size is huge
System must partition its data across multiple nodes
+Reliable
Data must be safe even when disks and nodes fail
System must replicate data
+Performance
Needs to perform well on cheap hardware and maintain low latency ALWAYS
14
NoSQL Databases – Design Strategies
+Complex Distributed System
+Partitioning
Consistent Hashing
+Consistency
Eventual Consistency
Vector Clocks
+Data Models
Primary Key -> Value
Value can be semi-structured
Multi-version Storage
+Storage Layouts
Column storage with Locality groups
Log structured Merge Trees
+Cluster Management
Peer to Peer vs Master/Slave approach
Gossip
15
References
+Bigtable: A Distributed Storage System for Structured Data
http://labs.google.com/papers/bigtable-osdi06.pdf
+Dynamo: Amazon's Highly Available Key-value Store
http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
+NOSQL debrief, June 2009
http://static.last.fm/johan/nosql-20090611/intro_nosql.pdf
http://static.last.fm/johan/nosql-20090611/hbase_nosql.pdf
http://static.last.fm/johan/nosql-20090611/cassandra_nosql.ppt
+NoSQL Databases Official Site
http://nosql-database.org
+Hbase – Hadoop Wiki
http://wiki.apache.org/hadoop/Hbase
+Apache Cassandra Wikipedia
http://en.wikipedia.org/wiki/Apache_Cassandra
16
Questions + Answers
Thank You

More Related Content

What's hot

Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoHyunsik Choi
 
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedwhoschek
 
Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017HashedIn Technologies
 
Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2Rohit Agrawal
 
HBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBaseHBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBaseCloudera, Inc.
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용Byeongweon Moon
 
Intro to Apache Hadoop
Intro to Apache HadoopIntro to Apache Hadoop
Intro to Apache HadoopSufi Nawaz
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Redis memory optimization sripathi, CTO hashedin
Redis memory optimization   sripathi, CTO hashedinRedis memory optimization   sripathi, CTO hashedin
Redis memory optimization sripathi, CTO hashedinHashedIn Technologies
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
rhbase_tutorial
rhbase_tutorialrhbase_tutorial
rhbase_tutorialAaron Benz
 
Hadoop & Zing
Hadoop & ZingHadoop & Zing
Hadoop & ZingLong Dao
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorialawesomesos
 

What's hot (20)

Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
 
Apache Hive
Apache HiveApache Hive
Apache Hive
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
 
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
 
Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017
 
Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2Hadoop Cluster Configuration and Data Loading - Module 2
Hadoop Cluster Configuration and Data Loading - Module 2
 
Hadoop-BigData
Hadoop-BigDataHadoop-BigData
Hadoop-BigData
 
HBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBaseHBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBase
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용
 
Hive hcatalog
Hive hcatalogHive hcatalog
Hive hcatalog
 
Intro to Apache Hadoop
Intro to Apache HadoopIntro to Apache Hadoop
Intro to Apache Hadoop
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
20080528dublinpt1
20080528dublinpt120080528dublinpt1
20080528dublinpt1
 
Redis memory optimization sripathi, CTO hashedin
Redis memory optimization   sripathi, CTO hashedinRedis memory optimization   sripathi, CTO hashedin
Redis memory optimization sripathi, CTO hashedin
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
rhbase_tutorial
rhbase_tutorialrhbase_tutorial
rhbase_tutorial
 
Hadoop & Zing
Hadoop & ZingHadoop & Zing
Hadoop & Zing
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 

Similar to Anatomy of NoSQL Databases

Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...
Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...
Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...IndicThreads
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBaseAnil Gupta
 
Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityMapR Technologies
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionCloudera, Inc.
 
Dancing with the elephant h base1_final
Dancing with the elephant   h base1_finalDancing with the elephant   h base1_final
Dancing with the elephant h base1_finalasterix_smartplatf
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the BasicsHBaseCon
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...Simplilearn
 
Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Jeremy Walsh
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase Cynthia Saracco
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future HBaseCon
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptxSadhik7
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
What's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its BeyondWhat's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its BeyondGruter
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & developmentShashwat Shriparv
 

Similar to Anatomy of NoSQL Databases (20)

Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...
Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...
Breaking with relational DBMS and dating with Hbase [5th IndicThreads.com Con...
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBase
 
Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
 
Hbase
HbaseHbase
Hbase
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
 
Hspark index conf
Hspark index confHspark index conf
Hspark index conf
 
Dancing with the elephant h base1_final
Dancing with the elephant   h base1_finalDancing with the elephant   h base1_final
Dancing with the elephant h base1_final
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the Basics
 
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
 
Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14
 
Nextag talk
Nextag talkNextag talk
Nextag talk
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
 
01 hbase
01 hbase01 hbase
01 hbase
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
Hbase 20141003
Hbase 20141003Hbase 20141003
Hbase 20141003
 
What's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its BeyondWhat's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its Beyond
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
 

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Anatomy of NoSQL Databases

  • 1. Anotomy of NoSQL Databases Date: 11/10/13 Amit Kumar
  • 2. 2 Agenda +Background +What are NoSQL Databases +Relational vs NoSQL Databases +HBase +Cassandra +Design Strategies behind NoSQL Databases
  • 3. 3 Background +Traditional Applications Limited Data Top priority on consistency Focus on average latency Ideally fit with RDBMS Utilized the DB intrinsic features well Good part of logic resided in DB +Next Gen Applications Web Scale (~infinite) ALWAYS available High performance in ALL cases Data in the form of key/value pair Logic part of Application Layer
  • 4. 4 RDBMS with Nextgen Apps – Failure +Scale Limit to maximum data supported Sharding is an option, but then RDBMS features are lost +Economy Requires large arrays of fast, expensive disks Very expensive +Availability still an issue
  • 5. 5 NoSQL Databases +Name is confusing Not RDBMS at all NoREL Databases a better name +Key Value Store +Extremely scalable +High performance +Always available +Weak Consistency (CAP Theorem) +Distributed Use commodity hardware - Cheap +Might not hold ACID properties +Only for specific Use – Not everything is good
  • 6. RDBMS vs NoSQL Databases +Go for RDBMS when Small instances of simple straight forward systems Joins, secondary indexing, referential integrity, group by/order by +Go for NoSQL when Data scale Read/write scale Data model is Flexible Semi-structured 6
  • 7. NoSQL Current Limitations +Maturity +Support +Analytics & Business Intelligence +Administration +Ease of Use 7
  • 8. Some famous NoSQL Databases +Open-source HBase Cassandra Voldemort Dynomite Hypertable CouchDB VPork MongoDB Riak +Closed-source BigTable Dynamo PNUTS 8
  • 9. 9 HBase +Based on Google BigTable +Sparse distributed persistent multi-dimensional sorted map +On top of Hadoop HDFS +Master Slave Model Single Master (SPOF) +Especially good when Objects are huge Data production/consumption is distributed and is tunneled through map/reduce jobs +Loose Data Model Column Families +Timestamp based versioning +Not supported on Windows +Major Users – Adobe, Twitter, Yahoo, Veoh, Streamy, Trend Micro
  • 10. HBase Architecture & Table Structure +Loosely based on Consistent Hashing +Table made up of regions Region specified by startkey and endkey A region may live on a different node. +Tables sorted by Rows +Schema defines column families only Each family consists of any no. of columns Each column consists of any no. of versions Columns within a family are sorted & stored together +Everything except table name are byte[] 10
  • 11. Connecting to Hbase +Java Client API HBaseConfiguration config = new HBaseConfiguration(); HTable table = new HTable(config, “table_name”); Put p = new Put(Bytes.toBytes(“key”)); p.add(Bytes.toBytes(“key”), Bytes.toBytes(“column”), Bytes.toBytes(“value”)); table.put(p); Get g = new Get(Bytes.toBytes(“key”)); Result r = table.get(g); +HBase Shell $ ${HBASE_HOME}/bin/hbase shell hbase> describe “table_name” hbase> put “table_name", “key”, “columnfamily:columnname", "value“ hbase> get “table_name”, “key” hbase> scan “table_name” +Thrift Gateway +REST Gateway +Many other non-java clients 11
  • 12. Cassandra +Based on Amazon Dynamo +Open sourced by Facebook in 2008 +Peer to Peer Model No Master Node +Works on Windows as well +Distributed Key/Value Store +Configurable parameters for Consistency/Availability +Especially suited if Number of Objects is huge objects are of small sizes (<1 MB) +Major Users: Facebook, Digg, Twitter etc. 12
  • 13. 13 NoSQL Databases – Assumptions +Data size is huge System must partition its data across multiple nodes +Reliable Data must be safe even when disks and nodes fail System must replicate data +Performance Needs to perform well on cheap hardware and maintain low latency ALWAYS
  • 14. 14 NoSQL Databases – Design Strategies +Complex Distributed System +Partitioning Consistent Hashing +Consistency Eventual Consistency Vector Clocks +Data Models Primary Key -> Value Value can be semi-structured Multi-version Storage +Storage Layouts Column storage with Locality groups Log structured Merge Trees +Cluster Management Peer to Peer vs Master/Slave approach Gossip
  • 15. 15 References +Bigtable: A Distributed Storage System for Structured Data http://labs.google.com/papers/bigtable-osdi06.pdf +Dynamo: Amazon's Highly Available Key-value Store http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf +NOSQL debrief, June 2009 http://static.last.fm/johan/nosql-20090611/intro_nosql.pdf http://static.last.fm/johan/nosql-20090611/hbase_nosql.pdf http://static.last.fm/johan/nosql-20090611/cassandra_nosql.ppt +NoSQL Databases Official Site http://nosql-database.org +Hbase – Hadoop Wiki http://wiki.apache.org/hadoop/Hbase +Apache Cassandra Wikipedia http://en.wikipedia.org/wiki/Apache_Cassandra

Editor's Notes

  1. DB features like joins, db links, constraints, streams,
  2. 8