The growth of Datacenter infrastructure is trending out of bounds, along with the pace in user activity and data generation in this digital era. However, the nature of the typical application deployment within the data center is changing to accommodate new business needs. Those changes introduce complexities in application deployment architecture and design, which cascade into requirements for a new generation of database technology (NoSQL) destined to ease that complexity. This webcast will discuss the modern data centers data centric application, the complexities that must be dealt with and common architectures found to describe and prescribe new data center aware services. Well look at the practical issues in implementation and overview current state of art in NoSQL database technology solving the problems of data center awareness in application development.
Boost PC performance: How more available memory can improve productivity
NoSQL – Data Center Centric Application Enablement
1. Data Centers and NoSQL Database
Robert Greene – Product Management
2. Data Center Trends and Business Drivers
Key Tech Requirements
The Data Center Aware Application
3 NoSQL Data Center Architectures
Data Locality and Reliability Isolation
Data Consistency Considerations
NoSQL Data Center Implementation
Summary - NoSQL & Data Centers
NoSQL - Enabling Data Center Applications
3. Data Centers - Trends and Drivers
Trends – More deployments
• SMB’s going global
• Consumer facing
• Terabytes of Data
• Proximity and Rev
• Requirements
• Security
• Regulatory
• Availability
IDC survey revenue>$1B or employees>5000
• 26% have 6 or more DC’s
4. Cloud deployments
• SMB’s going global
• Consumer facing
• Terabytes of Data
• Proximity and Rev
• Requirements
• Security
• Regulatory
• Availability
Data Centers - Trends and Drivers
5. Business Drivers
• Cost Optimization
• SaaS and Desktop Virtualization
• Globalized Sensor Data and M2M
• Big Data – Physics of IT
• Revenue
• Latency and Availability
Data Centers - Trends and Drivers
6. Latency reduction
• Amazon study
• 100ms = 1% Revenue
• 150M dollars
• SMB seeing more data
• Physics
• More data = more time,
• Longer distance = more time
Key Technical Requirements
7. Availability
• 2003 New York Blackout
• Dec 2012 AWS Outage
• Regulatory Mandates
Key Technical Requirements
8. Process & Data Redundancy
• Specialized Needs
• Provisioning
• Placement
• Communications
• Encryption
• Security
• Data Availability
• Upgrade processing
• Failover processes
The Data Center Aware Application
9. 3 Prevalent Architectures
• Container
• Component
• Tag
NoSQL Data Center Architectures
Data
Replica
Replica
Data Center 1
Data Center 2
Data Center 3
10. NoSQL Data Center Deployment Architectures
Node
Data Replica Replica
Data Replica Replica
Hardware
Unit of Replication
Process
Container Based
Strengths
• Allows Local Writes
• Simple Replication Admin
• Good Global Availability
Weakness
• Many Data Copies
• High Bandwidth Rep
• No Consistent Reads
Durability & ConsistencyDurability & Consistency
Replication
More
Sites
11. NoSQL Data Center Deployment Architectures
Data Replica Replica
Hardware
Unit of Replication
Process
Component Based
Strengths
• Minimal Data Copies
• Low Bandwidth Rep
• Good Regional Availability
• Simple Admin
Weakness
• Network Latency
Sensitive
• Data placement
Durability & ConsistencySync Channel
12. Strengths
• Complete Control
• Targeted Data
NoSQL Data Center Deployment Architectures
Node
Data Replica Replica Data Replica Replica
Hardware
Unit of Replication
Process
Data Replica Replica
TAG_A TAG_ATAG_A
Tag Based
Weakness
• Coding Specific
• Brittle to system change
• Complex management
• Non-optimal data copies
Durability & Consistency Durability & Consistency
Sync Channel
13. Hybrid Architectures
• Component (local reliability)
• Container (multi channel)
Cloud Architectures
• Model Physical Infrastructure
• Geographic Regions
• Zoned local isolation points
• Power & Network only
NoSQL Data Center Architectures
14. Physical - Points of Failure Isolation
• Disk, Server, Rack, Power, Network Switches
Latency Placement Considerations
Data Locality and Reliability Isolation
Region
Data Center
Zone 3Zone 2Zone 1
RD R DR R A
Net Switches
Servers
Disks
High/Low Latency
Racks
Net Switches Net Switches
Low Latency
Power
15. Multiple locations need to be kept in sync
Latency – Read Consistency across Data Centers
• Consistency based on eventually consistent processes
• How to ensure you have the latest data if needed
• How to read locally to keep latency low
Throughput – Write Durability across Data Centers
• Durability based on write copies, eventually everywhere
• How to get copies written without high latency
• How to resolve conflicting updates
Techniques: quorum voting, vector clocks, timestamp
Data Consistency Considerations
16. Container Architecture
• Reads and writes always local to the container
• No option to ensure consistent data
• Sync involves key differentiation scheme ( e.g. Merkle tree)
Component Architecture
• Reads and writes optionally to local component
• Can request consistent operation ( possible latency cost )
• Sync involves log ordered sequencing ( no differentiation )
Tagged Architecture
• Reads and writes explicit at the data level
• Can request consistent operation ( possible latency cost )
• Sync is application code dependent
Data Consistency Considerations
17. Review of a few well known vendors
• Cassandra
• Hbase
• MongoDB
• OnDB
• Riak
NoSQL Data Center Implementation
18. Cassandra Data Center Implementation
Data Center Architecture: Container
Replication Unit: Node
Reliability: Cluster
Data Copies: RF * Data Center
Durability: One, Local_Quorum
Consistency: Timestamp
Placement: Multiple copies per Data Center
19. Hbase Data Center Implementation
Data Center Architecture: Tag
Replication Unit: Column Family
Reliability: Cluster
Data Copies: RF x Clusters (family subset)
Durability: ACID
Consistency: Absolute
Placement: None
Extensions:
• Multi-Channel
Replication ( per
region server )
20. MongoDB Data Center Implementation
Data Center Architecture: Tag
Replication Unit: Range of Collection
Reliability: Replica Set
Data Copies: RF x Tagged Replica Set (range subset)
Durability: WriteConcern
Consistency: ReadConcern
Placement: Hard Coded
Extensions:
• Read-Only Shards
21. Riak Data Center Implementation
Data Center Architecture: Container
Replication Unit: Cluster
Reliability: Local Cluster
Data Copies: RF x Clusters
Durability: One, Quorum, All
Consistency: Vector Clock
Placement: Multiple copies per cluster
Extensions:
• Multi-Channel Sink
• Read-Only Cluster
22. Data Center Architecture: Component
Replication Unit: Zone
Reliability: Zone(s)
Data Copies: RF
Durability: ACID, One, Quorum, All
Consistency: Absolute, Quorum
Placement: Copy per Zone
Extensions:
• Local Quorum
• Read-Only Zones
OnDB – Oracle NoSQL DB Data Center
Data Replica Replica
Zone
Near Data Centers
Client ReadClient Write
Availability Group
23. Data Center Deployments
• Increasingly Common: Cost, Revenue & Reliability Advantages
The Data Center Aware Application ( latency & consistency )
Summary – NoSQL & Data Centers
Component X
Container X X
Tag X X
Near Data Centers X X X
Multi Channel
X
Disaster Recovery
Placement
Far Data Centers Placement X X Placement
Consistent Data X X
In-Consistent Data X X X
Data Size - Copies Low High High Low Medium
24. Oracle NoSQL DB Resources
• NoSQL Database Downloads
http://www.oracle.com/technetwork/products/nosqldb/downloads/index.html
• NoSQL Database Documentation
http://www.oracle.com/technetwork/products/nosqldb/documentation/index.html
• NoSQL Database Contacts
• David Segleau – Director Product Management dave.segleau@oracle.com
• Robert Greene – Senior Principle Product Manager robert.c.greene@oracle.com