SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
Scalability

Nicola Baldi

http://it.linkedin.com/in/nicolabaldi

Luigi Berrettini

http://it.linkedin.com/in/luigiberrettini
The need for speed

15/12/2012

Scalability

2
Companies continuously increase
More and more data and traffic
More and more computing resources needed
SOLUTION

SCALING
15/12/2012

Scalability – The need for speed

3
vertical scalability = scale up
 single server
 performance ⇒ more resources (CPUs, storage, memory)
 volumes increase ⇒ more difficult and expensive to scale
 not reliable: individual machine failures are common
horizontal scalability = scale out
 cluster of servers
 performance ⇒ more servers
 cheaper hardware (more likely to fail)
 volumes increase ⇒ complexity ~ constant, costs ~ linear
 reliability: CAN operate despite failures
 complex: use only if benefits are compelling
15/12/2012

Scalability – The need for speed

4
Vertical scalability

15/12/2012

Scalability

5
All data on a single node
Use cases
 data usage = mostly processing aggregates
 many graph databases

Pros/Cons
 RDBMSs or NoSQL databases
 simplest and most often recommended option
 only vertical scalability

15/12/2012

Scalability – Vertical scalability

6
Horizontal scalability
Architectures and
distribution models

15/12/2012

Scalability

7
Shared everything
 every node has access to all data
 all nodes share memory and disk storage
 used on some RDBMSs

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

8
Shared disk
 every node has access to all data
 all nodes share disk storage
 used on some RDBMSs

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

9
Shared nothing
 nodes are independent and self-sufficient
 no shared memory or disk storage
 used on some RDBMSs and all NoSQL databases

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

10
Sharding
different data put on different nodes
Replication
same data copied over multiple nodes
Sharding + replication
the two orthogonal techniques combined

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

11
Different parts of the data onto different nodes
 data accessed together (aggregates) are on the same node
 clumps arranged by physical location, to keep load even,
or according to any domain-specific access rule

R

W

A
F
H
Shard

15/12/2012

R

W

B
E
G
Shard

R

W

C
D
I
Shard

Scalability – Horizontal scalability: architectures and distribution models

12
Use cases
 different people access different parts of the dataset
 to horizontally scale writes
Pros/Cons
 “manual” sharding with every RDBMS or NoSQL store
 better read performance
 better write performance
 low resilience: all but failing node data available
 high licensing costs for RDBMSs
 difficult or impossible cluster-level operations
(querying, transactions, consistency controls)
15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

13
Data replicated across multiple nodes
 One designated master (primary) node
• contains the original
• processes writes and passes them on
 All other nodes are slave (secondary)
• contain the copies
• synchronized with the master during a replication process

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

14
R

A
B
C
Slave
R

A
B
C
Slave

W

A
B
C
Master
15/12/2012

R

MASTER-SLAVE REPLICATION

Scalability – Horizontal scalability: architectures and distribution models

15
Use cases
 load balancing cluster: data usage mostly read-intensive
 failover cluster: single server with hot backup
Pros/Cons
 better read performance
 worse write performance (write management)
 high read (slave) resilience:
master failure ⇒ slaves can still handle read requests
 low write (master) resilience:
master failure ⇒ no writes until old/new master is up
 read inconsistencies: update not propagated to all slaves
 master = bottleneck and single point of failure
 high licensing costs for RDBMSs
15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

16
Data replicated across multiple nodes
 All nodes are peer (equal weight): no master, no slaves

 All nodes can both read and write

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

17
R

W

R

A
B
C
Peer

W

A
B
C
Peer
R

W

A
B
C
Peer
15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

18
Use cases
 load balancing cluster: data usage read/write-intensive
 need to scale out more easily
Pros/Cons
 better read performance
 better write performance
 high resilience:
node failure ⇒ reads/writes handled by other nodes
 read inconsistencies: update not propagated to all nodes
 write inconsistencies: same record at the same time
 high licensing costs for RDBMSs
15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

19
Sharding + master-slave replication
 multiple masters
 each data item has a single master
 node configurations:
• master
• slave
• master for some data / slave for other data

Sharding + peer-to-peer replication

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

20
W

R

A
F
H
Master 1
R

A
F
H
Slave 1
15/12/2012

R

W

B
E
G
Master/Slave 2
R

W

B
E
G
Slave/Master 2

R

C
D
I
Slave 3
R

W

C
D
I
Master 3

Scalability – Horizontal scalability: architectures and distribution models

21
R

W

A
F
H
Peer 1/2
R

W

A
F
E
Peer 1/4
15/12/2012

R

W

B
E
G
Peer 3/4
R

W

B
H
G
Peer 2/3

R

W

C
D
I
Peer 5/6
R

W

C
D
I
Peer 5/6

Scalability – Horizontal scalability: architectures and distribution models

22
Oracle Database
Oracle RAC

shared everything

Microsoft SQL Server
All editions
shared nothing
master-slave replication
IBM DB2
DB2 pureScale
DB2 HADR

15/12/2012

shared disk
shared nothing
master-slave replication (failover cluster)

Scalability – Horizontal scalability: architectures and distribution models

23
Oracle MySQL
MySQL Cluster

shared nothing
sharding, replication, sharding + replication

The PostgreSQL Global Development Group PostgreSQL
PGCluster-II
shared disk
Postgres-XC
shared nothing
sharding, replication, sharding + replication

15/12/2012

Scalability – Horizontal scalability: architectures and distribution models

24
Horizontal scalability
Consistency

15/12/2012

Scalability

25
Inconsistent write = write-write conflict
multiple writes of the same data at the same time
(highly likely with peer-to-peer replication)

Inconsistent read = read-write conflict
read in the middle of someone else’s write

15/12/2012

Scalability – Horizontal scalability: consistency

26
 Pessimistic approach
prevent conflicts from occurring
 Optimistic approach
detect conflicts and fix them

15/12/2012

Scalability – Horizontal scalability: consistency

27
Implementation
 write locks ⇒ acquire a lock before updating a value
(only one lock at a time can be tacken)
Pros/Cons
 often severely degrade system responsiveness
 often leads to deadlocks (hard to prevent/debug)
 rely on a consistent serialization of the updates*
* sequential consistency
ensuring that all nodes apply operations in the same order

15/12/2012

Scalability – Horizontal scalability: consistency

28
Implementation
 conditional updates ⇒ test a value before updating it
(to see if it's changed since the last read)
 merged updates ⇒ merge conflicted updates somehow
(save updates, record conflict and merge somehow)
Pros/Cons
 conditional updates
rely on a consistent serialization of the updates*
* sequential consistency
ensuring that all nodes apply operations in the same order
15/12/2012

Scalability – Horizontal scalability: consistency

29
 Logical consistency
different data make sense together
 Replication consistency
same data ⇒ same value on different replicas
 Read-your-writes consistency
users continue seeing their updates

15/12/2012

Scalability – Horizontal scalability: consistency

30
ACID transactions ⇒ aggregate-ignorant DBs
Partially atomic updates ⇒ aggregate-oriented DBs
 atomic updates within an aggregate
 no atomic updates between aggregates
 updates of multiple aggregates: inconsistency window
 replication can lengthen inconsistency windows

15/12/2012

Scalability – Horizontal scalability: consistency

31
Eventual consistency
 nodes may have replication inconsistencies:

stale (out of date) data

 eventually all nodes will be synchronized

15/12/2012

Scalability – Horizontal scalability: consistency

32
Session consistency
 within a user’s session there is read-your-writes consistency

(no stale data read from a node after an update on another one)
 consistency lost if
• session ends
• the system is accessed simultaneously from different PCs

 implementations
• sticky session/session affinity = sessions tied to one node
 affects load balancing
 quite intricate with master-slave replication

• version stamps
 track latest version stamp seen by a session
 ensure that all interactions with the data store include it

15/12/2012

Scalability – Horizontal scalability: consistency

33
Horizontal scalability
CAP theorem

15/12/2012

Scalability

34
Consistency
all nodes see the same data at the same time

Latency
the response time in interactions between nodes
Availability
 every nonfailing node must reply to requests
 the limit of latency that we are prepared to tolerate:
once latency gets too high, we give up and treat data as
unavailable
Partition tolerance
the cluster can survive communication breakages
(separating it into partitions unable to communicate with each other)
15/12/2012

Scalability – Horizontal scalability: CAP theorem

35
1) read(A)
2) A = A – 50

Transaction to transfer $50
from account A to account B

3) write(A)
4) read(B)
5) B = B + 50
6) write(B)

 Atomicity

• transaction fails after 3 and before 6 ⇒ the system should
ensure that its updates are not reflected in the database

 Consistency
• A + B is unchanged by the execution of the transaction

15/12/2012

Scalability – Horizontal scalability: CAP theorem

36
1) read(A)
2) A = A – 50

Transaction to transfer $50
from account A to account B

3) write(A)
4) read(B)
5) B = B + 50
6) write(B)

 Isolation

• another transaction will see inconsistent data between 3 and 6
(A + B will be less than it should be)
• Isolation can be ensured trivially by running transactions
serially ⇒ performance issue

 Durability
• user notified that transaction completed ($50 transferred)
⇒ transaction updates must persist despite failures
15/12/2012

Scalability – Horizontal scalability: CAP theorem

37
Basically Available
Soft state
Eventually consistent
Soft state and eventual consistency are techniques that work
well in the presence of partitions and thus promote availability

15/12/2012

Scalability – Horizontal scalability: CAP theorem

38
Given the three properties of
Consistency, Availability and
Partition tolerance,
you can only get two

15/12/2012

Scalability – Horizontal scalability: CAP theorem

39
C
being up and keeping consistency is reasonable
A
one node: if it’s up it’s available
P
a single machine can’t partition

15/12/2012

Scalability – Horizontal scalability: CAP theorem

40
AP ( C )
partition ⇒ update on one node = inconsistency

15/12/2012

Scalability – Horizontal scalability: CAP theorem

41
CP ( A )
partition ⇒ consistency only if one nonfailing
node stops replying to requests

15/12/2012

Scalability – Horizontal scalability: CAP theorem

42
CA ( P )
nodes communicate ⇒ C and A can be preserved
partition ⇒ all nodes on one partition must be
turned off (failing nodes preserve A)
difficult and expensive

15/12/2012

Scalability – Horizontal scalability: CAP theorem

43
ACID databases
focus on consistency first and availability second

BASE databases
focus on availability first and consistency second

15/12/2012

Scalability – Horizontal scalability: CAP theorem

44
Single server
 no partitions
 consistency versus performance: relaxed isolation
levels or no transactions
Cluster
 consistency versus latency/availability
 durability versus performance (e.g. in memory DBs)
 durability versus latency (e.g. the master
acknowledges the update to the client only after
having been acknowledged by some slaves)

15/12/2012

Scalability – Horizontal scalability: CAP theorem

45
strong write consistency ⇒ write to the master
strong read consistency ⇒ read from the master

15/12/2012

Scalability – Horizontal scalability: CAP theorem

46
N = replication factor

(nodes involved in replication NOT nodes in the cluster)

W = nodes confirming a write
R = nodes needed for a consistent read

write quorum: W > N/2

read quorum: R + W > N

Consistency is on a per operation basis
Choose the most appropriate combination of
problems and advantages
15/12/2012

Scalability – Horizontal scalability: CAP theorem

47

Más contenido relacionado

La actualidad más candente

ditributed databases
ditributed databasesditributed databases
ditributed databasesHira Awan
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issuesEsar Qasmi
 
Distributed databases,types of database
Distributed databases,types of databaseDistributed databases,types of database
Distributed databases,types of databaseBoomadevi Shanmugam
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
 
Distributed data base management system
Distributed data base management systemDistributed data base management system
Distributed data base management systemSonu Mamman
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management SystemAAKANKSHA JAIN
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 WebAli Usman
 
1 introduction ddbms
1 introduction ddbms1 introduction ddbms
1 introduction ddbmsamna izzat
 
Intro to Distributed Database Management System
Intro to Distributed Database Management SystemIntro to Distributed Database Management System
Intro to Distributed Database Management SystemAli Raza
 
Lecture 10 distributed database management system
Lecture 10   distributed database management systemLecture 10   distributed database management system
Lecture 10 distributed database management systememailharmeet
 
Lecture 08 distributed dbms
Lecture 08 distributed dbmsLecture 08 distributed dbms
Lecture 08 distributed dbmsemailharmeet
 

La actualidad más candente (20)

Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
ditributed databases
ditributed databasesditributed databases
ditributed databases
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issues
 
Lecture 1 ddbms
Lecture 1 ddbmsLecture 1 ddbms
Lecture 1 ddbms
 
Distributed databases,types of database
Distributed databases,types of databaseDistributed databases,types of database
Distributed databases,types of database
 
Sequential consistency model
Sequential consistency modelSequential consistency model
Sequential consistency model
 
RDBMS
RDBMSRDBMS
RDBMS
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
Distributed data base management system
Distributed data base management systemDistributed data base management system
Distributed data base management system
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
 
No sql (not only sql)
No sql                 (not only sql)No sql                 (not only sql)
No sql (not only sql)
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 Web
 
1 introduction ddbms
1 introduction ddbms1 introduction ddbms
1 introduction ddbms
 
Intro to Distributed Database Management System
Intro to Distributed Database Management SystemIntro to Distributed Database Management System
Intro to Distributed Database Management System
 
Lecture 10 distributed database management system
Lecture 10   distributed database management systemLecture 10   distributed database management system
Lecture 10 distributed database management system
 
Lecture 08 distributed dbms
Lecture 08 distributed dbmsLecture 08 distributed dbms
Lecture 08 distributed dbms
 
Distributed dbms
Distributed dbmsDistributed dbms
Distributed dbms
 

Similar a Scalability

Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloudmoshfiq
 
Lecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfLecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfmanimozhi98
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Clustrix
 
Cassandra Essentials Day Cambridge
Cassandra Essentials Day CambridgeCassandra Essentials Day Cambridge
Cassandra Essentials Day CambridgeMarc Fielding
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...raghdooosh
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Clustrix
 
Redis. Seattle Data Science and Data Engineering Meetup
Redis. Seattle Data Science and Data Engineering MeetupRedis. Seattle Data Science and Data Engineering Meetup
Redis. Seattle Data Science and Data Engineering MeetupAbhishek Goswami
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Dave Anselmi
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptSQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for FailuresRodolfo Kohn
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.pptnadirpervez2
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsMatthew Dennis
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيMohamed Galal
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMSAli Usman
 

Similar a Scalability (20)

No sql
No sqlNo sql
No sql
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
 
Lecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfLecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdf
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
 
NoSQL Evolution
NoSQL EvolutionNoSQL Evolution
NoSQL Evolution
 
Cassandra Essentials Day Cambridge
Cassandra Essentials Day CambridgeCassandra Essentials Day Cambridge
Cassandra Essentials Day Cambridge
 
Hbase hive pig
Hbase hive pigHbase hive pig
Hbase hive pig
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack Database Architecture & Scaling Strategies, in the Cloud & on the Rack
Database Architecture & Scaling Strategies, in the Cloud & on the Rack
 
Redis. Seattle Data Science and Data Engineering Meetup
Redis. Seattle Data Science and Data Engineering MeetupRedis. Seattle Data Science and Data Engineering Meetup
Redis. Seattle Data Science and Data Engineering Meetup
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
 
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptSQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt
 
1 ddbms jan 2011_u
1 ddbms jan 2011_u1 ddbms jan 2011_u
1 ddbms jan 2011_u
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current Trends
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMS
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Scalability

  • 2. The need for speed 15/12/2012 Scalability 2
  • 3. Companies continuously increase More and more data and traffic More and more computing resources needed SOLUTION SCALING 15/12/2012 Scalability – The need for speed 3
  • 4. vertical scalability = scale up  single server  performance ⇒ more resources (CPUs, storage, memory)  volumes increase ⇒ more difficult and expensive to scale  not reliable: individual machine failures are common horizontal scalability = scale out  cluster of servers  performance ⇒ more servers  cheaper hardware (more likely to fail)  volumes increase ⇒ complexity ~ constant, costs ~ linear  reliability: CAN operate despite failures  complex: use only if benefits are compelling 15/12/2012 Scalability – The need for speed 4
  • 6. All data on a single node Use cases  data usage = mostly processing aggregates  many graph databases Pros/Cons  RDBMSs or NoSQL databases  simplest and most often recommended option  only vertical scalability 15/12/2012 Scalability – Vertical scalability 6
  • 8. Shared everything  every node has access to all data  all nodes share memory and disk storage  used on some RDBMSs 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 8
  • 9. Shared disk  every node has access to all data  all nodes share disk storage  used on some RDBMSs 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 9
  • 10. Shared nothing  nodes are independent and self-sufficient  no shared memory or disk storage  used on some RDBMSs and all NoSQL databases 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 10
  • 11. Sharding different data put on different nodes Replication same data copied over multiple nodes Sharding + replication the two orthogonal techniques combined 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 11
  • 12. Different parts of the data onto different nodes  data accessed together (aggregates) are on the same node  clumps arranged by physical location, to keep load even, or according to any domain-specific access rule R W A F H Shard 15/12/2012 R W B E G Shard R W C D I Shard Scalability – Horizontal scalability: architectures and distribution models 12
  • 13. Use cases  different people access different parts of the dataset  to horizontally scale writes Pros/Cons  “manual” sharding with every RDBMS or NoSQL store  better read performance  better write performance  low resilience: all but failing node data available  high licensing costs for RDBMSs  difficult or impossible cluster-level operations (querying, transactions, consistency controls) 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 13
  • 14. Data replicated across multiple nodes  One designated master (primary) node • contains the original • processes writes and passes them on  All other nodes are slave (secondary) • contain the copies • synchronized with the master during a replication process 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 14
  • 15. R A B C Slave R A B C Slave W A B C Master 15/12/2012 R MASTER-SLAVE REPLICATION Scalability – Horizontal scalability: architectures and distribution models 15
  • 16. Use cases  load balancing cluster: data usage mostly read-intensive  failover cluster: single server with hot backup Pros/Cons  better read performance  worse write performance (write management)  high read (slave) resilience: master failure ⇒ slaves can still handle read requests  low write (master) resilience: master failure ⇒ no writes until old/new master is up  read inconsistencies: update not propagated to all slaves  master = bottleneck and single point of failure  high licensing costs for RDBMSs 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 16
  • 17. Data replicated across multiple nodes  All nodes are peer (equal weight): no master, no slaves  All nodes can both read and write 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 17
  • 18. R W R A B C Peer W A B C Peer R W A B C Peer 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 18
  • 19. Use cases  load balancing cluster: data usage read/write-intensive  need to scale out more easily Pros/Cons  better read performance  better write performance  high resilience: node failure ⇒ reads/writes handled by other nodes  read inconsistencies: update not propagated to all nodes  write inconsistencies: same record at the same time  high licensing costs for RDBMSs 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 19
  • 20. Sharding + master-slave replication  multiple masters  each data item has a single master  node configurations: • master • slave • master for some data / slave for other data Sharding + peer-to-peer replication 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 20
  • 21. W R A F H Master 1 R A F H Slave 1 15/12/2012 R W B E G Master/Slave 2 R W B E G Slave/Master 2 R C D I Slave 3 R W C D I Master 3 Scalability – Horizontal scalability: architectures and distribution models 21
  • 22. R W A F H Peer 1/2 R W A F E Peer 1/4 15/12/2012 R W B E G Peer 3/4 R W B H G Peer 2/3 R W C D I Peer 5/6 R W C D I Peer 5/6 Scalability – Horizontal scalability: architectures and distribution models 22
  • 23. Oracle Database Oracle RAC shared everything Microsoft SQL Server All editions shared nothing master-slave replication IBM DB2 DB2 pureScale DB2 HADR 15/12/2012 shared disk shared nothing master-slave replication (failover cluster) Scalability – Horizontal scalability: architectures and distribution models 23
  • 24. Oracle MySQL MySQL Cluster shared nothing sharding, replication, sharding + replication The PostgreSQL Global Development Group PostgreSQL PGCluster-II shared disk Postgres-XC shared nothing sharding, replication, sharding + replication 15/12/2012 Scalability – Horizontal scalability: architectures and distribution models 24
  • 26. Inconsistent write = write-write conflict multiple writes of the same data at the same time (highly likely with peer-to-peer replication) Inconsistent read = read-write conflict read in the middle of someone else’s write 15/12/2012 Scalability – Horizontal scalability: consistency 26
  • 27.  Pessimistic approach prevent conflicts from occurring  Optimistic approach detect conflicts and fix them 15/12/2012 Scalability – Horizontal scalability: consistency 27
  • 28. Implementation  write locks ⇒ acquire a lock before updating a value (only one lock at a time can be tacken) Pros/Cons  often severely degrade system responsiveness  often leads to deadlocks (hard to prevent/debug)  rely on a consistent serialization of the updates* * sequential consistency ensuring that all nodes apply operations in the same order 15/12/2012 Scalability – Horizontal scalability: consistency 28
  • 29. Implementation  conditional updates ⇒ test a value before updating it (to see if it's changed since the last read)  merged updates ⇒ merge conflicted updates somehow (save updates, record conflict and merge somehow) Pros/Cons  conditional updates rely on a consistent serialization of the updates* * sequential consistency ensuring that all nodes apply operations in the same order 15/12/2012 Scalability – Horizontal scalability: consistency 29
  • 30.  Logical consistency different data make sense together  Replication consistency same data ⇒ same value on different replicas  Read-your-writes consistency users continue seeing their updates 15/12/2012 Scalability – Horizontal scalability: consistency 30
  • 31. ACID transactions ⇒ aggregate-ignorant DBs Partially atomic updates ⇒ aggregate-oriented DBs  atomic updates within an aggregate  no atomic updates between aggregates  updates of multiple aggregates: inconsistency window  replication can lengthen inconsistency windows 15/12/2012 Scalability – Horizontal scalability: consistency 31
  • 32. Eventual consistency  nodes may have replication inconsistencies: stale (out of date) data  eventually all nodes will be synchronized 15/12/2012 Scalability – Horizontal scalability: consistency 32
  • 33. Session consistency  within a user’s session there is read-your-writes consistency (no stale data read from a node after an update on another one)  consistency lost if • session ends • the system is accessed simultaneously from different PCs  implementations • sticky session/session affinity = sessions tied to one node  affects load balancing  quite intricate with master-slave replication • version stamps  track latest version stamp seen by a session  ensure that all interactions with the data store include it 15/12/2012 Scalability – Horizontal scalability: consistency 33
  • 35. Consistency all nodes see the same data at the same time Latency the response time in interactions between nodes Availability  every nonfailing node must reply to requests  the limit of latency that we are prepared to tolerate: once latency gets too high, we give up and treat data as unavailable Partition tolerance the cluster can survive communication breakages (separating it into partitions unable to communicate with each other) 15/12/2012 Scalability – Horizontal scalability: CAP theorem 35
  • 36. 1) read(A) 2) A = A – 50 Transaction to transfer $50 from account A to account B 3) write(A) 4) read(B) 5) B = B + 50 6) write(B)  Atomicity • transaction fails after 3 and before 6 ⇒ the system should ensure that its updates are not reflected in the database  Consistency • A + B is unchanged by the execution of the transaction 15/12/2012 Scalability – Horizontal scalability: CAP theorem 36
  • 37. 1) read(A) 2) A = A – 50 Transaction to transfer $50 from account A to account B 3) write(A) 4) read(B) 5) B = B + 50 6) write(B)  Isolation • another transaction will see inconsistent data between 3 and 6 (A + B will be less than it should be) • Isolation can be ensured trivially by running transactions serially ⇒ performance issue  Durability • user notified that transaction completed ($50 transferred) ⇒ transaction updates must persist despite failures 15/12/2012 Scalability – Horizontal scalability: CAP theorem 37
  • 38. Basically Available Soft state Eventually consistent Soft state and eventual consistency are techniques that work well in the presence of partitions and thus promote availability 15/12/2012 Scalability – Horizontal scalability: CAP theorem 38
  • 39. Given the three properties of Consistency, Availability and Partition tolerance, you can only get two 15/12/2012 Scalability – Horizontal scalability: CAP theorem 39
  • 40. C being up and keeping consistency is reasonable A one node: if it’s up it’s available P a single machine can’t partition 15/12/2012 Scalability – Horizontal scalability: CAP theorem 40
  • 41. AP ( C ) partition ⇒ update on one node = inconsistency 15/12/2012 Scalability – Horizontal scalability: CAP theorem 41
  • 42. CP ( A ) partition ⇒ consistency only if one nonfailing node stops replying to requests 15/12/2012 Scalability – Horizontal scalability: CAP theorem 42
  • 43. CA ( P ) nodes communicate ⇒ C and A can be preserved partition ⇒ all nodes on one partition must be turned off (failing nodes preserve A) difficult and expensive 15/12/2012 Scalability – Horizontal scalability: CAP theorem 43
  • 44. ACID databases focus on consistency first and availability second BASE databases focus on availability first and consistency second 15/12/2012 Scalability – Horizontal scalability: CAP theorem 44
  • 45. Single server  no partitions  consistency versus performance: relaxed isolation levels or no transactions Cluster  consistency versus latency/availability  durability versus performance (e.g. in memory DBs)  durability versus latency (e.g. the master acknowledges the update to the client only after having been acknowledged by some slaves) 15/12/2012 Scalability – Horizontal scalability: CAP theorem 45
  • 46. strong write consistency ⇒ write to the master strong read consistency ⇒ read from the master 15/12/2012 Scalability – Horizontal scalability: CAP theorem 46
  • 47. N = replication factor (nodes involved in replication NOT nodes in the cluster) W = nodes confirming a write R = nodes needed for a consistent read write quorum: W > N/2 read quorum: R + W > N Consistency is on a per operation basis Choose the most appropriate combination of problems and advantages 15/12/2012 Scalability – Horizontal scalability: CAP theorem 47