SlideShare una empresa de Scribd logo
1 de 77
Lecture 20 
Scalability
Agenda 
 Evolution - where are we today? 
 Requirements of 21st century web applications 
 Session State 
 Distribution Strategies 
 Scale Cube 
 Eventual Consistency 
– CAP Theorm 
 Real World Example
Evolution 
60s 70s 80s 90s 00s 
IBM 
Mainframes 
Limited 
layering or 
abstraction 
IBM, DEC 
Mini-computers 
Unix, VAX 
“Dumb” 
terminals 
Screens/Files 
PC, Intel, 
DOS, Mac, 
Unix, 
Windows 
Client/Server 
RMDB 
Windows 
Internet 
HTTP 
Web 
Browsers 
Web 
Applications 
RMDB 
Windows, 
Linux 
MacOS 
Browsers, 
Services 
Domain 
Applications 
RMDB 
10s 
iOS 
Android 
HTML5 
Browsers 
Apps 
API 
Cloud 
NoSQL
Motivation 
 Requirements of 21st century web systems 
– High availability 
– Millions of simultaneous users 
– Peak load of 1000s tx/sec 
 Example 
– What if we need to handle load of 20.000 tx/sec? 
– That’s 1.2 million tx per minute
Session State
Business Transactions 
 Transactions that expand more than one request 
– User is working with data before they are committed 
to the database 
• Example: User logs in, puts products in a shopping cart, 
buys, and logs out 
– Where do we keep the state between transactions?
State 
 Server with state vs. stateless server 
– Stateful server must keep the state between requests 
 Problem with stateful servers 
– Need more resources, limit scalability
Stateless Servers 
 Stateless servers scale much better 
 Use fewer resources 
 Example: 
– View book information 
– Each request is separate 
 REST was designed to be stateless
Stateful Servers 
 Stateful servers are the norm 
 Not easy to get rid of them 
 Problem: they take resources and cause server 
affinity 
 Example: 
– 100 users make request every 10 second, each 
request takes 1 second 
– One stateful object per user 
– Object are Idle 90% of the time
Session State 
 State that is relevant to a session 
– State used in business transactions and belong to a 
specific client 
– Data structure belonging to a client 
– May not be consistent until they are persisted 
 Session is distinct from record data 
– Record data is a long-term persistent data in a 
database 
– Session state might en up as record data
EXCERISE 
Question: 
Where do you store the session?
Ways to Store Session State 
 We have three players 
– The client using a web browser 
– The Server running the web application and domain 
– The database storing all the data
Ways to Store Session State 
 Three basic choices 
– Client Session State 
– Server Session State 
– Database Session State
Client Session State 
Store session state on the client 
 How It Works 
– Desktop applications can store the state in memory 
– Web solutions can store state in cookies, hide it in the 
web page, or use the URL 
– Data Transfer Object can be used 
– Session ID is the minimum client state 
– Works well with REST
Client Session State 
 When to Use It 
– Works well if server is stateless 
– Maximal clustering and failover resiliency 
 Drawbacks 
– Does not work well for large amount of data 
– Data gets lost if client crashes 
– Security issues
Server Session State 
Store session state on a server in a 
serialized form 
 How It Works 
– Session Objects – data structures on the server 
keyed to session Id 
 Format of data 
– Can be binary, objects or XML 
 Where to store session 
– Memory, application server, file or local database
Server Session State 
 Specific Implementations 
– HttpSession 
– Stateful Session Beans – EJB 
 When to Use It 
– Simplicity, it is easy to store and receive data 
 Drawbacks 
– Data can get lost if server goes down 
– Clustering and session migration becomes difficult 
– Space complexity (memory of server) 
– Inactive sessions need to be cleaned up
Database Session State 
Store session data as committed data in the database 
 How It Works 
– Session State stored in the database 
– Can be stored as temporary data to distinguish from 
committed record data 
 Pending session data 
– Pending session data might violate integrity rules 
– Use of pending field or pending tables 
• When pending session data becomes record data it is save in 
the real tables
Database Session State 
 When to Use It 
– Improved scalability – easy to add servers 
– Works well in clusters 
– Data is persisted, even if data centre goes down 
 Drawbacks 
– Database becomes a bottleneck 
– Need of clean up procedure of pending data that did 
not become record data – user just left
What about dead sessions? 
 Client session 
– Not our problem 
 Server session 
– Web servers will send inactive message upon timeout 
 Database session 
– Need to be clean up 
– Retention routines
Caching 
 Caching is temporary data that is kept in 
memory between requests for performance 
reasons 
– Not session data 
– Can be thrown away and retrieved any time 
 Saves the round-trip to the database 
 Can become stale or old and out-dated 
– Distributed caching (message driven cache) is one 
way to solve that
Practical Example 
 Client session 
– For preferences, 
user selections 
 Server session 
– Used for browsing and 
caching 
– Logged in customer 
 Database 
– “Legal” session 
– Stored, trackable, need to survive between sessions
QUIZ 
We are building an application for processing development 
grants. The application is complicated and users can login any 
time and continue work on their application. What design 
pattern would we use for storing the session? 
A) Client Session State 
B) Server Session State 
C) Database Session State 
D) No state required
QUIZ 
We are building an application for processing development 
grants. The application is complicated and users can login any 
time and continue work on their application. What design 
pattern would we use for storing the session? 
A) Client Session State 
B) Server Session State 
C) Database Session State 
D) No state required 
✔
Distribution Strategies
Distributed Architecture 
 Distribute processing by placing objects on 
different nodes
Distributed Architecture 
 Distribute processing by placing objects on 
different nodes 
 Benefits 
– Load is distributed between different nodes giving 
overall better performance 
– It is easy to add new nodes 
– Middleware products make calls between nodes 
transparent 
But is this true?
Distributed Architecture 
 Distribute processing by placing objects different 
nodes 
“This design sucks like an inverted hurricane” – 
Fowler 
Fowler’s First Law of Distributed Object Design: 
Don't Distribute your objects!
Remote and Local Interfaces 
 Local calls 
– Calls between components on the same node are 
local 
 Remote calls 
– Calls between components on different machines are 
remote 
 Objects Oriented programming 
– Promotes fine-grained objects
Remote and Local Interfaces 
 Local call within a process is very, very fast 
 Remote call between two processes is order-of-magnitude 
s l o w e r 
– Marshalling and un-marshalling of objects 
– Data transfer over the network 
 With fine-grained object oriented design, remote 
components can kill performance 
 Example 
– Address object has get and set method for each member, 
city, street, and so on 
– Will result in many remote calls
Remote and Local Interfaces 
 With distributed architectures, interfaces must 
be course-grained 
– Minimizing remote function calls 
 Example 
– Instead of having getters and setters for each field, 
bulk assessors are used
Distributed Architecture 
 Better distribution model (X scaling) 
– Load Balancing or Clustering the application involves 
putting several copies of the same application on 
different nodes
Where You Have to Distribute 
 As architect, try to eliminate as many remote call 
as possible 
– If this cannot be archived choose carefully where the 
distribution boundaries lay 
 Distribution Boundaries 
– Client/Server 
– Server/Database 
– Web Server/Application Server 
– Separation due to vendor differences 
– There might be some genuine reason
Optimizing Remote Calls 
 We know remote calls are expensive 
 How can we minimize the cost of remote calls? 
 The overhead is 
– Marshaling or serializing data 
– Network transfer 
 Put as enough data into the call 
– Course grained call 
– Use binary protocols – avoid XML
The Right Balance 
 In Service Architecture, we want to split by 
functionality (Y Scaling) 
– Boundaries must be well designed – objects that work 
together are grouped together 
– APIs must be
The Scale Cube
Scaling the application 
 Today’s web sites must handle multiple 
simulations users 
 Examples: 
– All web based apps must handle several users 
– mbl.is handles >200.000 users/day 
– Betware must handle up to 100.000 simultaneous 
users and 1,2 million tx/min for terminal system peak 
load
The World we Live in 
 Average number of tweets per day 500 
million 
 Total number of minutes spent on 
Facebook each month 700 billion 
 SnapChat has 100 million daily active 
users who send 1 billion snaps each 
day 
 Instagram has over 200 million users 
on the platform who send 60 million 
photos per day
Scalability 
 Scalability is the ability of a system, network, or 
process to handle a growing amount of work in a 
capable manner or its ability to be enlarged to 
accommodate that growth 
 With more load, how does the load of the 
system vary?
Scalability 
 Scalability is the measure of how adding 
resource (usually hardware) affects the 
performance 
– Vertical scalability (up) – increase server power 
– Horizontal scalability (out) – increase the servers 
 Session migration 
– Move the session for one server to another 
 Server affinity 
– Keep the session on one server and make the client 
always use the same server
Scalability 
 How is the system growth pattern – what is the 
formula?
Amdahl’s Law 
 This law is used to find the maximum expected 
improvement to an overall system when only 
part of the system is improved 
 In parallel computing, it states that a small 
portion of the program which cannot be 
parallelized will limit the overall speed-up 
available from parallelization
Amdahl’s Law 
 Amdahl’s law for overall speedup 
1 
Overall speedup = 
F 
(1 – F) + 
S 
F = The fraction enhanced 
S = The speedup of the enhanced fraction 
If we make 20% of the program be 10x faster 
F=0.2 
S=10 
1 
overall speedup = 
0.2 
(1 – 0.2) + 
10 
Gives 1.22 in overall speedup 
IF S = 1000, overall speedup is 1.25
Amdahl’s Corollary 
 Make the common case fast 
– Common case being defined as “most time 
consuming” 
40% 10x faster => 1.5625 
20% 100x faster => 1.2468
The Optimization Process 
 There is only one way to test scalability: 
Measure 
– Find the bottleneck (the common case) 
– Hypothesize about improvement 
– Make optimization – change only one thing a time 
– Measure again and repeat
The Scaling Problem 
 We need to handle number of request to our 
system 
 There are two ways to scale: 
– Vertically or scale up: Add more capacity to your 
hardware, more memory for example 
– Horizontal or scale out: Add more machines
Scaling Up 
 This is the traditional approach for many 
monolithic systems 
 Use a big powerful system 
 Pros: 
– Easy to do, easy to understand 
– One memory space and one database 
 Cons: 
– Has very hard limits 
– Does not work for the 21st century requirements
The Scale Cube 
One 
Monolithic 
system 
X-Axis: Horizontal Duplication 
Multiple 
Services 
Clustered 
Single data 
lookups 
Many System 
Duplicated 
Y-Axis: 
Split by 
Functionality 
No splits 
No splits 
Splits by 
functionality
Scaling Out (X scaling) 
 This can work for monolithic systems if the 
database requirements is not high 
 Use a many machines and distribute the load 
– Have one big powerful database 
 Pros: 
– Scales well – handles much more load 
– Shared database 
 Cons: 
– Session management is a challenge 
– Database is a bottleneck
Load Distribution 
 Use number of machines to handle requests 
 Load Balancer directs all 
request to particular server 
– All requests in one session go 
to the same server 
– Server affinity 
 Benefits 
– Load can be increased 
– Easy to add new pairs 
– Uptime is increased 
 Drawbacks 
– Database is a bootleneck
Clustering 
 With clustering, servers 
are connected together 
as they were a single 
computer 
– Request can be handled 
by any server 
– Sessions are stored on 
multiple servers 
– Servers can be added and 
removed any time 
 Problem is with state 
– State in application servers reduces scalability 
– Clients become dependant on particular nodes
Clustering State 
 Application functionality 
– Handle it yourself, but this is complicated, not worth 
the effort 
 Shared resources 
– Well-known pattern (Database Session State) 
– Problem with bottlenecks limits scalablity 
 Clustering Middleware 
– Several solutions, for example JBoss, Terracotta 
 Clustering JVM or network 
– Low levels, transparent to applications
Scalability Example
Scalability Example
The Scale Cube 
One 
Monolithic 
system 
X-Axis: Horizontal Duplication 
Multiple 
Services 
Clustered 
Single data 
lookups 
Many System 
Duplicated 
Y-Axis: 
Split by 
Functionality 
No splits 
No splits 
Many Services 
Data/User splits 
Splits by 
functionality 
Near infinite scaling
Eventual Consistency
Transactions 
 Transaction is a bounded sequence of work 
– Both start and finish is well defined 
– Transaction must complete on an all-or-nothing basis 
 All resources are in consistent state before and 
after the transaction 
 Example: Database transaction 
– Withdraw data from account 
– Buy the product 
– Update stock information 
 Transactions must have ACID properties
ACID properties 
 Atomicity 
– All steps are completed successfully – or rolled back 
 Consistency 
– Data is consistent at the start and the end of the 
transaction 
 Isolation 
– Transaction is not visible to any other until that transaction 
commits successfully 
 Durability 
– Any results of a committed transaction must be made 
permanent
Transactional Resources 
 Anything that is transactional 
– Use transaction to control concurrency 
– Databases, printers, message queues 
 Transaction must be as short as possible 
– Provides greatest throughput 
– Should not span multiple requests 
– Long transactions span multiple request
Transaction Isolations and 
Liveness  Transactions lock tables (or resources) 
– Need to provide isolation to guarantee correctness 
– Liveness suffers 
– We need to control isolation 
 Serializable Transactions 
– Full isolation 
– Transactions are executed serially, one after the other 
– Benefits: Guarantees correctness 
– Drawbacks: Can seriously damage liveness and 
performance
Isolation Level 
 Problems can be controlled by setting the 
isolation level 
– We don’t want to lock table since it reduces 
performance 
– Solution is to use as low isolation as possible while 
keeping correctness
Problem 
 Serialization crates scalability bottlenecks 
 Applications that support fully secure 
serialization of using RMDB have hard time with 
scale 
 Can we scarify something? 
– Can we relax these requirements?
CAP Theorem 
 States that it is impossible for a distributed 
computer system to simultaneously provide all 
three of the following guarantees: 
– Consistency: all nodes see the same data at the 
same time 
– Availability: a guarantee that every request receives 
a response about whether it was successful or failed 
– Partition tolerance: the system continues to operate 
despite arbitrary message loss or failure of part of the 
system
ACID vs. BASE 
 BASE: Basically Available, Soft state, Eventual 
consistency 
 Basically Available: Guarantees availability of 
the database 
 Soft state: The state of the system can change 
over time - even without input. 
 Eventual consistency: The system will 
eventually become consistent over time given no 
new input
ACID vs. BASE 
 The difference has more to do with 
synchronous and asynchronous messaging 
 For large scale systems asynchronous caters for 
the fastest and least restricted workflow
Asynchronous 
 Eventual Consistency example 
Web Layer 
Requests 
Approve RMDB 
Msg 
Q 
Process
Measuring Scalability 
 The only meaningful way to know about 
system’s performance is to measure it 
 Performance Tools can help this process 
– Give indication of scalability 
– Identify bottlenecks
Example tool: LoadRunner
Example tool: JMeter
Real World Examples: 
Betware Iceland Data Center
ISP1 ISP2 
Hardware 
firewall 
Load 
balancer 
16 port 2Gbps 
SAN switch 
QLogic 
12 x 300GB 
SAS 15K 
24 x 300GB 
SAS 15K 
IBM Blade 
Chassis 
System 
DB 
CMS 
DB 
Backup 
Software 
Pair of each 
server on 
separate blade
Summary 
 Requirements of 21st century web applications 
– Availability, Eventual consistency 
 Session State 
– Client, Server, Database 
 Distribution Strategies 
– Don’t distribute fine grained object – identify 
bouneries 
 The Scale Cube 
 Eventual Consistency 
– CAP Theorm 
 Real World Example

Más contenido relacionado

La actualidad más candente

Client computing evolution ppt11
Client computing evolution ppt11Client computing evolution ppt11
Client computing evolution ppt11Tech_MX
 
Client server computing
Client server computingClient server computing
Client server computingjorge cabiao
 
Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systemsTinniam V Ganesh (TV)
 
zahidCvFinal(Updated Jan 17)-2
zahidCvFinal(Updated Jan 17)-2zahidCvFinal(Updated Jan 17)-2
zahidCvFinal(Updated Jan 17)-2Zahid Ayub
 
Client server computing_keypoint_and_questions
Client server computing_keypoint_and_questionsClient server computing_keypoint_and_questions
Client server computing_keypoint_and_questionslucky94527
 
Client server-computing
Client server-computingClient server-computing
Client server-computingjayasreep3
 
Architecture and Distributed Systems, Web Distributed Systems Design
Architecture and Distributed Systems, Web Distributed Systems DesignArchitecture and Distributed Systems, Web Distributed Systems Design
Architecture and Distributed Systems, Web Distributed Systems DesignArmen Arzumanyan
 
Evaluating Cloud Database Offerings
Evaluating Cloud Database OfferingsEvaluating Cloud Database Offerings
Evaluating Cloud Database OfferingsChristopher Foot
 
Client server architecture
Client server architectureClient server architecture
Client server architectureBhargav Amin
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systemsMalisa Ncube
 
Client server based computing
Client server based computingClient server based computing
Client server based computingMohammad Affan
 
Server vs Client in real life and in programming world
Server vs Client in real life and in programming worldServer vs Client in real life and in programming world
Server vs Client in real life and in programming worldManoj Kumar
 
Saas & DBaas
Saas & DBaasSaas & DBaas
Saas & DBaasalkuzaee
 
What is a database server and client ?
What is a database server and client ?What is a database server and client ?
What is a database server and client ?Open E-School
 
Roman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail UnleashedRoman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail UnleashedMSDEVMTL
 
DBaaS- Database as a Service in a DBAs World
DBaaS- Database as a Service in a DBAs WorldDBaaS- Database as a Service in a DBAs World
DBaaS- Database as a Service in a DBAs WorldKellyn Pot'Vin-Gorman
 

La actualidad más candente (20)

Client computing evolution ppt11
Client computing evolution ppt11Client computing evolution ppt11
Client computing evolution ppt11
 
Client server computing
Client server computingClient server computing
Client server computing
 
Design principles of scalable, distributed systems
Design principles of scalable, distributed systemsDesign principles of scalable, distributed systems
Design principles of scalable, distributed systems
 
zahidCvFinal(Updated Jan 17)-2
zahidCvFinal(Updated Jan 17)-2zahidCvFinal(Updated Jan 17)-2
zahidCvFinal(Updated Jan 17)-2
 
Client server computing_keypoint_and_questions
Client server computing_keypoint_and_questionsClient server computing_keypoint_and_questions
Client server computing_keypoint_and_questions
 
Client server-computing
Client server-computingClient server-computing
Client server-computing
 
Architecture and Distributed Systems, Web Distributed Systems Design
Architecture and Distributed Systems, Web Distributed Systems DesignArchitecture and Distributed Systems, Web Distributed Systems Design
Architecture and Distributed Systems, Web Distributed Systems Design
 
Evaluating Cloud Database Offerings
Evaluating Cloud Database OfferingsEvaluating Cloud Database Offerings
Evaluating Cloud Database Offerings
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Client server architecture
Client server architectureClient server architecture
Client server architecture
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
 
Client Server Computing : unit 1
Client Server Computing : unit 1Client Server Computing : unit 1
Client Server Computing : unit 1
 
Client server based computing
Client server based computingClient server based computing
Client server based computing
 
Server vs Client in real life and in programming world
Server vs Client in real life and in programming worldServer vs Client in real life and in programming world
Server vs Client in real life and in programming world
 
Saas & DBaas
Saas & DBaasSaas & DBaas
Saas & DBaas
 
What is a database server and client ?
What is a database server and client ?What is a database server and client ?
What is a database server and client ?
 
Roman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail UnleashedRoman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail Unleashed
 
DBaaS- Database as a Service in a DBAs World
DBaaS- Database as a Service in a DBAs WorldDBaaS- Database as a Service in a DBAs World
DBaaS- Database as a Service in a DBAs World
 
Multi tenant architecture
Multi tenant architectureMulti tenant architecture
Multi tenant architecture
 
DC
DCDC
DC
 

Similar a L21 scalability

Caching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching PatternsCaching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching PatternsVMware Tanzu
 
Camunda BPM 7.2: Performance and Scalability (English)
Camunda BPM 7.2: Performance and Scalability (English)Camunda BPM 7.2: Performance and Scalability (English)
Camunda BPM 7.2: Performance and Scalability (English)camunda services GmbH
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
ScalabilityAvailability
ScalabilityAvailabilityScalabilityAvailability
ScalabilityAvailabilitywebuploader
 
Webinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsWebinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsTechcello
 
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Nati Shalom
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureTapio Rautonen
 
Caching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session ICaching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session IVMware Tanzu
 
8 application servers_v2
8 application servers_v28 application servers_v2
8 application servers_v2ashish61_scs
 
SWsoft Hosting Solutions for SaaS
SWsoft Hosting Solutions for SaaSSWsoft Hosting Solutions for SaaS
SWsoft Hosting Solutions for SaaSwebhostingguy
 
Logical Architecture for Protection
Logical Architecture for ProtectionLogical Architecture for Protection
Logical Architecture for ProtectionSunita Shrivastava
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that growGibraltar Software
 
Client server architecture
Client server architectureClient server architecture
Client server architectureRituBhargava7
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices SessionNZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices SessionMichael Noel
 
Movile Internet Movel SA: A Change of Seasons: A big move to Apache Cassandra
Movile Internet Movel SA: A Change of Seasons: A big move to Apache CassandraMovile Internet Movel SA: A Change of Seasons: A big move to Apache Cassandra
Movile Internet Movel SA: A Change of Seasons: A big move to Apache CassandraDataStax Academy
 
Cassandra Summit 2015 - A Change of Seasons
Cassandra Summit 2015 - A Change of SeasonsCassandra Summit 2015 - A Change of Seasons
Cassandra Summit 2015 - A Change of SeasonsEiti Kimura
 

Similar a L21 scalability (20)

L19 Application Architecture
L19 Application ArchitectureL19 Application Architecture
L19 Application Architecture
 
Caching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching PatternsCaching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching Patterns
 
Client server
Client serverClient server
Client server
 
Camunda BPM 7.2: Performance and Scalability (English)
Camunda BPM 7.2: Performance and Scalability (English)Camunda BPM 7.2: Performance and Scalability (English)
Camunda BPM 7.2: Performance and Scalability (English)
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
ScalabilityAvailability
ScalabilityAvailabilityScalabilityAvailability
ScalabilityAvailability
 
Webinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS ApplicationsWebinar How to Achieve True Scalability in SaaS Applications
Webinar How to Achieve True Scalability in SaaS Applications
 
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
Designing a Scalable Twitter - Patterns for Designing Scalable Real-Time Web ...
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud Infrastructure
 
Caching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session ICaching for Microservices Architectures: Session I
Caching for Microservices Architectures: Session I
 
8 application servers_v2
8 application servers_v28 application servers_v2
8 application servers_v2
 
SWsoft Hosting Solutions for SaaS
SWsoft Hosting Solutions for SaaSSWsoft Hosting Solutions for SaaS
SWsoft Hosting Solutions for SaaS
 
Logical Architecture for Protection
Logical Architecture for ProtectionLogical Architecture for Protection
Logical Architecture for Protection
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
 
Client server architecture
Client server architectureClient server architecture
Client server architecture
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
L11 Application Architecture
L11 Application ArchitectureL11 Application Architecture
L11 Application Architecture
 
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices SessionNZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
 
Movile Internet Movel SA: A Change of Seasons: A big move to Apache Cassandra
Movile Internet Movel SA: A Change of Seasons: A big move to Apache CassandraMovile Internet Movel SA: A Change of Seasons: A big move to Apache Cassandra
Movile Internet Movel SA: A Change of Seasons: A big move to Apache Cassandra
 
Cassandra Summit 2015 - A Change of Seasons
Cassandra Summit 2015 - A Change of SeasonsCassandra Summit 2015 - A Change of Seasons
Cassandra Summit 2015 - A Change of Seasons
 

Más de Ólafur Andri Ragnarsson

New Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course IntroductionNew Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course IntroductionÓlafur Andri Ragnarsson
 
New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine Ólafur Andri Ragnarsson
 

Más de Ólafur Andri Ragnarsson (20)

Nýsköpun - Leiðin til framfara
Nýsköpun - Leiðin til framfaraNýsköpun - Leiðin til framfara
Nýsköpun - Leiðin til framfara
 
Nýjast tækni og framtíðin
Nýjast tækni og framtíðinNýjast tækni og framtíðin
Nýjast tækni og framtíðin
 
New Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course IntroductionNew Technology Summer 2020 Course Introduction
New Technology Summer 2020 Course Introduction
 
L01 Introduction
L01 IntroductionL01 Introduction
L01 Introduction
 
L23 Robotics and Drones
L23 Robotics and Drones L23 Robotics and Drones
L23 Robotics and Drones
 
L22 Augmented and Virtual Reality
L22 Augmented and Virtual RealityL22 Augmented and Virtual Reality
L22 Augmented and Virtual Reality
 
L20 Personalised World
L20 Personalised WorldL20 Personalised World
L20 Personalised World
 
L19 Network Platforms
L19 Network PlatformsL19 Network Platforms
L19 Network Platforms
 
L18 Big Data and Analytics
L18 Big Data and AnalyticsL18 Big Data and Analytics
L18 Big Data and Analytics
 
L17 Algorithms and AI
L17 Algorithms and AIL17 Algorithms and AI
L17 Algorithms and AI
 
L16 Internet of Things
L16 Internet of ThingsL16 Internet of Things
L16 Internet of Things
 
L14 From the Internet to Blockchain
L14 From the Internet to BlockchainL14 From the Internet to Blockchain
L14 From the Internet to Blockchain
 
L14 The Mobile Revolution
L14 The Mobile RevolutionL14 The Mobile Revolution
L14 The Mobile Revolution
 
New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine New Technology 2019 L13 Rise of the Machine
New Technology 2019 L13 Rise of the Machine
 
L12 digital transformation
L12 digital transformationL12 digital transformation
L12 digital transformation
 
L10 The Innovator's Dilemma
L10 The Innovator's DilemmaL10 The Innovator's Dilemma
L10 The Innovator's Dilemma
 
L09 Disruptive Technology
L09 Disruptive TechnologyL09 Disruptive Technology
L09 Disruptive Technology
 
L09 Technological Revolutions
L09 Technological RevolutionsL09 Technological Revolutions
L09 Technological Revolutions
 
L07 Becoming Invisible
L07 Becoming InvisibleL07 Becoming Invisible
L07 Becoming Invisible
 
L06 Diffusion of Innovation
L06 Diffusion of InnovationL06 Diffusion of Innovation
L06 Diffusion of Innovation
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

L21 scalability

  • 2. Agenda  Evolution - where are we today?  Requirements of 21st century web applications  Session State  Distribution Strategies  Scale Cube  Eventual Consistency – CAP Theorm  Real World Example
  • 3. Evolution 60s 70s 80s 90s 00s IBM Mainframes Limited layering or abstraction IBM, DEC Mini-computers Unix, VAX “Dumb” terminals Screens/Files PC, Intel, DOS, Mac, Unix, Windows Client/Server RMDB Windows Internet HTTP Web Browsers Web Applications RMDB Windows, Linux MacOS Browsers, Services Domain Applications RMDB 10s iOS Android HTML5 Browsers Apps API Cloud NoSQL
  • 4. Motivation  Requirements of 21st century web systems – High availability – Millions of simultaneous users – Peak load of 1000s tx/sec  Example – What if we need to handle load of 20.000 tx/sec? – That’s 1.2 million tx per minute
  • 6. Business Transactions  Transactions that expand more than one request – User is working with data before they are committed to the database • Example: User logs in, puts products in a shopping cart, buys, and logs out – Where do we keep the state between transactions?
  • 7. State  Server with state vs. stateless server – Stateful server must keep the state between requests  Problem with stateful servers – Need more resources, limit scalability
  • 8. Stateless Servers  Stateless servers scale much better  Use fewer resources  Example: – View book information – Each request is separate  REST was designed to be stateless
  • 9. Stateful Servers  Stateful servers are the norm  Not easy to get rid of them  Problem: they take resources and cause server affinity  Example: – 100 users make request every 10 second, each request takes 1 second – One stateful object per user – Object are Idle 90% of the time
  • 10. Session State  State that is relevant to a session – State used in business transactions and belong to a specific client – Data structure belonging to a client – May not be consistent until they are persisted  Session is distinct from record data – Record data is a long-term persistent data in a database – Session state might en up as record data
  • 11. EXCERISE Question: Where do you store the session?
  • 12. Ways to Store Session State  We have three players – The client using a web browser – The Server running the web application and domain – The database storing all the data
  • 13. Ways to Store Session State  Three basic choices – Client Session State – Server Session State – Database Session State
  • 14. Client Session State Store session state on the client  How It Works – Desktop applications can store the state in memory – Web solutions can store state in cookies, hide it in the web page, or use the URL – Data Transfer Object can be used – Session ID is the minimum client state – Works well with REST
  • 15. Client Session State  When to Use It – Works well if server is stateless – Maximal clustering and failover resiliency  Drawbacks – Does not work well for large amount of data – Data gets lost if client crashes – Security issues
  • 16. Server Session State Store session state on a server in a serialized form  How It Works – Session Objects – data structures on the server keyed to session Id  Format of data – Can be binary, objects or XML  Where to store session – Memory, application server, file or local database
  • 17. Server Session State  Specific Implementations – HttpSession – Stateful Session Beans – EJB  When to Use It – Simplicity, it is easy to store and receive data  Drawbacks – Data can get lost if server goes down – Clustering and session migration becomes difficult – Space complexity (memory of server) – Inactive sessions need to be cleaned up
  • 18. Database Session State Store session data as committed data in the database  How It Works – Session State stored in the database – Can be stored as temporary data to distinguish from committed record data  Pending session data – Pending session data might violate integrity rules – Use of pending field or pending tables • When pending session data becomes record data it is save in the real tables
  • 19. Database Session State  When to Use It – Improved scalability – easy to add servers – Works well in clusters – Data is persisted, even if data centre goes down  Drawbacks – Database becomes a bottleneck – Need of clean up procedure of pending data that did not become record data – user just left
  • 20. What about dead sessions?  Client session – Not our problem  Server session – Web servers will send inactive message upon timeout  Database session – Need to be clean up – Retention routines
  • 21. Caching  Caching is temporary data that is kept in memory between requests for performance reasons – Not session data – Can be thrown away and retrieved any time  Saves the round-trip to the database  Can become stale or old and out-dated – Distributed caching (message driven cache) is one way to solve that
  • 22. Practical Example  Client session – For preferences, user selections  Server session – Used for browsing and caching – Logged in customer  Database – “Legal” session – Stored, trackable, need to survive between sessions
  • 23. QUIZ We are building an application for processing development grants. The application is complicated and users can login any time and continue work on their application. What design pattern would we use for storing the session? A) Client Session State B) Server Session State C) Database Session State D) No state required
  • 24. QUIZ We are building an application for processing development grants. The application is complicated and users can login any time and continue work on their application. What design pattern would we use for storing the session? A) Client Session State B) Server Session State C) Database Session State D) No state required ✔
  • 26. Distributed Architecture  Distribute processing by placing objects on different nodes
  • 27. Distributed Architecture  Distribute processing by placing objects on different nodes  Benefits – Load is distributed between different nodes giving overall better performance – It is easy to add new nodes – Middleware products make calls between nodes transparent But is this true?
  • 28. Distributed Architecture  Distribute processing by placing objects different nodes “This design sucks like an inverted hurricane” – Fowler Fowler’s First Law of Distributed Object Design: Don't Distribute your objects!
  • 29. Remote and Local Interfaces  Local calls – Calls between components on the same node are local  Remote calls – Calls between components on different machines are remote  Objects Oriented programming – Promotes fine-grained objects
  • 30. Remote and Local Interfaces  Local call within a process is very, very fast  Remote call between two processes is order-of-magnitude s l o w e r – Marshalling and un-marshalling of objects – Data transfer over the network  With fine-grained object oriented design, remote components can kill performance  Example – Address object has get and set method for each member, city, street, and so on – Will result in many remote calls
  • 31. Remote and Local Interfaces  With distributed architectures, interfaces must be course-grained – Minimizing remote function calls  Example – Instead of having getters and setters for each field, bulk assessors are used
  • 32. Distributed Architecture  Better distribution model (X scaling) – Load Balancing or Clustering the application involves putting several copies of the same application on different nodes
  • 33. Where You Have to Distribute  As architect, try to eliminate as many remote call as possible – If this cannot be archived choose carefully where the distribution boundaries lay  Distribution Boundaries – Client/Server – Server/Database – Web Server/Application Server – Separation due to vendor differences – There might be some genuine reason
  • 34. Optimizing Remote Calls  We know remote calls are expensive  How can we minimize the cost of remote calls?  The overhead is – Marshaling or serializing data – Network transfer  Put as enough data into the call – Course grained call – Use binary protocols – avoid XML
  • 35. The Right Balance  In Service Architecture, we want to split by functionality (Y Scaling) – Boundaries must be well designed – objects that work together are grouped together – APIs must be
  • 37. Scaling the application  Today’s web sites must handle multiple simulations users  Examples: – All web based apps must handle several users – mbl.is handles >200.000 users/day – Betware must handle up to 100.000 simultaneous users and 1,2 million tx/min for terminal system peak load
  • 38.
  • 39. The World we Live in  Average number of tweets per day 500 million  Total number of minutes spent on Facebook each month 700 billion  SnapChat has 100 million daily active users who send 1 billion snaps each day  Instagram has over 200 million users on the platform who send 60 million photos per day
  • 40. Scalability  Scalability is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth  With more load, how does the load of the system vary?
  • 41. Scalability  Scalability is the measure of how adding resource (usually hardware) affects the performance – Vertical scalability (up) – increase server power – Horizontal scalability (out) – increase the servers  Session migration – Move the session for one server to another  Server affinity – Keep the session on one server and make the client always use the same server
  • 42. Scalability  How is the system growth pattern – what is the formula?
  • 43. Amdahl’s Law  This law is used to find the maximum expected improvement to an overall system when only part of the system is improved  In parallel computing, it states that a small portion of the program which cannot be parallelized will limit the overall speed-up available from parallelization
  • 44. Amdahl’s Law  Amdahl’s law for overall speedup 1 Overall speedup = F (1 – F) + S F = The fraction enhanced S = The speedup of the enhanced fraction If we make 20% of the program be 10x faster F=0.2 S=10 1 overall speedup = 0.2 (1 – 0.2) + 10 Gives 1.22 in overall speedup IF S = 1000, overall speedup is 1.25
  • 45. Amdahl’s Corollary  Make the common case fast – Common case being defined as “most time consuming” 40% 10x faster => 1.5625 20% 100x faster => 1.2468
  • 46. The Optimization Process  There is only one way to test scalability: Measure – Find the bottleneck (the common case) – Hypothesize about improvement – Make optimization – change only one thing a time – Measure again and repeat
  • 47. The Scaling Problem  We need to handle number of request to our system  There are two ways to scale: – Vertically or scale up: Add more capacity to your hardware, more memory for example – Horizontal or scale out: Add more machines
  • 48. Scaling Up  This is the traditional approach for many monolithic systems  Use a big powerful system  Pros: – Easy to do, easy to understand – One memory space and one database  Cons: – Has very hard limits – Does not work for the 21st century requirements
  • 49. The Scale Cube One Monolithic system X-Axis: Horizontal Duplication Multiple Services Clustered Single data lookups Many System Duplicated Y-Axis: Split by Functionality No splits No splits Splits by functionality
  • 50. Scaling Out (X scaling)  This can work for monolithic systems if the database requirements is not high  Use a many machines and distribute the load – Have one big powerful database  Pros: – Scales well – handles much more load – Shared database  Cons: – Session management is a challenge – Database is a bottleneck
  • 51. Load Distribution  Use number of machines to handle requests  Load Balancer directs all request to particular server – All requests in one session go to the same server – Server affinity  Benefits – Load can be increased – Easy to add new pairs – Uptime is increased  Drawbacks – Database is a bootleneck
  • 52. Clustering  With clustering, servers are connected together as they were a single computer – Request can be handled by any server – Sessions are stored on multiple servers – Servers can be added and removed any time  Problem is with state – State in application servers reduces scalability – Clients become dependant on particular nodes
  • 53. Clustering State  Application functionality – Handle it yourself, but this is complicated, not worth the effort  Shared resources – Well-known pattern (Database Session State) – Problem with bottlenecks limits scalablity  Clustering Middleware – Several solutions, for example JBoss, Terracotta  Clustering JVM or network – Low levels, transparent to applications
  • 56. The Scale Cube One Monolithic system X-Axis: Horizontal Duplication Multiple Services Clustered Single data lookups Many System Duplicated Y-Axis: Split by Functionality No splits No splits Many Services Data/User splits Splits by functionality Near infinite scaling
  • 58. Transactions  Transaction is a bounded sequence of work – Both start and finish is well defined – Transaction must complete on an all-or-nothing basis  All resources are in consistent state before and after the transaction  Example: Database transaction – Withdraw data from account – Buy the product – Update stock information  Transactions must have ACID properties
  • 59. ACID properties  Atomicity – All steps are completed successfully – or rolled back  Consistency – Data is consistent at the start and the end of the transaction  Isolation – Transaction is not visible to any other until that transaction commits successfully  Durability – Any results of a committed transaction must be made permanent
  • 60. Transactional Resources  Anything that is transactional – Use transaction to control concurrency – Databases, printers, message queues  Transaction must be as short as possible – Provides greatest throughput – Should not span multiple requests – Long transactions span multiple request
  • 61. Transaction Isolations and Liveness  Transactions lock tables (or resources) – Need to provide isolation to guarantee correctness – Liveness suffers – We need to control isolation  Serializable Transactions – Full isolation – Transactions are executed serially, one after the other – Benefits: Guarantees correctness – Drawbacks: Can seriously damage liveness and performance
  • 62. Isolation Level  Problems can be controlled by setting the isolation level – We don’t want to lock table since it reduces performance – Solution is to use as low isolation as possible while keeping correctness
  • 63. Problem  Serialization crates scalability bottlenecks  Applications that support fully secure serialization of using RMDB have hard time with scale  Can we scarify something? – Can we relax these requirements?
  • 64. CAP Theorem  States that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: – Consistency: all nodes see the same data at the same time – Availability: a guarantee that every request receives a response about whether it was successful or failed – Partition tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system
  • 65.
  • 66. ACID vs. BASE  BASE: Basically Available, Soft state, Eventual consistency  Basically Available: Guarantees availability of the database  Soft state: The state of the system can change over time - even without input.  Eventual consistency: The system will eventually become consistent over time given no new input
  • 67. ACID vs. BASE  The difference has more to do with synchronous and asynchronous messaging  For large scale systems asynchronous caters for the fastest and least restricted workflow
  • 68. Asynchronous  Eventual Consistency example Web Layer Requests Approve RMDB Msg Q Process
  • 69. Measuring Scalability  The only meaningful way to know about system’s performance is to measure it  Performance Tools can help this process – Give indication of scalability – Identify bottlenecks
  • 72. Real World Examples: Betware Iceland Data Center
  • 73. ISP1 ISP2 Hardware firewall Load balancer 16 port 2Gbps SAN switch QLogic 12 x 300GB SAS 15K 24 x 300GB SAS 15K IBM Blade Chassis System DB CMS DB Backup Software Pair of each server on separate blade
  • 74.
  • 75.
  • 76.
  • 77. Summary  Requirements of 21st century web applications – Availability, Eventual consistency  Session State – Client, Server, Database  Distribution Strategies – Don’t distribute fine grained object – identify bouneries  The Scale Cube  Eventual Consistency – CAP Theorm  Real World Example