SlideShare una empresa de Scribd logo
1 de 17
Scalable Web Architecture and
     Distributed Systems


 Gloria Chow, Jacqui Killow, Peter Moon, Jessica Wong
Introduction to Large-Scale Web Systems
 The design of large-scale web systems is
 influenced by six key principles:
     Availability
     Performance
     Reliability
     Scalability
     Manageability
     Cost
Introduction to Large-Scale Web Systems
 Four core factors are central to almost all large
 web applications:
   Services
   Redundancy
   Partitions
   Failure Handling
Core Factors of Large Web
Systems
 Services
   Increase scalability by separating services so that
    each service has its own function
   Interactions outside of the service’s context are
    handled by an abstract interface
   This is called “Service-Oriented Architecture (SOA)”


 Redundancy
   The server needs to be able to create redundant
   copies of data in case of data loss
Core Factors of Large Web
Systems
 Partitions
   When performance degrades, more resources
    should be added
   Systems can:
     scale vertically: add more resources to the server
      (e.g. increasing capacity)
     scale horizontally: add more servers


 Failure Handling
   System should be able to recover from errors and
   faults
Achieving Fast and Scalable Data Access
 By providing fast access to data, services also
 become faster and more scalable.

 There are many options available for increasing
 the speed of data access:
   Caches
   Proxies
   Indexes
   Load Balancers
   Queues
Caches
 Similar to short-term memory
 A limited amount of storage space
 Only contains the most recently-accessed items
 Faster than accessing the original data source
 Used in all levels of architecture
   Most commonly found near the front end


 Two types of cache: Global vs. Distributed
Global Cache
 Involves adding a server or a file store that is
  faster than the original storage and accessible
  by all request nodes
 Request nodes query the cache in the same way
  they would query a local storage
 There are two ways a global cache can be set up
Two types of Global Cache:
 1. Cache is responsible for retrieval of
 data.

Request        For ea ch
                                          I f t he da t a wa s
 Node       r equest , t he
                                         not in t he ca che,
           r equest node
                                           t he ca che wil l
           wil l check t he
                                            pul l t he da t a
            ca che f ir st .
                                          f r om t he or igin
Request                                  a nd t hen keep it
 Node                                         f or f ut ur e
                                               r equest s.
                        Gl oba l Cache                           Dat a
Request
 Node



Request
 Node
Two types of Global Cache:
    2. Nodes are responsible for retrieval of data.
              1. For ea ch
             r equest , t he
          r equest node wil l                         3. When da t a is
           check t he gl oba l                        r et r ieved f r om
              ca che f ir st .   Gl oba l Ca che    t he or igin, it ca n
Request                                              be a dded t o t he
 Node                                                       ca che.


Request
 Node
                                                                            Da t a
Request
 Node

                             2. I f t he da t a wa s not in t he ca che,
                                   t hen t he Request Node wil l
Request                             r et r ieve it f r om t he or igin.
 Node
Distributed Cache
 Each node owns       Request Node
                                      The node checks t he
  part of the cached        Cache A     cache based on an
                                      it em key obt ained by
  data                                      using a hash
                                       al gor it hm and t hen
                       Request Node       t he dat a or igin.
 Uses a hashing
  function so nodes         Cache B

  can quickly find                                              Dat a
                       Request Node
  the data they
  need                      Cache C

 Storage space
                       Request Node
  can easily be
  increased just by         Cache D

  adding more
  nodes
Proxies
 Coordinates
  requests from          Request A Request
  multiple clients and              Node
  relays them to the
  backend                                             Request A
                         Request A Request                        Data
 Can speed up data
                                    Node
  access by
  grouping similar
  requests and
                         Request A Request   Proxy
  treating it as one                Node     Server
  (“collapsed-
  forwarding”)
Indexes
 A table that stores       Index                      Memor y
  the location of data                   Locat ion 0
                       Dat a Locat ion                           A
 Stored in memory
  or somewhere local A          0        Locat ion 1       B- par t 1
  to incoming                                              B- par t 2
                        B       1                          B- par t 3
  requests                               Locat ion 2
 Increases speed of    C       2                                C
  data reads                             Locat ion 3
                        D       3
 Data writes are                                                D
  slower because the
  index must be
  updated for each
  read
Load Balancers
 Distributes                                                       Request
                                                                     Node
  load across a       Request A                    Request B
  set of nodes
  responsible
  for handling      Request B
                                  Load Bal ancer
                                       (LB)
                                                       Request C
                                                                      Request
  requests                                                             Node

 Algorithms
                  Request C
  determine                                               Request A

  how to to
  distribute                                                       Request
                                                                    Node
  requests
Queues
                   Queue
 Incoming
  requests are     Task 1

  added to a                Running Task
  queue and are    Task 2
                               Task 1
  taken off when
                   Task 3
  the server has
                               Task 2
  processed it
                   Task 4
 Allows clients
  to work          Task 5
  asynchronousl
  y
Summary
 The six principles that govern the design of large
  web systems are: availability, performance,
  reliability, scalability, manageability, and cost.
 Services, redundancy, partitions, and failure
  handling are core factors common in almost all
  large web systems.
 Caches, Proxies, Indexes, Load Balancers and
  Queues are all used to increase the speed of
  data access. Multiple methods can be used
  together in one web system.
Questions?

Más contenido relacionado

Similar a Scalability AOSA Presentation

Orders of-magnitude-scale-out-your-sql-server-data-slideshare
Orders of-magnitude-scale-out-your-sql-server-data-slideshareOrders of-magnitude-scale-out-your-sql-server-data-slideshare
Orders of-magnitude-scale-out-your-sql-server-data-slideshare
Mark Broadbent
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Jonas Traub
 
Ariu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesAriu - Ph.D. Defense Slides
Ariu - Ph.D. Defense Slides
Pluribus One
 

Similar a Scalability AOSA Presentation (20)

Part 4 : reliable transport and sharing resources
Part 4 : reliable transport and sharing resourcesPart 4 : reliable transport and sharing resources
Part 4 : reliable transport and sharing resources
 
Orders of-magnitude-scale-out-your-sql-server-data-slideshare
Orders of-magnitude-scale-out-your-sql-server-data-slideshareOrders of-magnitude-scale-out-your-sql-server-data-slideshare
Orders of-magnitude-scale-out-your-sql-server-data-slideshare
 
Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, D...
Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, D...Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, D...
Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, D...
 
Clojure at BackType
Clojure at BackTypeClojure at BackType
Clojure at BackType
 
Aurora_session.pdf
Aurora_session.pdfAurora_session.pdf
Aurora_session.pdf
 
Scalability and Availability for Marketing Campaigns
Scalability and Availability for Marketing CampaignsScalability and Availability for Marketing Campaigns
Scalability and Availability for Marketing Campaigns
 
#lspe: Dynamic Scaling
#lspe: Dynamic Scaling #lspe: Dynamic Scaling
#lspe: Dynamic Scaling
 
Optimization of Resource Provisioning Cost in Cloud Computing
Optimization of Resource Provisioning Cost in Cloud Computing Optimization of Resource Provisioning Cost in Cloud Computing
Optimization of Resource Provisioning Cost in Cloud Computing
 
Securing Sharded Networks with Swarm
Securing Sharded Networks with SwarmSecuring Sharded Networks with Swarm
Securing Sharded Networks with Swarm
 
AWS Atlanta meetup load-balancing
AWS Atlanta meetup load-balancingAWS Atlanta meetup load-balancing
AWS Atlanta meetup load-balancing
 
Design and implementation of a system for the improved searching and accessin...
Design and implementation of a system for the improved searching and accessin...Design and implementation of a system for the improved searching and accessin...
Design and implementation of a system for the improved searching and accessin...
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
 
World Wide Web Caching
World Wide Web CachingWorld Wide Web Caching
World Wide Web Caching
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and Consul
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
Clustering - october 2006
Clustering  - october 2006Clustering  - october 2006
Clustering - october 2006
 
Ariu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesAriu - Ph.D. Defense Slides
Ariu - Ph.D. Defense Slides
 
Plandas-CacheCloud
Plandas-CacheCloudPlandas-CacheCloud
Plandas-CacheCloud
 
Understanding the Web through HTTP
Understanding the Web through HTTPUnderstanding the Web through HTTP
Understanding the Web through HTTP
 
CDN Project Presentation
CDN Project PresentationCDN Project Presentation
CDN Project Presentation
 

Último

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Scalability AOSA Presentation

  • 1. Scalable Web Architecture and Distributed Systems Gloria Chow, Jacqui Killow, Peter Moon, Jessica Wong
  • 2. Introduction to Large-Scale Web Systems  The design of large-scale web systems is influenced by six key principles:  Availability  Performance  Reliability  Scalability  Manageability  Cost
  • 3. Introduction to Large-Scale Web Systems  Four core factors are central to almost all large web applications:  Services  Redundancy  Partitions  Failure Handling
  • 4. Core Factors of Large Web Systems  Services  Increase scalability by separating services so that each service has its own function  Interactions outside of the service’s context are handled by an abstract interface  This is called “Service-Oriented Architecture (SOA)”  Redundancy  The server needs to be able to create redundant copies of data in case of data loss
  • 5. Core Factors of Large Web Systems  Partitions  When performance degrades, more resources should be added  Systems can:  scale vertically: add more resources to the server (e.g. increasing capacity)  scale horizontally: add more servers  Failure Handling  System should be able to recover from errors and faults
  • 6. Achieving Fast and Scalable Data Access  By providing fast access to data, services also become faster and more scalable.  There are many options available for increasing the speed of data access:  Caches  Proxies  Indexes  Load Balancers  Queues
  • 7. Caches  Similar to short-term memory  A limited amount of storage space  Only contains the most recently-accessed items  Faster than accessing the original data source  Used in all levels of architecture  Most commonly found near the front end  Two types of cache: Global vs. Distributed
  • 8. Global Cache  Involves adding a server or a file store that is faster than the original storage and accessible by all request nodes  Request nodes query the cache in the same way they would query a local storage  There are two ways a global cache can be set up
  • 9. Two types of Global Cache: 1. Cache is responsible for retrieval of data. Request For ea ch I f t he da t a wa s Node r equest , t he not in t he ca che, r equest node t he ca che wil l wil l check t he pul l t he da t a ca che f ir st . f r om t he or igin Request a nd t hen keep it Node f or f ut ur e r equest s. Gl oba l Cache Dat a Request Node Request Node
  • 10. Two types of Global Cache: 2. Nodes are responsible for retrieval of data. 1. For ea ch r equest , t he r equest node wil l 3. When da t a is check t he gl oba l r et r ieved f r om ca che f ir st . Gl oba l Ca che t he or igin, it ca n Request be a dded t o t he Node ca che. Request Node Da t a Request Node 2. I f t he da t a wa s not in t he ca che, t hen t he Request Node wil l Request r et r ieve it f r om t he or igin. Node
  • 11. Distributed Cache  Each node owns Request Node The node checks t he part of the cached Cache A cache based on an it em key obt ained by data using a hash al gor it hm and t hen Request Node t he dat a or igin.  Uses a hashing function so nodes Cache B can quickly find Dat a Request Node the data they need Cache C  Storage space Request Node can easily be increased just by Cache D adding more nodes
  • 12. Proxies  Coordinates requests from Request A Request multiple clients and Node relays them to the backend Request A Request A Request Data  Can speed up data Node access by grouping similar requests and Request A Request Proxy treating it as one Node Server (“collapsed- forwarding”)
  • 13. Indexes  A table that stores Index Memor y the location of data Locat ion 0 Dat a Locat ion A  Stored in memory or somewhere local A 0 Locat ion 1 B- par t 1 to incoming B- par t 2 B 1 B- par t 3 requests Locat ion 2  Increases speed of C 2 C data reads Locat ion 3 D 3  Data writes are D slower because the index must be updated for each read
  • 14. Load Balancers  Distributes Request Node load across a Request A Request B set of nodes responsible for handling Request B Load Bal ancer (LB) Request C Request requests Node  Algorithms Request C determine Request A how to to distribute Request Node requests
  • 15. Queues Queue  Incoming requests are Task 1 added to a Running Task queue and are Task 2 Task 1 taken off when Task 3 the server has Task 2 processed it Task 4  Allows clients to work Task 5 asynchronousl y
  • 16. Summary  The six principles that govern the design of large web systems are: availability, performance, reliability, scalability, manageability, and cost.  Services, redundancy, partitions, and failure handling are core factors common in almost all large web systems.  Caches, Proxies, Indexes, Load Balancers and Queues are all used to increase the speed of data access. Multiple methods can be used together in one web system.

Notas del editor

  1. If a node goes missing/down, the system will need to pull data from the datastore– data will still be there but performance will just be slighted affected
  2. - Possible algorithms could be picking random nodes, round robin, or selecting nodes based on a criteria