SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
High Scalability by Example –
                                          How can Web Architecture scale like
                                          Facebook, Twitter?




                                          JAX 2011, 03.05.2011
                                          Robert Mederer, Markus Bukowski
    Copyright © 2011 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture.




Speaker

                             Robert Mederer                                                                     Markus Bukowski
                             Lead Technology Architect                                                          Senior Software Engineer
                              Anni-Albers-Straße 11                                                              Kaistraße 20
                                                                                                                 40221 Düsseldorf
                              80807 München
                                                                                                                 Mobil:    +49-175-57-64217
                              Mobil:    +49-175-57-68012
                                                                                                                 markus.bukowski@accenture.com
                              robert.mederer@accenture.com




Experience
2000 - 2005: Technology Architect and Software                                      2006 - 2009: Software Engineer in several projects
Engineer in several projects                                                        (during studies) for mainly telecommunication
                                                                                    companies
2006: Technical Architecture Lead, Integration and
Execution Architecture for Location-Based Service                                   2009 - 2011: Senior Software Engineer in several
Provider                                                                            projects

2009: Technical Architecture Lead, Frontend and                                     2009/2010: Coach and Software Engineer for a
Execution Architecture Government Agency                                            Health Insurance Fund

2009/2010: Technical Architecture and front-office                                  2011: Technical Architecture Lead during the
integration build lead, Integration and Execution                                   development of a Document Management Solution
Architecture Financial Services Agency                                              for a Government Agency

2011: Architect and Performance Engineer for
Location Based Services Platform
Copyright © 2010 Accenture. All rights reserved.                                                                                                 2
Accenture
High performance achieved

Company Profile                                    Worldwide Revenues $21.6 billion
•  Global management consulting,                   (in US$ billion, as of August 31, 2010)
   technology services and outsourcing                                          Communications &
   company                                            Resources                 High Tech
•  215.000 employees
•  Rank 47 among the                                              3.88
                                                                                4.83
   “Best Global Brands 2008”
•  Top 100 Employer
                                                               4.32
•  28 of the DAX-30-Companies                                                      2.98
•  96 of the Fortune-Global-100                                                             Public
•  More than three-quarters of the Fortune-        Financial             5.53               Service
                                                   Services
   Global-500
•  87 of our Top 100-clients have been with                           Products
   us for 10 or more years



Copyright © 2010 Accenture. All rights reserved.                                                      3




Local Accenture … ???

Geographic unit
•  Austria
•  Switzerland
•  Germany
Employees                                                                         Berlin
•  Ca. 6000                                                       Düsseldorf
•  We are hiring!
Exciting Technology work                                          Frankfurt
•  Large scale projects
   (100+ people / multiple years)
                                                                                Munich
•  Most challenging requirements
   –  Stock Exchange / Banking / Trading Systems                                           Vienna
   –  AEMS Mobility Platform
   –  Large Scale Web Applications                                Zurich
      (> 1M page views / day)
   –  Batch Architectures


Copyright © 2010 Accenture. All rights reserved.                                                      4
Agenda


•  Introduction
   – Scalability & CAP Theorem
   – Traditional websites & Social media websites
   – High Scalability in Numbers
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                             5




Scalability



 A system’s capacity to uphold the same performance
               under heavier volumes.

(Patterns for Performance and Operability: Building and Testing Enterprise Software, Chris
                                  Ford et. al., 2008)




Copyright © 2011 Accenture. All rights reserved.                                             6
Vertical Scalability


•  Is achieved by increasing the capacity of a
   single node
   – CPU,
   – Memory,
   – Bandwidth, …
•  Simple Process
   – Application is generally not affected by
     those changes

•  Classical Example are Super Computers
   like
   – HP Integrity Superdome
   – IBM Mainframe
                                                                       Source: Hewlett-Packard




Copyright © 2011 Accenture. All rights reserved.                                                      7




Horizontal Scalability


•  Application is spread on a cluster with
   several nodes
•  Nodes can be added to scale out
•  Produces overhead
   – Keep cluster consistent
   – Node error detection and handling
   – Communication between nodes
•  May be used to increase reliability and
   availability

•  Distributed Systems and Programs like
   – SETI@Home                                     Source: Space Sciences Laboratory, U.C. Berkeley


   – World Wide Web
   – Domain Name Service

Copyright © 2011 Accenture. All rights reserved.                                                      8
CAP Theorem (Brewer‘s Theorem)


                                                                             •  Consistency – all clients see the
                                                                                same data at the same time
                           Consistency                                       •  Availability – all clients can find all
                                                                                data even in presence of failure
                                                                             •  Partition Tolerance – system works
                                                                                even when one node failed


   Partition
                                                     Availability
   Tolerance




                                                    Impossible
Source: PODC-keynote, Towards Robust Distributed Systems, Dr. Eric A. Brewer, 2000
Copyright © 2011 Accenture. All rights reserved.                                                                           9




CAP Theorem

Normally, two of these properties for any shared-data system
                  Consistency + Availability
    C             •  High data integrity
 P      A         •  Single site, cluster database, LDAP, etc.
                  •  2-phase commit, data replication, etc.


          C                       Consistency + Partition
                                  •  Distributed database, distributed locking, etc.
   P             A                •  Pessimistic locking, etc.

                                  Availability + Partition
          C                       •  High scalability
   P             A                •  Distributed cache, DNS, etc.
                                  •  Optimistic locking, expiration/leases (timeout), etc.

Source: “Architecting Cloudy Applications”, David Chou
Copyright © 2011 Accenture. All rights reserved.                                                                          10
Agenda


•  Introduction
   – Scalability & CAP Theorem
   – Traditional websites & Social media websites
   – High Scalability in Numbers
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                 11




Traditional Websites




(Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011)

Copyright © 2011 Accenture. All rights reserved.                                 12
Social Media Websites




(Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011)

Copyright © 2011 Accenture. All rights reserved.                                 13




Agenda


•  Introduction
   – Scalability & CAP Theorem
   – Traditional websites & Social media websites
   – High Scalability in Numbers
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                 14
High Scalability in Numbers

The World Is Obsessed With Facebook (Video)




Source: http://www.youtube.com/watch?v=xJXOavGwAW8
Copyright © 2011 Accenture. All rights reserved.                                                            15




High Scalability in Numbers

 facebook                                                                        twitter

 Active users                                                500.000.000 New accounts per day        500.000



 Active users logging in                                     250.000.000 Tweets per day          140.000.000
 daily                                                                   (a year ago)            (50.000.000)


 Active mobile users                                         250.000.000 Max tweets per second         6.939



 In 20 minutes …


 Links shared                                                   1.000.000

 Status updates                                                 1.851.000


 Photos uploaded                                                2.716.000

Source: http://www.facebook.com/press/info.php?statistics, http://www.allfacebook.com/
 Comments made                                                10.208.000                                    16
Copyright © 2011 Accenture. All rights reserved.
High Scalability in Numbers

 LinkedIn
 Members                                                                                          >100.000.000
 Connections between Members                                                                      1.300.000.000


 Visitors per month                                                                                128.000.000
 % of global internet users who visit LinkedIn daily                                                        4%




Source: http://blog.linkedin.com, http://press.linkedin.com/about/, LinkedIn Demographics 20111

Copyright © 2011 Accenture. All rights reserved.                                                              17




Agenda


•  Introduction
•  Accenture – High Scalability by Example
   – The Royal Wedding Website
   – US Based Professional Sport League .com Website
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                                              18
Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
   – Facebook
   – Twitter
   – LinkedIn
   – Challenges
•  Common concepts of Scalability
•  Conclusion


Copyright © 2011 Accenture. All rights reserved.              19




Facebook – Key Facts


•  Biggest Social Network

•  People are linked together
•  People share and exchange information in real-time

•  Facebook provides a
   – A personalized News Feed containing news about friends
   – Picture Service including people tagging / linking
   – Messaging Service to stay in connect with friends
   – Platform to play Games with friends




Copyright © 2011 Accenture. All rights reserved.              20
Facebook – Architecture (1)


•  Facebook „main“ architecture is based                                                                  Memcache
   on a LAMP stack
•  The application logic resides in Web
   Server layer (PHP based)
•  The application layer takes care of the
   data distribution

•  Some services do not fit (C++, Java)
   – Search                                                                                                  MySQL
   – Ads
   – People You May Know
                                                                              Invalidation logic is implemented in
   – Multifeed
                                                                              the application!



Copyright © 2011 Accenture. All rights reserved.                                                                     21




Facebook – Architecture (2)

Multifeed / Newsfeed
•  Distributed System

•  Every user gets
   – 45 best rated updates
   –  on every reload

•  Every DB-Update results in
   a Scribe Notification
   –  Leaf Nodes / Server are
     notified



Source: “High Performance at Massive Scale – Lessons learned at Facebook”, Jeff Rothschild, 2009
Inspired By: “Architecting Cloudy Applications”, David Chou, 2010
Copyright © 2011 Accenture. All rights reserved.                                                                     22
Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
   – Facebook
   – Twitter
   – LinkedIn
   – Challenges
•  Common concepts of Scalability
•  Conclusion


Copyright © 2011 Accenture. All rights reserved.   23




Twitter – Key Facts


•  Twitter is a real-time information network

•  Twitter‘s key characteristics
   – Tweets are used for sharing information
   – Timeline of tweets must be kept
   – A Social Graph is maintained to deliver
     information




Copyright © 2011 Accenture. All rights reserved.   24
Twitter - Architecture


                                                          Load Balancers
Web
Frontend
                                                         Apache mod_proxy

Apps &                                                                                 Distributed
                                                          Rails (Unicorn)              Cache
Services

                                        Flock                memcached       Kestrel   Queues
Partitioned
Data                                                                                   Distributed
                                                     MySQL           Cassandra
                                                                                       Storage

                                                             Daemons
Source: “Architecting Cloudy Applications”, David Chou
Copyright © 2011 Accenture. All rights reserved.                                                 25




Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
   – Facebook
   – Twitter
   – LinkedIn
   – Challenges
•  Common concepts of Scalability
•  Conclusion


Copyright © 2011 Accenture. All rights reserved.                                                 26
LinkedIn – Key Facts


 •  A social media networking platform to find
    connections to recommended job candidates,
    industry experts and business partners
 •  99% pure Java

 LinkedIn – Applications
 •  Groups
 •  Job market
 •  Mailing
 •  Profiles (Friends and Companies)
 •  Mobile Applications (Android, iPhone,
    Blackberry, Palm)




 Copyright © 2011 Accenture. All rights reserved.                                                                                         27




 LinkedIn - Architecture

Web Frontend
                                                                                                  LinkedIn
                                                                                                  Spring Search
                                                                                                  Grails MVC
                                                                                                  Queuing Spring
                                                                                                  Caching / Messaging
                                                                                                  DWR
       Spring MVC                                   Grails                          DWR              Spring Framework (IoC, Web
                                                                                                       Lucene Searchprocessing AOP,
                                                                                                       Model-View-Controller DI, for
                                                                                                       Asynchronous Engine
                                                                                                  •  Direct Web Remoting (AJAX for
                                                                                                       Voldemort
                                                                                                       Open-source web application
                                                                                                       Frameworktriggered writes
                                                                                                  •  Java Library) Concerns,..) by
                                                                                                     Separation of open "coding
                                                                                                       Aims to bring the source in
                                                                                                          •  Apache combines high-
                                                                                                          •  Voldemort
                                                                                                                user
Apps & Services                                                                                   •  RPC library toparadigm to the text
                                                                                                       Used performance, full-featured
                                                                                                     LinkedIn extensions with Groovy
                                                                                                                memory caching
                                                                                                                for
                                                                                                       convention" call Java functions
                                                                                                                search engine library written
                                                                                                                storage system so that a
      Service        Service                 LinkedIn Spring              Service    Service         from JavaScript and (lin:bean,
                                                                                                          •  XML Vocabulary to call
                                                                                                  StorageentirelyLoad of Page not
                                                                                                                Initial
                                                                                                  •  Why Grails? incaching tier is
                                                                                                                  Valdemort
                                                                                                                  Hadoop
                                                                                                                separate Java
                                                                                                  Storagerequired a more from Javaweb
                                                                                                                  Sensei
                                                                                                     JavaScript functions productive
                                                                                                          •  lin:component, …)
                                                                                                                Needed software
                                                                                                       Hadoop isOperations framework
                                                                                                                Pull a
                                                                                                  •  ZoieProperty Editors a major
                                                                                                       Valdemort used as
                                                                                                          •  Forframework
                                                                                                  •  Used supports data-intensive
                                                                                                       distributed database
                                                                                                       distributed key-value data cache
                Lucene
    Zoie                         Bobo        Caching         Voldemort    ehcache      Memcache   •  that realtime indexing/search
                                                                                                       ehcache
                                                                                                          •  …
                                                                                                          •  New kind of application context
              Search Index                                                                                •  Asynchronous Communication
                                                                                                       for• the social graph to (map/
                                                                                                       horizontallyprototyping
                                                                                                                Rapid
                                                                                                  •  distributed applications source,
                                                                                                                system scaledopen
                                                                                                          •  for life-cycle management with
                                                                                                                Ehcache is an
                                                                                                          •  Push Operations with and
                                                                                                                demonstrate feasibility
                                                                                                  •  reduce /Consistency cachehigh
                                                                                                                standards-based
                                                                                                       relaxed distributed file system)  used
                                                                         Queuing / Messaging
                                                                                                       Bobo productivity gain
                                                                                                       Advantage:
                                                                                                             components
                                                                                                       durabilityis usedwith engine offload
                                                                                                          •  …
                                                                                                                to boost performance,
                                                                                                                     guarantees majorjava/
                                                                                                  •  Hadoop database and simplify the
                                                                                                                Integration as a
                                                                                                                faceted search existing for
                                                                              ActiveMQ                          the concurrent read
                                                                                                                Fast
                                                                                                                Socialkey-value data
                                                                                                                        Graph
                                                                                                       distributed index building store
                                                                                                          •  spring-based solutions
                                                                                                                scalability
                                                                                                                Fast
                                                                                                          •  handles semi-structured and
                                                                                                  •  for the social graph
                                                                                                       Memcache structured data
Storage
                                                                                                  •  Bottleneck:
                                                                                                          •  Distributed Cache
               MySQL                    Sensei                  Hadoop              Voldemort             •  Read data
                                                                                                          •  Index building

• Keep core data ACID
• Keep replicated and cached data BASE
• Cache data on a cheap memory (memcached)
• Use a hint to route the client to his / her’s ACID data
 Source: Project Voldemort, Jay Kreps            Source: http://sna-projects.com
 Copyright © 2011 Accenture. All rights reserved.                                                                                         28
Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion




Copyright © 2011 Accenture. All rights reserved.                                                 29




Common concepts of Scalability




                                                         parallelization

              asynchronous                                                       idempotent
                                                         7 Habits of              operations
                                                            Good
           partitioned data                              Distributed           fault-tolerance
                                                          Systems

                 optimistic concurrency                                    shared nothing

Source: “Architecting Cloudy Applications”, David Chou
Source: highscalability.com
Copyright © 2011 Accenture. All rights reserved.                                                 30
Common concepts of Scalability
    Hybrid architectures


• Scale-out (horizontal)                           • Scale-up (vertical)
  – BASE                                             – ACID
  – focus on “commit”                                – availability first
  – optimistic locking                               – pessimistic locking
  – shared nothing                                   – transactional
  – maximize scalability                             – favor accuracy/consistency


                                 Most distributed systems realize both




Copyright © 2011 Accenture. All rights reserved.                                    31




Agenda


•  Introduction
•  Accenture – High Scalability by Example
•  Architecture at Internet Scale
•  Common concepts of Scalability
•  Conclusion
   –  Questionnaire
   –  Déjà vu - Common concepts
   –  Don’t forget the Infrastructure




Copyright © 2011 Accenture. All rights reserved.                                    32
Conclusion
                 Questionnaire


•  Is there a need to scale my application?
   –  Vertical scaling is more easy to achieve
   –  Use horizontal scaling only when required

•  Is there a plan to proof your designed solution?
   –  Plan to do a lot of realistic Proof-of-Concepts

•  Is there a one size fits all solution?
   –  NO!

•  How important is ACID?
   – Is BASE enough?
   – Can a NoSQL solution be used?
Copyright © 2011 Accenture. All rights reserved.             33




Conclusion
      Déjà vu - Common concepts


•  Asynchronous Processing
   – Keep Facebook, Twitter and LinkedIn in mind
   –  Asynchronous writes and even UI

•  Data Partioining
   –  You must understand your data
   –  Is a hybrid solution as used by LinkedIn applicable?

•  Design To Failure
   –  Lessons learned from Amazon‘s cloud solution outage




Copyright © 2011 Accenture. All rights reserved.             34
Conclusion
      Don’t forget the Infrastructure


•  Disk IO compared to RAM is slow
      Avoid „slow“ IO access whenever possible
      Caching, Caching, Caching
      Bulk Updates (Batch) over small updates

•  Network Latency
      Keep in mind that there is a network
      Where are the users located?
      How are your users connected? mobile application?
      Is there a need for a datacenter distribution?

•  Hardware configuration
   –  Is the CPU Power Saving working right?
Copyright © 2011 Accenture. All rights reserved.          35




Questions ?




Copyright © 2011 Accenture. All rights reserved.          36

Más contenido relacionado

La actualidad más candente

DevOps and the DBA- 24 Hours of Pass
DevOps and the DBA-  24 Hours of PassDevOps and the DBA-  24 Hours of Pass
DevOps and the DBA- 24 Hours of PassKellyn Pot'Vin-Gorman
 
Azure Storage Revisited
Azure Storage RevisitedAzure Storage Revisited
Azure Storage RevisitedJoel Cochran
 
LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)Sid Anand
 
Moving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application MigrationMoving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application Migrationanilmadugula
 
Randy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural PrinciplesRandy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural Principlesdeimos
 
Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010cScape
 
Rethink your architecture - Marten Deinum
Rethink your architecture - Marten DeinumRethink your architecture - Marten Deinum
Rethink your architecture - Marten DeinumNLJUG
 
JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010David Nuescheler
 
Effective admin and development in iib
Effective admin and development in iibEffective admin and development in iib
Effective admin and development in iibm16k
 
Responsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROIResponsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROIJuan Carlos Duron
 
The Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogicThe Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogicBrian Huff
 
Build highly scalable_low_latency_applications
Build highly scalable_low_latency_applicationsBuild highly scalable_low_latency_applications
Build highly scalable_low_latency_applicationsShivnarayan Varma
 
windows server 2012 R2
windows server 2012 R2windows server 2012 R2
windows server 2012 R2Gol D Roger
 
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...Roberto Vazquez Delgado
 
Databases & Microsoft SQL Server
Databases & Microsoft SQL ServerDatabases & Microsoft SQL Server
Databases & Microsoft SQL ServerMahmoud Abdallah
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0EDB
 
Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)Chris Richardson
 

La actualidad más candente (20)

DevOps and the DBA- 24 Hours of Pass
DevOps and the DBA-  24 Hours of PassDevOps and the DBA-  24 Hours of Pass
DevOps and the DBA- 24 Hours of Pass
 
Azure Storage Revisited
Azure Storage RevisitedAzure Storage Revisited
Azure Storage Revisited
 
LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)LinkedIn Data Infrastructure Slides (Version 2)
LinkedIn Data Infrastructure Slides (Version 2)
 
How to prepare for your SharePoint upgrade
How to prepare for your SharePoint upgradeHow to prepare for your SharePoint upgrade
How to prepare for your SharePoint upgrade
 
Moving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application MigrationMoving to Web 2.0 - Best Practices for Business and Application Migration
Moving to Web 2.0 - Best Practices for Business and Application Migration
 
Randy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural PrinciplesRandy Shoup eBays Architectural Principles
Randy Shoup eBays Architectural Principles
 
Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010Planning your Migration for SharePoint 2010
Planning your Migration for SharePoint 2010
 
DevOps and DBA- Delphix
DevOps and DBA-  DelphixDevOps and DBA-  Delphix
DevOps and DBA- Delphix
 
Rethink your architecture - Marten Deinum
Rethink your architecture - Marten DeinumRethink your architecture - Marten Deinum
Rethink your architecture - Marten Deinum
 
JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010JBoye Presentation: WCM Trends for 2010
JBoye Presentation: WCM Trends for 2010
 
Effective admin and development in iib
Effective admin and development in iibEffective admin and development in iib
Effective admin and development in iib
 
Responsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROIResponsive Web Design ~ Best Practices for Maximizing ROI
Responsive Web Design ~ Best Practices for Maximizing ROI
 
The Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogicThe Top 10 Things Oracle UCM Users Need To Know About WebLogic
The Top 10 Things Oracle UCM Users Need To Know About WebLogic
 
DevOps for the DBA- Jax Style!
DevOps for the DBA-  Jax Style!DevOps for the DBA-  Jax Style!
DevOps for the DBA- Jax Style!
 
Build highly scalable_low_latency_applications
Build highly scalable_low_latency_applicationsBuild highly scalable_low_latency_applications
Build highly scalable_low_latency_applications
 
windows server 2012 R2
windows server 2012 R2windows server 2012 R2
windows server 2012 R2
 
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
Successful Migration to SharePoint 2013 - Planning Considerations & Migration...
 
Databases & Microsoft SQL Server
Databases & Microsoft SQL ServerDatabases & Microsoft SQL Server
Databases & Microsoft SQL Server
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0
 
Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)Developing modular, polyglot applications with Spring (SpringOne India 2012)
Developing modular, polyglot applications with Spring (SpringOne India 2012)
 

Similar a High Scalability by Example – How can Web-Architecture scale like Facebook, Twitter, YouTube, etc.

Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!Robert Mederer
 
Portfolio Sameer
Portfolio SameerPortfolio Sameer
Portfolio Sameersamahmedksa
 
Career Portfolio Sameer Ahmed
Career Portfolio Sameer AhmedCareer Portfolio Sameer Ahmed
Career Portfolio Sameer Ahmedsamahmedksa
 
Destination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud PresentationDestination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud PresentationIsaac Christoffersen
 
Telecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin SystemsTelecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin SystemsSriram Subramanian
 
OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013OCSL
 
CWIN17 london becoming cloud native part 2 - guy martin docker
CWIN17 london   becoming cloud native part 2 - guy martin dockerCWIN17 london   becoming cloud native part 2 - guy martin docker
CWIN17 london becoming cloud native part 2 - guy martin dockerCapgemini
 
Akshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infraAkshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infraAkshay Guleria
 
CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016Don Mori
 
AWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWSAWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWSAmazon Web Services
 
Startups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demoStartups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demoCloudOps Summit
 
Sybrant Technologies Company Presentation
Sybrant Technologies Company PresentationSybrant Technologies Company Presentation
Sybrant Technologies Company Presentationmanimsquare
 
DECIDE H2020 Brochure
DECIDE H2020 BrochureDECIDE H2020 Brochure
DECIDE H2020 BrochureDECIDEH2020
 
Resume_Achhar_Kalia
Resume_Achhar_KaliaResume_Achhar_Kalia
Resume_Achhar_KaliaAchhar Kalia
 
Oracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao DatacenterOracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao DatacenterGeneXus
 

Similar a High Scalability by Example – How can Web-Architecture scale like Facebook, Twitter, YouTube, etc. (20)

Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!
 
Jobs in the Cloud
 Jobs in the Cloud Jobs in the Cloud
Jobs in the Cloud
 
Portfolio Sameer
Portfolio SameerPortfolio Sameer
Portfolio Sameer
 
Agcv
AgcvAgcv
Agcv
 
Career Portfolio Sameer Ahmed
Career Portfolio Sameer AhmedCareer Portfolio Sameer Ahmed
Career Portfolio Sameer Ahmed
 
Destination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud PresentationDestination Marketing Open Source and Cloud Presentation
Destination Marketing Open Source and Cloud Presentation
 
Telecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin SystemsTelecom Clouds crossing borders, Chet Golding, Zefflin Systems
Telecom Clouds crossing borders, Chet Golding, Zefflin Systems
 
OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013OCSL - Migrating to a Virtualised Modern Desktop June 2013
OCSL - Migrating to a Virtualised Modern Desktop June 2013
 
CWIN17 london becoming cloud native part 2 - guy martin docker
CWIN17 london   becoming cloud native part 2 - guy martin dockerCWIN17 london   becoming cloud native part 2 - guy martin docker
CWIN17 london becoming cloud native part 2 - guy martin docker
 
Akshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infraAkshay guleria cloud_dev_ops_infra
Akshay guleria cloud_dev_ops_infra
 
CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016CIS Infrastructure Group Don Mori Profile 2016
CIS Infrastructure Group Don Mori Profile 2016
 
AWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWSAWS Webcast - Datacenter Migration to AWS
AWS Webcast - Datacenter Migration to AWS
 
Startups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demoStartups: Streit, Scaleup - introduction and product demo
Startups: Streit, Scaleup - introduction and product demo
 
I Tech Art
I Tech ArtI Tech Art
I Tech Art
 
Sybrant Technologies Company Presentation
Sybrant Technologies Company PresentationSybrant Technologies Company Presentation
Sybrant Technologies Company Presentation
 
DECIDE H2020 Brochure
DECIDE H2020 BrochureDECIDE H2020 Brochure
DECIDE H2020 Brochure
 
Chiranjib resume
Chiranjib resumeChiranjib resume
Chiranjib resume
 
Resume_Achhar_Kalia
Resume_Achhar_KaliaResume_Achhar_Kalia
Resume_Achhar_Kalia
 
Oracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao DatacenterOracle - Soluções do device ao Datacenter
Oracle - Soluções do device ao Datacenter
 
Views from the Unified Communications Summit
Views from the Unified Communications SummitViews from the Unified Communications Summit
Views from the Unified Communications Summit
 

Último

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

High Scalability by Example – How can Web-Architecture scale like Facebook, Twitter, YouTube, etc.

  • 1. High Scalability by Example – How can Web Architecture scale like Facebook, Twitter? JAX 2011, 03.05.2011 Robert Mederer, Markus Bukowski Copyright © 2011 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Speaker Robert Mederer Markus Bukowski Lead Technology Architect Senior Software Engineer Anni-Albers-Straße 11 Kaistraße 20 40221 Düsseldorf 80807 München Mobil: +49-175-57-64217 Mobil: +49-175-57-68012 markus.bukowski@accenture.com robert.mederer@accenture.com Experience 2000 - 2005: Technology Architect and Software 2006 - 2009: Software Engineer in several projects Engineer in several projects (during studies) for mainly telecommunication companies 2006: Technical Architecture Lead, Integration and Execution Architecture for Location-Based Service 2009 - 2011: Senior Software Engineer in several Provider projects 2009: Technical Architecture Lead, Frontend and 2009/2010: Coach and Software Engineer for a Execution Architecture Government Agency Health Insurance Fund 2009/2010: Technical Architecture and front-office 2011: Technical Architecture Lead during the integration build lead, Integration and Execution development of a Document Management Solution Architecture Financial Services Agency for a Government Agency 2011: Architect and Performance Engineer for Location Based Services Platform Copyright © 2010 Accenture. All rights reserved. 2
  • 2. Accenture High performance achieved Company Profile Worldwide Revenues $21.6 billion •  Global management consulting, (in US$ billion, as of August 31, 2010) technology services and outsourcing Communications & company Resources High Tech •  215.000 employees •  Rank 47 among the 3.88 4.83 “Best Global Brands 2008” •  Top 100 Employer 4.32 •  28 of the DAX-30-Companies 2.98 •  96 of the Fortune-Global-100 Public •  More than three-quarters of the Fortune- Financial 5.53 Service Services Global-500 •  87 of our Top 100-clients have been with Products us for 10 or more years Copyright © 2010 Accenture. All rights reserved. 3 Local Accenture … ??? Geographic unit •  Austria •  Switzerland •  Germany Employees Berlin •  Ca. 6000 Düsseldorf •  We are hiring! Exciting Technology work Frankfurt •  Large scale projects (100+ people / multiple years) Munich •  Most challenging requirements –  Stock Exchange / Banking / Trading Systems Vienna –  AEMS Mobility Platform –  Large Scale Web Applications Zurich (> 1M page views / day) –  Batch Architectures Copyright © 2010 Accenture. All rights reserved. 4
  • 3. Agenda •  Introduction – Scalability & CAP Theorem – Traditional websites & Social media websites – High Scalability in Numbers •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 5 Scalability A system’s capacity to uphold the same performance under heavier volumes. (Patterns for Performance and Operability: Building and Testing Enterprise Software, Chris Ford et. al., 2008) Copyright © 2011 Accenture. All rights reserved. 6
  • 4. Vertical Scalability •  Is achieved by increasing the capacity of a single node – CPU, – Memory, – Bandwidth, … •  Simple Process – Application is generally not affected by those changes •  Classical Example are Super Computers like – HP Integrity Superdome – IBM Mainframe Source: Hewlett-Packard Copyright © 2011 Accenture. All rights reserved. 7 Horizontal Scalability •  Application is spread on a cluster with several nodes •  Nodes can be added to scale out •  Produces overhead – Keep cluster consistent – Node error detection and handling – Communication between nodes •  May be used to increase reliability and availability •  Distributed Systems and Programs like – SETI@Home Source: Space Sciences Laboratory, U.C. Berkeley – World Wide Web – Domain Name Service Copyright © 2011 Accenture. All rights reserved. 8
  • 5. CAP Theorem (Brewer‘s Theorem) •  Consistency – all clients see the same data at the same time Consistency •  Availability – all clients can find all data even in presence of failure •  Partition Tolerance – system works even when one node failed Partition Availability Tolerance Impossible Source: PODC-keynote, Towards Robust Distributed Systems, Dr. Eric A. Brewer, 2000 Copyright © 2011 Accenture. All rights reserved. 9 CAP Theorem Normally, two of these properties for any shared-data system Consistency + Availability C •  High data integrity P A •  Single site, cluster database, LDAP, etc. •  2-phase commit, data replication, etc. C Consistency + Partition •  Distributed database, distributed locking, etc. P A •  Pessimistic locking, etc. Availability + Partition C •  High scalability P A •  Distributed cache, DNS, etc. •  Optimistic locking, expiration/leases (timeout), etc. Source: “Architecting Cloudy Applications”, David Chou Copyright © 2011 Accenture. All rights reserved. 10
  • 6. Agenda •  Introduction – Scalability & CAP Theorem – Traditional websites & Social media websites – High Scalability in Numbers •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 11 Traditional Websites (Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011) Copyright © 2011 Accenture. All rights reserved. 12
  • 7. Social Media Websites (Scaling the Social Graph: Infrastructure at Facebook, Jason Sobel, QCon 2011) Copyright © 2011 Accenture. All rights reserved. 13 Agenda •  Introduction – Scalability & CAP Theorem – Traditional websites & Social media websites – High Scalability in Numbers •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 14
  • 8. High Scalability in Numbers The World Is Obsessed With Facebook (Video) Source: http://www.youtube.com/watch?v=xJXOavGwAW8 Copyright © 2011 Accenture. All rights reserved. 15 High Scalability in Numbers facebook twitter Active users 500.000.000 New accounts per day 500.000 Active users logging in 250.000.000 Tweets per day 140.000.000 daily (a year ago) (50.000.000) Active mobile users 250.000.000 Max tweets per second 6.939 In 20 minutes … Links shared 1.000.000 Status updates 1.851.000 Photos uploaded 2.716.000 Source: http://www.facebook.com/press/info.php?statistics, http://www.allfacebook.com/ Comments made 10.208.000 16 Copyright © 2011 Accenture. All rights reserved.
  • 9. High Scalability in Numbers LinkedIn Members >100.000.000 Connections between Members 1.300.000.000 Visitors per month 128.000.000 % of global internet users who visit LinkedIn daily 4% Source: http://blog.linkedin.com, http://press.linkedin.com/about/, LinkedIn Demographics 20111 Copyright © 2011 Accenture. All rights reserved. 17 Agenda •  Introduction •  Accenture – High Scalability by Example – The Royal Wedding Website – US Based Professional Sport League .com Website •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 18
  • 10. Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale – Facebook – Twitter – LinkedIn – Challenges •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 19 Facebook – Key Facts •  Biggest Social Network •  People are linked together •  People share and exchange information in real-time •  Facebook provides a – A personalized News Feed containing news about friends – Picture Service including people tagging / linking – Messaging Service to stay in connect with friends – Platform to play Games with friends Copyright © 2011 Accenture. All rights reserved. 20
  • 11. Facebook – Architecture (1) •  Facebook „main“ architecture is based Memcache on a LAMP stack •  The application logic resides in Web Server layer (PHP based) •  The application layer takes care of the data distribution •  Some services do not fit (C++, Java) – Search MySQL – Ads – People You May Know Invalidation logic is implemented in – Multifeed the application! Copyright © 2011 Accenture. All rights reserved. 21 Facebook – Architecture (2) Multifeed / Newsfeed •  Distributed System •  Every user gets – 45 best rated updates –  on every reload •  Every DB-Update results in a Scribe Notification –  Leaf Nodes / Server are notified Source: “High Performance at Massive Scale – Lessons learned at Facebook”, Jeff Rothschild, 2009 Inspired By: “Architecting Cloudy Applications”, David Chou, 2010 Copyright © 2011 Accenture. All rights reserved. 22
  • 12. Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale – Facebook – Twitter – LinkedIn – Challenges •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 23 Twitter – Key Facts •  Twitter is a real-time information network •  Twitter‘s key characteristics – Tweets are used for sharing information – Timeline of tweets must be kept – A Social Graph is maintained to deliver information Copyright © 2011 Accenture. All rights reserved. 24
  • 13. Twitter - Architecture Load Balancers Web Frontend Apache mod_proxy Apps & Distributed Rails (Unicorn) Cache Services Flock memcached Kestrel Queues Partitioned Data Distributed MySQL Cassandra Storage Daemons Source: “Architecting Cloudy Applications”, David Chou Copyright © 2011 Accenture. All rights reserved. 25 Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale – Facebook – Twitter – LinkedIn – Challenges •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 26
  • 14. LinkedIn – Key Facts •  A social media networking platform to find connections to recommended job candidates, industry experts and business partners •  99% pure Java LinkedIn – Applications •  Groups •  Job market •  Mailing •  Profiles (Friends and Companies) •  Mobile Applications (Android, iPhone, Blackberry, Palm) Copyright © 2011 Accenture. All rights reserved. 27 LinkedIn - Architecture Web Frontend LinkedIn Spring Search Grails MVC Queuing Spring Caching / Messaging DWR Spring MVC Grails DWR Spring Framework (IoC, Web Lucene Searchprocessing AOP, Model-View-Controller DI, for Asynchronous Engine •  Direct Web Remoting (AJAX for Voldemort Open-source web application Frameworktriggered writes •  Java Library) Concerns,..) by Separation of open "coding Aims to bring the source in •  Apache combines high- •  Voldemort user Apps & Services •  RPC library toparadigm to the text Used performance, full-featured LinkedIn extensions with Groovy memory caching for convention" call Java functions search engine library written storage system so that a Service Service LinkedIn Spring Service Service from JavaScript and (lin:bean, •  XML Vocabulary to call StorageentirelyLoad of Page not Initial •  Why Grails? incaching tier is Valdemort Hadoop separate Java Storagerequired a more from Javaweb Sensei JavaScript functions productive •  lin:component, …) Needed software Hadoop isOperations framework Pull a •  ZoieProperty Editors a major Valdemort used as •  Forframework •  Used supports data-intensive distributed database distributed key-value data cache Lucene Zoie Bobo Caching Voldemort ehcache Memcache •  that realtime indexing/search ehcache •  … •  New kind of application context Search Index •  Asynchronous Communication for• the social graph to (map/ horizontallyprototyping Rapid •  distributed applications source, system scaledopen •  for life-cycle management with Ehcache is an •  Push Operations with and demonstrate feasibility •  reduce /Consistency cachehigh standards-based relaxed distributed file system) used Queuing / Messaging Bobo productivity gain Advantage: components durabilityis usedwith engine offload •  … to boost performance, guarantees majorjava/ •  Hadoop database and simplify the Integration as a faceted search existing for ActiveMQ the concurrent read Fast Socialkey-value data Graph distributed index building store •  spring-based solutions scalability Fast •  handles semi-structured and •  for the social graph Memcache structured data Storage •  Bottleneck: •  Distributed Cache MySQL Sensei Hadoop Voldemort •  Read data •  Index building • Keep core data ACID • Keep replicated and cached data BASE • Cache data on a cheap memory (memcached) • Use a hint to route the client to his / her’s ACID data Source: Project Voldemort, Jay Kreps Source: http://sna-projects.com Copyright © 2011 Accenture. All rights reserved. 28
  • 15. Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion Copyright © 2011 Accenture. All rights reserved. 29 Common concepts of Scalability parallelization asynchronous idempotent 7 Habits of operations Good partitioned data Distributed fault-tolerance Systems optimistic concurrency shared nothing Source: “Architecting Cloudy Applications”, David Chou Source: highscalability.com Copyright © 2011 Accenture. All rights reserved. 30
  • 16. Common concepts of Scalability Hybrid architectures • Scale-out (horizontal) • Scale-up (vertical) – BASE – ACID – focus on “commit” – availability first – optimistic locking – pessimistic locking – shared nothing – transactional – maximize scalability – favor accuracy/consistency Most distributed systems realize both Copyright © 2011 Accenture. All rights reserved. 31 Agenda •  Introduction •  Accenture – High Scalability by Example •  Architecture at Internet Scale •  Common concepts of Scalability •  Conclusion –  Questionnaire –  Déjà vu - Common concepts –  Don’t forget the Infrastructure Copyright © 2011 Accenture. All rights reserved. 32
  • 17. Conclusion Questionnaire •  Is there a need to scale my application? –  Vertical scaling is more easy to achieve –  Use horizontal scaling only when required •  Is there a plan to proof your designed solution? –  Plan to do a lot of realistic Proof-of-Concepts •  Is there a one size fits all solution? –  NO! •  How important is ACID? – Is BASE enough? – Can a NoSQL solution be used? Copyright © 2011 Accenture. All rights reserved. 33 Conclusion Déjà vu - Common concepts •  Asynchronous Processing – Keep Facebook, Twitter and LinkedIn in mind –  Asynchronous writes and even UI •  Data Partioining –  You must understand your data –  Is a hybrid solution as used by LinkedIn applicable? •  Design To Failure –  Lessons learned from Amazon‘s cloud solution outage Copyright © 2011 Accenture. All rights reserved. 34
  • 18. Conclusion Don’t forget the Infrastructure •  Disk IO compared to RAM is slow   Avoid „slow“ IO access whenever possible   Caching, Caching, Caching   Bulk Updates (Batch) over small updates •  Network Latency   Keep in mind that there is a network   Where are the users located?   How are your users connected? mobile application?   Is there a need for a datacenter distribution? •  Hardware configuration –  Is the CPU Power Saving working right? Copyright © 2011 Accenture. All rights reserved. 35 Questions ? Copyright © 2011 Accenture. All rights reserved. 36