1. The document discusses strategies for scaling web applications, including scaling the client, web/application, and database tiers.
2. It covers techniques like load balancing, domain sharding, caching, and database partitioning to distribute load across servers.
3. Scaling the database tier involves strategies such as replication, indexing, and moving to NoSQL databases which sacrifice some consistency for improved scalability.
4. Scalability A system is scalable when it can accommodate more loads and larger data set by increasing hardware power Scalability implies performance but not the other way around
5. Vertical Scaling vs. Horizontal Scaling http://www.ourofficebox.co.uk/Images/scalability2%20750.png http://www.gigaspaces.com/files/pics/online_gaming_scalability_diagram_website.jpg
6. Don't Underestimate Vertical Scaling StackOverflow.com 1M page views per day 500K questions and millions of posts 817th largest site Hardware 2 web servers (1 Xeon 4-core processor, 8GB RAM) 1 database server (2 Xeon 4-core processors, 48GB RAM) Software IIS 7.5 on Windows Server 2008 R2 HAProxy (inside Linux VM) SQL Server 2008 Enterprise ASP.NET MVC and .NET 3.5 Source: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?casestudyid=4000006676 Microsoft Confidential 6
8. Client Resource Management Minify and Gzip JavaScript, CSS MS Ajax Minifier, YUI Compressor, Google Closure Compiler Combine JavaScript and CSS MS Script Loader CSS on top JavaScript at bottom Add Expires headers for all resources
9. Combres Currently 2.0, Apache license http://combres.codeplex.com/ Key Features Combine, compress, minify, cache (server & client) JavaScript and CSS resources Automatic change detection Extensible architecture User Guide http://www.codeproject.com/KB/aspnet/combres2.aspx Microsoft Confidential 9
10. JavaScript & Ajax Optimization Reverse-Ajax (Comet) instead of polling Use dedicated Comet server Examples: StreamHub, Meteor http://cometdaily.com/maturity.html Yield to timer by chunking processing Avoid outer-scope lookup Split initial payload (i.e. before onload) Employ non-blocking JS loading techniques MS Ajax Library Script Loader, YUI Script Loader
12. Load Balancing The act of properly distribute workload across machines/resources in a cluster Approaches DNS’ “A” records Poor man’s hand-coded redirection Software, e.g. HAProxy, NLB, LVS Hardware, e.g. F5 Considerations Session state View state
13. Domain Sharding Partition resources across different hosts By type, e.g. static/dynamic, JS/CSS etc. By functionality, e.g. forum module Benefits Balance loads Parallel downloads Avoid redundant cookies Isolated optimization
14. Content Delivery Network (CDN) Services Microsoft, Google, Akamai etc. Static resources only Benefits As domain sharding Redundancy and availability Smart DNS routing High chance of browser cache hit
15. Distributed Cache Enable persistent cache for a server cluster Fast (in-memory) Distributed – ignorant by clients Solutions AppFabric Cache (Velocity) Support POCO, XML, binary ASP.NET integration Extensible cache provider Others: memcached
16. Concurrency It’s a waste of cores if you don’t have enough threads Solutions Parallel Task Library (PTL) PLINQ F#
17. Others Compensation over distributed transactions Asynchronous over synchronous services Related: asynchronous controller in MVC 2.0
18. Side Notes on Entity Framework Singleton ObjectContext doesn’t fit web apps Not thread-safe Consume RAM Might fit rich client One OC per request is the common strategy for web apps Use compiled query to reuse generated EF command tree
20. Indexes Clustered vs. non-clustered indexes Guidelines Indexing columns in WHERE and SELECT Indexing columns with highly unique values Cluster-indexing primary keys Avoid over-indexing
21. Replication Publisher-subscribers configuration i.e. master-slave in MySQL Write to subscribers won’t propagate back Common options Transactional (near real-time) Snapshot (time interval) Suitable for read-intensive applications
23. Partitioning Vertical Partitioning (Clustering) Horizontal Partitioning (Sharding) Distribute tables into multiple DBs, each representing a cluster of related tables, e.g. Customer DB, product DB, forum DB etc. Application layer aggregates data Distribute table rows into logical groups, e.g. US customers, European customers Application layer picks shards & aggregates data
24. NoSQL Partitioning makes relational databases not so relational any more Complexity in DB design & application layer NoSQL, “not-only-relational”, is about DBs built with scalability in mind Sacrifice integrity & ACID to a certain extent Apache Cassandra Auto load balancing Identical nodes Elastic capacity Flexible schema Others: MongoDB, Voldermort, Tokyo Cabinet