SlideShare una empresa de Scribd logo
1 de 48
How to scale your
platform when you
are not Facebook
2013 Carlos Herrera
caherrerapa@gmail.com
CTO - Binumi, former Head of Product and IT –
Lazada TH (Rocket Internet)
MBA and B. Computer Science
7 countries
10 years in tech. Contributed to some OSS
Product Management, Software dev, ethical
hacking, corporate governance
Coca Cola FEMSA, Infosys, among others
Who am i?
Be humble
“Normal” scale platforms are different.
–Customer paying vs Like a cat
–Sensitivity*
–Structured information vs Unstructured
–Concurrency
–Reach across regions
–GBs vs Several TB or PB
Why I shouldn’t do exactly as fb?
You scale a platform not a language*
Language selection drivers
– Problem
– Maturity
– Community
– Talent pool
Nice to experiment but focus on the
problem (yes, im talking about you
mongodb guys)
Different solutions / Creativity
No silver
bullet
Scaling in 7
steps
How does it start?
Everything on the
same server
Reading the DB
all the time
JS, CSS, Images
served from your
server
1. Define, measure, benchmark
b. Measure
Munin
Icinga
Statsd
New Relic $
Pingdom $
c. Benchmark
Siege
Jmeter
Sysbench
a. Baremetal vs Cloud (Amazon)
Physical vs Virtualized
Powerful vs Flexible
Normal skills vs Hard core skills
Support vs On your own
Outages?
Avoid anything with Cpanel for God’s sake
International gateway
Cache?
Set content in memory/drive
Faster than DB
Key = Value
TTL
Memcached (> 1 server), APC (1
server)
Other Common use: sessions
Extremely important in cloud
Live without cache
Constant hits to DB
DB easily the bottleneck
Wasting $$ you paid for memory
2. Cache system
2. Cache system
a. Customer goes to your website’s homepage
b. The page requested needs to load a list of
products
c. Is the list of products in the cache by the key
“XXXXX”?
a. Yes. Retrieve from cache using key
“XXXXX”, use it and return page.
b. No. Go to the Database, perform the query
and save it in cache for later, use the info
and return the page.
Introducing
Cache
Content Delivery
Network
Static files (CSS, JS, HTML,
JPG)
Amazon, Rackspace,
Cloudflare, Akamai
CDN
< Page Load time
< Server load (important
in cloud)
Inexpensive
> Automation
3. Content Delivery Network
4. Decouple and revisit your
web node Separation of concerns
Webnodes apart from DB
More visibility
Easier to scale
Horizontal
Vertical
Evaluate Nginx, Puma(RoR)
PHP-FPM
5. Adding web nodes / load balancing
Load Balancer
Sends traffic to internal web nodes
Easier to pay. ELB*, Rackspace
No budget? Nginx, HAProxy
Location depends if you pay or you
build it*
www.yourapp.com
web01.yourapp.com
web02.yourapp.com
Beware of the session
management
If you don’t use sticky sessions
be careful you can end up not
remembering your customers as
the load balancer sends traffic
to the less busy
5. Adding web nodes / load balancing
6. Scaling the database:
vertical
Vertical
Bigger server
RDS (downtime)
Easier
But there is a limit
7. Scaling the database:
horizontal
Master / Slave
More of effort for sysadmin
Master
Important reads
Always writes
Slave
Only Reads
Consider delay
Enough for
1000RPM per
node*
Need more?
Separate memcache servers
Sharding
RAIDs
If you are doing heavy search add
Apache Solr
Revisit your problem
Do you need Hadoop
Can you use NoSQL (Cassandra,
Mongo)?
Something is not working?
Revisit your database design
Indexes anyone?
Revisit your app design
Pessimistic locks?
Lack of good algorithms?
Bottleneck => web nodes
We added one web and db
RPM
Used historical performance data
Rackspace (no choice)
Reduce loaded elements on frontpage
Standby - Monitor APDEX
We did well. No downtimes
TV and BTS at Lazada
20-25% of the traffic just to one page
That page was on a small server we
controlled in TH
2 hours handling our own end of the
world
12/12/12 Campaign
Tech communication with business
Monitoring
VPS
Static files
Cache
What could be done better?
Solution
Delay traffic / Business guys
EC2
Cloudfront
Cache and fix code
New relic
1M video clips
20K+ video clips per month
Heavy search
Streaming
Fast quick preview. Faster than Animoto or
Wevideo
Rendering
Some
lessons
learnt
7 DO’s
1. No fear
2. Iterate fast
3. Decouple and
create interfaces
4. Track and
Monitor
5. Version and
manage
branches
6. Use the right tools
…”mmm I will need a
bigger drill”
7. Code
conventions /
Process
7 DON’Ts
1. Don’t do
“Temporary fixes”
Bad code is like
Karma
2. “DRY”*
3. Don’t
modify
“Live” files
or
databases
directly
4. Don’t
forget
testing
environment
5. Don’t use self-
managed
servers
6. Don’t be the
last to know
7. Don’t scale
too soon just be
prepared
Where do I see how the big guys do?
https://github.com/blog/530-how-we-made-
github-fast
http://nerds.airbnb.com/
http://www.facebook.com/Engineering
http://engineering.twitter.com/
http://highscalability.com/blog/2012/4/16/ins
tagram-architecture-update-whats-new-with-
instagram.html
Interesting projects
Apache Mesos (Distributed apps).
Cassandra (Database)
Kentrel (Queue)
BERT (RPC)
Apache Hadoop
Apache Zookeeper
HipHopVM
Do you code in PHP?
I’m Hiring
caherrerapa@gmail
“Scaling is like replacing
all components on a car
while driving it at 100
mph.”
“Initial scaling won’t be
glamorous”

Más contenido relacionado

Destacado

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Destacado (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Scalability meetup-05-2013-presentation

  • 1. How to scale your platform when you are not Facebook 2013 Carlos Herrera caherrerapa@gmail.com
  • 2. CTO - Binumi, former Head of Product and IT – Lazada TH (Rocket Internet) MBA and B. Computer Science 7 countries 10 years in tech. Contributed to some OSS Product Management, Software dev, ethical hacking, corporate governance Coca Cola FEMSA, Infosys, among others Who am i?
  • 3. Be humble “Normal” scale platforms are different. –Customer paying vs Like a cat –Sensitivity* –Structured information vs Unstructured –Concurrency –Reach across regions –GBs vs Several TB or PB Why I shouldn’t do exactly as fb?
  • 4. You scale a platform not a language* Language selection drivers – Problem – Maturity – Community – Talent pool Nice to experiment but focus on the problem (yes, im talking about you mongodb guys) Different solutions / Creativity
  • 7. How does it start? Everything on the same server Reading the DB all the time JS, CSS, Images served from your server
  • 8. 1. Define, measure, benchmark b. Measure Munin Icinga Statsd New Relic $ Pingdom $ c. Benchmark Siege Jmeter Sysbench a. Baremetal vs Cloud (Amazon) Physical vs Virtualized Powerful vs Flexible Normal skills vs Hard core skills Support vs On your own Outages? Avoid anything with Cpanel for God’s sake International gateway
  • 9. Cache? Set content in memory/drive Faster than DB Key = Value TTL Memcached (> 1 server), APC (1 server) Other Common use: sessions Extremely important in cloud Live without cache Constant hits to DB DB easily the bottleneck Wasting $$ you paid for memory 2. Cache system
  • 10. 2. Cache system a. Customer goes to your website’s homepage b. The page requested needs to load a list of products c. Is the list of products in the cache by the key “XXXXX”? a. Yes. Retrieve from cache using key “XXXXX”, use it and return page. b. No. Go to the Database, perform the query and save it in cache for later, use the info and return the page. Introducing Cache
  • 11. Content Delivery Network Static files (CSS, JS, HTML, JPG) Amazon, Rackspace, Cloudflare, Akamai CDN < Page Load time < Server load (important in cloud) Inexpensive > Automation 3. Content Delivery Network
  • 12. 4. Decouple and revisit your web node Separation of concerns Webnodes apart from DB More visibility Easier to scale Horizontal Vertical Evaluate Nginx, Puma(RoR) PHP-FPM
  • 13. 5. Adding web nodes / load balancing Load Balancer Sends traffic to internal web nodes Easier to pay. ELB*, Rackspace No budget? Nginx, HAProxy Location depends if you pay or you build it* www.yourapp.com web01.yourapp.com web02.yourapp.com
  • 14. Beware of the session management If you don’t use sticky sessions be careful you can end up not remembering your customers as the load balancer sends traffic to the less busy 5. Adding web nodes / load balancing
  • 15. 6. Scaling the database: vertical Vertical Bigger server RDS (downtime) Easier But there is a limit
  • 16. 7. Scaling the database: horizontal Master / Slave More of effort for sysadmin Master Important reads Always writes Slave Only Reads Consider delay Enough for 1000RPM per node*
  • 17. Need more? Separate memcache servers Sharding RAIDs If you are doing heavy search add Apache Solr Revisit your problem Do you need Hadoop Can you use NoSQL (Cassandra, Mongo)?
  • 18. Something is not working? Revisit your database design Indexes anyone? Revisit your app design Pessimistic locks? Lack of good algorithms?
  • 19.
  • 20. Bottleneck => web nodes We added one web and db RPM Used historical performance data Rackspace (no choice) Reduce loaded elements on frontpage Standby - Monitor APDEX We did well. No downtimes TV and BTS at Lazada
  • 21.
  • 22. 20-25% of the traffic just to one page That page was on a small server we controlled in TH 2 hours handling our own end of the world 12/12/12 Campaign
  • 23.
  • 24. Tech communication with business Monitoring VPS Static files Cache What could be done better? Solution Delay traffic / Business guys EC2 Cloudfront Cache and fix code New relic
  • 25. 1M video clips 20K+ video clips per month Heavy search Streaming Fast quick preview. Faster than Animoto or Wevideo Rendering
  • 26.
  • 34. 6. Use the right tools
  • 35. …”mmm I will need a bigger drill”
  • 38. 1. Don’t do “Temporary fixes” Bad code is like Karma
  • 42. 5. Don’t use self- managed servers
  • 43. 6. Don’t be the last to know
  • 44. 7. Don’t scale too soon just be prepared
  • 45. Where do I see how the big guys do? https://github.com/blog/530-how-we-made- github-fast http://nerds.airbnb.com/ http://www.facebook.com/Engineering http://engineering.twitter.com/ http://highscalability.com/blog/2012/4/16/ins tagram-architecture-update-whats-new-with- instagram.html
  • 46. Interesting projects Apache Mesos (Distributed apps). Cassandra (Database) Kentrel (Queue) BERT (RPC) Apache Hadoop Apache Zookeeper HipHopVM
  • 47. Do you code in PHP? I’m Hiring caherrerapa@gmail
  • 48. “Scaling is like replacing all components on a car while driving it at 100 mph.” “Initial scaling won’t be glamorous”

Notas del editor

  1. Hackaton