SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
The whole is greater than the sum of the parts
Spotify services
Niklas Gustavsson
måndag 27 maj 13
Distributed systems geek
Spotify since 2011
ngn@spotify.com
@protocol7
Me
måndag 27 maj 13
Architectural overview
Lots of questions!
Lastyear
måndag 27 maj 13
Spotify has more than a hundred backend services. They handle enormous amounts of data.
They should always be available. How are they built?
Today
måndag 27 maj 13
In praise of small services
måndag 27 maj 13
A small code base is simpler to understand and reason about
Doing one thing and one thing only means no compromises
In praise of small services
C
CC C
AP
S
S S
S
måndag 27 maj 13
“Rule of Modularity: Developers should build a program out of simple parts connected by well
defined interfaces, so problems are local, and parts of the program can be replaced in future
versions to support new features. This rule aims to save time on debugging complex code that
is complex, long, and unreadable.”
Eric S. Raymond, The Art of Unix Programming
måndag 27 maj 13
“Decouple until it breaks, and then back of just a little”
Strive to make services autonomous
Watch your latency, but commonly not significant
Decouple
C
CC C
AP
S
S S
S
måndag 27 maj 13
Use scaffolding to quickly get the basic service structure
Reuse in libraries
Don’t overuse patterns. Don’t use layers upon layers. Keep it simple
Simple codebases
måndag 27 maj 13
We build services in Python and Java
Python is awesome for quick development and beautiful code
The JVM is stable, performant and transparent
Languages and runtimes
måndag 27 maj 13
Performance at scale
måndag 27 maj 13
Care about your performance. Set clear goals. Measure, measure, measure.
Have an architecture that allows for scale. Build out as needed. Measure, measure, measure.
Performance at scale
http://www.bbc.co.uk/programmes/b01qzdc1
måndag 27 maj 13
Prefer stateless services when possible
Scales out linear
Isolate mutating operations
Prefer stateless services
måndag 27 maj 13
Fast, efficient, RESTful protocols
Connection pools are hard. Overloaded TCP servers are complicated
Use queues. Proper pushback. Naturally asynchronous.
Efficient protocols
måndag 27 maj 13
Small payloads, fast marshaling
gzip
http://qconsf.com/dl/qcon-sanfran-2011/slides/
SastryMalladi_DealingWithPerformanceChallengesOptimizedSerializationTechniques.pdf
Efficient payloads
måndag 27 maj 13
ZeroMQ. Light-weight, fast as hell, queue based
Protobuf. Small, fast, schema-based, simple binary format
Request-reply and pub/sub
Hermes
måndag 27 maj 13
Don’t be afraid to drop requests (and replies) when overloaded
Use shallow queues
Use short timeouts
Use small thread pools
Use small connection pools
Drop requests
måndag 27 maj 13
måndag 27 maj 13
We use the best tool for each case from a small, carefully selected set of options
PostgreSQL as the default mutable storage
Cassandra for large scale (heavy writes) or multi-site services
Various read-only key-value stores
http://labs.spotify.com/2013/02/25/in-praise-of-boring-technology/
Scaling storage
måndag 27 maj 13
Always fail, never fail
måndag 27 maj 13
Stuff is always broken. Deal with it.
Always design for redundancy
Always keep an eye on your world
Don’t DDoS yourself
Always fail, never fail
måndag 27 maj 13
Build your system to run on multiple servers
Use service discovery everywhere. We use DNS SRV records.
Make deployment and configuration automated and repeatable
Make sure your service is actually running
Many commodity servers
måndag 27 maj 13
Instrument your code with metrics everywhere
We use our own for Python. http://metrics.codahale.com for java
Monitor your infrastructure. JVMs, OS, network, storage
Measure everything
måndag 27 maj 13
Graph your important metrics, strive for seconds latency
We use a heavily extended derivative of Munin
Graph
måndag 27 maj 13
Hard to know beforehand, err on the side of logging too much (within reasons)
Use a structured format
Use syslog
Collect your logs in a central place
Store your logs and make them analyzable
Log what’s important
måndag 27 maj 13
Consistently build to some form of packages. Keep track of dependencies
We build everything* to Debian packages and use package dependencies
Debian is awesome. Use it.
Automate deployment
* Except Maven dependencies
måndag 27 maj 13
Keep everything under version control
Use a provisioning tool
We use Puppet and store every configuration in Git. Everything*.
250 modules, 880 classes
Automate configuration
* Everything
måndag 27 maj 13
Trust your developers and ops. Let your teams be autonomous
Long-term ownership
Minimize interruptions (aka meetings)
Favor asynchronous communication. We coordinate over IRC and use mail
Ship.
Development
måndag 27 maj 13
We’re hiring → spotify.com/jobs (ngn@spotify.com)
Questions?
måndag 27 maj 13

Más contenido relacionado

La actualidad más candente

Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Thoughtworks
 

La actualidad más candente (20)

Presto at Twitter
Presto at TwitterPresto at Twitter
Presto at Twitter
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
 
Presto @ Uber Hadoop summit2017
Presto @ Uber Hadoop summit2017Presto @ Uber Hadoop summit2017
Presto @ Uber Hadoop summit2017
 
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESSARCHITECTING INFLUXENTERPRISE FOR SUCCESS
ARCHITECTING INFLUXENTERPRISE FOR SUCCESS
 
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworks
 
Running Zeppelin in Enterprise
Running Zeppelin in EnterpriseRunning Zeppelin in Enterprise
Running Zeppelin in Enterprise
 
Nifi
NifiNifi
Nifi
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
Learn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best PracticesLearn from HomeAway Hadoop Development and Operations Best Practices
Learn from HomeAway Hadoop Development and Operations Best Practices
 
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
 
E2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/LivyE2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/Livy
 
Data-Driven Development Era and Its Technologies
Data-Driven Development Era and Its TechnologiesData-Driven Development Era and Its Technologies
Data-Driven Development Era and Its Technologies
 
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
 
Spark optimization
Spark optimizationSpark optimization
Spark optimization
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 

Destacado

Modern Software Architectures: Building Solutions for Web, Cloud, and Mobile
Modern Software Architectures: Building Solutions for Web, Cloud, and MobileModern Software Architectures: Building Solutions for Web, Cloud, and Mobile
Modern Software Architectures: Building Solutions for Web, Cloud, and Mobile
Dan Mohl
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
Adam Kawa
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
Chris Johnson
 

Destacado (15)

Oredev 2009 JAX-RS
Oredev 2009 JAX-RSOredev 2009 JAX-RS
Oredev 2009 JAX-RS
 
Spotify services - Leetspeak 2014
Spotify services - Leetspeak 2014Spotify services - Leetspeak 2014
Spotify services - Leetspeak 2014
 
REST made simple with Java
REST made simple with JavaREST made simple with Java
REST made simple with Java
 
Modern Software Architectures: Building Solutions for Web, Cloud, and Mobile
Modern Software Architectures: Building Solutions for Web, Cloud, and MobileModern Software Architectures: Building Solutions for Web, Cloud, and Mobile
Modern Software Architectures: Building Solutions for Web, Cloud, and Mobile
 
Modern Software Architecture Styles and Patterns
Modern Software Architecture Styles and PatternsModern Software Architecture Styles and Patterns
Modern Software Architecture Styles and Patterns
 
The Modern Software Architect
The Modern Software ArchitectThe Modern Software Architect
The Modern Software Architect
 
Spotify architecture - Pressing play
Spotify architecture - Pressing playSpotify architecture - Pressing play
Spotify architecture - Pressing play
 
Software Architecture Patterns
Software Architecture PatternsSoftware Architecture Patterns
Software Architecture Patterns
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studies
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
From Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover WeeklyFrom Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover Weekly
 

Similar a Spotify services (SDC 2013)

Kevin Slade - CV
Kevin Slade - CVKevin Slade - CV
Kevin Slade - CV
Kevin Slade
 
This is the way - Holistic (Network) Automation
This is the way - Holistic (Network) AutomationThis is the way - Holistic (Network) Automation
This is the way - Holistic (Network) Automation
Maximilan Wilhelm
 
MoSQL: An Elastic Storage Engine for MySQL
MoSQL: An Elastic Storage Engine for MySQLMoSQL: An Elastic Storage Engine for MySQL
MoSQL: An Elastic Storage Engine for MySQL
Alex Tomic
 
NATS in action - A Real time Microservices Architecture handled by NATS
NATS in action - A Real time Microservices Architecture handled by NATSNATS in action - A Real time Microservices Architecture handled by NATS
NATS in action - A Real time Microservices Architecture handled by NATS
Raül Pérez
 

Similar a Spotify services (SDC 2013) (20)

Node.js and express
Node.js and expressNode.js and express
Node.js and express
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
Kevin Slade - CV
Kevin Slade - CVKevin Slade - CV
Kevin Slade - CV
 
AWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern ArchitectureAWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern Architecture
 
(java2days) Is the Future of Java Cloudy?
(java2days) Is the Future of Java Cloudy?(java2days) Is the Future of Java Cloudy?
(java2days) Is the Future of Java Cloudy?
 
Unleashing the Rails Asset Pipeline
Unleashing the Rails Asset PipelineUnleashing the Rails Asset Pipeline
Unleashing the Rails Asset Pipeline
 
Microservices Architecture
Microservices ArchitectureMicroservices Architecture
Microservices Architecture
 
Micro services may not be the best idea
Micro services may not be the best ideaMicro services may not be the best idea
Micro services may not be the best idea
 
How to automate and scale-out PostgreSQL deployment using Ansible?
How to automate and scale-out PostgreSQL deployment using Ansible?How to automate and scale-out PostgreSQL deployment using Ansible?
How to automate and scale-out PostgreSQL deployment using Ansible?
 
Workshop: Delivering chnages for applications and databases
Workshop: Delivering chnages for applications and databasesWorkshop: Delivering chnages for applications and databases
Workshop: Delivering chnages for applications and databases
 
This is the way - Holistic (Network) Automation
This is the way - Holistic (Network) AutomationThis is the way - Holistic (Network) Automation
This is the way - Holistic (Network) Automation
 
Green Shoots in the Brownest Field: Being a Startup in Government
Green Shoots in the Brownest Field: Being a Startup in GovernmentGreen Shoots in the Brownest Field: Being a Startup in Government
Green Shoots in the Brownest Field: Being a Startup in Government
 
What's New in IBM Java 8 SE?
What's New in IBM Java 8 SE?What's New in IBM Java 8 SE?
What's New in IBM Java 8 SE?
 
MoSQL: An Elastic Storage Engine for MySQL
MoSQL: An Elastic Storage Engine for MySQLMoSQL: An Elastic Storage Engine for MySQL
MoSQL: An Elastic Storage Engine for MySQL
 
NATS in action - A Real time Microservices Architecture handled by NATS
NATS in action - A Real time Microservices Architecture handled by NATSNATS in action - A Real time Microservices Architecture handled by NATS
NATS in action - A Real time Microservices Architecture handled by NATS
 
Nats in action a real time microservices architecture handled by nats
Nats in action   a real time microservices architecture handled by natsNats in action   a real time microservices architecture handled by nats
Nats in action a real time microservices architecture handled by nats
 
Introduction to IBM Spectrum Scale and Its Use in Life Science
Introduction to IBM Spectrum Scale and Its Use in Life ScienceIntroduction to IBM Spectrum Scale and Its Use in Life Science
Introduction to IBM Spectrum Scale and Its Use in Life Science
 
Java in the Age of Containers and Serverless
Java in the Age of Containers and ServerlessJava in the Age of Containers and Serverless
Java in the Age of Containers and Serverless
 
Case Study: Credit Card Core System with Exalogic, Exadata, Oracle Cloud Mach...
Case Study: Credit Card Core System with Exalogic, Exadata, Oracle Cloud Mach...Case Study: Credit Card Core System with Exalogic, Exadata, Oracle Cloud Mach...
Case Study: Credit Card Core System with Exalogic, Exadata, Oracle Cloud Mach...
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptx
 

Más de Niklas Gustavsson (7)

Real-time web
Real-time webReal-time web
Real-time web
 
RESTful web services
RESTful web servicesRESTful web services
RESTful web services
 
Not only SQL
Not only SQL Not only SQL
Not only SQL
 
HTML5
HTML5HTML5
HTML5
 
The future is bright
The future is brightThe future is bright
The future is bright
 
CouchDB
CouchDBCouchDB
CouchDB
 
Apachecon Eu 2008 Mina
Apachecon Eu 2008 MinaApachecon Eu 2008 Mina
Apachecon Eu 2008 Mina
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Spotify services (SDC 2013)

  • 1. The whole is greater than the sum of the parts Spotify services Niklas Gustavsson måndag 27 maj 13
  • 2. Distributed systems geek Spotify since 2011 ngn@spotify.com @protocol7 Me måndag 27 maj 13
  • 3. Architectural overview Lots of questions! Lastyear måndag 27 maj 13
  • 4. Spotify has more than a hundred backend services. They handle enormous amounts of data. They should always be available. How are they built? Today måndag 27 maj 13
  • 5. In praise of small services måndag 27 maj 13
  • 6. A small code base is simpler to understand and reason about Doing one thing and one thing only means no compromises In praise of small services C CC C AP S S S S måndag 27 maj 13
  • 7. “Rule of Modularity: Developers should build a program out of simple parts connected by well defined interfaces, so problems are local, and parts of the program can be replaced in future versions to support new features. This rule aims to save time on debugging complex code that is complex, long, and unreadable.” Eric S. Raymond, The Art of Unix Programming måndag 27 maj 13
  • 8. “Decouple until it breaks, and then back of just a little” Strive to make services autonomous Watch your latency, but commonly not significant Decouple C CC C AP S S S S måndag 27 maj 13
  • 9. Use scaffolding to quickly get the basic service structure Reuse in libraries Don’t overuse patterns. Don’t use layers upon layers. Keep it simple Simple codebases måndag 27 maj 13
  • 10. We build services in Python and Java Python is awesome for quick development and beautiful code The JVM is stable, performant and transparent Languages and runtimes måndag 27 maj 13
  • 12. Care about your performance. Set clear goals. Measure, measure, measure. Have an architecture that allows for scale. Build out as needed. Measure, measure, measure. Performance at scale http://www.bbc.co.uk/programmes/b01qzdc1 måndag 27 maj 13
  • 13. Prefer stateless services when possible Scales out linear Isolate mutating operations Prefer stateless services måndag 27 maj 13
  • 14. Fast, efficient, RESTful protocols Connection pools are hard. Overloaded TCP servers are complicated Use queues. Proper pushback. Naturally asynchronous. Efficient protocols måndag 27 maj 13
  • 15. Small payloads, fast marshaling gzip http://qconsf.com/dl/qcon-sanfran-2011/slides/ SastryMalladi_DealingWithPerformanceChallengesOptimizedSerializationTechniques.pdf Efficient payloads måndag 27 maj 13
  • 16. ZeroMQ. Light-weight, fast as hell, queue based Protobuf. Small, fast, schema-based, simple binary format Request-reply and pub/sub Hermes måndag 27 maj 13
  • 17. Don’t be afraid to drop requests (and replies) when overloaded Use shallow queues Use short timeouts Use small thread pools Use small connection pools Drop requests måndag 27 maj 13
  • 19. We use the best tool for each case from a small, carefully selected set of options PostgreSQL as the default mutable storage Cassandra for large scale (heavy writes) or multi-site services Various read-only key-value stores http://labs.spotify.com/2013/02/25/in-praise-of-boring-technology/ Scaling storage måndag 27 maj 13
  • 20. Always fail, never fail måndag 27 maj 13
  • 21. Stuff is always broken. Deal with it. Always design for redundancy Always keep an eye on your world Don’t DDoS yourself Always fail, never fail måndag 27 maj 13
  • 22. Build your system to run on multiple servers Use service discovery everywhere. We use DNS SRV records. Make deployment and configuration automated and repeatable Make sure your service is actually running Many commodity servers måndag 27 maj 13
  • 23. Instrument your code with metrics everywhere We use our own for Python. http://metrics.codahale.com for java Monitor your infrastructure. JVMs, OS, network, storage Measure everything måndag 27 maj 13
  • 24. Graph your important metrics, strive for seconds latency We use a heavily extended derivative of Munin Graph måndag 27 maj 13
  • 25. Hard to know beforehand, err on the side of logging too much (within reasons) Use a structured format Use syslog Collect your logs in a central place Store your logs and make them analyzable Log what’s important måndag 27 maj 13
  • 26. Consistently build to some form of packages. Keep track of dependencies We build everything* to Debian packages and use package dependencies Debian is awesome. Use it. Automate deployment * Except Maven dependencies måndag 27 maj 13
  • 27. Keep everything under version control Use a provisioning tool We use Puppet and store every configuration in Git. Everything*. 250 modules, 880 classes Automate configuration * Everything måndag 27 maj 13
  • 28. Trust your developers and ops. Let your teams be autonomous Long-term ownership Minimize interruptions (aka meetings) Favor asynchronous communication. We coordinate over IRC and use mail Ship. Development måndag 27 maj 13
  • 29. We’re hiring → spotify.com/jobs (ngn@spotify.com) Questions? måndag 27 maj 13