Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Let's build a PaaS platform, how hard could it be?

249 visualizaciones

Publicado el

Presentation given by Błażej Kasperczyk at Pykonik meetup in Kraków.

How many applications, and where do we put them? Why is our system so bad at keeping up with what the users want? What to do in case of a noisy neighbour?

When you're aiming to provide a platform where the developers could easily launch an application without worrying about configuring the system, you will have to code it sooner or later. As with most very simple concepts, it presents a plethora of challenges to deal with.

Publicado en: Software
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Let's build a PaaS platform, how hard could it be?

  1. 1. Błażej Kasperczyk, Kraków, 05.10.2017 Hey, let's build a PaaS Cloud! ...it's easy, right?
  2. 2. Team PaaS... 3 • DevOps team • Develop and maintain the Platform • Backend-oriented • Python 3.x, Tornado "With the friends you have in your team, you don't really need enemies!"
  3. 3. • Approx. 2300 VMs of varying sizes • 1400 active applications, 600 of them in Python3.x • 9000+ running instances • ...a third of it is Python, Tornado-based applications ...And our little cloud – and what runs on it 4
  4. 4. • Push-button deployment • Scale by available resources and the amount of applications • Quick application installation with our build system • Communications bus between applications The PaaS layer 5
  5. 5. The slow start • Work started in 2011 • Python2.7 + GEvent • Works over SSH • Push model • Hard limit: 10 applications on each vm • ...and it works! 6
  6. 6. The inevitable • Approx. 150 VMs max • 300 VMs becomes a hard limit that cannot be bypassed • A single point of failure 7 While the panel was primitive, "papyrus" was a top trending colour!
  7. 7. • In place of the old orchestrator – a table of states and a coordinator • An API that exposes what needs to be done to reach the desired state • A daemon running on the VM handles the rest • It's 2013 - let's be modern, let's do it in Python3! A moment of reinvention: What if we use our cloud, to scale our cloud? 8
  8. 8. Scoreboard • Coordinates cloud management • PostgreSQL backend • Responsible for provisioning • Supports over 2000 machines... • ...each querying multiple times every minute... • ...currently. • It can rebuild itself in case of a database failure 9
  9. 9. Agent daemon • Runs on the VM it manages • Automatically launched with each new VM • Launches and maintains applications • Reports statistics for monitoring purposes • Allows the developer to remotely shut the application down 10
  10. 10. Density problems • Over-taxing VMs causes performance issues • As it is, the allocation is hit and miss. 11
  11. 11. Weight balancing • Each VM has a capacity limit • Each application declares its size • Light (White/Green) • Medium (Yellow) • Heavy (Red) • ...that should do it, right? 12
  12. 12. Oversized cats • A worker can have spikes of 100% CPU usage and 10% averaged. • An application can declare high usage but be harmless. 13
  13. 13. The RnD • Docker? • LXC? • ...CGroups? 14
  14. 14. Docker • Requires a major overhaul of our application building and deployment... • ...and will actually do what we already have. 15
  15. 15. LXC • Current architecture requires a lack of network translation between the Agent and Application... • ...and that caused issues when launching applications 16
  16. 16. CGroups! • The same mechanism that is used by most containers • Automatic cleanup • Simplicity of the solution 17
  17. 17. • Applications in the cloud no longer exceed their assigned resources • CPU is limited for each instance • OOMKiller kicks in for memory-heavy applications that tries to exceed its limits Everything is now in a box... 18
  18. 18. • Time does not stop, or that time we went Xenial and got eaten by SystemD • The Damocles' sword called "Impending Knapsack Problem" • Autoscaling • ...and a few other things ...time to relax, right? 19 As a side effect, we actually made a sane frontend.

×