This talk concentrates on understanding, what issues are at play, when operating on systems run on public clouds. This talk should get you thinking, why service levels are not supposed to be thought as a sequence of 9s, but how to take more holistic approach and how to think of investing in the resilience the correct amount before going live and running in production. Also it is equally important to understanding the human element, which is where most of the errors occur in any case and being able to minimize the impact and occurrence of the human based errors. The key takeaway in this talk is to understanding that everything can and will eventually fail and how to approach your design in such a way, that you are able to handle those situations gracefully