Apache Spark is a general-purpose big data execution engine. You can work with different data sources with the same set of API in both batch and streaming mode. Such flexibility is great if you are experienced Spark developer solving a complicated data engineering problem, which might include ML or streaming. In Airbnb, 95% of all data pipelines are daily batch jobs, which read from Hive tables and write to Hive tables. For such jobs, you would like to trade some flexibility for more extensive functionality around writing to Hive or multiple days processing orchestration. Another advantage of reducing flexibility is creating "best practices", which can be followed by less experienced data engineers. In Airbnb, we've created a framework called "Sputnik," which tries to address these issues. In this talk, I'll show the typical boilerplate code, which Sputnik tries to reduce and concepts it introduces to simplify pipeline development.