Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Simple Data Engineering in Python 3.5+ — Pycon.DE 2017 Karlsruhe — Bonobo ETL

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 45 Anuncio

Simple Data Engineering in Python 3.5+ — Pycon.DE 2017 Karlsruhe — Bonobo ETL

Descargar para leer sin conexión

Simple Data Engineering in Python 3.5+ using Bonobo ETL, with real world example using Django2 and DBPedia.

https://www.bonobo-project.org/

Presentation from Pycon.DE 2017 in Karlsruhe

Simple Data Engineering in Python 3.5+ using Bonobo ETL, with real world example using Django2 and DBPedia.

https://www.bonobo-project.org/

Presentation from Pycon.DE 2017 in Karlsruhe

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Simple Data Engineering in Python 3.5+ — Pycon.DE 2017 Karlsruhe — Bonobo ETL (20)

Anuncio

Más reciente (20)

Simple Data Engineering in Python 3.5+ — Pycon.DE 2017 Karlsruhe — Bonobo ETL

  1. 1. Bonobo ETL Data Engineering for Humans Python 3.5+
  2. 2. quick intro 101 one minute to jump in 102 theory and concepts 103 apply to reality wrap up
  3. 3. Intro
  4. 4. Extract Transform Load foo bar baz Extract Transform Load
  5. 5. Extract Transform Load foo bar baz Extract Transform Load Transform
 more Join 
 DB HTTP POST log?
  6. 6. Why don’t we have… • Extract Transform Load using code as configuration. • Preferably Python code. • Something that can be tested (I mean, by a machine). • Something that can use inheritance. • Quick install on a laptop, great on servers too.
  7. 7. Meet Bonobo
  8. 8. Bonobo is… A framework to write ETL jobs …using code as configuration …with the same concepts as legacy ETLs. It’s just Python!
  9. 9. Bonobo is not… …a data analysis or statistical toolkit …a scheduler or dependency manager …a big data tool
  10. 10. </tl;dr>
  11. 11. Action
  12. 12. 101 jump in
  13. 13. $ pip install bonobo
  14. 14. $ bonobo init pycon
  15. 15. $ bonobo run pycon
  16. 16. - extract in=1 out=42 - transform in=42 out=21 - load in=21
  17. 17. 102 theory
  18. 18. Graphs…
  19. 19. Graphs… import bonobo graph = bonobo.Graph() graph.add_chain( extract, transform, load, )
  20. 20. Transformations…
  21. 21. Functions items = {...} def get_item(id): return id, items.get(id) 1 In / 1 Out
  22. 22. Generators orders = {...} def get_orders(user): yield from orders.get(user) 1 In / 0-n Out
  23. 23. Iterators numbers = range(2017) messages = [ ('Bonjour', 'Paris', ), ('Ciao', 'Rimini', ), ('Guten Tag', 'Karlsruhe', ), ] 0 In / 0-n Out
  24. 24. Classes class ExtractMessages: def __call__(self): yield 'Bonjour', 'Paris', {'year': 2017} yield 'Ciao', 'Rimini', {'year': 2017} yield 'Guten Tag', 'Karlsruhe', {'year': 2017}
  25. 25. … anything, as long as it’s callable().
  26. 26. 103 reality
  27. 27. 1 • Bonobo meets Django2 2 • Hello, DBPedia 3 • Music groups… 4 • Music genres… 5 • Links and play!
  28. 28. 1 • Bonobo meets Django2 2 • Hello, DBPedia 3 • Music groups… 4 • Music genres… 5 • Links and play!
  29. 29. DBPedia & SPARQL Triplets : (SUBJECT, PREDICATE, OBJECT) DBPedia : Wikipedia as triplets SPARQL : Query language for triplet stores. ( )
  30. 30. 1 • Bonobo meets Django2 2 • Hello, DBPedia 3 • Music groups… 4 • Music genres… 5 • Links and play!
  31. 31. 1 • Bonobo meets Django2 2 • Hello, DBPedia 3 • Music groups… 4 • Music genres… 5 • Links and play!
  32. 32. 1 • Bonobo meets Django2 2 • Hello, DBPedia 3 • Music groups… 4 • Music genres… 5 • Links and play!
  33. 33. Wrap up
  34. 34. State of Bonobo ETL • First commit : December 2016 • 25 releases, ~565 commits, 12 contributors • Current « stable » 0.5.1 • Target : 1.0 early 2018 when it’s ready
  35. 35. Small scale … < Tb One minute to install. Easy to deploy It is a Lean Manufacturing Toolkit for Data.
  36. 36. www.bonobo-project.org Data Engineering for Humans @bonobo_etl
  37. 37. Romain Dorgueil @rdorgueil
  38. 38. Oh, wait!
  39. 39. Sprint!
  40. 40. Stickers!
  41. 41. Feedback!
  42. 42. Danke! https://goo.gl/e25eoa

×