The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
My talk at LVEE 2016
1. Using Hadoop stack to build a cloud VAT
declarations revising service
Alex Chistyakov
Git in Sky
Grodno, LVEE 2016
2. Who I am
●
Hello, my name is Alex
●
Principal Engineer @ Git in Sky
●
Hadoop operations engineer
●
Former Java developer (not only Java and not so
“former” in fact)
3. Who are you?
●
Linux and OSS enthusiasts?
●
Software developers?
●
DevOps engineers?
●
Big data guys?
4. Well, what is this all about?
●
Configuring a Hadoop/HBase cluster is easy
5. Well, what is this all about?
●
Configuring a Hadoop/HBase cluster is easy
●
1) Buy a lot of hardware
6. Well, what is this all about?
●
Configuring a Hadoop/HBase cluster is easy
●
1) Buy a lot of hardware
●
2) Configure the bloody cluster!
7. Well, what is this all about?
●
Configuring a Hadoop/HBase cluster is easy
●
1) Buy a lot of hardware
●
2) Configure the bloody cluster!
●
3) ???
8. Well, what is this all about?
●
Configuring a Hadoop/HBase cluster is easy
●
1) Buy a lot of hardware
●
2) Configure the bloody cluster!
●
3) ???
●
4) PROFIT!!!
9. Big Data is hard!
●
A customer wants a number of environments for
different purposes (dev, testing, staging &
production)
●
DevOps culture requires repeatability!
●
(Observe a beautiful snowflake to the right)
●
Business wants to reduce costs
10. So, we need a detailed plan
●
1) Buy an enterprise subscription from Oracle
11. So, we need a detailed plan
●
1) Buy an enterprise subscription from Oracle
●
^ FAIL!
12. So, we need a detailed plan
●
1) Read the manual on the product site
13. So, we need a detailed plan
●
1) Read the manual on the product site
●
2) Configure everything manually
14. So, we need a detailed plan
●
1) Read the manual on the product site
●
2) Configure everything manually
●
^ FAIL!
15. So, we need a detailed plan
●
1) Take Cloudera distribution of Hadoop
16. So, we need a detailed plan
●
1) Take Cloudera distribution of Hadoop
●
2) Configure everything from a web interface
17. So, we need a detailed plan
●
1) Take Cloudera distribution of Hadoop
●
2) Configure everything from a web interface
●
3) Don’t forget to buy an enterprise subscription
18. So, we need a detailed plan
●
1) Take Cloudera distribution of Hadoop
●
2) Configure everything from a web interface
●
3) Don’t forget to buy an enterprise subscription
●
4) ^ MULTIPLE FAILS!!!
19. A word on proprietary software
●
Proprietary software is full of nasty bugs, period
20. A word on open source software
●
Open source software is awesome
22. Software market in 2016
●
It’s not “proprietary vs open source”
●
It’s “open source vs open source”
23. Open source vs open source
●
Cloudera CDH vs vanilla Apache
24. So, we need a detailed plan
●
1) Hire a DevOps engineer
25. So, we need a detailed plan
●
1) Hire a DevOps engineer
●
2) Use Chef or something
26. So, we need a detailed plan
●
1) Hire a DevOps engineer
●
2) Use Chef or something
●
3) Automate all the things
27. So, we need a detailed plan
●
1) Hire a DevOps engineer
●
2) Use Chef or something
●
3) Automate all the things
●
4) ???
28. So, we need a detailed plan
●
1) Hire a DevOps engineer
●
2) Use Chef or something
●
3) Automate all the things
●
4) ???
●
5) PROFIT!!!
29. 100 reasons not to use Cloudera CDH
●
Cloudera CDH obscures configuration
●
Cloudera CDH generates textual configs from the DB
●
Cloudera CDH is web-interface centric
●
Cloudera CDH is a monolith with a vendor lock-in
30. Our own little open source product
●
Based on Ansible (Ansible is like Chef but awesome)
●
https://github.com/gitinsky/ansible-hadoop-stack-howto
●
https://github.com/gitinsky/ansible-role-*
36. So long, and thanks for all the fish!
●
Ask your questions please
●
Alex Chistyakov, Principal Engineer @ Git in Sky
●
http://gitinsky.com
●
alex@gitinsky.com
●
http://meetup.com/DevOps-40