6. Let’s Talk Schema
Cons
❑No mechanism for evolution
❑No contract enforcement
Avro
Schem
a
Registry
@erik_tank
7. History of Avro
(Aircraft)
1956
Avro Vulcan B. 1
High-altitude strategic bomber
Operated by the Royal Air Force
(RAF) from 1956 until 1984
1963
Merged into Hawker
Siddeley Aviation
However, the Avro name has
been used for some aircraft
since then
1909
Roe I Triplane
First aircraft build by the Roes.
First flight on June 5th
1909
1910
Founded by Alliott &
Humphrey Verdon Roe
One of the world's first
aircraft builders.
@erik_tank
8. History of Avro (Aircraft)
2006
Hadoop
Search removed from Nutch and with
only the data processing/sharing parts
Hadoop is born.
2009
Apache Avro
A specification based design to
server the data exchange
needs to a world with multiple
*everything*.
2002
Nutch
Doug Cutting set out to
build an open source full
scale search engine
2003
Writable/SequenceFile
Improving the ability to
MapReduce and other
parallel data computations
@erik_tank
9. Avro
❑Language neutral serialization system
❑Rich data structure
❑Compact, fast, binary data format
❑Integration to dynamic languages
❑Code generation (Java)
❑RPC
❑Schema evolution
@erik_tank
10. Why Use
❑ Data compression
❑ Language support
❑ Schema evolution
❑ Flexibility
@erik_tank
11. Why Not
❑ Initial Lift
❑ Binary format
❑ Not turn key in every language
❑ Prototyping
❑ You’re afraid?
@erik_tank
14. Short Kafka Detour
❑Topic contains messages
❑message is a key-value pair
❑subject – Schema Registry
topic_name-key
topic_name-value
@erik_tank
15. Picking a Topic Name
❑Avoid names that change over time
❑Settle on a template for your topics
❑ <message type>.<dataset name>.<data name>
❑Topic should reflect its purpose
Source: https://riccomini.name/how-paint-bike-shed-kafka-topic-naming-conventions
@erik_tank
21. Best Practices (Do’s)
❑Default Value
❑Document!!!
❑Carefully consider topic name
❑Carefully consider changes to schema
❑Deal with Avro as a DB connection
@erik_tank