Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
1
How Kroger Embraced a
“Schema First”
Philosophy in Building
Real-time Data Pipelines
Rob Hammonds Rob Hoeting Lauren McDonald
460,000
Associates
Company-Wide
$121.2
Billion
2018 Total Sales
2,764
Supermarkets &
Multi-Department
Stores
Serving
Custo...
We are evolving!
This is just a data
swamp.
My report just broke!
How do I use the data?
Where do I find the data?
I want this new business...
Event Streaming Platform
Tenets
Thou shalt democratize data
Thou shalt model business processes
Thou shalt have a high dev...
Avro to the
rescue!
Schema-First
Development
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "c...
Ugh…How do I makes sense out of all this?
Let’s make the events
composable!
Address: {
Street, City, State, ZipCode
}
Person: {
firstName,
lastName,
address: Addres...
Class Generation and
Publishing
Register the Schema
Compatibility Checking
Bootstrap all the things!
#LazyDevelopersAreTheBestDevelopers
My pipeline
isn’t
registering the
schemas
I register
schemas but
my jar isn’t
being
published
I’m failing
compatibility
bu...
Producer
v.1
Producer
v.2
Avro Files
v.1
Avro Files
v.2
Consumer
v.1
Consumer
v.2
Schema v.1 Schema v.2
Full Compatibility...
Schema Editor
Event Schema Lifecycle
V1
V2
V2
TIME
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for...
Schema Editor
Event Schema Lifecycle
V1
V2
V2
TIME
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for...
I accidently named a
field wrong. Seems
rather harsh, don’t you
think?
I’m blocked.
Hey Robs and Lauren,
can you delete th...
Schema Editor
Event Schema Lifecycle
V1
V2
TIME
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a ...
How do we protect clients
in production, but improve
the DevX of schema
development?
Schema Editor
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",...
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Stor...
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0 2.1.1
Major Version Compatibility (MVC)
{
"type": "record",
"name":...
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Stor...
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Stor...
Lets automate all
the things…
#lazydevelopersarethebestdevelopers
What are we improving again?
Cross-platform Schema Development
Schema Composition
Schema Registration
JAR building
Schema ...
Development
Schema Registry
Stage
Schema Registry
Production
Schema Registry
ELMR
Global Schema and
Metadata Store
ELMR
Ar...
Event Streaming Becomes a Thing
The Business
No problem, I just sent
you a 500 line Avro
schema file which has
everything you need.
Can you help me
understand all the
...
How can these
people live like this?
How we
improved data
discovery...
#LazyDevelopersAreTheBestDevelopers
ELMR
UI
Development
Schema Registry
Stage
Schema Registry
Production
Schema Registry
ELMR
Global Schema and
Metadata Store
ELMR
Ar...
Artifact
Discovery
Development
Schema Registry
Stage
Schema Registry
Production
Schema Registry
ELMR
Global Schema and
Metadata Store
ELMR
Ar...
Event Discovery & Socialization
Event Lifecycle Management
CI/CD Automation
Avro Standard
Events Streaming
A Review of our...
What’s next?
Future Ideas
INTUITIVE SCHEMA
EDITING
GLOBAL STRUCTURES
AND FIELDS
EXTENDED ATTRIBUTES
AND VALIDATION
Thank You!!!
@RHammonds1 @RobHoeting @Lew181818
How Kroger embraced a "schema first" philosophy in building real-time data pipelines (Rob Hoeting, Rob Hammonds& Lauren Mc...
How Kroger embraced a "schema first" philosophy in building real-time data pipelines (Rob Hoeting, Rob Hammonds& Lauren Mc...
How Kroger embraced a "schema first" philosophy in building real-time data pipelines (Rob Hoeting, Rob Hammonds& Lauren Mc...
Próxima SlideShare
Cargando en…5
×

How Kroger embraced a "schema first" philosophy in building real-time data pipelines (Rob Hoeting, Rob Hammonds& Lauren McDonald, Kroger ) Kafka Summit SF 2019

1.620 visualizaciones

Publicado el

Early attempts at real-time business event streaming at Kroger was based on JSON formatted events. Modifications to the event formats occasionally broke downstream consumers, causing costly downtime. In the course of reimagining what an industrial strength streaming platform would look like, we decided to focus heavily on schema lifecycle and management as a foundation. The schema registry is a great service, but it's only one part of the schema lifecycle management process. Here are the core principles around schema management: (1) Event schema are expressed in Avro (2) New versions will be fully compatible with older versions (3) Event producers create, manage, and fully document event schemas (4) Avro Schemas are managed in git and represent the source of truth (5) Complex schemas can be broken into smaller reusable component schemas and referenced in larger schemas The CI/CD Build process, in conjunction with customized gradle plugins, perform the following: (1) Constructs the full event schemas from components into larger registerable schemas (2) Generates Java source code based on the event schemas (3) Checks compatibility with prior registered versions (4) Registers the new/updated version in the schema registry (5) Publishes generated JAR File into artifactory for producers & consumers (6) Other Source Code Generation (Future) (7) Publishes the schema into other metadata tools to help make them more discoverable (future)

Publicado en: Tecnología
  • Sé el primero en comentar

How Kroger embraced a "schema first" philosophy in building real-time data pipelines (Rob Hoeting, Rob Hammonds& Lauren McDonald, Kroger ) Kafka Summit SF 2019

  1. 1. 1 How Kroger Embraced a “Schema First” Philosophy in Building Real-time Data Pipelines
  2. 2. Rob Hammonds Rob Hoeting Lauren McDonald
  3. 3. 460,000 Associates Company-Wide $121.2 Billion 2018 Total Sales 2,764 Supermarkets & Multi-Department Stores Serving Customers in 35 States and The District of Columbia
  4. 4. We are evolving!
  5. 5. This is just a data swamp. My report just broke! How do I use the data? Where do I find the data? I want this new business feature by Friday. Let's use AI and streaming analytics to solve all our problems! I’m spending too much time spraying data to everyone! I had to roll back my prod release because my data change broke someone!
  6. 6. Event Streaming Platform Tenets Thou shalt democratize data Thou shalt model business processes Thou shalt have a high developer experience
  7. 7. Avro to the rescue!
  8. 8. Schema-First Development { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": "string" } ] } StoreCreated.avsc StoreCreated.newBuilder() .setStoreName(“Kroger-Cincinnati”) .setStoreID(“513”) .setLocation(“1014 Vine Street”) .build() Serialization Framework
  9. 9. Ugh…How do I makes sense out of all this?
  10. 10. Let’s make the events composable! Address: { Street, City, State, ZipCode } Person: { firstName, lastName, address: Address, } Role: { reportsTo title } EmployeeHiredEvent { employee: Person role: Role startDate: date salary: float }
  11. 11. Class Generation and Publishing Register the Schema Compatibility Checking
  12. 12. Bootstrap all the things! #LazyDevelopersAreTheBestDevelopers
  13. 13. My pipeline isn’t registering the schemas I register schemas but my jar isn’t being published I’m failing compatibility but I HAVE to change this field name The bootstrap script isn’t working for me!
  14. 14. Producer v.1 Producer v.2 Avro Files v.1 Avro Files v.2 Consumer v.1 Consumer v.2 Schema v.1 Schema v.2 Full Compatibility Backward Compatibility Forward Compatibility Schema Evolution
  15. 15. Schema Editor Event Schema Lifecycle V1 V2 V2 TIME { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" } ] } StoreCreated.avsc – v1 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" }, { "name": "Location", "type": "string", "default": "Cincinnati" } ] } StoreCreated.avsc – v2 PRODUCTION Schema Registry Full Compatibility STAGE Schema Registry Full Compatibility DEVELOPMENT Schema Registry Full Compatibility
  16. 16. Schema Editor Event Schema Lifecycle V1 V2 V2 TIME { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" } ] } StoreCreated.avsc – v1 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" }, { "name": "Location", "type": "string", "default": "Cincinnati" } ] } StoreCreated.avsc – v2 V3 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": "string", "default": "Cincinnati" } ] } StoreCreated.avsc – v3 Changing the name of an attribute breaks compatibility! PRODUCTION Schema Registry Full Compatibility STAGE Schema Registry Full Compatibility DEVELOPMENT Schema Registry Full Compatibility
  17. 17. I accidently named a field wrong. Seems rather harsh, don’t you think? I’m blocked. Hey Robs and Lauren, can you delete this schema and clear my topic?
  18. 18. Schema Editor Event Schema Lifecycle V1 V2 TIME { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": ”string" } ] } V2 V2 V3 StoreCreated.avsc – v2 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": "com.kroger.commons.Location" } ] } StoreCreated.avsc – v3 You can’t change the type of an attribute! PRODUCTION Schema Registry Full Compatibility STAGE Schema Registry Full Compatibility DEVELOPMENT Schema Registry Full Compatibility
  19. 19. How do we protect clients in production, but improve the DevX of schema development?
  20. 20. Schema Editor { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": ”com.kroger.commons.Location" } } StoreCreated.avsc – v2.0.0 TIME Major Version Compatibility (MVC) 2.0.0 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility 2.0.0 2.0.0 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": "com.kroger.commons.Location" },{ "name": "StoreManager", "type": "com.kroger.commons.Person", "doc": "Manager of the store" } } StoreCreated.avsc – v2.0.1 2.0.1 When adding an attribute, make it nullable and add a default!
  21. 21. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": "com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store" } } StoreCreated.avsc – v2.0.1 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  22. 22. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 2.1.1 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "Store", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": ”com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store" } } StoreCreated.avsc – v2.1.1 You can’t change the name of an attribute, but you can add an alias! PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  23. 23. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "alias" : "Store", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": " com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store” } } StoreCreated.avsc – v2.1.1 2.1.1 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  24. 24. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "alias": "Store", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": "com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store" } } StoreCreated.avsc – v2.1.1 2.1.1 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  25. 25. Lets automate all the things… #lazydevelopersarethebestdevelopers
  26. 26. What are we improving again? Cross-platform Schema Development Schema Composition Schema Registration JAR building Schema Lifecycle
  27. 27. Development Schema Registry Stage Schema Registry Production Schema Registry ELMR Global Schema and Metadata Store ELMR Artifactory JAR files CI/CD Build Server Java Plugin ELMR – Event Lifecycle Management Repository
  28. 28. Event Streaming Becomes a Thing
  29. 29. The Business
  30. 30. No problem, I just sent you a 500 line Avro schema file which has everything you need. Can you help me understand all the fields in those inventory events?
  31. 31. How can these people live like this?
  32. 32. How we improved data discovery... #LazyDevelopersAreTheBestDevelopers
  33. 33. ELMR UI
  34. 34. Development Schema Registry Stage Schema Registry Production Schema Registry ELMR Global Schema and Metadata Store ELMR Artifactory JAR files CI/CD Build Server Java Plugin ELMR-UI
  35. 35. Artifact Discovery
  36. 36. Development Schema Registry Stage Schema Registry Production Schema Registry ELMR Global Schema and Metadata Store ELMR Artifactory JAR files CI/CD Build Server Java Plugin ELMR-UI
  37. 37. Event Discovery & Socialization Event Lifecycle Management CI/CD Automation Avro Standard Events Streaming A Review of our Journey
  38. 38. What’s next?
  39. 39. Future Ideas INTUITIVE SCHEMA EDITING GLOBAL STRUCTURES AND FIELDS EXTENDED ATTRIBUTES AND VALIDATION
  40. 40. Thank You!!! @RHammonds1 @RobHoeting @Lew181818

×