SlideShare a Scribd company logo
1 of 27
Download to read offline
Exactly Once Processing
The Sad Truth
Yair Weinberger
alooma | CTO
@yairwein
#TLVDataPlumbers
● Create a community of data plumbers
● "The best minds of my generation are deleting
commas from log files, and that makes me sad."
@medriscoll, http://adage.com/...
● “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle
to Insights” @mrogati, http://www.nytimes.com/2014/08/18/...
● “It’s time to take care of our clogged data plumbing”,
Tom Davenport, http://venturebeat.com/2014/09/18/...
A real-time platform that abstracts the data layer
Mobile
Servers
Sensors
Devices
tons of plumbing to
make it work
Unscalable Rigid Leaky Slow
Analytics Personalization Monitoring And more…
A real-time platform that abstracts the data layer
Scalable Flexible Reliable Fast
Servers
Sensors
Devices
Mobile
Analytics Personalization Monitoring And more…
Exactly Once Semantics
Same goes for exactly-once semantics
Maybe exists Does not exist
storm + trident
● What is storm?
● What is trident? (what is transaction)
Exactly once processing - storm + trident
Common myths
● We use trident, so we guarantee “exactly once”
● If a tuple in a transaction failed, the whole
transaction will be repeated, and the computation
done on the transaction so far will be discarded
● Someone already put up a working transactional
state
Transactional state is hard (impossible?)
● Trident States
Even if the backing store is transactional!
Theory:
- begin commit => begin transaction in the backing store
- update state => write to the backing store
- commit => commit in the backing store
Practice:
- This only works for single-threaded states!
- commit is called once per thread
Experience:
- Even in single threaded state, the thread can crash
between commit to the backing store and ack
Fake it ‘till you make it
Idempotency to the rescue
Fake it ‘till you make it
Idempotency to the rescue
Simple (non distributed) example: TCP sequence numbers
Fake it ‘till you make it
Idempotency to the rescue
Simple (non distributed) example: TCP sequence numbers
Fake it ‘till you make it
Idempotency to the rescue
Simple (non distributed) example: TCP sequence numbers
Fake it ‘till you make it
Idempotency to the rescue
Simple (non distributed) example: TCP sequence numbers
Fake it ‘till you make it
Idempotent distributed examples
- Database with primary key (INSERT … IGNORE)
- Kafka with log compaction
Kafka Log Compaction
● Stores only the most recent message per key, thus
idempotent.
● We can achieve exactly once even without
transactional state!
Exactly once - with idempotence
Maybe exists Does not exist
Questions?
https://storm.apache.org/documentation/Trident-state
https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction
http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/
Storm Real-Time Processing Cookbook, Quinton Anderson
https://github.com/quintona/trident-kafka-
push/blob/master/src/main/java/com/github/quintona/KafkaState.java
Resources

More Related Content

Viewers also liked

Ova c groepswerk[1]anne jooren 2_lo
Ova c groepswerk[1]anne jooren 2_loOva c groepswerk[1]anne jooren 2_lo
Ova c groepswerk[1]anne jooren 2_loAnne1987december
 
Jezierski plumbing & heating presentation
Jezierski plumbing & heating presentationJezierski plumbing & heating presentation
Jezierski plumbing & heating presentationjezierski94
 
HALIM HANI HH-SP-P0001-SanPan Plumbing Total Solution Presentation
HALIM HANI  HH-SP-P0001-SanPan Plumbing Total Solution PresentationHALIM HANI  HH-SP-P0001-SanPan Plumbing Total Solution Presentation
HALIM HANI HH-SP-P0001-SanPan Plumbing Total Solution PresentationHalim Hani 08 SANPAN
 
Gaswise Plumbing & Heating BNI Presentation
Gaswise Plumbing & Heating BNI PresentationGaswise Plumbing & Heating BNI Presentation
Gaswise Plumbing & Heating BNI PresentationBNI Bridgewater
 
Presentation on Case Study "Danhart Plumbing company".
Presentation on Case Study "Danhart Plumbing company".Presentation on Case Study "Danhart Plumbing company".
Presentation on Case Study "Danhart Plumbing company".Azaan Khan
 
Plumbing presentation by Jose Anacleto Soberano
Plumbing presentation by Jose Anacleto SoberanoPlumbing presentation by Jose Anacleto Soberano
Plumbing presentation by Jose Anacleto SoberanoEsOj Soberano
 
Services report - plumbing, electrical and hvac
Services report - plumbing, electrical and hvacServices report - plumbing, electrical and hvac
Services report - plumbing, electrical and hvacYashna Garg
 
Cloud Native Camel Design Patterns
Cloud Native Camel Design PatternsCloud Native Camel Design Patterns
Cloud Native Camel Design PatternsBilgin Ibryam
 
Plumbing in Architecture
Plumbing in ArchitecturePlumbing in Architecture
Plumbing in ArchitectureSneha Nagarajan
 
Presentation plumbing
Presentation plumbingPresentation plumbing
Presentation plumbingLiguidliguid
 

Viewers also liked (15)

Ova c groepswerk[1]anne jooren 2_lo
Ova c groepswerk[1]anne jooren 2_loOva c groepswerk[1]anne jooren 2_lo
Ova c groepswerk[1]anne jooren 2_lo
 
Jezierski plumbing & heating presentation
Jezierski plumbing & heating presentationJezierski plumbing & heating presentation
Jezierski plumbing & heating presentation
 
HALIM HANI HH-SP-P0001-SanPan Plumbing Total Solution Presentation
HALIM HANI  HH-SP-P0001-SanPan Plumbing Total Solution PresentationHALIM HANI  HH-SP-P0001-SanPan Plumbing Total Solution Presentation
HALIM HANI HH-SP-P0001-SanPan Plumbing Total Solution Presentation
 
Gaswise Plumbing & Heating BNI Presentation
Gaswise Plumbing & Heating BNI PresentationGaswise Plumbing & Heating BNI Presentation
Gaswise Plumbing & Heating BNI Presentation
 
Presentation on Case Study "Danhart Plumbing company".
Presentation on Case Study "Danhart Plumbing company".Presentation on Case Study "Danhart Plumbing company".
Presentation on Case Study "Danhart Plumbing company".
 
Plumbing presentation by Jose Anacleto Soberano
Plumbing presentation by Jose Anacleto SoberanoPlumbing presentation by Jose Anacleto Soberano
Plumbing presentation by Jose Anacleto Soberano
 
Plumbing system
Plumbing systemPlumbing system
Plumbing system
 
Services report - plumbing, electrical and hvac
Services report - plumbing, electrical and hvacServices report - plumbing, electrical and hvac
Services report - plumbing, electrical and hvac
 
Archit x1 building services 1
Archit x1 building services 1Archit x1 building services 1
Archit x1 building services 1
 
Patterns for distributed systems
Patterns for distributed systemsPatterns for distributed systems
Patterns for distributed systems
 
Cloud Native Camel Design Patterns
Cloud Native Camel Design PatternsCloud Native Camel Design Patterns
Cloud Native Camel Design Patterns
 
plumbing
plumbingplumbing
plumbing
 
Basic Plumbing System
Basic Plumbing System Basic Plumbing System
Basic Plumbing System
 
Plumbing in Architecture
Plumbing in ArchitecturePlumbing in Architecture
Plumbing in Architecture
 
Presentation plumbing
Presentation plumbingPresentation plumbing
Presentation plumbing
 

Similar to TLV Data Plumbers: Exactly once processing

Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMoved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
 
Nagios Conference 2014 - Gerald Combs - A Trillion Truths
Nagios Conference 2014 - Gerald Combs - A Trillion TruthsNagios Conference 2014 - Gerald Combs - A Trillion Truths
Nagios Conference 2014 - Gerald Combs - A Trillion TruthsNagios
 
FOSDEM 2021 - Infrastructure as Code Drift & Driftctl
FOSDEM 2021 - Infrastructure as Code Drift & DriftctlFOSDEM 2021 - Infrastructure as Code Drift & Driftctl
FOSDEM 2021 - Infrastructure as Code Drift & DriftctlStephane Jourdan
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the Worldjhugg
 
Codebits Handivi
Codebits HandiviCodebits Handivi
Codebits Handivicfpinto
 
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...Flink Forward
 
Austin Cassandra Meetup re: Atomic Counters
Austin Cassandra Meetup re: Atomic CountersAustin Cassandra Meetup re: Atomic Counters
Austin Cassandra Meetup re: Atomic CountersTrevor Francis
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetMarco Parenzan
 
Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017Dan Kaminsky
 
Here There Be Turtles: Platform Ops in Public Cloud
Here There Be Turtles: Platform Ops in Public CloudHere There Be Turtles: Platform Ops in Public Cloud
Here There Be Turtles: Platform Ops in Public Cloudbridgetkromhout
 
Design for Scale / Surge 2010
Design for Scale / Surge 2010Design for Scale / Surge 2010
Design for Scale / Surge 2010Christopher Brown
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Brian Brazil
 
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache
 
Java Hurdling: Obstacles and Techniques in Java Client Penetration-Testing
Java Hurdling: Obstacles and Techniques in Java Client Penetration-TestingJava Hurdling: Obstacles and Techniques in Java Client Penetration-Testing
Java Hurdling: Obstacles and Techniques in Java Client Penetration-TestingTal Melamed
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus44CON
 
MongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB
 

Similar to TLV Data Plumbers: Exactly once processing (20)

Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMoved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
 
Nagios Conference 2014 - Gerald Combs - A Trillion Truths
Nagios Conference 2014 - Gerald Combs - A Trillion TruthsNagios Conference 2014 - Gerald Combs - A Trillion Truths
Nagios Conference 2014 - Gerald Combs - A Trillion Truths
 
FOSDEM 2021 - Infrastructure as Code Drift & Driftctl
FOSDEM 2021 - Infrastructure as Code Drift & DriftctlFOSDEM 2021 - Infrastructure as Code Drift & Driftctl
FOSDEM 2021 - Infrastructure as Code Drift & Driftctl
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Codebits Handivi
Codebits HandiviCodebits Handivi
Codebits Handivi
 
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...
Flink Forward Berlin 2018: Lasse Nedergaard - "Our successful journey with Fl...
 
Austin Cassandra Meetup re: Atomic Counters
Austin Cassandra Meetup re: Atomic CountersAustin Cassandra Meetup re: Atomic Counters
Austin Cassandra Meetup re: Atomic Counters
 
Software + Babies
Software + BabiesSoftware + Babies
Software + Babies
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
 
Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017
 
Here There Be Turtles: Platform Ops in Public Cloud
Here There Be Turtles: Platform Ops in Public CloudHere There Be Turtles: Platform Ops in Public Cloud
Here There Be Turtles: Platform Ops in Public Cloud
 
Design for Scale / Surge 2010
Design for Scale / Surge 2010Design for Scale / Surge 2010
Design for Scale / Surge 2010
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional Memory
 
Java Hurdling: Obstacles and Techniques in Java Client Penetration-Testing
Java Hurdling: Obstacles and Techniques in Java Client Penetration-TestingJava Hurdling: Obstacles and Techniques in Java Client Penetration-Testing
Java Hurdling: Obstacles and Techniques in Java Client Penetration-Testing
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus
 
MongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, Analytics
 

Recently uploaded

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Recently uploaded (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

TLV Data Plumbers: Exactly once processing

  • 1. Exactly Once Processing The Sad Truth Yair Weinberger alooma | CTO @yairwein
  • 2. #TLVDataPlumbers ● Create a community of data plumbers ● "The best minds of my generation are deleting commas from log files, and that makes me sad." @medriscoll, http://adage.com/... ● “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights” @mrogati, http://www.nytimes.com/2014/08/18/... ● “It’s time to take care of our clogged data plumbing”, Tom Davenport, http://venturebeat.com/2014/09/18/...
  • 3. A real-time platform that abstracts the data layer Mobile Servers Sensors Devices tons of plumbing to make it work Unscalable Rigid Leaky Slow Analytics Personalization Monitoring And more…
  • 4. A real-time platform that abstracts the data layer Scalable Flexible Reliable Fast Servers Sensors Devices Mobile Analytics Personalization Monitoring And more…
  • 5. Exactly Once Semantics Same goes for exactly-once semantics
  • 6. Maybe exists Does not exist
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. storm + trident ● What is storm? ● What is trident? (what is transaction)
  • 13. Exactly once processing - storm + trident
  • 14. Common myths ● We use trident, so we guarantee “exactly once” ● If a tuple in a transaction failed, the whole transaction will be repeated, and the computation done on the transaction so far will be discarded ● Someone already put up a working transactional state
  • 15. Transactional state is hard (impossible?) ● Trident States
  • 16. Even if the backing store is transactional! Theory: - begin commit => begin transaction in the backing store - update state => write to the backing store - commit => commit in the backing store Practice: - This only works for single-threaded states! - commit is called once per thread Experience: - Even in single threaded state, the thread can crash between commit to the backing store and ack
  • 17. Fake it ‘till you make it Idempotency to the rescue
  • 18. Fake it ‘till you make it Idempotency to the rescue Simple (non distributed) example: TCP sequence numbers
  • 19. Fake it ‘till you make it Idempotency to the rescue Simple (non distributed) example: TCP sequence numbers
  • 20. Fake it ‘till you make it Idempotency to the rescue Simple (non distributed) example: TCP sequence numbers
  • 21. Fake it ‘till you make it Idempotency to the rescue Simple (non distributed) example: TCP sequence numbers
  • 22. Fake it ‘till you make it Idempotent distributed examples - Database with primary key (INSERT … IGNORE) - Kafka with log compaction
  • 23. Kafka Log Compaction ● Stores only the most recent message per key, thus idempotent. ● We can achieve exactly once even without transactional state!
  • 24. Exactly once - with idempotence
  • 25. Maybe exists Does not exist
  • 27. https://storm.apache.org/documentation/Trident-state https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/ Storm Real-Time Processing Cookbook, Quinton Anderson https://github.com/quintona/trident-kafka- push/blob/master/src/main/java/com/github/quintona/KafkaState.java Resources