This document discusses a formal specification for event processing called LEAD. It proposes new operators and a rule grammar to address challenges in complex event processing (CEP) like performance, scalability, state management, ambiguous semantics, and lack of expressiveness. The key contributions are a pattern algebra extending common CEP operators, a rule grammar to define patterns and obtain actions, and a novel logical execution plan using timed colored Petri nets to facilitate deployment. An example use case of product roll-up tracking is presented to illustrate how LEAD can formulate a problem with interdependent patterns in fewer queries compared to other CEP frameworks.
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaKai Wähner
The Rise of Data in Motion powered by Event Streaming - Use Cases and Architecture for IBM Cloud Pak with Confluent Platform. Including screenshots of the live demo (integration between IBM and Kafka via Confluent Platform and Kafka Connect connectors).
Learn about the integration capabilities of IBM Cloud Pak for Integration, now with the industry’s leading event streaming platform from Confluent Platform powered by Apache Kafka.
Financial Event Sourcing at Enterprise Scaleconfluent
For years, Rabobank has been actively investing in becoming a real-time, event-driven bank. If you are familiar with banking processes, you will understand that this is not simple. Many banking processes are implemented as batch jobs on not-so-commodity hardware, meaning that any migration effort is immense.
*Find out how Rabobank redesigned Rabo Alerts while continuing to provide a robust and stable alert system for its existing user base
*Learn how the project team managed to achieve a balance between the need to decentralise activity while not losing control
*Understand how Rabobank re-invented a reliable service to meet modern customer expectations
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...confluent
Apache Kafka can act as both an enemy and a friend to traditional middleware like message queues, ETL tools, and enterprise service buses. As an enemy, Kafka replaces many of the individual components and provides a single scalable platform for messaging, storage, and processing. However, Kafka can also integrate with traditional middleware as a friend through connectors and client APIs, allowing certain use cases to still leverage existing tools. In complex environments with both new and legacy systems, Kafka acts as a "frenemy" - replacing some functions but integrating with other existing technologies to provide a bridge to new architectures.
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
Data in Motion powered by the Apache Kafka ecosystem for Situational Awareness, Threat Detection, Forensics, Zero Trust Zones and Air-Gapped Environments.
Agenda:
1) Cybersecurity in 202X
2) Data in Motion as Cybersecurity Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
More details in the "Kafka for Cybersecurity" blog series:
https://www.kai-waehner.de/blog/2021/07/02/kafka-cybersecurity-siem-soar-part-1-of-6-data-in-motion-as-backbone/
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...confluent
This document summarizes Chris D'Agostino's presentation on enabling real-time event processing at scale. The presentation covers event-based architecture, self-service streaming and data governance, and complex event processing (CEP) and IFTTT capabilities. It discusses goals like data democracy, shared infrastructure, and making data tools and platforms user-centered. It also provides examples of how their streaming data platform supports features like stream design and management, data validation, and automatic data enrichment.
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...confluent
This document provides an overview of a webinar on driving business transformation with real-time analytics using Apache Kafka and KSQL. The webinar features presentations from Nick Dearden of Confluent, John Thuma of Arcadia Data, and Thomas Clarke of RCG Global Services. It discusses how Kafka and KSQL can be used together to enable real-time data processing and analytics. It also highlights how Arcadia Data provides a BI tool for KSQL that allows for easy drag-and-drop dashboarding on streaming data. RCG then discusses its approach to digital transformation and data architecture services. The webinar concludes with a Q&A section.
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
The Rise of Event Streaming in Financial Services - Use Cases, Architectures and Examples powered by Apache Kafka.
The New FinServ Enterprise Reality: Every company is a software company. Innovate OR be Disrupted. Learn how Event Streaming with Apache Kafka and its ecosystem help...
More details:
https://www.kai-waehner.de/apache-kafka-financial-services-industry-banking-finserv-payment-fraud-middleware-messaging-transactions
https://www.kai-waehner.de/blog/2020/04/15/apache-kafka-machine-learning-banking-finance-industry/
https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/
Stream me to the Cloud (and back) with Confluent & MongoDBconfluent
In this online talk, we’ll explore how and why companies are leveraging Confluent and MongoDB to modernize their architecture and leverage the scalability of the cloud and the velocity of streaming. Based upon a sample retail business scenario, we will explain how changes in an on-premise database are streamed via the Confluent Cloud to MongoDB Atlas and back.
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaKai Wähner
The Rise of Data in Motion powered by Event Streaming - Use Cases and Architecture for IBM Cloud Pak with Confluent Platform. Including screenshots of the live demo (integration between IBM and Kafka via Confluent Platform and Kafka Connect connectors).
Learn about the integration capabilities of IBM Cloud Pak for Integration, now with the industry’s leading event streaming platform from Confluent Platform powered by Apache Kafka.
Financial Event Sourcing at Enterprise Scaleconfluent
For years, Rabobank has been actively investing in becoming a real-time, event-driven bank. If you are familiar with banking processes, you will understand that this is not simple. Many banking processes are implemented as batch jobs on not-so-commodity hardware, meaning that any migration effort is immense.
*Find out how Rabobank redesigned Rabo Alerts while continuing to provide a robust and stable alert system for its existing user base
*Learn how the project team managed to achieve a balance between the need to decentralise activity while not losing control
*Understand how Rabobank re-invented a reliable service to meet modern customer expectations
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...confluent
Apache Kafka can act as both an enemy and a friend to traditional middleware like message queues, ETL tools, and enterprise service buses. As an enemy, Kafka replaces many of the individual components and provides a single scalable platform for messaging, storage, and processing. However, Kafka can also integrate with traditional middleware as a friend through connectors and client APIs, allowing certain use cases to still leverage existing tools. In complex environments with both new and legacy systems, Kafka acts as a "frenemy" - replacing some functions but integrating with other existing technologies to provide a bridge to new architectures.
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
Data in Motion powered by the Apache Kafka ecosystem for Situational Awareness, Threat Detection, Forensics, Zero Trust Zones and Air-Gapped Environments.
Agenda:
1) Cybersecurity in 202X
2) Data in Motion as Cybersecurity Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
More details in the "Kafka for Cybersecurity" blog series:
https://www.kai-waehner.de/blog/2021/07/02/kafka-cybersecurity-siem-soar-part-1-of-6-data-in-motion-as-backbone/
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...confluent
This document summarizes Chris D'Agostino's presentation on enabling real-time event processing at scale. The presentation covers event-based architecture, self-service streaming and data governance, and complex event processing (CEP) and IFTTT capabilities. It discusses goals like data democracy, shared infrastructure, and making data tools and platforms user-centered. It also provides examples of how their streaming data platform supports features like stream design and management, data validation, and automatic data enrichment.
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...confluent
This document provides an overview of a webinar on driving business transformation with real-time analytics using Apache Kafka and KSQL. The webinar features presentations from Nick Dearden of Confluent, John Thuma of Arcadia Data, and Thomas Clarke of RCG Global Services. It discusses how Kafka and KSQL can be used together to enable real-time data processing and analytics. It also highlights how Arcadia Data provides a BI tool for KSQL that allows for easy drag-and-drop dashboarding on streaming data. RCG then discusses its approach to digital transformation and data architecture services. The webinar concludes with a Q&A section.
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
The Rise of Event Streaming in Financial Services - Use Cases, Architectures and Examples powered by Apache Kafka.
The New FinServ Enterprise Reality: Every company is a software company. Innovate OR be Disrupted. Learn how Event Streaming with Apache Kafka and its ecosystem help...
More details:
https://www.kai-waehner.de/apache-kafka-financial-services-industry-banking-finserv-payment-fraud-middleware-messaging-transactions
https://www.kai-waehner.de/blog/2020/04/15/apache-kafka-machine-learning-banking-finance-industry/
https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/
Stream me to the Cloud (and back) with Confluent & MongoDBconfluent
In this online talk, we’ll explore how and why companies are leveraging Confluent and MongoDB to modernize their architecture and leverage the scalability of the cloud and the velocity of streaming. Based upon a sample retail business scenario, we will explain how changes in an on-premise database are streamed via the Confluent Cloud to MongoDB Atlas and back.
Confluent x imply: Build the last mile to value for data streaming applicationsconfluent
The document discusses how modern applications require real-time connectivity and instant reactions using data streams, as opposed to traditional batch processing with databases. It explains how Apache Kafka and stream processing with ksqlDB can act as the central nervous system to instantly connect data sources and sinks in real-time. The document also describes how Confluent Cloud provides a fully managed service for Apache Kafka deployments in public clouds.
Event Mesh: The Architecture Layer That Will Power Your Digital TransformationSolace
Event mesh is an architectural layer that routes events from producers to consumers in a flexible, reliable and governed manner, no matter where your apps are deployed. Crispin Clark, SVP Europe at Solace, and Harsh Jegadeesan, Head of Product Management Integration Platform at SAP, discuss in depth the evolution of the event mesh.
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...confluent
The document discusses event streaming architectures and use cases. It describes how event streaming is used across different industries for applications like inventory management, fraud detection, customer profiles, and more. It then discusses the key components of an event streaming platform, including processing event streams in real-time, publishing and subscribing to streams, and storing streams fault-tolerantly. Finally, it provides examples of common event streaming architectures like change data capture and integration with messaging systems.
Data reply sneak peek: real time decision enginesconfluent
This document discusses real-time decision engines and how they can react to business events in real-time. It provides examples of how real-time decision engines work in different industries like telecommunications, banking, insurance, and media. Real-time decision engines integrate real-time data sources to understand customer context and trigger actions in response to events. They are built using a microservices approach and streaming data technologies. Examples of applications include real-time marketing, fraud detection, content recommendations, and enforcing business rules.
Originally presented by Jonathan Schabowsky at Kafka Summit 2020.
** About PubSub+ Event Portal for Apache Kafka **
You know and love Apache Kafka, but have you ever tried to visualize Kafka topology, or figure out who owns what event stream in a Kafka cluster? Your event-driven architecture has evolved, and your system has grown to the point where you’re feeling a bit… out of control.
You need a tool to discover your Kafka event streams, represent it in a graphical view, and make it easy to share and reuse events. Basically, you need an API portal, but for asynchronous, event-driven applications.
That is why we have developed PubSub+ Event Portal. This event management toolset makes it easy for you to discover, visualize, catalog and share your Apache Kafka event streams, including those from Confluent and Amazon MSK.
To learn more, visit: https://solace.com/products/portal/kafka/
Apache Kafka® Use Cases for Financial Servicesconfluent
Traditional systems were designed in an era that predates large-scale distributed systems. These systems often lack the ability to scale to meet the needs of the modern data-driven organisation. Adding to this is the accumulation of technologies and the explosion of data which can result in complex point-to-point integrations where data becomes siloed or separated across the enterprise.
The demand for fast results and decision making, have generated the need for real-time event streaming and processing of data adoption in financial institutions to be on the competitive edge. Apache Kafka and the Confluent Platform are designed to solve the problems associated with traditional systems and provide a modern, distributed architecture and Real-time Data streaming capability. In addition these technologies open up a range of use cases for Financial Services organisations, many of which will be explored in this talk. .
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19confluent
Google Cloud Next ’19
Speakers:
Gaetan Castelein, Confluent Product Marketing
Kir Titievsky, Google Product Management
Confluent Cloud for Apache Kafka® was a session conducted at Google Cloud Next ’19 on the topic of how Confluent and Google are partnering to give you a complete event-streaming platform that extends Kafka with essential capabilities for developers and enterprises. Confluent is available as a fully managed, first class service on GCP, or can be deployed on-premises on Google Cloud Services Platform. Developers can deploy Confluent Cloud™ in minutes right from the Google Cloud Console to start building event-driven applications. Enterprises can build hybrid cloud streaming solutions with a common platform that spans from on-premises to GCP, streaming data to GCP to leverage best-of-breed services such as BigQuery and TensorFlow. Review this presentation to learn about Confluent and GCP services, and see how you can get started in just minutes with no upfront commitment.
How Apache Kafka helps to create Data Culture – How to Cross the Kafka Chasmconfluent
In this webinar we want to share our experience on how the Swiss Mobiliar, the biggest Swiss household insurance enterprise, introduced Kafka and led it to enterprise-wide adoption with the help of AGOORA.com.
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaKai Wähner
Apache Kafka and Event Streaming are two of the most relevant buzzwords in tech these days. Ever wonder what the predicted TOP 5 Event Streaming Architectures and Use Cases for 2021 are? Check out the following presentation. Learn about edge deployments, hybrid and multi-cloud architectures, service mesh-based microservices, streaming machine learning, and cybersecurity.
On-demand video recording: https://videos.confluent.io/watch/XAjxV3j8hzwCcEKoZVErUJ
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...confluent
The document discusses using Apache Kafka to capture customer data across multiple systems for a healthcare organization. It describes implementing a Kafka event streaming pipeline to collect user interaction data from a member portal. This provided a single view of members across different systems to improve customer experience, operational efficiency, and adopt new technologies. The implementation was successful and prepared the organization to stream more customer data for analytics and better customer service.
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...confluent
Apache Kafka is an open source event streaming platform. It is often used to complement or even replace existing middleware to integrate applications and build microservice architectures. Apache Kafka is already used in various projects in almost every bigger company today. Understood, battled-tested, highly scalable, reliable, real-time.
Blockchain is a different story. This technology is a lot in the news, especially related to cryptocurrencies like Bitcoin. But what is the added value for software architectures? Is blockchain just hype and adds complexity? Or will it be used by everybody in the future, like a web browser or mobile app today? And how is it related to an integration architecture and event streaming platform?
This session explores use cases for blockchains and discusses different alternatives such as Hyperledger, Ethereum and a Kafka-native tamper-proof blockchain implementation. Different architectures are discussed to understand when blockchain really adds value and how it can be combined with the Apache Kafka ecosystem to integrate blockchain with the rest of the enterprise architecture to build a highly scalable and reliable event streaming infrastructure.
Speakers:
Kai Waehner, Technology Evangelist, Confluent
Stephen Reed, CTO, Co-Founder, AiB
Enabling Smarter Cities and Connected Vehicles with an Event Streaming Platfo...Kai Wähner
Many cities are investing in technologies to transform their cities into smart city- environments in which data collection and analysis is utilized to manage assets and resources efficiently. Modern technology can help connect the right data, at the right time, to the right people, processes and systems. Innovations around smart cities and the Internet of Things give cities the ability to improve motor safety, unify and manage transportation systems and traffic, save energy and provide a better experience for the residents.
By utilizing an event streaming platform, like Confluent, cities are able to process data in real-time from thousands of sources, such as sensors. By aggregating that data and analyzing real-time data streams, more informed decisions can be made and fine-tuned operations developed for a positive impact on everyday challenges faced by cities.
Learn how to:
-Overcome challenges for building a smarter city
-Build a real time infrastructure to correlate relevant events
-Connect thousands of devices, machines, and people
-Leverage open source and fully managed solutions from the Apache Kafka ecosystem
APAC Confluent Consumer Data Right the Lowdown and the Lessonsconfluent
The document discusses the Consumer Data Right (CDR) framework in Australia and lessons that can be learned from it. It provides an overview of the CDR, including that it applies to existing consumer data and requires data holders to share data with accredited third parties if authorized by consumers. It also notes the CDR will apply across multiple sectors starting with banking, energy, and telecommunications. The document also discusses some of the technical challenges of implementing CDR like maintaining a single customer view, tracking accredited parties, and ensuring data privacy and governance. It provides examples of how streaming data platforms like Apache Kafka can be used to power use cases enabled by CDR like customer and product 360-degree views, payments traceability, and open banking
Event-Streaming verstehen in unter 10 Minconfluent
Um die unternehmerische Geschwindigkeit zu erhöhen, die Wettbewerbsfähigkeit durch neue Produkte und Services zu steigern und schnell auf plötzlich ändernde Markteinflüsse reagieren zu können, müssen Daten und Ereignisströme in Echtzeit geteilt, verarbeitet und ausgewertet werden können. Apache Kafka hat sich hier als Industrie-Standard für Event-Streaming etabliert. Ob Connected Car, Industrie 4.0 oder Customer 360 – alle diese zukunftsorientierten Themen benötigen schnelle Kommunikation, effiziente Vernetzung und eine Verarbeitung von enormen Datenmengen in Echtzeit.
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
The rise of data in motion in the insurance industry is visible across all lines of business including life, healthcare, travel, vehicle, and others. Apache Kafka changes how enterprises rethink data. This blog post explores use cases and architectures for event streaming. Real-world examples from Generali, Centene, Humana, and Telsa show innovative insurance-related data integration and stream processing in real-time.
Event-Based Business Architecture: Orchestrating Enterprise Communications confluent
(Gary Samuelson, GarySamuelson) Kafka Summit SF 2018
A business-oriented view, illustrating both process models and in-flight task progress, is critical to understanding organizational health, efficiency and alignment to strategic goals. The intent of this talk is to illustrate the real-time relationship between Kafka-managed events (event driven) and business architecture via actionable models (real-time analytics).
Takeaways:
-Understand how business views technology in terms of capabilities aligned to strategy.
-Introduce process model and performance views into an event-oriented dashboard. This view illustrates the organization in terms of collaborating human and automated services.
-Illustrate how system architecture dovetails into business goals with an aligned business/IT architectures.
Bridge Your Kafka Streams to Azure Webinarconfluent
With a fully managed Apache Kafka(R) as-a-service on Microsoft Azure, businesses can focus on building applications and not managing clusters. Build a persistent bridge from on-premises data systems to the cloud with a hybrid Kafka service or stream across public clouds for multi-cloud data pipelines.
In this session for business and technical data leaders, you can learn about powering business applications with the managed Kafka service that streams data into Azure SQL Data Warehouse, Cosmos DB, Azure Data Lake Storage and Azure Blob Storage.
This document discusses measuring the business value of using Kafka to power event-driven applications. It begins by explaining why measuring value is important for ROI, stakeholder commitment, and benefits realization. It then outlines three real-world examples of using Kafka: resolving ATM disputes faster resulted in 50% less agent time and 75% fewer avoidable payments; a customer 360 application improved targeted offers for increased revenue and better inventory management; and a fraud prevention system enabled real-time detection and prevention, decreasing insurance premiums. The document concludes by recommending establishing credibility through sound assumptions, defining what is actually being measured, and accepting that value is subjective and changing over time.
Event: https://www.meetup.com/de-DE/Vienna-Kafka-meetup/events/262314643/
Speaker: Patrik Kleindl (patrik.kleindl@bearingpoint.com)
Slides of the introduction to Apache Kafka and some popular use cases.
Slides were provided by Confluent (confluent.io)
Workshop 1. Architecting Innovative Graph Applications
Join this hands-on workshop for beginners led by Neo4j experts guiding you to systematically uncover contextual intelligence. Using a real-life dataset we will build step-by-step a graph solution; from building the graph data model to running queries and data visualization. The approach will be applicable across multiple use cases and industries.
This document summarizes a final year internship presentation on detecting fake news using machine learning. The intern worked at Syslog Technologies on a project to build a model that can classify news articles as real or fake. The methodology involved collecting a dataset of real and fake news, preprocessing the data, training classification algorithms, and evaluating the models' performance. The system architecture included feature extraction, training/testing datasets, applying algorithms like random forest and Naive Bayes, and selecting the best model based on accuracy metrics. The presentation covered technologies used like Python, OpenCV, Anaconda, and modeling tools like Jupyter Notebook and Spyder.
Confluent x imply: Build the last mile to value for data streaming applicationsconfluent
The document discusses how modern applications require real-time connectivity and instant reactions using data streams, as opposed to traditional batch processing with databases. It explains how Apache Kafka and stream processing with ksqlDB can act as the central nervous system to instantly connect data sources and sinks in real-time. The document also describes how Confluent Cloud provides a fully managed service for Apache Kafka deployments in public clouds.
Event Mesh: The Architecture Layer That Will Power Your Digital TransformationSolace
Event mesh is an architectural layer that routes events from producers to consumers in a flexible, reliable and governed manner, no matter where your apps are deployed. Crispin Clark, SVP Europe at Solace, and Harsh Jegadeesan, Head of Product Management Integration Platform at SAP, discuss in depth the evolution of the event mesh.
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...confluent
The document discusses event streaming architectures and use cases. It describes how event streaming is used across different industries for applications like inventory management, fraud detection, customer profiles, and more. It then discusses the key components of an event streaming platform, including processing event streams in real-time, publishing and subscribing to streams, and storing streams fault-tolerantly. Finally, it provides examples of common event streaming architectures like change data capture and integration with messaging systems.
Data reply sneak peek: real time decision enginesconfluent
This document discusses real-time decision engines and how they can react to business events in real-time. It provides examples of how real-time decision engines work in different industries like telecommunications, banking, insurance, and media. Real-time decision engines integrate real-time data sources to understand customer context and trigger actions in response to events. They are built using a microservices approach and streaming data technologies. Examples of applications include real-time marketing, fraud detection, content recommendations, and enforcing business rules.
Originally presented by Jonathan Schabowsky at Kafka Summit 2020.
** About PubSub+ Event Portal for Apache Kafka **
You know and love Apache Kafka, but have you ever tried to visualize Kafka topology, or figure out who owns what event stream in a Kafka cluster? Your event-driven architecture has evolved, and your system has grown to the point where you’re feeling a bit… out of control.
You need a tool to discover your Kafka event streams, represent it in a graphical view, and make it easy to share and reuse events. Basically, you need an API portal, but for asynchronous, event-driven applications.
That is why we have developed PubSub+ Event Portal. This event management toolset makes it easy for you to discover, visualize, catalog and share your Apache Kafka event streams, including those from Confluent and Amazon MSK.
To learn more, visit: https://solace.com/products/portal/kafka/
Apache Kafka® Use Cases for Financial Servicesconfluent
Traditional systems were designed in an era that predates large-scale distributed systems. These systems often lack the ability to scale to meet the needs of the modern data-driven organisation. Adding to this is the accumulation of technologies and the explosion of data which can result in complex point-to-point integrations where data becomes siloed or separated across the enterprise.
The demand for fast results and decision making, have generated the need for real-time event streaming and processing of data adoption in financial institutions to be on the competitive edge. Apache Kafka and the Confluent Platform are designed to solve the problems associated with traditional systems and provide a modern, distributed architecture and Real-time Data streaming capability. In addition these technologies open up a range of use cases for Financial Services organisations, many of which will be explored in this talk. .
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19confluent
Google Cloud Next ’19
Speakers:
Gaetan Castelein, Confluent Product Marketing
Kir Titievsky, Google Product Management
Confluent Cloud for Apache Kafka® was a session conducted at Google Cloud Next ’19 on the topic of how Confluent and Google are partnering to give you a complete event-streaming platform that extends Kafka with essential capabilities for developers and enterprises. Confluent is available as a fully managed, first class service on GCP, or can be deployed on-premises on Google Cloud Services Platform. Developers can deploy Confluent Cloud™ in minutes right from the Google Cloud Console to start building event-driven applications. Enterprises can build hybrid cloud streaming solutions with a common platform that spans from on-premises to GCP, streaming data to GCP to leverage best-of-breed services such as BigQuery and TensorFlow. Review this presentation to learn about Confluent and GCP services, and see how you can get started in just minutes with no upfront commitment.
How Apache Kafka helps to create Data Culture – How to Cross the Kafka Chasmconfluent
In this webinar we want to share our experience on how the Swiss Mobiliar, the biggest Swiss household insurance enterprise, introduced Kafka and led it to enterprise-wide adoption with the help of AGOORA.com.
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaKai Wähner
Apache Kafka and Event Streaming are two of the most relevant buzzwords in tech these days. Ever wonder what the predicted TOP 5 Event Streaming Architectures and Use Cases for 2021 are? Check out the following presentation. Learn about edge deployments, hybrid and multi-cloud architectures, service mesh-based microservices, streaming machine learning, and cybersecurity.
On-demand video recording: https://videos.confluent.io/watch/XAjxV3j8hzwCcEKoZVErUJ
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...confluent
The document discusses using Apache Kafka to capture customer data across multiple systems for a healthcare organization. It describes implementing a Kafka event streaming pipeline to collect user interaction data from a member portal. This provided a single view of members across different systems to improve customer experience, operational efficiency, and adopt new technologies. The implementation was successful and prepared the organization to stream more customer data for analytics and better customer service.
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...confluent
Apache Kafka is an open source event streaming platform. It is often used to complement or even replace existing middleware to integrate applications and build microservice architectures. Apache Kafka is already used in various projects in almost every bigger company today. Understood, battled-tested, highly scalable, reliable, real-time.
Blockchain is a different story. This technology is a lot in the news, especially related to cryptocurrencies like Bitcoin. But what is the added value for software architectures? Is blockchain just hype and adds complexity? Or will it be used by everybody in the future, like a web browser or mobile app today? And how is it related to an integration architecture and event streaming platform?
This session explores use cases for blockchains and discusses different alternatives such as Hyperledger, Ethereum and a Kafka-native tamper-proof blockchain implementation. Different architectures are discussed to understand when blockchain really adds value and how it can be combined with the Apache Kafka ecosystem to integrate blockchain with the rest of the enterprise architecture to build a highly scalable and reliable event streaming infrastructure.
Speakers:
Kai Waehner, Technology Evangelist, Confluent
Stephen Reed, CTO, Co-Founder, AiB
Enabling Smarter Cities and Connected Vehicles with an Event Streaming Platfo...Kai Wähner
Many cities are investing in technologies to transform their cities into smart city- environments in which data collection and analysis is utilized to manage assets and resources efficiently. Modern technology can help connect the right data, at the right time, to the right people, processes and systems. Innovations around smart cities and the Internet of Things give cities the ability to improve motor safety, unify and manage transportation systems and traffic, save energy and provide a better experience for the residents.
By utilizing an event streaming platform, like Confluent, cities are able to process data in real-time from thousands of sources, such as sensors. By aggregating that data and analyzing real-time data streams, more informed decisions can be made and fine-tuned operations developed for a positive impact on everyday challenges faced by cities.
Learn how to:
-Overcome challenges for building a smarter city
-Build a real time infrastructure to correlate relevant events
-Connect thousands of devices, machines, and people
-Leverage open source and fully managed solutions from the Apache Kafka ecosystem
APAC Confluent Consumer Data Right the Lowdown and the Lessonsconfluent
The document discusses the Consumer Data Right (CDR) framework in Australia and lessons that can be learned from it. It provides an overview of the CDR, including that it applies to existing consumer data and requires data holders to share data with accredited third parties if authorized by consumers. It also notes the CDR will apply across multiple sectors starting with banking, energy, and telecommunications. The document also discusses some of the technical challenges of implementing CDR like maintaining a single customer view, tracking accredited parties, and ensuring data privacy and governance. It provides examples of how streaming data platforms like Apache Kafka can be used to power use cases enabled by CDR like customer and product 360-degree views, payments traceability, and open banking
Event-Streaming verstehen in unter 10 Minconfluent
Um die unternehmerische Geschwindigkeit zu erhöhen, die Wettbewerbsfähigkeit durch neue Produkte und Services zu steigern und schnell auf plötzlich ändernde Markteinflüsse reagieren zu können, müssen Daten und Ereignisströme in Echtzeit geteilt, verarbeitet und ausgewertet werden können. Apache Kafka hat sich hier als Industrie-Standard für Event-Streaming etabliert. Ob Connected Car, Industrie 4.0 oder Customer 360 – alle diese zukunftsorientierten Themen benötigen schnelle Kommunikation, effiziente Vernetzung und eine Verarbeitung von enormen Datenmengen in Echtzeit.
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
The rise of data in motion in the insurance industry is visible across all lines of business including life, healthcare, travel, vehicle, and others. Apache Kafka changes how enterprises rethink data. This blog post explores use cases and architectures for event streaming. Real-world examples from Generali, Centene, Humana, and Telsa show innovative insurance-related data integration and stream processing in real-time.
Event-Based Business Architecture: Orchestrating Enterprise Communications confluent
(Gary Samuelson, GarySamuelson) Kafka Summit SF 2018
A business-oriented view, illustrating both process models and in-flight task progress, is critical to understanding organizational health, efficiency and alignment to strategic goals. The intent of this talk is to illustrate the real-time relationship between Kafka-managed events (event driven) and business architecture via actionable models (real-time analytics).
Takeaways:
-Understand how business views technology in terms of capabilities aligned to strategy.
-Introduce process model and performance views into an event-oriented dashboard. This view illustrates the organization in terms of collaborating human and automated services.
-Illustrate how system architecture dovetails into business goals with an aligned business/IT architectures.
Bridge Your Kafka Streams to Azure Webinarconfluent
With a fully managed Apache Kafka(R) as-a-service on Microsoft Azure, businesses can focus on building applications and not managing clusters. Build a persistent bridge from on-premises data systems to the cloud with a hybrid Kafka service or stream across public clouds for multi-cloud data pipelines.
In this session for business and technical data leaders, you can learn about powering business applications with the managed Kafka service that streams data into Azure SQL Data Warehouse, Cosmos DB, Azure Data Lake Storage and Azure Blob Storage.
This document discusses measuring the business value of using Kafka to power event-driven applications. It begins by explaining why measuring value is important for ROI, stakeholder commitment, and benefits realization. It then outlines three real-world examples of using Kafka: resolving ATM disputes faster resulted in 50% less agent time and 75% fewer avoidable payments; a customer 360 application improved targeted offers for increased revenue and better inventory management; and a fraud prevention system enabled real-time detection and prevention, decreasing insurance premiums. The document concludes by recommending establishing credibility through sound assumptions, defining what is actually being measured, and accepting that value is subjective and changing over time.
Event: https://www.meetup.com/de-DE/Vienna-Kafka-meetup/events/262314643/
Speaker: Patrik Kleindl (patrik.kleindl@bearingpoint.com)
Slides of the introduction to Apache Kafka and some popular use cases.
Slides were provided by Confluent (confluent.io)
Workshop 1. Architecting Innovative Graph Applications
Join this hands-on workshop for beginners led by Neo4j experts guiding you to systematically uncover contextual intelligence. Using a real-life dataset we will build step-by-step a graph solution; from building the graph data model to running queries and data visualization. The approach will be applicable across multiple use cases and industries.
This document summarizes a final year internship presentation on detecting fake news using machine learning. The intern worked at Syslog Technologies on a project to build a model that can classify news articles as real or fake. The methodology involved collecting a dataset of real and fake news, preprocessing the data, training classification algorithms, and evaluating the models' performance. The system architecture included feature extraction, training/testing datasets, applying algorithms like random forest and Naive Bayes, and selecting the best model based on accuracy metrics. The presentation covered technologies used like Python, OpenCV, Anaconda, and modeling tools like Jupyter Notebook and Spyder.
The document discusses career progression and development at Ticketmaster. It introduces a competency model to map skills and roles across the software development lifecycle. The model is based on the IEEE Software Engineering Competency Model framework and includes technical, professional, and behavioral skill levels. Career mapping helps individuals understand where they are, where they want to go, and the steps needed to get there with support from available resources.
ATMOSPHERE was invited to be a speaker at Think Milano event, on 6th June from 14.30 to 17.30, to join a panel discussion called “L’infrastruttura cloud ready protagonista del future” on how cloud infrastructures are important for different market sectors.
Kislaya Kumar Singh provides a summary of his career objective, education, experience, technical skills, projects, and work experience. He has a Post Graduate Program in Business Analytics and Business Intelligence in progress and a Bachelor's degree in Information Science. His experience includes data modeling, machine learning, data analysis tools like Python and R, and statistical modeling. Some of his projects include analyzing cricket data, factor analysis of cereal brands, clustering engineering colleges, and predictive modeling of employee attrition. He has worked as a Technical Lead for Western Union and as a Software Engineer for Symantec, where his responsibilities included API testing, incident management, and executing various security and code coverage tools.
This presentation outlines the outcomes of 3D printing on entrepreneurship.
The evolution of 3D printers, The market opportunity, and the application in industry.
Software Architecture Evaluation: A Systematic Mapping StudySofia Ouhbi
Sofia Ouhbi presented a systematic mapping study on software architecture evaluation approaches at the 13th International Conference on Evaluation of Novel Approaches to Software Engineering in Funchal, Madeira, Portugal. The study analyzed 60 papers on software architecture evaluation to identify key publication sources, trends over time, research types, validation methods, evaluation approaches, quality models used, and description models. The results showed that journals and conferences are the main publication sources, interest peaked in the late 2000s and has declined recently, case studies are common validation methods, and quality attributes like performance and maintainability are frequently evaluated using models like the ISO 9126.
IRJET- Search Improvement using Digital Thread in Data AnalyticsIRJET Journal
This document discusses the use of digital thread in data analytics to improve search and provide end-to-end visibility across product lifecycles. Digital thread is a communication system that connects manufacturing process elements and provides a complete view of each element throughout the lifecycle. It allows sharing of information across organizations and suppliers. Digital thread brings quality gains by managing large amounts of data and complex supply chains. It helps enterprises quickly redesign products and meet timelines while maintaining visibility of each component's journey. The document proposes using a Neo4j graph database hosted on AWS cloud to implement a digital thread that links product data. This would provide security, performance, and analytics benefits across the overall manufacturing process.
Transformacion del Negocio Financiero por medio de Tecnologias CloudRaul Goycoolea Seoane
This document discusses how cloud technologies can transform businesses. It provides an overview of Xertica, a leading cloud consulting firm in Latin America. The document then discusses how various industries like financial services, retail, and manufacturing are using technologies like cloud platforms, machine learning, and analytics to improve areas such as customer service, risk management, and modernizing legacy infrastructure. Specific customer examples are provided for each area to illustrate the business benefits seen such as reduced costs, improved efficiencies and increased innovation.
This document discusses graph data science and Neo4j's capabilities. It describes how Neo4j can help simplify graph data science through its native graph database, graph data science library, and data visualization tool. Example use cases are also provided that demonstrate how Neo4j has helped companies with fraud detection, customer journey analysis, supply chain management, and patient outcomes.
Emerging engineering issues for building large scale AI systems By Srinivas P...Analytics India Magazine
The document discusses an online 6-month certificate program in artificial intelligence and deep learning from Manipal Prolearn. It provides awarding from MAHE, hands-on training using real-world data from different domains, and instruction from industry experts. The program teaches skills for developing end-to-end AI/ML systems and covers topics like data acquisition, modeling, evaluation, and deployment.
Harbor Research: 3D Printing Growth OpportunityHarbor Research
This document discusses 3D printing as a growth opportunity for smart systems. It provides an overview of the 3D printing market and key trends, including increasing speeds and decreasing costs of printers. While adoption has been limited by materials and other factors, the document outlines several innovators in the 3D printing space that are working to address pain points and drive further adoption through new technologies and business models. It also provides a breakdown of the 3D printing ecosystem and the various players involved.
I like to take up new roles lead the team and play with data.I have turned my day to day passion to new role and I am working with Data sets to extract more and more information. I love to recognize pattern's in data and this is what I do very passionately.
This document discusses security issues related to moving from single cloud to multi-cloud environments. It first provides background on the increased use of cloud computing and the privacy and security concerns organizations have in using single cloud providers. It then discusses the trend toward multi-cloud/inter-cloud environments to address issues like availability and potential insider threats. The document examines research on security issues in single and multi-cloud environments and outlines the objective to automatically block attackers and securely compute data across clouds.
Borys Pratsiuk is the Head of R&D at an unnamed company. He has over 15 years of experience in engineering roles related to Android development, embedded systems, and solid state electronics. He holds a PhD in Solid State Electronics from Kiev Polytechnic Institute and has worked in both academic and industry roles in South Korea and Ukraine. The presentation discusses big data, analytics, artificial intelligence and machine learning applications across various industries. It provides examples of deep learning solutions developed for clients in areas like computer vision, natural language processing, predictive analytics and process automation. The presentation emphasizes Ciklum's full-service approach to developing and deploying deep learning solutions from data collection and modeling to deployment and ongoing support.
EXPLORING POSSIBILITIES
Delivering Business Outcomes Through Innovation
Showcasing exemplary stories of success where channel partners have gone to great lengths to implement innovative solutions. Acclaiming those partners who have risen to the challenges of the digital era and transformed their business to a solutions offering. Inspiring channel businesses to become value-added providers and trusted allies to their customers. Stories that made a Difference.
This document is a resume for Vignesh Thulasi Dass summarizing his education and experience. He has a Master's degree in Data Analytics from Northeastern University and a Bachelor's degree in Computer Science. His skills include programming languages like R, Python, and SQL as well as tools like Hadoop, Tableau, and PowerBI. He has work experience as a Software Developer at Just Dial India where he performed website and data analysis. His academic projects include predicting Airbnb user bookings using R and Tableau and analyzing household energy consumption using PySpark and PowerBI. He also has leadership experience co-founding an NGO and being a member of clubs at Northeastern and in Bangalore.
Gabriel Lucaciu presented on future technology trends and how Progress keeps pace. He discussed emerging technologies like chatbots, artificial intelligence, analytics, mobile backends, augmented reality, blockchain, and how Progress offers solutions like NativeChat, DataRPM, Kinvey, and Consensus to support these trends. Progress aims to help organizations easily adopt new technologies through products that integrate features such as predictive modeling, serverless platforms, and cross-platform development.
This document discusses the future of enterprise search in 2025. It notes that digital transformation, changing customer expectations, and the increasing role of AI will drive the need for adaptable and flexible workflows, human-machine augmentation, and an emphasis on innovation and creativity. By 2025, enterprise search is predicted to provide a unified interface for accessing information, connecting insights to actions, and measuring outcomes. Key technologies enabling this vision include natural language processing, machine learning, and a modular architecture providing scalability, security and a user-focused experience.
Neo4j GraphDay Seattle- Sept19- Connected data imperativeNeo4j
The document outlines an agenda for a Neo4j Graph Day event including sessions on connected data, graphs and artificial intelligence, a lunch break, Neo4j training, and a reception. Key topics include Neo4j in production environments, its role in boosting artificial intelligence, and training opportunities.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"sameer shah
Embark on a captivating financial journey with 'Financial Odyssey,' our hackathon project. Delve deep into the past performance of two companies as we employ an array of financial statement analysis techniques. From ratio analysis to trend analysis, uncover insights crucial for informed decision-making in the dynamic world of finance."
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
3. We want to have a positive impact
on the world.
We believe technology is the engine
of change. To embrace the change
and its possibilities, we believe that
we have to stay at the edge of the
knowledge. More : we believe
creating new knowledge is the best
way to see further and to lead the
path to tomorrow.
So, the positive impact on the world
is led by a sound and cutting-edge
knowledge of technology.
Vision
Impacting the society with technologies
3
4. We develop technologies
for clients and we
implement the solution
they need. We commit on
the results & deliveries.
We bring our cross-domain
knowledge through our
experts to deliver more
value to our clients. On
request of the client, we help
them define strategic
opportunities.
CRAFT
SERVE
EXPLORE
We work on the most
promising IT technologies to
share our R&D knowledge &
build dedicated training
sessions delivered by
renowned experts
OUR MISSION
4
8. A PRIVATE R&D CENTER
36 SCIENTIFIC PUBLICATIONS SINCE 2008
● 4th Workshop IEEE CA USA, on Real-time and Stream Analytics in Big Data &
Stream Data Management
● Anas Al Bassit, Skhiri Sabri, LEAD: A Formal Specification For Event
Processing, in 13Th ACM international Conference on distributed and
event-based systems 2019
● Katsiaryna Krasnashchok, Aymen Cherif, Coherence Regularization for Neural
Topic Models. in 16th International Symposium on Neural Networks 2019
(ISNN 2019)
● Aymen Cherif, Salim Jouili, Pairwise Image Ranking with Deep Comparative
Network. ESANN 2018: ES2018-200
● Qile Zhu, Xiaolin Li, Ana Conesa, Cécile Pereira, GRAM-CNN: A deep learning
approach with local context for named entity recognition in biomedical text,
Bioinformatics – May 2018
● Katsiaryna Krasnashchok, Salim Jouili, Improving Topic Quality by Promoting
Named • Entities in Topic Modeling, Proceedings of the 56th Annual Meeting
of the Association for Computational Linguistics (Volume 2: Short Papers).
Vol. 2. 2018
● Amine Ghrab, Oscar Romero, Salim Jouili, Sabri Skhiri, Graph BI & Analytics:
Current State and Future Challenges. DaWaK 2018: 3-18
● De Visscher, I.; Stempfel, G.; Rooseleer, F. & Treve, V.; Data mining and
Machine Learning techniques supporting Time-Based Separation concept
deployment, in 37th Digital Avionics Systems Conference (DASC), pp 594-603,
London, UK, September 23-27, 2018
R&D
expertise
8 8
12. Our Research Tracks
Jericho and Asgard
2017 2019
3 Tracks
ELI: Elastic & Large Indexing
ECCO: Easy Cluster Configuration
Optimization
LEAD: Live Event Analysis &
Detection
2022
ASGARDJERICHO
13. JERICHO
ELI
ELI’s GOAL
Extract relevant information from data lakes containing large
volumes of heterogeneous data
Topic Modeling
Embeddings
Improvements
Learning to
Rank
Aymen Cherif, Salim Jouili, Image retrieval and ranking through Deep Comparative Neural
Networks. ESANN 2018: ES2018-200
Katsiaryna Krasnashchok, Salim Jouili, Improving Topic Quality by Promoting Named Entities in
Topic Modeling, Proceedings of the 56th
Luca De Petris, Aymen Cherif, LSTM Siamese Network for Question Answering System, SLSP
2018
Katsiaryna Krasnashchok, Salim Jouili, Hierarchical Attention-Based Neural Topic Model, SLSP
2018
Katsiaryna Krasnashchok, Aymen Cherif, Coherence Regularization for Neural Topic Models. 16th
International Symposium on Neural Networks 2019 (ISNN 2019)
14. JERICHO
ECCO
ECCO’s GOAL
Assist users in the configuration & optimisation of their Big Data
frameworks
Select parameters for
performance prediction
and performance
optimization
Given:
● the infrastructure
constraints
● a specific application (job)
● a specific input dataset.
We build an optimiser that
searches for the best parameters
Muaz Twaty, Amine Ghrab, Sabri Skhiri, GraphOpt: a Framework for Automatic Parameters
Tuning of Graph Processing Frameworks EEE International Workshop on Benchmarking,
Performance Tuning and Optimization for Big Data Applications (BPOD 2019)
15. JERICHO
LEAD
LEAD’s GOAL
Allow users to benefit from stream analytics in a large variety of
domains
Stream 1 Stream 2
Join
Context-aware
application
Stream 3
Followed
by
Detects patterns
Anas Al Bassit, Skhiri Sabri, LEAD: A Formal Specification For Event Processing, 13Th
ACM international Conference on distributed and event-based systems 2019
16. Our Research Tracks
Jericho and Asgard
2017 2019 2022
ASGARD
MJOLNIR
RUNE
YGGDRASIL
VAGGDELMIR
JERICHO
4 Tracks
19. ASGARD program
AI & ML in industries
THE NEW WALL
CONNECTDATA Business needs
20. ASGARD program
AI & ML in industries
THE NEW WALL
CONNECTDATA Business needs
21. THE COST OF
COMPLEXITY
40%
of business initiatives do not
achieve their objectives because
of the quality/governance of the
data
The minimum cost of an AI/Big Data use case
500,000 - 1 Mi
700,000 positions
are waiting to be filled
+
GDPR compliance costs
$1 to $5 MILLIONS
ASGARD
What kind of barrier ?
70% Of business leaders are worried about the lack
of data scientists in the next 2Y
24. Introduction
24
What is Complex Event Processing?
Systems that are able to detect interesting situations by correlating events from
different streams, transforming and aggregating them, and then generating actions
are referred to as CEP engines
25. Introduction
25
What is Complex Event Processing?
“A Manchester fan is moving far from
home to see a match”
“Fan” Fact: location & context
Localised more than x times in the last 3
mo at Manchester events
Fan Moving: Now, there is a Manchester Match at
Chelsea & a Manchester fan is located there
30. CEP Challenges
Technical
● Performance(Throughput & Latency)
● Scalability
● State management
30
Logical
● Ambiguous Semantics (Absence of formalisms
and Selection & Consumption policies)
● Lack of Expressiveness and User-friendliness
● Missing operators (Negations, Sequences,
Repetitions … etc)
31. Motivation
31
Product Roll-up Tracking
Four streams of events:
● installations,
● accesses,
● artifacts bought,
● connect,
● and shares;
How to identify good customers?
How to measure the success of the
application in real time?
32. Motivation
We assume the following four actions per each user and game and within the first 3 days from
installation:
32
Product Roll-up Tracking
3. Middle-success (M)
≥3, and not (S) nor (L)
4. Failure (F)
≤2, 0, 0
1. Success (S)
≥5, ≥2, ≥2
2. Middle-success & Leaving (L)
≥3 and ≤5, 0, 0 and the user did not connect within 2 days
after the last access
● installations,
● accesses,
● artifacts bought,
● connect,
● and shares;
33. Motivation
33
Product Roll-up Tracking
There is no CEP framework capable of formulating this problem with less than four queries,
although the patterns are similar to each other and have inter-dependencies.
We assume the following four actions per each user and game and within the first 3 days from
installation:
3. Middle-success (M)
≥3, and not (S) nor (L)
4. Failure (F)
≤2, 0, 0
1. Success (S)
≥5, ≥2, ≥2
2. Middle-success & Leaving (L)
≥3 and ≤5, 0, 0 and the user did not connect within 2 days
after the last access
● installations,
● accesses,
● artifacts bought,
● connect,
● and shares;
34. Contributions
34
A pattern algebra that extends the common set of operators in CEP, and defines them
formally using TRIO [1, 2], a logic-based specification language aggrandized with temporal
features
A rule grammar that, using our pattern algebra, allows users to obtain different kinds of
actions, depending on the characteristics of a matched pattern
A novel logical execution plan created based on a combination of timed colored petri nets
with aging tokens [3] and prioritized petri nets [4], that we believe will facilitate the
deployment of this plan in the future.
1
2
3
37. Pattern Model
Access (GID: 123, UID: 321): 999
37
Event Representation & Formal Definitions
Event Type
Attributes
Values
Event Time
Sequence Operator Repetition Operator
38. Pattern Model
38
Basic Operators:
● Renaming
● Filtering
LEAD Operators
Core Operators:
● Conjunction
● Disjunction
● Negation
● Sequence
● Repetition
● Subcontext
Temporal Constraints
● Within
● Wait
Selection & Consumption Policies:
● First
● Last
● Adjacent
● Every
● All
● All … Consume
● Repetition Max
● Repetition Min
39. Pattern Model
39
Context and Sub-context
time
Context
installed installed + 3 days || ac::6
ac::1 ac::2
Middle-success & Leaving (L)
● 3≤ accesses ≤5
● The user did not connect within 2 days after the last access
40. Pattern Model
40
Context and Sub-context
ac::3 time
Context
Sub-context
installed
ac::1 ac::2 ac::3 + 2 days
installed + 3 days || ac::6
Middle-success & Leaving (L)
● 3≤ accesses ≤5
● The user did not connect within 2 days after the last access
41. Pattern Model
41
Context and Sub-context
ac::3 time
Context
Sub-context
installed
ac::1 ac::2 ac::4 + 2 daysac::4
installed + 3 days || ac::6
Middle-success & Leaving (L)
● 3≤ accesses ≤5
● The user did not connect within 2 days after the last access
45. Let’s come back
45
Product Roll-up Tracking
There is no CEP framework capable of formulating this problem with less than four queries,
although the patterns are similar to each other and have inter-dependencies.
We assume the following four actions per each user and game and within the first 3 days from
installation:
3. Middle-success (M)
≥3, and not (S) nor (L)
4. Failure (F)
≤2, 0, 0
1. Success (S)
≥5, ≥2, ≥2
2. Middle-success & Leaving (L)
≥3 and ≤5, 0, 0 and the user did not connect within 2 days
after the last access
● installations,
● accesses,
● artifacts bought,
● connect,
● and shares;
46. Let’s come back
46
Product Roll-up Tracking
● installations,
● accesses,
● artifacts bought,
● connect,
● and shares;
≥5, ≥2, ≥2
S
≥3 ≤5, 0, 0
No con. 2 days
after the last
access
L
≥3,
not(S) nor (L)
M
≤2, 0, 0
F
time
Context
installed installed + 3 days || ac::6
≤2
≥3
56. LOGICAL EXECUTION PLAN
56
Why Petri Nets?
Concurrency &
Synchronization
Places, Transitions,
Edges and Tokens
Probabilistic CEP
57. LOGICAL EXECUTION PLAN
A Petri net consists of places, transitions, and arcs. Arcs run from a place to a transition or vice versa, never between places or
between transitions.
57
Petri Net
https://en.wikipedia.org/wiki/Petri_net
Coloured Petri nets allow tokens to have a data value attached to them. This attached data value is called the token color.
Age can be used to invalidate token over time
58. LOGICAL EXECUTION PLAN
N = (Σ, P, I, IC, OC, TT, 𝛑, IT, G, r0
)
Σ: is a finite set of types (colours),Σ ⊆ E[n]
, n ∈ N;
P≣ [p1
, p2
,..., p|P|
]: is a finite set of places, which can be either stateless, i.e. they pass tokens between transitions, or stateful,
i.e. they preserve tokens in ordered structures;
I: is a finite set of transitions . Transitions are either temporal guards, consumers or intermediate transitions;
IC ⊆ (P x I): is a finite non-empty set of input arcs;
OC ⊆ (I x P): is a finite non-empty set of output arcs;
TT: P ⇒ Σ: is a color function, where each place has a single type that belongs to Σ, and all the tokens on this place must be
of the same type;
𝛑: IC ⇒ NO
is a priority function;
IT: I ⇒ R is a time expression function;
G: I ⇒ boolean is a guard function that maps each transition i ∈ I to a boolean expression over all the incoming arcs IC(i) ⊆
IC;
r0
∈ R is an initial marking from the set of all markings R.
58
Aging tokens Prioritized Colored Petri Net Definition
60. LOGICAL EXECUTION PLAN
60
LEAD Rules in APCPN
LEAD Rule in APCPN
1 type of token
Every place has 1 type
(union of input token
type)
61. LOGICAL EXECUTION PLAN
61
LEAD Rules in APCPN
LEAD Rule in APCPN
Source Pattern Compact version
Priorities to transitions
62. Within Operator:
A within 10s from B
(B matched before A)
Sequence Operator:
A followed by B
LOGICAL EXECUTION PLAN
62
LEAD Rules in APCPN
Two forms of sequencing events
63. Within Operator:
A within 10s from B
(B matched before A)
Sequence Operator:
A followed by B
LOGICAL EXECUTION PLAN
63
LEAD Rules in APCPN
Two forms of sequencing events
past (match), stateful
match Now, stateless
64. Within Operator:
A within 10s from B
(B matched before A)
Sequence Operator:
A followed by B
LOGICAL EXECUTION PLAN
64
LEAD Rules in APCPN
Two forms of sequencing events
Aging tokens for
temporal constraints
68. Streaming job
68
Keyed Stream
● Key preserving stream (avoiding re-shuffling)
● Kind of hard coded optimization: to ensure that Flink does not reshuffle keys between
operators (already on the task manager)
Stream functions
● Enable
● And
● OR
● ...
The basic foundation: the keyed stream processors & function library
69. The activation challenge
69
How to feedback the activation on a Stream based on a current event?
Sequence Operator:
A followed by B: The activation of
B starts only when A is received.
A
B
OP OP
OP
You can start processing B
OP
70. LEAD Stream Operator
The activation challenge
70
How to feedback the activation on a Stream based on a current event?
A
B
OP OP
OP
You can start processing B
Stream Function 2
Each transition in the ACPN becomes a function
● The functions are stored in the state key store
● They can be activated/deactivated by listening control events coming from the “Context”
OP
71. LEAD Stream Operator
Modeling ACPN choice
71
Grouping the functions only when a transition faces a choice
Stream Function 1
H
L
LL
Petri net prioritized choice
LEAD Stream Operator
Stream Function 1
Stream Function 2
Stream Function 3
LEAD Stream Operator
Stream Function 2
A
B
A
B
72. 72
The context - the control plan
An iterative Stream simulating a pub sub
Context
LEAD Stream Operator
Stream Function 1
LEAD Stream Operator
Stream Function 2
LEAD Stream Operator
Stream Function 3
JOIN process broadcast
73. Streaming job
73
What logic do functions execute
Sequence Operator:
A followed by B: The activation of
B starts only when A is received.
A
B
Source Operator
A Source Function
1- Get an A event
2- Enable “B Source Function”
Source Operator
Get Control Events
Enable “B Source Function”
B Source Function
1- Get a B event
Sequence Operator
Sequence Function
1- Get an event
2- if (event == A)
2.1- Store in State
if (event == B and !State.isEmpty)
2.1- Produce a sequence event
A --→ B
Garbage Collector Function
1- Discard a collected event B
Data Flow
Control Flow
74. Challenges to tackle
74
1. Control Stream: How to ensure that the control stream is faster than data stream and
to avoid race conditions ?
2. Matching per partition: is there any case where we need to match cross partitions ?
3. Finishing the end to end code:
a. Petri 2 Flink compiler
b. End to end functional and performance tests on distributed env.
75. Summary
75
● Both technical and logical challenges were the reasons behind LEAD;
● 18 operators were introduced and formalized using TRIO trying to eliminate ambiguous behaviours;
● The decent set of operators and extending the capabilities of the query language were meant to increase
the expressive power in CEP;
● Aging tokens prioritized colored petri nets, as a logical execution gives a room for optimisations;
Future Work
● Query optimizations on both logical and physical levels
● Benchmarking the performance of LEAD CEP
● Probabilistic CEP
76. Open Sourcing Dev
76
Preparing a Flink integration proposal and a merge request
Thinking about integration the Stream SQL API as Match recognise
Why not a KStream implementation ?
79. ● A peer-reviewed open access journal
on data in science, with the aim of
enhancing data transparency and
reusability.
● Publishes in two sections:
○ A section on the collection,
treatment and analysis methods
of data in science;
○ A section publishing descriptions
of scientific and scholarly datasets
(one dataset per paper).
Open Access; Included in Scopus, Emerging
Sources Citation Index (ESCI - Web of
Science), DBLP and Inspec
Over 500 authors, reviewers and editors involved
in 2018; Free language check service after paper
accepted
Peer-reviewed, 16.7 days for a first decision;
2.8 days from acceptance to publication
(median processing time in first half 2019)
No space constraints, no extra
space or color charges
data@mdpi.com
www.mdpi.com/journal/data