High performance queues with Cassandra

•

7 recomendaciones•8,884 vistas

Everyone knows that Cassandra is a NoSQL solution for data storage. But often for processing of this data message queues are used with some existing messaging provider. Due to this, there is inconsistency of data sometimes and an additional infrastructure level to maintain. Since one of our services stores all the data in Cassandra, we have developed a solution for message queues that automatically gained a lot of useful features: scalability, high availability and flexibility. This solution I will present in the talk.

Tecnología Empresariales

High performance
queues with Cassandra
Mikalai Alimenkou
http://xpinjection.com
@xpinjection

How to
process
data, master
?

Asynchronously!

Some specific requirements
1

Message
Queue

External Service
Provider

Some specific requirements
Message
Queue

1

External Service
Provider

Some specific requirements
2

Message
Queue

External Service
Provider

Some specific requirements
Message
Queue

2
1
Redeliver
after 1 hour

External Service
Provider

Some specific requirements
2

Message
Queue

External Service
Provider

Redeliver
after 1 hour

Redelivery business logic and external service
hourly based usage limits

Message batches “station”
QUEUE
NOW

ALMOST READY

+1 HOUR

WAITING

+6 HOURS

WAITING

+12 HOURS

WAITING

System components: Message
MESSAGE = REAL REQUEST DATA WITH UNIQUE ID
1ST FIELD

3rd FIELD

{ field data}

MESSAGE ID

2nd FIELD
{ field data}

{ field data}

ID FORMAT ALLOWS 4096 MESSAGES
PER MILLISECOND FROM ONE NODE
Timestamp

44 bits

Counter

12 bits

Cluster node ID

8 bits

System components: Batch
• Open for at least 1 second
• Closing if opened for > 10 seconds
• Closing if has > 100 messages
Ascending columns ordering
1ST MESSSAGE ID

2nd MESSAGE ID

3rd MESSAGE ID

BATCH ID { opt message data} { opt message data} { opt message data}

ID FORMAT REQUIRE BATCH TO BE OPENED FOR > 1 SECOND
Timestamp
Rounded to seconds

Cluster node ID + Batch Type
Last 3 digits

System components: Queue
• Similar to batch
• Unlimited
• May have batches with past time

Ascending columns ordering
1ST BATCH ID
QUEUE NAME

2nd BATCH ID

3rd BATCH ID

{ processed at }

{ processed at }

{ processed at }

System components: Broker
batches polling
BROKER
check batch time
process batch
PROCESSOR

QUEUE

lock batch for
processing

ZOOKEEPER

•
•
•
•
•

Natural pre-fetch thanks to batches
Easy to control messages processing
Simple concurrency model
Easy scalable between nodes
No high loading on Cassandra

System components: Processor
PROCESSOR
OK

System components: Processor
PROCESSOR

redeliver
on failure
ANOTHER BATCH

•
•
•
•

Tries to process messages as quickly as possible
On error just redeliver message
Messages are processed concurrently
Any redelivery business logic is easy to implement

Warnings and benefits
• Message and batch must be
checked before processing
• Hard to explain “queue” size
• Separate columns for status
tracking of message
• Perform correct compaction
from time to time

• Expected loading is handled
with single node
• Everything works on
commodity hardware
• Single storage for all data
• System is easily scalable and
reliable (no message was
lost)

@xpinjection
http://xpinjection.com
mikalai.alimenkou@xpinjection.com

Más contenido relacionado

La actualidad más candente

PayPal Customer Presentation

Splunk

Consistent hashing

Jooho Lee

Network security

سودان وب لأمن المعلومات

Consistent hashing algorithmic tradeoffs

Evan Lin

Large Scale Graph Analytics with JanusGraph

P. Taylor Goetz

Timeline of computer viruses

Smart Technology Services, Inc.

DDoS

Milan Petrásek

Nlp history-thai-virach20191025

Thammasat University, Musashino University

Lambda architecture is a popular technique where records are processed by a batch system and streaming system in parallel. The results are then combined during query time to provide a complete answer. Strict latency requirements to process old and recently generated events made this architecture popular. The key downside to this architecture is the development and operational overhead of managing two different systems. There have been attempts to unify batch and streaming into a single system in the past. Organizations have not been that successful though in those attempts. But, with the advent of Delta Lake, we are seeing lot of engineers adopting a simple continuous data flow model to process data as it arrives. We call this architecture, The Delta Architecture.

The delta architecture

Prakash Chockalingam

Cassandra Introduction & Features

DataStax Academy

Intro to Deep Learning for Question Answering

Traian Rebedea

MonetDB :column-store approach in database

Nikhil Patteri

Web application Security tools

Nico Penaredondo

Introduction to Cassandra Architecture

nickmbailey

Spanner is a globally distributed database that provides external consistency between data centers and stores data in a schema based semi-relational data structure. Not only that, Spanner provides a versioned view of the data that allows for instantaneous snapshot isolation across any segment of the data. This versioned isolation allows Spanner to provide globally consistent reads of the database at a particular time allowing for lock-free read-only transactions (and therefore no communications overhead for consensus during these types of reads). Spanner also provides externally consistent reads and writes with a timestamp-based linear execution of transactions and two phase commits. Spanner is the first distributed database that provides global sharding and replication with strong consistency semantics.

An Overview of Spanner: Google's Globally Distributed Database

Benjamin Bengfort

Hw09 Large Scale Transaction Analysis

Cloudera, Inc.

Brute Force Attack and Its Prevention.pptx

hamzajawad10

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Amy W. Tang

An Overview of Apache Cassandra

DataStax

Big Data with Hadoop & Spark Training: http://bit.ly/2kvXlPd This CloudxLab Introduction to Apache ZooKeeper tutorial helps you to understand ZooKeeper in detail. Below are the topics covered in this tutorial: 1) Data Model 2) Znode Types 3) Persistent Znode 4) Sequential Znode 5) Architecture 6) Election & Majority Demo 7) Why Do We Need Majority? 8) Guarantees - Sequential consistency, Atomicity, Single system image, Durability, Timeliness 9) ZooKeeper APIs 10) Watches & Triggers 11) ACLs - Access Control Lists 12) Usecases 13) When Not to Use ZooKeeper

Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab

CloudxLab

La actualidad más candente (20)

PayPal Customer Presentation

Consistent hashing

Network security

Consistent hashing algorithmic tradeoffs

Large Scale Graph Analytics with JanusGraph

Timeline of computer viruses

DDoS

Nlp history-thai-virach20191025

The delta architecture

Cassandra Introduction & Features

Intro to Deep Learning for Question Answering

MonetDB :column-store approach in database

Web application Security tools

Introduction to Cassandra Architecture

An Overview of Spanner: Google's Globally Distributed Database

Hw09 Large Scale Transaction Analysis

Brute Force Attack and Its Prevention.pptx

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

An Overview of Apache Cassandra

Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab

Similar a High performance queues with Cassandra

Donatas Mažionis, Building low latency web APIs

Tanya Denisyuk

Network protocols and vulnerabilities

G Prachi

Stream Processing is emerging as a popular paradigm for data processing architectures, because it handles the continuous nature of most data and computation and gets rid of artificial boundaries and delays. In this talk, we are going to look at some of the most common misconceptions about stream processing and debunk them. - Myth 1: Streaming is approximate and exactly-once is not possible. - Myth 2: Streaming is for real-time only. - Myth 4: Streaming is harder to learn than Batch Processing. - Myth 3: You need to choose between latency and throughput. We will look at these and other myths and debunk them at the example of Apache Flink. We will discuss Apache Flink's approach to high performance stream processing with state, strong consistency, low latency, and sophisticated handling of time. With such building blocks, Apache Flink can handle classes of problems previously considered out of reach for stream processing. We also take a sneak preview at the next steps for Flink.

Debunking Common Myths in Stream Processing

DataWorks Summit/Hadoop Summit

This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters

Stephan Ewen - Experiences running Flink at Very Large Scale

Ververica

In this talk, we'll explain our journey from having near-zero monitoring to having all of our infrastructure monitored with the necessary metrics and alerts. We will share with the audience some of the mistakes we did and what lessons we have learned. We currently have around 200 instances monitored with a comfortable cost-effective in-house monitoring stack based on Prometheus. We want to demonstrate that you don't need to have a big fleet to embrace Prometheus and that it is a non-expensive solution for monitoring. ---------- ShuttleCloud is a small startup specialized in email and contacts migrations. We developed a reliable migration platform in high availability used by clients like Gmail, GContacts and Comcast. For example, Gmail alone has imported data for 3 million users with our API and we process hundreds of terabytes every month. ------------- Follow us on Twitter: @ShuttleCloud: https://twitter.com/ShuttleCloud @ShuttleCloudEng: https://twitter.com/ShuttleCloudEng ShuttleCloud.com

Prometheus Is Good for Your Small Startup - ShuttleCloud Corp. - 2016

ShuttleCloud

This presentation was part of EXPERTALKS: Nov 2012 conducted at Equal Experts India on 24th Nov 2012. Visit http://meetup.com/expertalks to know more... ------------------------------------------------------------------------------------- Ever wondered how Facebook is able to support user requests growing at thousands per second without ever crashing ? Any enterprise application that we use today; whether it’s a banking website or an eCommerce app; uses Clustering. Clustering addresses some of the most crucial demands of today’s ever-growing user base; including scalability and 24/7 availability amongst others. This presentation discusses the concept of web application clustering. FOCUS AREAS: * Why clustering * Clustering Types * Load Balancing Algorithms * Fault Tolerance It also has a guide workshop on how to build a Vertical Cluster using Tomcat.

EXPERTALKS: Nov 2012 - Web Application Clustering

EXPERTALKS

Openstack meetup lyon_2017-09-28

Xavier Lucas

Kafka overview v0.1

Mahendran Ponnusamy

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL

confluent

Amazon Kinesis is a fully managed, cloud-based service for real-time data processing over large, distributed data streams. Customers who use Amazon Kinesis can continuously capture and process real-time data such as website clickstreams, financial transactions, social media feeds, IT logs, location-tracking events, and more. In this session, we first focus on building a scalable, durable streaming data ingest workflow, from data producers like mobile devices, servers, or even a web browser, using the right tool for the right job. Then, we cover code design that minimizes duplicates and achieves exactly-once processing semantics in your elastic stream-processing application, built with the Kinesis Client Library. Attend this session to learn best practices for building a real-time streaming data architecture with Amazon Kinesis, and get answers to technical questions frequently asked by those starting to process streaming events.

(BDT403) Best Practices for Building Real-time Streaming Applications with Am...

Amazon Web Services

3.2 Streaming and Messaging

振东刘

Ingestion and Dimensions Compute and Enrich using Apache Apex

Apache Apex

2.communcation in distributed system

Gd Goenka University

Presto At Treasure Data

Taro L. Saito

Scalable IoT platform

Swapnil Bawaskar

OpenStack HA

tcp cloud

OpenStack High Availability

Jakub Pavlik

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, Paul will reveal how he architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. Paul will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from his experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day. Melbourne Big Data Meetup, March 5 2020 https://www.eventbrite.com/e/melbourne-big-data-meetup-realtime-anomaly-detection-with-cassandra-kafka-tickets-93028445585

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Paul Brebner

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Paul Brebner

Performance

Christophe Marchal

Similar a High performance queues with Cassandra (20)

Donatas Mažionis, Building low latency web APIs

Network protocols and vulnerabilities

Debunking Common Myths in Stream Processing

Stephan Ewen - Experiences running Flink at Very Large Scale

Prometheus Is Good for Your Small Startup - ShuttleCloud Corp. - 2016

EXPERTALKS: Nov 2012 - Web Application Clustering

Openstack meetup lyon_2017-09-28

Kafka overview v0.1

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL

(BDT403) Best Practices for Building Real-time Streaming Applications with Am...

3.2 Streaming and Messaging

Ingestion and Dimensions Compute and Enrich using Apache Apex

2.communcation in distributed system

Presto At Treasure Data

Scalable IoT platform

OpenStack HA

OpenStack High Availability

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...

Performance

Más de Mikalai Alimenkou

Люди в мире Agile используют Story Points - для Agile коучей и тренеров это самый простой способ объяснить, как следует проводить оценку и планирование в «новом мире». Но тогда эта простая концепция нарушает реальные практические кейсы. В настоящее время команды состоят из очень специализированных людей, работающих над бэкендом, фронтэндом, тестировании, инфраструктуре и прочим. Для них почти невозможно иметь общий уровень сложности. Это только одна из проблем, которые мы собираемся осветить в этом докладе. Чтобы оставаться конструктивным, а не просто старомодным парнем из XP, Николай поделится своим опытом с более точной и прагматичной техникой оценки/планирования - планированием на основе возможностей.

Rise and fall of Story Points. Capacity based planning from the trenches.

Mikalai Alimenkou

We have spent many years testing our applications and systems manually and with test automation tools. During this time many bug root causes have been classified and could be detected automatically with special static analysis tools. Most of them could be applied at the early stages of development even before code is integrated into the main development branch. In this talk, I will go through available solutions and demonstrate what kinds of issues may be detected automatically reducing the time and effort of traditional testing.

Static analysis tools as the best friend of QA

Mikalai Alimenkou

In this talk, we will go through the design process of modern CI/CD for the microservices-based system with Kubernetes support. We will discuss how to verify consistency between microservices, apply different levels of quality gates and promote artifacts between environments. Thanks to Kubernetes we will review different approaches of environment resources optimization for development needs during CI/CD cycles.

Modern CI/CD in the microservices world with Kubernetes

Mikalai Alimenkou

Most of people nowadays think microservices architecture is a great way to build any system. They visit conference talks, read books and review tutorials, where ‘hello world’ applications are built just in several minutes using microservices approach. But the reality is not so wonderful and one of the biggest pain is hidden inside distributed business transactions. In monolith application this topic is almost completely covered with DB level transactions. In distributed world you have to face many issues trying to implement reliable and consistent business logic. In this talk we will review different types of problems related to distributed business transactions, popular SAGA pattern, frameworks and techniques to simplify your life without compromising quality of the system.

Saga about distributed business transactions in microservices world

Mikalai Alimenkou

Effectiveness tips from Kubernetes trenches by Captain Obvious

Mikalai Alimenkou

For a long time DB related testing in Java world has been a real pain and most developers tried to reduce number of such tests as much as possible. With good in-memory database implementations like H2, schema migration solutions like Liquibase or Flyway, containerization with libraries like TestContainers, database management is now much simpler. But test data management is still a pain. Some developers use SQL dumps, others insert data via JPA/JDBC or rely on prepared data sets. Good old DBUnit may be a good option, but it is not so developer friendly and not adopted well for modern annotations driven development style. Database Rider closes the gap between modern Java development environment and DBUnit, bringing DBUnit closer to your JUnit tests, so database testing will feel like a breeze. In addition to flexible data sets management this library provides other useful features: programmatic data sets definition, leak hunting, data sets export, constraints management, etc. As contributor and loyal user for many years, I would like to share my experience with Database Rider and demonstrate how to make database testing a fun again!

Ride the database in JUnit tests with Database Rider

Mikalai Alimenkou

I think almost everybody experienced cases when things are moving very slowly in IT companies or teams. You have many people, talented engineers, Agile process and development speed is still below expectations. We try to focus on performance and efficiency last 10 years, improving our practices and tools. But we are still there in terms of speed when they are applied to real life cases. How is it possible? In this talk we will review the concept of waste circles and understand what are the main sources of time waste in development process. This concept would help you to check your processess, focus on right things and achieve much better results in your organization or team.

Wastful waste or why everything is so slow in development

Mikalai Alimenkou

Nowadays traditional layered monolithic architecture in Java world is not so popular as 5-10 years ago. I remember how we wrote tons of code for each layer repeating almost the same parts for every application. Add unit and integration testing to understand how much time and efforts has been spent on repeatable work. All cool ideas around DDD (domain driven design) and Hexagonal Architecture was just a nice theory because reality hasn’t allow us to implement it easily. Even Dependency Injection with Spring framework was completely focused on traditional layered approach, not even talking about JavaEE platform. Today we have Spring Boot ecosystem covering most of our needs for integration with almost all possible technologies and microservices architectural trend, enabling completely new approach to build Java applications around domain model. It is so natural to build Java domain-oriented services and connect them with external world using ports and adapters, that Hexagonal Architecture is almost enabled by default. You just need to switch your way of thinking…

Hexagonal architecture with Spring Boot

Mikalai Alimenkou

Wastful waste or why everything is so slow in development

Mikalai Alimenkou

DevOps become a buzzword in a last few years. Several companies, development and product teams have achieved quite impressive results in this area making cultural changes, transforming their processes and practices, introducing new roles, tools and techniques. Do you think is achievable for you team or it’s still a bunch of drama? There is no common approach for measuring achievements and understanding how much DevOps’ich the current team/company is. In this talk I will provide attendees with basic checklist to start with and some reliable tools/techniques to monitor progress of “DevOps transformation”.

DevOps checklist or how to understand where is your team in DevOps landscape ...

Mikalai Alimenkou

DevOps is a hot topic during last several years. Some companies, teams and products have achieved quite impressive results in this area making cultural changes, transforming their processes and practices, introducing new roles, tools and techniques. At the same time there is no common approach for measuring achievements and understanding “how DevOps” the current team/company is. In this talk I will provide attendees with basic checklist to start with and some reliable tools/techniques to monitor progress of “DevOps transformation”.

DevOps checklist or how to understand where is your team in DevOps landscape

Mikalai Alimenkou

Почти год мы в Whirl Software разрабатываем систему Медкарта в масштабе целой страны. За это время мы столкнулись с множеством интересных сложностей и проблем, часть из которых успешно победили, а для некоторых хорошего решения до сих пор не найдено. В этом докладе мы поделимся накопленным практическим опытом и некоторыми техническими решениями, которые могут быть полезны в рамках разработки электронных медицинских систем.

Практические трудности в разработке Медкарты для целой страны

Mikalai Alimenkou

Hexagonal architecture with Spring Boot [EPAM Java online conference]

Mikalai Alimenkou

Almost any application or software system manages data. It is hard to imagine test automation that is not affected by this fact. There are many differenct approaches how to prepare system under test, providing predefined test data: use application UI, invoke API methods, run business logic directly, access DB from test scenarios, etc. In this talk we will review most of existing approaches, starting from the easiest and the most popular ones and finishing with really tricky ways to manage your test data for large distributed systems. There is no ideal solution for every case or silver bullet, but I hope your toolset will become wider after visiting this talk.

Bro, manage test data like a pro! [QA Fest 2018]

Mikalai Alimenkou

Вот уже более 10 лет Agile движение шагает по Украине и стучится практически в каждую компанию. Но приносят ли новые процессы, принципы и практики реальную практическую пользу? Получается ли изменить к лучшему команды, проекты, компании? Я в роли консультанта за эти 10 лет поработал с более чем 100 компаниями, поэтому повидал много хорошего и плохого. В данном докладе хочется пробежаться по основным проблемам, сложностям и анти-паттернам в переходе отечественных компаний на "Agile рельсы". Мы рассмотрим какие практики не очень хорошо приживаются, от чего страдает большинство команд, какие основные препятствия встречают на своем пути и как умудряются их обходить. Я надеюсь, доклад поможет зародиться множеству интересных дискуссий.

Agile antipatterns: review after 10 years of practice

Mikalai Alimenkou

Hexagonal architecture with Spring Boot

Mikalai Alimenkou

Bro, manage test data like a pro!

Mikalai Alimenkou

Тестировщики часто говорят о противостоянии и конфликтах с разработчиками. Но ведь есть команды, где все живут в мире и согласии. Видимо что-то тут не так? Я хочу поговорить о том, как тестировщиков видят сами разработчики. В докладе будет проведена забавная классификация. Кроме известного всем тестировщика-обезьянки будут представлены тестировщик-муха, тестировщик-нацист, тестировщик-панда и многие другие герои. Высможете лишний раз задуматься над тем, как вас видят со стороны и, возможно, изменить ситуацию к лучшему. Доклад будет также полезен менеджерам проектов и лидерам команд. Вы сможете быстрее распознавать те или иные шаблоны поведения тестировщикови принимать меры по повышению уровня командной работы. Приходите, будет интересно!

Бытовая классификация тестировщиков с точки зрения разработчика

Mikalai Alimenkou

Usually it is hard to analyze personal effectiveness and detect wastes in development process because developer’s work decomposition is not transparent and available for analysis. As a good sample of ineffective process imagine developer, who spends 1 day on task implementation and then reimplements it several times according to code review notes during next 2 days. Or another developer, who is waiting for code review during 2 days, switching context to other tasks, finally gets notes and switches back to initial task, trying to refresh all details in his head. And so on and so forth… Code review tool usage helps to aggregate lots of useful information about any code change at any stage (static analysis, code review, rework, acceptance, integration into main branch). In this talk I’m going to demontrate how this information could be used for detailed analysis of development effectiveness and wastes detection. Based on mentioned analysis you could implement many improvements for your development process and then measure their success.

Code Review tool for personal effectiveness and waste analysis

Mikalai Alimenkou

During last several years DevOps became strong buzzword used almost in every project, team and company. But almost everywhere it is used in very funny and strange context. For example, existing ops guys are renamed to DevOps just to sell them to the client for more money. Or DevOps is used as new job title for some magically powerful person who is able to operate cloud environment and modern infrastructure related tools, leading team of old school ops and participating in management meetings. In this talk I’m going to review all different anti-patterns and bad practices in DevOps landscape using stories from my personal experience as Delivery Manager and independent consultant.

Funny stories and anti-patterns from DevOps landscape

Mikalai Alimenkou

Más de Mikalai Alimenkou (20)

Rise and fall of Story Points. Capacity based planning from the trenches.

Static analysis tools as the best friend of QA

Modern CI/CD in the microservices world with Kubernetes

Saga about distributed business transactions in microservices world

Effectiveness tips from Kubernetes trenches by Captain Obvious

Ride the database in JUnit tests with Database Rider

Wastful waste or why everything is so slow in development

Hexagonal architecture with Spring Boot

Wastful waste or why everything is so slow in development

DevOps checklist or how to understand where is your team in DevOps landscape ...

DevOps checklist or how to understand where is your team in DevOps landscape

Практические трудности в разработке Медкарты для целой страны

Hexagonal architecture with Spring Boot [EPAM Java online conference]

Bro, manage test data like a pro! [QA Fest 2018]

Agile antipatterns: review after 10 years of practice

Hexagonal architecture with Spring Boot

Bro, manage test data like a pro!

Бытовая классификация тестировщиков с точки зрения разработчика

Code Review tool for personal effectiveness and waste analysis

Funny stories and anti-patterns from DevOps landscape

Último

Advantages of Hiring UIUX Design Service Providers for Your Business

Pixlogix Infotech

Choosing the right accounts payable services provider is a strategic decision that can significantly impact your business's financial performance and operational efficiency. By considering factors such as expertise, range of services, technology infrastructure, scalability, cost, and reputation, businesses can make informed decisions and select a provider that aligns with their unique needs and objectives. Partnering with the right provider can streamline accounts payable processes, drive cost savings, and position your business for long-term success. https://katprotech.com/accounts-payable-and-purchase-order-automation/

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Katpro Technologies

Sara Mae O’Brien Scott and Tatiana Baquero Cakici, Senior Consultants at Enterprise Knowledge (EK), presented “AI Fast Track to Search-Focused AI Solutions” at the Information Architecture Conference (IAC24) that took place on April 11, 2024 in Seattle, WA. In their presentation, O’Brien-Scott and Cakici focused on what Enterprise AI is, why it is important, and what it takes to empower organizations to get started on a search-based AI journey and stay on track. The presentation explored the complexities of enterprise search challenges and how IA principles can be leveraged to provide AI solutions through the use of a semantic layer. O’Brien-Scott and Cakici showcased a case study where a taxonomy, an ontology, and a knowledge graph were used to structure content at a healthcare workforce solutions organization, providing personalized content recommendations and increasing content findability. In this session, participants gained insights about the following: Most common types of AI categories and use cases; Recommended steps to design and implement taxonomies and ontologies, ensuring they evolve effectively and support the organization’s search objectives; Taxonomy and ontology design considerations and best practices; Real-world AI applications that illustrated the value of taxonomies, ontologies, and knowledge graphs; and Tools, roles, and skills to design and implement AI-powered search solutions.

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Enterprise Knowledge

What is a good lead in your organisation? Which leads are priority? What happens to leads? When sales and marketing give different answers to these questions, or perhaps aren't sure of the answers at all, frustrations build and opportunities are left on the table. Join us for an illuminating session with Cian McLoughlin, HubSpot Principal Customer Success Manager, as we look at that crucial piece of the customer journey in which leads are transferred from marketing to sales.

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

HampshireHUG

Real Time Object Detection Using Open CV

Khem

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

A Call to Action for Generative AI in 2024

Results

In an era where artificial intelligence (AI) stands at the forefront of business innovation, Information Architecture (IA) is at the core of functionality. See “There’s No AI Without IA” – (from 2016 but even more relevant today) Understanding and leveraging how Information Architecture (IA) supports AI synergies between knowledge engineering and prompt engineering is critical for senior leaders looking to successfully deploy AI for internal and externally facing knowledge processes. This webinar be a high-level overview of the methodologies that can elevate AI-driven knowledge processes supporting both employees and customers. Core Insights Include: Strategic Knowledge Engineering: Delve into how structuring AI's knowledge base is required to prevent hallucinations, enable contextual retrieval of accurate information. This will include discussion of gold standard libraries of use cases support testing various LLMs and structures and configurations of knowledge base. Precision in Prompt Engineering: Learn the art of crafting prompts that direct AI to deliver targeted, relevant responses, thereby optimizing customer experiences and business outcomes. Unified Approach for Enhanced AI Performance: Explore the intersection of knowledge and prompt engineering to develop AI systems that are not only more responsive but also aligned with overarching business strategies. Guiding Principles for Implementation: Equip yourself with best practices, ethical guidelines, and strategic considerations for embedding these technologies into your business ecosystem effectively. This webinar is designed to empower business and technology leaders with the knowledge to harness the full potential of AI, ensuring their organizations not only keep pace with digital transformation but lead the charge. Join us to map a roadmap to fully leverage Information Architecture (IA) and AI chart a course towards a future where AI is a key pillar of strategic innovation and business success.

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

Earley Information Science

BooK Now Call us at +918448380779 to hire a gorgeous and seductive call girl for sex. Take a Delhi Escort Service. The help of our escort agency is mostly meant for men who want sexual Indian Escorts In Delhi NCR. It should be noted that any impersonator will get 100 attention from our Young Girls Escorts in Delhi. They will assume the position of reliable allies. VIP Call Girl With Original Photos Book Tonight +918448380779 Our Cheap Price 1 Hour not available 2 Hours 5000 Full Night 8000 TAG: Call Girls in Delhi, Noida, Gurgaon, Ghaziabad, Connaught Place, Greater Kailash Delhi, Lajpat Nagar Delhi, Mayur Vihar Delhi, Chanakyapuri Delhi, New Friends Colony Delhi, Majnu Ka Tilla, Karol Bagh, Malviya Nagar, Saket, Khan Market, Noida Sector 18, Noida Sector 76, Noida Sector 51, Gurgaon Mg Road, Iffco Chowk Gurgaon, Rajiv Chowk Gurgaon All Delhi Ncr Free Home Deliver

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Delhi Call girls

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

08448380779 Call Girls In Friends Colony Women Seeking Men

Delhi Call girls

Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?

Driving Behavioral Change for Information Management through Data-Driven Gree...

Enterprise Knowledge

Imagine a world where information flows as swiftly as thought itself, making decision-making as fluid as the data driving it. Every moment is critical, and the right tools can significantly boost your organization’s performance. The power of real-time data automation through FME can turn this vision into reality. Aimed at professionals eager to leverage real-time data for enhanced decision-making and efficiency, this webinar will cover the essentials of real-time data and its significance. We’ll explore: FME’s role in real-time event processing, from data intake and analysis to transformation and reporting An overview of leveraging streams vs. automations FME’s impact across various industries highlighted by real-life case studies Live demonstrations on setting up FME workflows for real-time data Practical advice on getting started, best practices, and tips for effective implementation Join us to enhance your skills in real-time data automation with FME, and take your operational capabilities to the next level.

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Safe Software

How to convert PDF to text with Nanonets

naman860154

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

Handwritten Text Recognition for manuscripts and early printed texts

Maria Levchenko

High performance queues with Cassandra

1. High performance queues with Cassandra Mikalai Alimenkou http://xpinjection.com @xpinjection

2. Kiev, Ukraine #евромайдан

4. How to process data, master ? Asynchronously!

5. Queues usage scenario

6. More realistic scenario

7. More realistic scenario

8. Are you crazy? Cassandra for queues?

9. So many cool MQ providers

10. Initial expected loading

11. Some specific requirements 1 Message Queue External Service Provider

12. Some specific requirements Message Queue 1 External Service Provider

13. Some specific requirements 2 Message Queue External Service Provider

14. Some specific requirements Message Queue 2 1 Redeliver after 1 hour External Service Provider

15. Some specific requirements 2 Message Queue External Service Provider Redeliver after 1 hour Redelivery business logic and external service hourly based usage limits

16. Idea came from railways

17. “Body flow” in regular life

18. Message batches “station” QUEUE NOW ALMOST READY +1 HOUR WAITING +6 HOURS WAITING +12 HOURS WAITING

19. System components: Message MESSAGE = REAL REQUEST DATA WITH UNIQUE ID 1ST FIELD 3rd FIELD { field data} MESSAGE ID 2nd FIELD { field data} { field data} ID FORMAT ALLOWS 4096 MESSAGES PER MILLISECOND FROM ONE NODE Timestamp 44 bits Counter 12 bits Cluster node ID 8 bits

20. System components: Batch • Open for at least 1 second • Closing if opened for > 10 seconds • Closing if has > 100 messages Ascending columns ordering 1ST MESSSAGE ID 2nd MESSAGE ID 3rd MESSAGE ID BATCH ID { opt message data} { opt message data} { opt message data} ID FORMAT REQUIRE BATCH TO BE OPENED FOR > 1 SECOND Timestamp Rounded to seconds Cluster node ID + Batch Type Last 3 digits

21. System components: Queue • Similar to batch • Unlimited • May have batches with past time Ascending columns ordering 1ST BATCH ID QUEUE NAME 2nd BATCH ID 3rd BATCH ID { processed at } { processed at } { processed at }

22. System components: Broker batches polling BROKER check batch time process batch PROCESSOR QUEUE lock batch for processing ZOOKEEPER • • • • • Natural pre-fetch thanks to batches Easy to control messages processing Simple concurrency model Easy scalable between nodes No high loading on Cassandra

23. System components: Processor PROCESSOR

24. System components: Processor PROCESSOR OK

25. System components: Processor PROCESSOR

26. System components: Processor PROCESSOR redeliver on failure ANOTHER BATCH • • • • Tries to process messages as quickly as possible On error just redeliver message Messages are processed concurrently Any redelivery business logic is easy to implement

27. Warnings and benefits • Message and batch must be checked before processing • Hard to explain “queue” size • Separate columns for status tracking of message • Perform correct compaction from time to time • Expected loading is handled with single node • Everything works on commodity hardware • Single storage for all data • System is easily scalable and reliable (no message was lost)

28. Show me the code!

29. @xpinjection http://xpinjection.com mikalai.alimenkou@xpinjection.com

High performance queues with Cassandra

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a High performance queues with Cassandra

Similar a High performance queues with Cassandra (20)

Más de Mikalai Alimenkou

Más de Mikalai Alimenkou (20)

Último

Último (20)

High performance queues with Cassandra