SlideShare a Scribd company logo
1 of 35
Download to read offline
Our Story w/ Clickhouse @seo.do
Metehan Çetinkaya
What is seo.do?
What is seo.do?
Yandex Metrica for SEO professionals.
50.000 keywords
50.000 keywords
50.000 x 2 x 100 x 365
50.000 keywords
50.000 x 2 x 100 x 365
3.6 billion rows per client
How did our use case comply with ClickHouse?
● Continuous data insertion
○ Since there is no lock implementation in ClickHouse, inserting data to the database does not
affect query performance
● No updates / deletes necessary
○ We don't have a use case for UPDATE & DELETE operations, to delete obsolete data we
simply use PARTITION and it's operations
■ You can still update the data with using ALTER mutations
● Billions of rows to be processed
○ Time series data which will be aggregated constantly to create reports etc.
● SQL dialect close to Standard SQL
How did our use case comply with ClickHouse?
● Very handy built-in functions
○ avgIf, sumIf, countIf, order by if, date conversion functions (toWeek, toMonth etc.), splitByChar
■ ‘ SELECT fullName, age, avgIf(wage, age BETWEEN 18 and 25) as youthWageAvg
from people where country_id = 745261’
● High compression ratio
○ Compression ratio totally depends on the data and since our data is timeseries and there are
repeated strings, compression ratio for us is around 1:15
● Continuous development and great community
How did we design our table schemas?
Understanding ClickHouse Index Structure
● Index structure is not similar to traditional RDBMS's index structure, no
B+Tree, does not create unique constraint
● Data is physically sorted on disk
○ There is a background job to sort and merge the data and it will take place eventually
● Need to choose primary key / sorting key by considering query conditions
○ To keep the reads at minimum, need to consider all possible queries with conditions and then
choose the primary key
How did we design our table schemas?
What Did We Do?
● Created initial table schemas on single server by considering our data
structure and possible queries
● Inserted billions of dummy data which completely simulated our actual case
○ Since the data we will have is time series, we created the actual data for one day, and
replicate it for 180 days
● Wrote sample queries, few queries for each dimension we would query the
data
Refactoring The Schemas & Writing Queries
● Ran single query each time and checked the results
○ Tip : Use ' tail -f /var/log/clickhouse-server/clickhouse-server.log ' to see execution logs of the
query (Peak MEM Usage, Threads executed, Marks read) that was executed. Or simply add
'send_logs_level=trace' when connecting to the cli ' clickhouse-client --send_logs_level=trace '
● Refactored the table schemas or even created new table schemas with
considering query results
● After satisfying results, we executed bunch of load tests with the sample
queries
Using JOIN in ClickHouse
● Since ClickHouse’s compression ratio is very high, compromised from storage
to boost the performance
○ How so? Did not force the tables to be atomic, to keep the relations & JOINs at minimum, let
there be duplicated data. It’s a trade off.
● Tried to avoid JOINs however with our data structure it was not possible, so
we kept JOINs at minimum
● Avoided using raw JOINs all the time
● Used JOINs with subqueries
Raw JOIN
select keyword, group_id from keyword_data as ss
INNER JOIN
keyword_info as kw on ss.keyword = kw.keyword
and ss.cid = kw.cid
and ss.cid = 149315
and ss.position = 5
and toDate(ss.timestamp) = toDate('2019-10-20')
Raw JOIN
● Took 5 seconds to complete
○ Slow and keeping system resources busy for long (This is important because ClickHouse can
fully utilize system resources under load, average query response time will be high)
● Processed 369.74 million rows
● Executed the query with 8 threads
○ Executed the query on a server which had 16 cores. By default ClickHouse sets max_threads
setting to half the number of physical core count, so this query utilized all the cores available
to ClickHouse
● Peak Memory Usage : 90 Megabytes
JOIN With Sub Query
select keyword, group_id from
(select keyword from keyword_data PREWHERE position = 5 where cid = 149315
and toDate(timestamp) = '2019-10-20') as ss
INNER JOIN
(select keyword, group_id from keyword_info where cid = 149315) as kw
on ss.keyword = kw.keyword
JOIN With Sub Query
● Took 100 milliseconds
○ 50 times faster than raw join and releases resources quickly
● Processed 65.54 thousand rows
● Executed the query with 2 threads
○ Was able to execute the query with 6 less threads. It is very important for us to keep thread
count at minimum for each query since our QPS rate will be high. If we were to ignore this, we
would have to solve our performance problems with new replicas in the future which means
new servers and constant money spend
● Peak Memory Usage : 5 Megabytes
Creating ClickHouse Cluster
● Replication
○ High Availability
○ Load Scaling
● Sharding
○ Data size
○ Split data into smaller parts
Creating ClickHouse Cluster
● Created a cluster with 6 servers (2 shards and 3 replicas)
● Set up ZooKeeper with 3 additional servers
○ Since latency is a critical point for ZooKeeper and ClickHouse can utilize all available system
resources we don’t run ZooKeeper on the same server with ClickHouse (ZooKeeper Cluster
with 3 servers can handle failure of 1 server)
● Used Clickhouse's internal load balancer mechanism to distribute queries
over the replicas
○ HAProxy or CHProxy could be used as separate load balancer
Chaos Testing
● Cluster is set up & running
● Developed a dummy API which will execute random queries with Golang Gin
○ This is not a performance test so we just went with the fastest way for us
● Created a basic load test which will make requests to our Go API
continuously, with this way we will be able to monitor ClickHouse behavior
with failures
Without Data Loss
● Initiated the load test
● Killed the ClickHouse in a random node during the load test
○ We are actually trying to simulate ClickHouse failure & temporary server crash in this scenario
● Monitored ClickHouse's load balancer's behavior and system's load
● Restarted the ClickHouse in that server
○ Load test is still being executed in the meantime
What Happened?
● Since the ClickHouse Server in chosen shard was unreachable, ClickHouse
load balancer stopped making requests to that node
● Queries started to be distributed over the remaining 2 replicas
● After restarting the Clickhouse Server, chosen node did not get any requests
for another 10 minutes then it started to receive requests
○ Because ClickHouse load balancer distributes queries with considering error counts
○ Error count is halved each minute
○ Maximum error count is 1000 by default
With Data Loss
● Chose an another shard randomly
● This time we didn’t just kill the ClickHouse Server, we formatted disks as well.
● CH configuration lost, all the data is lost
● Chosen node is still in ZooKeeper config
● Reconfigured the server and Reinstalled the CH
● Copied metadata from a running replica
● Monitored ClickHouse behavior and system's load
What Happened?
● Like in the previous case, chosen node did not get any requests
● After configuration of the server is completed, replication took place
● Like in the other scenario, chosen node started getting requests after 10
minutes
Deciding the API Framework
● Developed API endpoints which execute sample queries with Flask, Golang
Gin and FastAPI
● Initiated the load tests for three of them separately (10K requests per minute,
ran for 20 minutes)
● Monitored the results
Deciding the API Framework
● Flask
○ Was not able to handle all the requests so after some time errors started to raise
○ Average response time was 3 seconds and %10 of the incoming requests resulted with errors
● Gin
○ Was able to handle all the requests without errors
○ Average response time was 350 milliseconds and there was no error at all
● FastAPI
○ Was able to handle all the requests without errors as well
○ Average response time was 300 milliseconds without errors

More Related Content

What's hot

Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)Christopher Gutknecht
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
 
[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOsAreej AbuAli
 
How Search Works
How Search WorksHow Search Works
How Search WorksAhrefs
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
How to put together a search strategy for a new category
How to put together a search strategy for a new categoryHow to put together a search strategy for a new category
How to put together a search strategy for a new categoryAmir Jirbandey
 
Why is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier LeauteWhy is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier LeauteDatabricks
 
Influencing Discovery, Indexing Strategies For Complex Websites
Influencing Discovery, Indexing Strategies For Complex WebsitesInfluencing Discovery, Indexing Strategies For Complex Websites
Influencing Discovery, Indexing Strategies For Complex WebsitesDan Taylor
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOAltinity Ltd
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...Ian Helms
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Christopher Gutknecht
 
How to leverage indexation tracking to monitor issues and improve performance
How to leverage indexation tracking to monitor issues and improve performanceHow to leverage indexation tracking to monitor issues and improve performance
How to leverage indexation tracking to monitor issues and improve performanceSimon Lesser
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
Canonicalization for SEO BrightonSEO April 2023 Patrick Stox
Canonicalization for SEO BrightonSEO April 2023 Patrick StoxCanonicalization for SEO BrightonSEO April 2023 Patrick Stox
Canonicalization for SEO BrightonSEO April 2023 Patrick StoxAhrefs
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseAltinity Ltd
 

What's hot (20)

Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
Scaling Search Campaigns With Bulk Uploads and Ad Customizers (SMX 2023)
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
Python for SEO
Python for SEOPython for SEO
Python for SEO
 
[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs
 
How Search Works
How Search WorksHow Search Works
How Search Works
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
How to put together a search strategy for a new category
How to put together a search strategy for a new categoryHow to put together a search strategy for a new category
How to put together a search strategy for a new category
 
Why is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier LeauteWhy is My Stream Processing Job Slow? with Xavier Leaute
Why is My Stream Processing Job Slow? with Xavier Leaute
 
Influencing Discovery, Indexing Strategies For Complex Websites
Influencing Discovery, Indexing Strategies For Complex WebsitesInfluencing Discovery, Indexing Strategies For Complex Websites
Influencing Discovery, Indexing Strategies For Complex Websites
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
 
40 Deep #SEO Insights for 2023
40 Deep #SEO Insights for 202340 Deep #SEO Insights for 2023
40 Deep #SEO Insights for 2023
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...How to convince even the pickiest editors to take SEO more seriously :: brigh...
How to convince even the pickiest editors to take SEO more seriously :: brigh...
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
 
How to leverage indexation tracking to monitor issues and improve performance
How to leverage indexation tracking to monitor issues and improve performanceHow to leverage indexation tracking to monitor issues and improve performance
How to leverage indexation tracking to monitor issues and improve performance
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
Canonicalization for SEO BrightonSEO April 2023 Patrick Stox
Canonicalization for SEO BrightonSEO April 2023 Patrick StoxCanonicalization for SEO BrightonSEO April 2023 Patrick Stox
Canonicalization for SEO BrightonSEO April 2023 Patrick Stox
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 

Similar to Our Story With ClickHouse at seo.do

Enabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedShubham Tagra
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedShubham Tagra
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django applicationbangaloredjangousergroup
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsDataWorks Summit
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward
 
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesDruid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesShivji Kumar Jha
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short versionAlex Pinkin
 
Speed Up Uber's Presto with Alluxio
Speed Up Uber's Presto with AlluxioSpeed Up Uber's Presto with Alluxio
Speed Up Uber's Presto with AlluxioAlluxio, Inc.
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3LibbySchulze
 
Streaming millions of Contact Center interactions in (near) real-time with Pu...
Streaming millions of Contact Center interactions in (near) real-time with Pu...Streaming millions of Contact Center interactions in (near) real-time with Pu...
Streaming millions of Contact Center interactions in (near) real-time with Pu...Frank Kelly
 
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...StreamNative
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...Anna Ossowski
 
Debugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarDebugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarShubham Tagra
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud EraMydbops
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesEd Hunter
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
 

Similar to Our Story With ClickHouse at seo.do (20)

Enabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speed
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speed
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
 
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
Flink Forward San Francisco 2018: Gregory Fee - "Bootstrapping State In Apach...
 
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutesDruid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
 
Speed Up Uber's Presto with Alluxio
Speed Up Uber's Presto with AlluxioSpeed Up Uber's Presto with Alluxio
Speed Up Uber's Presto with Alluxio
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Streaming millions of Contact Center interactions in (near) real-time with Pu...
Streaming millions of Contact Center interactions in (near) real-time with Pu...Streaming millions of Contact Center interactions in (near) real-time with Pu...
Streaming millions of Contact Center interactions in (near) real-time with Pu...
 
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Debugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarDebugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan Kumar
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Evolution of DBA in the Cloud Era
 Evolution of DBA in the Cloud Era Evolution of DBA in the Cloud Era
Evolution of DBA in the Cloud Era
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 

Recently uploaded

Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 

Recently uploaded (20)

Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 

Our Story With ClickHouse at seo.do

  • 1. Our Story w/ Clickhouse @seo.do Metehan Çetinkaya
  • 3. What is seo.do? Yandex Metrica for SEO professionals.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 15. 50.000 keywords 50.000 x 2 x 100 x 365
  • 16. 50.000 keywords 50.000 x 2 x 100 x 365 3.6 billion rows per client
  • 17. How did our use case comply with ClickHouse? ● Continuous data insertion ○ Since there is no lock implementation in ClickHouse, inserting data to the database does not affect query performance ● No updates / deletes necessary ○ We don't have a use case for UPDATE & DELETE operations, to delete obsolete data we simply use PARTITION and it's operations ■ You can still update the data with using ALTER mutations ● Billions of rows to be processed ○ Time series data which will be aggregated constantly to create reports etc. ● SQL dialect close to Standard SQL
  • 18. How did our use case comply with ClickHouse? ● Very handy built-in functions ○ avgIf, sumIf, countIf, order by if, date conversion functions (toWeek, toMonth etc.), splitByChar ■ ‘ SELECT fullName, age, avgIf(wage, age BETWEEN 18 and 25) as youthWageAvg from people where country_id = 745261’ ● High compression ratio ○ Compression ratio totally depends on the data and since our data is timeseries and there are repeated strings, compression ratio for us is around 1:15 ● Continuous development and great community
  • 19. How did we design our table schemas? Understanding ClickHouse Index Structure ● Index structure is not similar to traditional RDBMS's index structure, no B+Tree, does not create unique constraint ● Data is physically sorted on disk ○ There is a background job to sort and merge the data and it will take place eventually ● Need to choose primary key / sorting key by considering query conditions ○ To keep the reads at minimum, need to consider all possible queries with conditions and then choose the primary key
  • 20. How did we design our table schemas? What Did We Do? ● Created initial table schemas on single server by considering our data structure and possible queries ● Inserted billions of dummy data which completely simulated our actual case ○ Since the data we will have is time series, we created the actual data for one day, and replicate it for 180 days ● Wrote sample queries, few queries for each dimension we would query the data
  • 21. Refactoring The Schemas & Writing Queries ● Ran single query each time and checked the results ○ Tip : Use ' tail -f /var/log/clickhouse-server/clickhouse-server.log ' to see execution logs of the query (Peak MEM Usage, Threads executed, Marks read) that was executed. Or simply add 'send_logs_level=trace' when connecting to the cli ' clickhouse-client --send_logs_level=trace ' ● Refactored the table schemas or even created new table schemas with considering query results ● After satisfying results, we executed bunch of load tests with the sample queries
  • 22. Using JOIN in ClickHouse ● Since ClickHouse’s compression ratio is very high, compromised from storage to boost the performance ○ How so? Did not force the tables to be atomic, to keep the relations & JOINs at minimum, let there be duplicated data. It’s a trade off. ● Tried to avoid JOINs however with our data structure it was not possible, so we kept JOINs at minimum ● Avoided using raw JOINs all the time ● Used JOINs with subqueries
  • 23. Raw JOIN select keyword, group_id from keyword_data as ss INNER JOIN keyword_info as kw on ss.keyword = kw.keyword and ss.cid = kw.cid and ss.cid = 149315 and ss.position = 5 and toDate(ss.timestamp) = toDate('2019-10-20')
  • 24. Raw JOIN ● Took 5 seconds to complete ○ Slow and keeping system resources busy for long (This is important because ClickHouse can fully utilize system resources under load, average query response time will be high) ● Processed 369.74 million rows ● Executed the query with 8 threads ○ Executed the query on a server which had 16 cores. By default ClickHouse sets max_threads setting to half the number of physical core count, so this query utilized all the cores available to ClickHouse ● Peak Memory Usage : 90 Megabytes
  • 25. JOIN With Sub Query select keyword, group_id from (select keyword from keyword_data PREWHERE position = 5 where cid = 149315 and toDate(timestamp) = '2019-10-20') as ss INNER JOIN (select keyword, group_id from keyword_info where cid = 149315) as kw on ss.keyword = kw.keyword
  • 26. JOIN With Sub Query ● Took 100 milliseconds ○ 50 times faster than raw join and releases resources quickly ● Processed 65.54 thousand rows ● Executed the query with 2 threads ○ Was able to execute the query with 6 less threads. It is very important for us to keep thread count at minimum for each query since our QPS rate will be high. If we were to ignore this, we would have to solve our performance problems with new replicas in the future which means new servers and constant money spend ● Peak Memory Usage : 5 Megabytes
  • 27. Creating ClickHouse Cluster ● Replication ○ High Availability ○ Load Scaling ● Sharding ○ Data size ○ Split data into smaller parts
  • 28. Creating ClickHouse Cluster ● Created a cluster with 6 servers (2 shards and 3 replicas) ● Set up ZooKeeper with 3 additional servers ○ Since latency is a critical point for ZooKeeper and ClickHouse can utilize all available system resources we don’t run ZooKeeper on the same server with ClickHouse (ZooKeeper Cluster with 3 servers can handle failure of 1 server) ● Used Clickhouse's internal load balancer mechanism to distribute queries over the replicas ○ HAProxy or CHProxy could be used as separate load balancer
  • 29. Chaos Testing ● Cluster is set up & running ● Developed a dummy API which will execute random queries with Golang Gin ○ This is not a performance test so we just went with the fastest way for us ● Created a basic load test which will make requests to our Go API continuously, with this way we will be able to monitor ClickHouse behavior with failures
  • 30. Without Data Loss ● Initiated the load test ● Killed the ClickHouse in a random node during the load test ○ We are actually trying to simulate ClickHouse failure & temporary server crash in this scenario ● Monitored ClickHouse's load balancer's behavior and system's load ● Restarted the ClickHouse in that server ○ Load test is still being executed in the meantime
  • 31. What Happened? ● Since the ClickHouse Server in chosen shard was unreachable, ClickHouse load balancer stopped making requests to that node ● Queries started to be distributed over the remaining 2 replicas ● After restarting the Clickhouse Server, chosen node did not get any requests for another 10 minutes then it started to receive requests ○ Because ClickHouse load balancer distributes queries with considering error counts ○ Error count is halved each minute ○ Maximum error count is 1000 by default
  • 32. With Data Loss ● Chose an another shard randomly ● This time we didn’t just kill the ClickHouse Server, we formatted disks as well. ● CH configuration lost, all the data is lost ● Chosen node is still in ZooKeeper config ● Reconfigured the server and Reinstalled the CH ● Copied metadata from a running replica ● Monitored ClickHouse behavior and system's load
  • 33. What Happened? ● Like in the previous case, chosen node did not get any requests ● After configuration of the server is completed, replication took place ● Like in the other scenario, chosen node started getting requests after 10 minutes
  • 34. Deciding the API Framework ● Developed API endpoints which execute sample queries with Flask, Golang Gin and FastAPI ● Initiated the load tests for three of them separately (10K requests per minute, ran for 20 minutes) ● Monitored the results
  • 35. Deciding the API Framework ● Flask ○ Was not able to handle all the requests so after some time errors started to raise ○ Average response time was 3 seconds and %10 of the incoming requests resulted with errors ● Gin ○ Was able to handle all the requests without errors ○ Average response time was 350 milliseconds and there was no error at all ● FastAPI ○ Was able to handle all the requests without errors as well ○ Average response time was 300 milliseconds without errors