SlideShare una empresa de Scribd logo
1 de 17
Huy Nguyen
CTO, Co-founder - Holistics.io
Building A Job Queue System
with PostgreSQL & Ruby
Grokking TechTalk #24 - Background Job Queues
Ho Chi Minh City - March 2018
About Me
Education:
● Pho Thong Nang Khieu, Tin 04-07
● National University of Singapore (NUS), Computer Science Major.
Work:
● Software Engineer Intern, SenseGraphics (Stockholm, Sweden)
● Software Engineer Intern, Facebook (California, US)
● Data Infrastructure Engineer, Viki (Singapore)
Now:
● Co-founder & CTO, Holistics Software
● Co-founder, Grokking Vietnam
huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
● Building Analytics Infrastructure at Viki
● Why PostgreSQL for Analytics Infrastructure
● PostgreSQL Internals: Discussing on Uber Moving From
PostgreSQL to MySQL
● Now: Building A Job Queue System with PostgreSQL & Ruby
Some Talks I’ve Given
huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
B2B SaaS Web Application.
Connect to customer’s DB
→ run queries
→ wait for results
→ display to end users
Background
Requirements
Customer A Customer B Customer C
Reliability: Job should be processed only once
and never missed; job pickup order; retry
mechanism.
Jobs Persistence: store jobs info, track job’s
statistics (run duration, start time, end time, etc)
Multi-tenancy: Each customer has own queue
slots
Queue Architecture
Customer A Customer B Customer C
Customer’s Queue Slot
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
...
Sidekiq
Holistics Job Queue Layer
● Ruby + Rails
● PostgreSQL for DB
● Sidekiq Background Job
class DataReport < ApplicationRecord
include Queuable
def execute
# compose and run this report
return values
end
end
report = DataReport.find(123)
# normal: execute synchronously, this returns the
return value of `execute` method
report_results = report.execute
# execute asynchronously, this returns a job ID (int)
job_id = report.async.execute
CREATE TABLE jobs (
id INTEGER PRIMARY KEY,
source_type VARCHAR,
source_method VARCHAR,
source_id INTEGER,
args JSONB DEFAULT '{}',
status VARCHAR,
start_time TIMESTAMP,
queued_time TIMESTAMP,
end_time TIMESTAMP,
created_at TIMESTAMP,
stats JSONB DEFAULT '{}',
tenant_id INTEGER
)
Job Status:
created → queued → running → success / failure
Storing Jobs Data
-- finds out how many jobs are running per queue, so we know if it's full
WITH running_jobs_per_queue AS (
SELECT
tenant_id,
count(1) AS running_jobs from jobs
WHERE (status = 'running' OR status = 'queued') -- running or queued
AND created_at > NOW() - INTERVAL '6 HOURS' -- ignore jobs past 6 hours
group by 1
),
-- find out queues that are full
full_queues AS (
select
R.tenant_id
from running_jobs_per_queue R
left join tenant_queues Q ON R.tenant_id = Q.tenant_id
where R.running_jobs >= Q.num_slots
)
select id from jobs
where status = 'created'
and tenant_id NOT IN ( select tenant_id from full_queues )
order by id asc
for update skip locked
limit 1
SQL to claim next job
Select the next job which customer
still have available queue slots.
Skip over rows that’s been selected
(SKIP LOCKED)
Acquire a row-level lock upon
selecting.
CREATE TABLE tenant_queues (
id INTEGER PRIMARY KEY,
tenant_id INTEGER,
num_slots INTEGER
)
Each job is processed in a transaction.
Upon finding next job, change its state
and send over to Sidekiq
Queuing next job
class Job
def self.queue_next_job()
ActiveRecord::Base.transaction do
ret = ActiveRecord::Base.connection.execute queue_sql
return nil if ret.values.empty?
job_id = ret.values.first.first.to_i
job = Job.find(job_id)
# send to background worker
job.status = 'queued' && job.save
JobWorker.perform_async(job_id)
end
end
end
Generic Sidekiq job worker
# simplified code
class JobWorker
include Sidekiq::Worker
def perform(job_id)
job = Job.find(job_id)
job.status = 'running' && job.save
obj = job.source_type.constantize.find(job.source_id)
obj.call(job.source_method, job.args)
job.status = 'success' && job.save
rescue
job.status = 'error' && job.save
ensure
Job.queue_next_job()
end
end
This is run inside Sidekiq worker
(background).
Pull relevant instance from database and
construct object.
Invoke the method with relevant
parameters
Supervisor vs non-supervisor
OTHER JOB QUEUE HOLISTICS JOB QUEUE
Master
Dedicated process to receive
request
SQL + inline with existing Rails or
Sidekiq process
Workers Dedicated processes or threads Pass over to Sidekiq
Easily switch between synchronous vs
asynchronous.
No need to create dedicated workers code.
Thanks to Ruby’s metaprogramming.
The .async keyword
class DataReport < ApplicationRecord
include Queuable
def execute
# compose and run this report
return values
end
end
report = DataReport.find(123)
# normal: execute synchronously, this returns the return
value of `execute` method
report_results = report.execute
# execute asynchronously, this returns a job ID (int)
job_id = report.async.execute
Summary
● Why Reinvent The Wheel? Or did we.
● What we have now.
Customer A Customer B Customer C
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
...
Sidekiq
Holistics Job Queue Layer
● que, an open-source job queue written for Ruby &
PostgreSQL.
● Using advisory locks.
● Learn more: https://github.com/chanks/que
Other Job Queue Using PostgreSQL: que
Q&As?
Jobs @ Holistics
Looking for comrades to join our small team.
Why?
● Global, enterprise product used by well-known companies
(Grab, Traveloka, ...)
● Small, lean team that moves fast
Positions:
● Backend Engineer
● Front-end Engineer
● Technical Sales Engineer
● Product Manager
holistics.io/careers

Más contenido relacionado

La actualidad más candente

The Future starts with a Promise
The Future starts with a PromiseThe Future starts with a Promise
The Future starts with a PromiseAlexandru Nedelcu
 
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Dean Wampler -  Streaming Deep Learning Scenarios with...Flink Forward SF 2017: Dean Wampler -  Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...Flink Forward
 
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman FarahatSpark Summit
 
Scala Future & Promises
Scala Future & PromisesScala Future & Promises
Scala Future & PromisesKnoldus Inc.
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured StreamingKnoldus Inc.
 
From rest api to graph ql a 10 year journey
From rest api to graph ql a 10 year journeyFrom rest api to graph ql a 10 year journey
From rest api to graph ql a 10 year journeyArno Schulz
 
EclairJS = Node.Js + Apache Spark
EclairJS = Node.Js + Apache SparkEclairJS = Node.Js + Apache Spark
EclairJS = Node.Js + Apache SparkJen Aman
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015Martin Junghanns
 
H2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal MalohlavaH2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal MalohlavaSri Ambati
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxMichael Hackstein
 
Ajax Introduction Presentation
Ajax   Introduction   PresentationAjax   Introduction   Presentation
Ajax Introduction Presentationthinkphp
 
Raphael Amorim - Scrating React Fiber
Raphael Amorim - Scrating React FiberRaphael Amorim - Scrating React Fiber
Raphael Amorim - Scrating React FiberReact Conf Brasil
 
OQGraph @ SCaLE 11x 2013
OQGraph @ SCaLE 11x 2013OQGraph @ SCaLE 11x 2013
OQGraph @ SCaLE 11x 2013Antony T Curtis
 
Parallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflowParallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflowCodecamp Romania
 
Introduction to ajax
Introduction  to  ajaxIntroduction  to  ajax
Introduction to ajaxPihu Goel
 
Gatling overview
Gatling overviewGatling overview
Gatling overviewViral Jain
 

La actualidad más candente (20)

The Future starts with a Promise
The Future starts with a PromiseThe Future starts with a Promise
The Future starts with a Promise
 
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Dean Wampler -  Streaming Deep Learning Scenarios with...Flink Forward SF 2017: Dean Wampler -  Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
 
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
26 Trillion App Recomendations using 100 Lines of Spark Code - Ayman Farahat
 
Scala Future & Promises
Scala Future & PromisesScala Future & Promises
Scala Future & Promises
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
From rest api to graph ql a 10 year journey
From rest api to graph ql a 10 year journeyFrom rest api to graph ql a 10 year journey
From rest api to graph ql a 10 year journey
 
EclairJS = Node.Js + Apache Spark
EclairJS = Node.Js + Apache SparkEclairJS = Node.Js + Apache Spark
EclairJS = Node.Js + Apache Spark
 
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje CrnjakJavantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
 
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink Forward 2015
 
H2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal MalohlavaH2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal Malohlava
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB Foxx
 
Ajax
AjaxAjax
Ajax
 
What is Spark
What is SparkWhat is Spark
What is Spark
 
Ajax Introduction Presentation
Ajax   Introduction   PresentationAjax   Introduction   Presentation
Ajax Introduction Presentation
 
Raphael Amorim - Scrating React Fiber
Raphael Amorim - Scrating React FiberRaphael Amorim - Scrating React Fiber
Raphael Amorim - Scrating React Fiber
 
OQGraph @ SCaLE 11x 2013
OQGraph @ SCaLE 11x 2013OQGraph @ SCaLE 11x 2013
OQGraph @ SCaLE 11x 2013
 
Parallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflowParallel & async processing using tpl dataflow
Parallel & async processing using tpl dataflow
 
Introduction to ajax
Introduction  to  ajaxIntroduction  to  ajax
Introduction to ajax
 
Gatling overview
Gatling overviewGatling overview
Gatling overview
 

Similar a Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & PostgreSQL

Sql storeprocedure
Sql storeprocedureSql storeprocedure
Sql storeprocedureftz 420
 
Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGvraopolisetti
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiRan Mizrahi
 
Node js
Node jsNode js
Node jshazzaz
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript EverywherePascal Rettig
 
Javascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and SinonJavascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and SinonLars Thorup
 
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job QueueTask Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job QueueSam Hennessy
 
Apex code Benchmarking
Apex code BenchmarkingApex code Benchmarking
Apex code BenchmarkingAmit Chaudhary
 
Sherlock Homepage - A detective story about running large web services - NDC ...
Sherlock Homepage - A detective story about running large web services - NDC ...Sherlock Homepage - A detective story about running large web services - NDC ...
Sherlock Homepage - A detective story about running large web services - NDC ...Maarten Balliauw
 
Django for IoT: From hackathon to production (DjangoCon US)
Django for IoT: From hackathon to production (DjangoCon US)Django for IoT: From hackathon to production (DjangoCon US)
Django for IoT: From hackathon to production (DjangoCon US)Anna Schneider
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaChetan Khatri
 
Live Streaming & Server Sent Events
Live Streaming & Server Sent EventsLive Streaming & Server Sent Events
Live Streaming & Server Sent Eventstkramar
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Holden Karau
 
Using Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App EngineUsing Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App EngineRiver of Talent
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingRichardWarburton
 
Job Queues Overview
Job Queues OverviewJob Queues Overview
Job Queues Overviewjoeyrobert
 
Scalable web application architecture
Scalable web application architectureScalable web application architecture
Scalable web application architecturepostrational
 

Similar a Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & PostgreSQL (20)

Sql storeprocedure
Sql storeprocedureSql storeprocedure
Sql storeprocedure
 
Salesforce asynchronous apex
Salesforce asynchronous apexSalesforce asynchronous apex
Salesforce asynchronous apex
 
Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran Mizrahi
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
 
Node js
Node jsNode js
Node js
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript Everywhere
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
Javascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and SinonJavascript unit testing with QUnit and Sinon
Javascript unit testing with QUnit and Sinon
 
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job QueueTask Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
Task Scheduling and Asynchronous Processing Evolved. Zend Server Job Queue
 
Apex code Benchmarking
Apex code BenchmarkingApex code Benchmarking
Apex code Benchmarking
 
Sherlock Homepage - A detective story about running large web services - NDC ...
Sherlock Homepage - A detective story about running large web services - NDC ...Sherlock Homepage - A detective story about running large web services - NDC ...
Sherlock Homepage - A detective story about running large web services - NDC ...
 
Django for IoT: From hackathon to production (DjangoCon US)
Django for IoT: From hackathon to production (DjangoCon US)Django for IoT: From hackathon to production (DjangoCon US)
Django for IoT: From hackathon to production (DjangoCon US)
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Live Streaming & Server Sent Events
Live Streaming & Server Sent EventsLive Streaming & Server Sent Events
Live Streaming & Server Sent Events
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
 
Using Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App EngineUsing Task Queues and D3.js to build an analytics product on App Engine
Using Task Queues and D3.js to build an analytics product on App Engine
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional Programming
 
Job Queues Overview
Job Queues OverviewJob Queues Overview
Job Queues Overview
 
Scalable web application architecture
Scalable web application architectureScalable web application architecture
Scalable web application architecture
 

Más de Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking VN
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking VN
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking VN
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking VN
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...Grokking VN
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problemGrokking VN
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoringGrokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking VN
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design PatternsGrokking VN
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
 

Más de Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKI
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 

Último

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 

Último (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & PostgreSQL

  • 1. Huy Nguyen CTO, Co-founder - Holistics.io Building A Job Queue System with PostgreSQL & Ruby Grokking TechTalk #24 - Background Job Queues Ho Chi Minh City - March 2018
  • 2. About Me Education: ● Pho Thong Nang Khieu, Tin 04-07 ● National University of Singapore (NUS), Computer Science Major. Work: ● Software Engineer Intern, SenseGraphics (Stockholm, Sweden) ● Software Engineer Intern, Facebook (California, US) ● Data Infrastructure Engineer, Viki (Singapore) Now: ● Co-founder & CTO, Holistics Software ● Co-founder, Grokking Vietnam huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
  • 3. ● Building Analytics Infrastructure at Viki ● Why PostgreSQL for Analytics Infrastructure ● PostgreSQL Internals: Discussing on Uber Moving From PostgreSQL to MySQL ● Now: Building A Job Queue System with PostgreSQL & Ruby Some Talks I’ve Given huy@holistics.io facebook.com/huy bit.ly/huy-linkedin
  • 4. B2B SaaS Web Application. Connect to customer’s DB → run queries → wait for results → display to end users Background
  • 5. Requirements Customer A Customer B Customer C Reliability: Job should be processed only once and never missed; job pickup order; retry mechanism. Jobs Persistence: store jobs info, track job’s statistics (run duration, start time, end time, etc) Multi-tenancy: Each customer has own queue slots
  • 6.
  • 7. Queue Architecture Customer A Customer B Customer C Customer’s Queue Slot Worker Worker Worker Worker Worker Worker Worker Worker ... Sidekiq Holistics Job Queue Layer ● Ruby + Rails ● PostgreSQL for DB ● Sidekiq Background Job
  • 8. class DataReport < ApplicationRecord include Queuable def execute # compose and run this report return values end end report = DataReport.find(123) # normal: execute synchronously, this returns the return value of `execute` method report_results = report.execute # execute asynchronously, this returns a job ID (int) job_id = report.async.execute CREATE TABLE jobs ( id INTEGER PRIMARY KEY, source_type VARCHAR, source_method VARCHAR, source_id INTEGER, args JSONB DEFAULT '{}', status VARCHAR, start_time TIMESTAMP, queued_time TIMESTAMP, end_time TIMESTAMP, created_at TIMESTAMP, stats JSONB DEFAULT '{}', tenant_id INTEGER ) Job Status: created → queued → running → success / failure Storing Jobs Data
  • 9. -- finds out how many jobs are running per queue, so we know if it's full WITH running_jobs_per_queue AS ( SELECT tenant_id, count(1) AS running_jobs from jobs WHERE (status = 'running' OR status = 'queued') -- running or queued AND created_at > NOW() - INTERVAL '6 HOURS' -- ignore jobs past 6 hours group by 1 ), -- find out queues that are full full_queues AS ( select R.tenant_id from running_jobs_per_queue R left join tenant_queues Q ON R.tenant_id = Q.tenant_id where R.running_jobs >= Q.num_slots ) select id from jobs where status = 'created' and tenant_id NOT IN ( select tenant_id from full_queues ) order by id asc for update skip locked limit 1 SQL to claim next job Select the next job which customer still have available queue slots. Skip over rows that’s been selected (SKIP LOCKED) Acquire a row-level lock upon selecting. CREATE TABLE tenant_queues ( id INTEGER PRIMARY KEY, tenant_id INTEGER, num_slots INTEGER )
  • 10. Each job is processed in a transaction. Upon finding next job, change its state and send over to Sidekiq Queuing next job class Job def self.queue_next_job() ActiveRecord::Base.transaction do ret = ActiveRecord::Base.connection.execute queue_sql return nil if ret.values.empty? job_id = ret.values.first.first.to_i job = Job.find(job_id) # send to background worker job.status = 'queued' && job.save JobWorker.perform_async(job_id) end end end
  • 11. Generic Sidekiq job worker # simplified code class JobWorker include Sidekiq::Worker def perform(job_id) job = Job.find(job_id) job.status = 'running' && job.save obj = job.source_type.constantize.find(job.source_id) obj.call(job.source_method, job.args) job.status = 'success' && job.save rescue job.status = 'error' && job.save ensure Job.queue_next_job() end end This is run inside Sidekiq worker (background). Pull relevant instance from database and construct object. Invoke the method with relevant parameters
  • 12. Supervisor vs non-supervisor OTHER JOB QUEUE HOLISTICS JOB QUEUE Master Dedicated process to receive request SQL + inline with existing Rails or Sidekiq process Workers Dedicated processes or threads Pass over to Sidekiq
  • 13. Easily switch between synchronous vs asynchronous. No need to create dedicated workers code. Thanks to Ruby’s metaprogramming. The .async keyword class DataReport < ApplicationRecord include Queuable def execute # compose and run this report return values end end report = DataReport.find(123) # normal: execute synchronously, this returns the return value of `execute` method report_results = report.execute # execute asynchronously, this returns a job ID (int) job_id = report.async.execute
  • 14. Summary ● Why Reinvent The Wheel? Or did we. ● What we have now. Customer A Customer B Customer C Worker Worker Worker Worker Worker Worker Worker Worker ... Sidekiq Holistics Job Queue Layer
  • 15. ● que, an open-source job queue written for Ruby & PostgreSQL. ● Using advisory locks. ● Learn more: https://github.com/chanks/que Other Job Queue Using PostgreSQL: que
  • 16. Q&As?
  • 17. Jobs @ Holistics Looking for comrades to join our small team. Why? ● Global, enterprise product used by well-known companies (Grab, Traveloka, ...) ● Small, lean team that moves fast Positions: ● Backend Engineer ● Front-end Engineer ● Technical Sales Engineer ● Product Manager holistics.io/careers