SlideShare a Scribd company logo
1 of 18
Tarantool/Box: a use case with
serving 2 billion queries a day
        November 1st, 2012, Tallinn, Estonia
            Konstantin Osipov, Mail.Ru
              kostja@tarantool.org
Agenda


Tarantool/Box: features and architecture
Case study 1: advanced memcache
Case study 2: a message queue server
Case study 3: a reliable in-memory database
Project future
Feature overview


●
    Flexible data model, primary and secondary keys
●
    Fully cached: 100% of data is cached in RAM
●
    Persistent using a Write Ahead Log
●
    Log shipping replication and online backup
●
    extensible with Lua
Data model
●
    A game of fields, tuples and spaces
●
    HASH, TREE, BITMAP, partial, indexes
●
    Secondary keys
●
    Single-part, multi-part
●
    STRING, NUM and NUM64 data types
Server architecture
Lua API

                  Redis                 Tarantool (Lua)
redis.set(key, value)       box.insert(space, key, value)
redis.get(key)              box.select(space, 0, key, value)
redis.getset(key, newkey)    box.update(space, key, '=p', 0,
                            newkey)
redis.incr(key)             box.update(space, key, '+p', 1, 1)
redis.lpush(key, value)     box.update(space, key, '!p', 1, value)

redis.rpush(key, value)     You guess it...
Performance overview


Intel I5 , 4G RAM, 7200 RPM SATA
10 threads, 200-300 bytes per tuple
Tarantool 1.4.6: 170k writes,
260k reads
Memory footprint
Raw GET/SET performance
Use case 1: flexible memcache


●
    You can create your own fibers
●
    box.fiber.create(), box.fiber.yield()
●
    A background fiber performs a customized expire
●
 Session store: 20M online users, 200M monthly users,
4 2-CPU units, 96GB RAM each
●
    60K RPS, CPU usage is below 20%
Use case 2: message queue


●
 Reliable queues are a vital ingredient for building
scalable applications
●
  In Web Apps queues are used for delayed processing,
load balancing, e-mail notifications
●
    Our use case: prefetching avatars
Message queues: how


●
    Rich Lua stored procedure environment
      ―   box.fiber, box.space, box.cfg
●
    Specialized inter-procedure communication API:
      ―   box.ipc
●
    Specialized data structures
      ―   Bitmaps and partial keys
Use case 3: reliable database


●
    User profile: 500 bytes of key/value pairs
●
    Math: 200M users * 500 = 100GB
●
    At least 2x smaller memory footprint than in a RDBMS
●
 Win: predictable response time, significantly higher
performance
Example: a FIFO

function fifo_push(name, val)
    fifo = find_or_create_fifo(name)
    top = box.unpack('i', fifo[1])
    bottom = box.unpack('i', fifo[2])
    if top == fifomax+2 then -- % size
        top = 3
 …
    end
    return box.update(0, name, '=p=p=p', 1, top,
          2, bottom, top, val)
end
Conclusion


●
    Tarantool/Box – a high performance data engine
●
 Tarantool/Lua – a building block for your heavily loaded
web applications
●
  Tarantool – our approach to an easy to use database
for highly volatile data
What's cooking in 1.5


●
    Disk-based backend
●
    Synchronous master-master replication
●
    New data types (array, date, currency, json)
●
    authentication
Thank you!
Links


http://github.com/mailru/tarantool - source code
http://github.com/mailru/tntlua - open source stored procedures
repository
http://groups.google.com/group/tarantool - mailing list
http://tarantool.org/dist/ - always fresh .tar.gz and .rpm

More Related Content

What's hot

Mining top k frequent closed itemsets
Mining top k frequent closed itemsetsMining top k frequent closed itemsets
Mining top k frequent closed itemsets
yuanchung
 

What's hot (20)

Your data isn't that big @ Big Things Meetup 2016-05-16
Your data isn't that big @ Big Things Meetup 2016-05-16Your data isn't that big @ Big Things Meetup 2016-05-16
Your data isn't that big @ Big Things Meetup 2016-05-16
 
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
 
Mongo nyc nyt + mongodb
Mongo nyc nyt + mongodbMongo nyc nyt + mongodb
Mongo nyc nyt + mongodb
 
High Performance OSM Data Manipulation With Osmium - State of the Map 2013
High Performance OSM Data Manipulation With Osmium - State of the Map 2013High Performance OSM Data Manipulation With Osmium - State of the Map 2013
High Performance OSM Data Manipulation With Osmium - State of the Map 2013
 
Big data in the cloud
Big data in the cloudBig data in the cloud
Big data in the cloud
 
Performance evaluation of apache tajo
Performance evaluation of apache tajoPerformance evaluation of apache tajo
Performance evaluation of apache tajo
 
Introduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJugIntroduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJug
 
Geo Package and OWS Context at FOSS4G PDX
Geo Package and OWS Context at FOSS4G PDXGeo Package and OWS Context at FOSS4G PDX
Geo Package and OWS Context at FOSS4G PDX
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
Circos
CircosCircos
Circos
 
Alluxio
AlluxioAlluxio
Alluxio
 
Mining top k frequent closed itemsets
Mining top k frequent closed itemsetsMining top k frequent closed itemsets
Mining top k frequent closed itemsets
 
K computer
K computerK computer
K computer
 
Modern software design in Big data era
Modern software design in Big data eraModern software design in Big data era
Modern software design in Big data era
 
Big data solution capacity planning
Big data solution capacity planningBig data solution capacity planning
Big data solution capacity planning
 
Heap Data Structure
 Heap Data Structure Heap Data Structure
Heap Data Structure
 
MapDB - taking Java collections to the next level
MapDB - taking Java collections to the next levelMapDB - taking Java collections to the next level
MapDB - taking Java collections to the next level
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo db
 
Mongo db cluster administration and Shredded Databases
Mongo db cluster administration and Shredded DatabasesMongo db cluster administration and Shredded Databases
Mongo db cluster administration and Shredded Databases
 
spark stream - kafka - the right way
spark stream - kafka - the right way spark stream - kafka - the right way
spark stream - kafka - the right way
 

Similar to My talk at Topconf.com conference, Tallinn, 1st of November 2012

Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Day
programmermag
 

Similar to My talk at Topconf.com conference, Tallinn, 1st of November 2012 (20)

High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Java
 
Lrz kurs: big data analysis
Lrz kurs: big data analysisLrz kurs: big data analysis
Lrz kurs: big data analysis
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
 
PyTables
PyTablesPyTables
PyTables
 
Py tables
Py tablesPy tables
Py tables
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
 
Project Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare MetalProject Tungsten: Bringing Spark Closer to Bare Metal
Project Tungsten: Bringing Spark Closer to Bare Metal
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
 
PyTables
PyTablesPyTables
PyTables
 
Flexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific ArchitecturesFlexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific Architectures
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Jupyter Enterprise Gateway Overview
Jupyter Enterprise Gateway OverviewJupyter Enterprise Gateway Overview
Jupyter Enterprise Gateway Overview
 
Architecting a 35 PB distributed parallel file system for science
Architecting a 35 PB distributed parallel file system for scienceArchitecting a 35 PB distributed parallel file system for science
Architecting a 35 PB distributed parallel file system for science
 
Research computing at ILRI
Research computing at ILRIResearch computing at ILRI
Research computing at ILRI
 
Write on memory TSDB database (gocon tokyo autumn 2018)
Write on memory TSDB database (gocon tokyo autumn 2018)Write on memory TSDB database (gocon tokyo autumn 2018)
Write on memory TSDB database (gocon tokyo autumn 2018)
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Day
 
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
 
Machbase_Edge_Edition_v2.pdf
Machbase_Edge_Edition_v2.pdfMachbase_Edge_Edition_v2.pdf
Machbase_Edge_Edition_v2.pdf
 
Data Center Lessons Learned
Data Center Lessons LearnedData Center Lessons Learned
Data Center Lessons Learned
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

My talk at Topconf.com conference, Tallinn, 1st of November 2012

  • 1. Tarantool/Box: a use case with serving 2 billion queries a day November 1st, 2012, Tallinn, Estonia Konstantin Osipov, Mail.Ru kostja@tarantool.org
  • 2. Agenda Tarantool/Box: features and architecture Case study 1: advanced memcache Case study 2: a message queue server Case study 3: a reliable in-memory database Project future
  • 3. Feature overview ● Flexible data model, primary and secondary keys ● Fully cached: 100% of data is cached in RAM ● Persistent using a Write Ahead Log ● Log shipping replication and online backup ● extensible with Lua
  • 4. Data model ● A game of fields, tuples and spaces ● HASH, TREE, BITMAP, partial, indexes ● Secondary keys ● Single-part, multi-part ● STRING, NUM and NUM64 data types
  • 6. Lua API Redis Tarantool (Lua) redis.set(key, value) box.insert(space, key, value) redis.get(key) box.select(space, 0, key, value) redis.getset(key, newkey) box.update(space, key, '=p', 0, newkey) redis.incr(key) box.update(space, key, '+p', 1, 1) redis.lpush(key, value) box.update(space, key, '!p', 1, value) redis.rpush(key, value) You guess it...
  • 7. Performance overview Intel I5 , 4G RAM, 7200 RPM SATA 10 threads, 200-300 bytes per tuple Tarantool 1.4.6: 170k writes, 260k reads
  • 10. Use case 1: flexible memcache ● You can create your own fibers ● box.fiber.create(), box.fiber.yield() ● A background fiber performs a customized expire ● Session store: 20M online users, 200M monthly users, 4 2-CPU units, 96GB RAM each ● 60K RPS, CPU usage is below 20%
  • 11. Use case 2: message queue ● Reliable queues are a vital ingredient for building scalable applications ● In Web Apps queues are used for delayed processing, load balancing, e-mail notifications ● Our use case: prefetching avatars
  • 12. Message queues: how ● Rich Lua stored procedure environment ― box.fiber, box.space, box.cfg ● Specialized inter-procedure communication API: ― box.ipc ● Specialized data structures ― Bitmaps and partial keys
  • 13. Use case 3: reliable database ● User profile: 500 bytes of key/value pairs ● Math: 200M users * 500 = 100GB ● At least 2x smaller memory footprint than in a RDBMS ● Win: predictable response time, significantly higher performance
  • 14. Example: a FIFO function fifo_push(name, val) fifo = find_or_create_fifo(name) top = box.unpack('i', fifo[1]) bottom = box.unpack('i', fifo[2]) if top == fifomax+2 then -- % size top = 3 … end return box.update(0, name, '=p=p=p', 1, top, 2, bottom, top, val) end
  • 15. Conclusion ● Tarantool/Box – a high performance data engine ● Tarantool/Lua – a building block for your heavily loaded web applications ● Tarantool – our approach to an easy to use database for highly volatile data
  • 16. What's cooking in 1.5 ● Disk-based backend ● Synchronous master-master replication ● New data types (array, date, currency, json) ● authentication
  • 18. Links http://github.com/mailru/tarantool - source code http://github.com/mailru/tntlua - open source stored procedures repository http://groups.google.com/group/tarantool - mailing list http://tarantool.org/dist/ - always fresh .tar.gz and .rpm