SlideShare una empresa de Scribd logo
1 de 38
2TB of RAM ought to be enough for
anybody
PG Nordic Day 2014
The Presenter
1
Renaud Bruyeron - @brew_your_own
Paris, France
The Presenter – short bio
2
• 1998-1999: W3C @ MIT
• 1999-2013: FullSIX
• Top 3 Web agency in FR
• Strong focus on PHP, Java, and Oracle…
• 2013-present: CTO of
Schibsted Classified Media
From 1 to 30+ Countries in 7 years
3
4
Templated deployment in 30 countries w/ shared
technology
5
• Technology originally
inherited from
Blocket.se
• Has since evolved to
power all 30 sites
• Focus on performance
and ease of local
modificationsPostgreSQL
Middleware
(data access, business
rules)
Index
Presentation layer
APIs Web
Search C
C
C
PHP
C/PHP
C
PL/SQL
PostgreSQL in the SCM Platform
6
100+ servers running PostgreSQL
8TB of data
50+ million classified ads
Schibsted Classified Media & PostgreSQL
married…and in love 
7
#1 Classified Web site in France
8
History
9
Project initiated in 2006
Site launch: early 2007
Based on technology
from 1 to 230 people, challenger to #1 in 7 years
10
11
12
Explosive growth…with a few bumps along the way!
13
Big Audience
14
250M page views / day
5M unique visitors / day
18M UV / month
That’s 1/3 of French internet population…
600000+ new ads / day
25M live ads
#7 most visited Website in France
Big Ops
15
300+ servers in 2 DCs
20 servers hosting PG databases
(in production)
Built on SCM Technology
16
PostgreSQL
Middleware
(data access, business
rules)
Index
Presentation layer
APIs Web
Search
Built on SCM Technology
17
Website Data
Master
Slave
(hot
standby)
Slave
(short
queries)
Slave
(Long
queries)
Middleware
(data access, business
rules)
Index
Presentation layer
APIs Web
Search
Built on SCM Technology
18
Website Data
Master
Slave
(hot
standby)
Slave
(short
queries)
Slave
(Long
queries)
Middleware
(data access, business
rules)
Index
Presentation layer
APIs Web
Search
+15 support DBs
Website Data
+15 support DBs
We use PostgreSQL everywhere!
19
Backoffice & Analytics
Master
Slave
(hot
standby)
Slave
(short
queries)
Slave
(Long
queries)
ODS
OLAP
BI & CRM Middleware
(data access, business
rules)
Index
Presentation layer
APIs Web
Search
We try to limit writes!
20
• Index/Search acting as a structured cache
• Master DB workload = 70% writes
• Slaves used to offload read queries
• Main database = 6TB on disk…
• +4TB archived away…
• 20K LOCs of PL/SQL
Website Data
+15 support DBs
Master
Slave
(hot
standby)
Slave
(short
queries)
Slave
(Long
queries)
Index
Search
PostgreSQL works beautifully as a DW!
21
Website DataBackoffice & Analytics
Master
Slave
(hot
standby)
Slave
(short
queries)
Slave
(Long
queries)
ODS
OLAP
ETL moving
500GB / day into
the DW
2.2TB
Adding
6GB/day
Big Iron: HP DL980, 2TB of RAM, 64-80 cores
22
HTOP on the Master 
23
Physical storage for the main PostgreSQL instances
24
DL980 DL980
3Par
V800
(SAN)
Fusion-IO
DL980DL980
3Par
V800
(SAN)
Fusion-IO
Fiber
Channel
Fiber
Channel
Master Slave SlaveSlave
DC 1 DC 2
Big Iron: 3Par V800 SAN
25
Big Iron: thin provisioning with mix of SSD and FC disks
26
Big Iron: high performance…
27
Bragging about it ;)
28
Despite the caches and the search engine, we get
impressive workloads on the master DB
29
600 tx/sec
How did we get there?
30
2x DL980 w/
Fusion-IO
2x 3Par V800
2x DL980 w/
FC to the
V800s
We did go through growing pains and near disasters
31
The « Big
One »
Our own « worst day of our lives »: March 1st 2013 (1/2)
32
Master DB is slowing down dramatically
We find that Slony replication is the culprit
We don’t know what to do…
Until we find a solution on the net that involves cleaning up slony metadata…
…(you know where this is going)…
We fumble. We notice. The slave is borked.
Rebuilding the Slave with slony brings the Master down. Oh. God…
We take the Master off the stack, and start rebuilding the slave w/ Slony
…5 days later, we are done (!)…
Our own « worst day of our lives »: March 1st 2013 (2/2)
33
…but we are not out of the woods yet!...
Pent up demand is bringing the site down!
We decide to switch to native replication!
…but the network cards are maxed out by the replication data…
…triggering a kernel bug…
…(Murphy, could you please step out of the room?)…
We implement network card bonding, and start moving support tables off the main
instance
…and we are done!
What’s Next?
34
Vertical Scalability has limits!
35
We are already running on the biggest HW money can buy
Past certain volume levels, execution plans can change radically
Huge instances are difficult to maintain & backup safely
Rebuilding the slave in March 2013 took a full 5 days…
Although we are maxing out the HW available to us, especially
on the writes/s,
We are committed to PostgreSQL at the heart of our platform
Key ideas to take our platform to the next level
36
Sharding
Unbundling of
the schema
PGQ
Spread the reads & writes horizontally
Move parts of the schema (that can be
decoupled) to other instances. Spread the
workload
Reduce application-level transaction
interleaving by moving parts of the
transactions to asynchronous workers
37

Más contenido relacionado

La actualidad más candente

Writing Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySparkWriting Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySparkDatabricks
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...Chris Fregly
 
Ireland OUG Meetup May 2017
Ireland OUG Meetup May 2017Ireland OUG Meetup May 2017
Ireland OUG Meetup May 2017Brendan Tierney
 
Python in the Hadoop Ecosystem (Rock Health presentation)
Python in the Hadoop Ecosystem (Rock Health presentation)Python in the Hadoop Ecosystem (Rock Health presentation)
Python in the Hadoop Ecosystem (Rock Health presentation)Uri Laserson
 
Introduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scaleIntroduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scaleMapR Technologies
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGDuyhai Doan
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopHadoop User Group
 
Life of PySpark - A tale of two environments
Life of PySpark - A tale of two environmentsLife of PySpark - A tale of two environments
Life of PySpark - A tale of two environmentsShankar M S
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesDuyhai Doan
 
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...Spark Summit
 
NoSQL Couchbase Lite & BigData HPCC Systems
NoSQL Couchbase Lite & BigData HPCC SystemsNoSQL Couchbase Lite & BigData HPCC Systems
NoSQL Couchbase Lite & BigData HPCC SystemsFujio Turner
 
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...Chris Fregly
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
 
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseIntroduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseJihoon Son
 
Presentation of OrientDB v2.2 - Webinar
Presentation of OrientDB v2.2 - WebinarPresentation of OrientDB v2.2 - Webinar
Presentation of OrientDB v2.2 - WebinarOrient Technologies
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceDuyhai Doan
 
20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所
20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所
20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所Ryuji Tamagawa
 

La actualidad más candente (20)

Writing Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySparkWriting Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySpark
 
Nov 2010 HUG: Fuzzy Table - B.A.H
Nov 2010 HUG: Fuzzy Table - B.A.HNov 2010 HUG: Fuzzy Table - B.A.H
Nov 2010 HUG: Fuzzy Table - B.A.H
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
 
Ireland OUG Meetup May 2017
Ireland OUG Meetup May 2017Ireland OUG Meetup May 2017
Ireland OUG Meetup May 2017
 
Python in the Hadoop Ecosystem (Rock Health presentation)
Python in the Hadoop Ecosystem (Rock Health presentation)Python in the Hadoop Ecosystem (Rock Health presentation)
Python in the Hadoop Ecosystem (Rock Health presentation)
 
Introduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scaleIntroduction to Apache Drill - interactive query and analysis at scale
Introduction to Apache Drill - interactive query and analysis at scale
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ ING
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
 
Life of PySpark - A tale of two environments
Life of PySpark - A tale of two environmentsLife of PySpark - A tale of two environments
Life of PySpark - A tale of two environments
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
 
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
 
HUG Nov 2010: HDFS Raid - Facebook
HUG Nov 2010: HDFS Raid - FacebookHUG Nov 2010: HDFS Raid - Facebook
HUG Nov 2010: HDFS Raid - Facebook
 
NoSQL Couchbase Lite & BigData HPCC Systems
NoSQL Couchbase Lite & BigData HPCC SystemsNoSQL Couchbase Lite & BigData HPCC Systems
NoSQL Couchbase Lite & BigData HPCC Systems
 
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
Advanced Apache Spark Meetup Data Sources API Cassandra Spark Connector Spark...
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
 
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseIntroduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data Warehouse
 
Presentation of OrientDB v2.2 - Webinar
Presentation of OrientDB v2.2 - WebinarPresentation of OrientDB v2.2 - Webinar
Presentation of OrientDB v2.2 - Webinar
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
 
20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所
20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所
20170927 pydata tokyo データサイエンスな皆様に送る分散処理の基礎の基礎、そしてPySparkの勘所
 

Destacado

(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...
(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...
(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...Phil Calçado
 
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16Rafael Dohms
 
Real World React Native & ES7
Real World React Native & ES7Real World React Native & ES7
Real World React Native & ES7joestanton1
 
Expériencer les objets connectés
Expériencer les objets connectésExpériencer les objets connectés
Expériencer les objets connectésekino
 
SfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-like
SfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-likeSfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-like
SfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-likeTristan Maindron
 
Industrialisation PHP - Canal+
Industrialisation PHP - Canal+Industrialisation PHP - Canal+
Industrialisation PHP - Canal+ekino
 
Symfony2 for legacy app rejuvenation: the eZ Publish case study
Symfony2 for legacy app rejuvenation: the eZ Publish case studySymfony2 for legacy app rejuvenation: the eZ Publish case study
Symfony2 for legacy app rejuvenation: the eZ Publish case studyGaetano Giunta
 
Performance au quotidien dans un environnement symfony
Performance au quotidien dans un environnement symfonyPerformance au quotidien dans un environnement symfony
Performance au quotidien dans un environnement symfonyXavier Leune
 
The Art of Transduction
The Art of TransductionThe Art of Transduction
The Art of TransductionDavid Stockton
 
A tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp Krenn
A tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp KrennA tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp Krenn
A tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp Krenndistributed matters
 
Grand Rapids PHP Meetup: Behavioral Driven Development with Behat
Grand Rapids PHP Meetup: Behavioral Driven Development with BehatGrand Rapids PHP Meetup: Behavioral Driven Development with Behat
Grand Rapids PHP Meetup: Behavioral Driven Development with BehatRyan Weaver
 
ScalaItaly 2015 - Your Microservice as a Function
ScalaItaly 2015 - Your Microservice as a FunctionScalaItaly 2015 - Your Microservice as a Function
ScalaItaly 2015 - Your Microservice as a FunctionPhil Calçado
 
Profiling php5 to php7
Profiling php5 to php7Profiling php5 to php7
Profiling php5 to php7julien pauli
 
Finagle @ SoundCloud
Finagle @ SoundCloudFinagle @ SoundCloud
Finagle @ SoundCloudPhil Calçado
 
Using Magnolia in a Microservices Architecture
Using Magnolia in a Microservices ArchitectureUsing Magnolia in a Microservices Architecture
Using Magnolia in a Microservices ArchitectureMagnolia
 
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and more
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and moreSymfony Guard Authentication: Fun with API Token, Social Login, JWT and more
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and moreRyan Weaver
 
All Aboard for Laravel 5.1
All Aboard for Laravel 5.1All Aboard for Laravel 5.1
All Aboard for Laravel 5.1Jason McCreary
 
La #NouvelleVague de l’écosystème français des startups
La #NouvelleVague de l’écosystème français des startupsLa #NouvelleVague de l’écosystème français des startups
La #NouvelleVague de l’écosystème français des startupsStripe
 
Composer in monolithic repositories
Composer in monolithic repositoriesComposer in monolithic repositories
Composer in monolithic repositoriesSten Hiedel
 
Keeping the frontend under control with Symfony and Webpack
Keeping the frontend under control with Symfony and WebpackKeeping the frontend under control with Symfony and Webpack
Keeping the frontend under control with Symfony and WebpackIgnacio Martín
 

Destacado (20)

(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...
(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...
(ThoughtWorks Away Day 2009) one or two things you may not know about typesys...
 
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
 
Real World React Native & ES7
Real World React Native & ES7Real World React Native & ES7
Real World React Native & ES7
 
Expériencer les objets connectés
Expériencer les objets connectésExpériencer les objets connectés
Expériencer les objets connectés
 
SfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-like
SfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-likeSfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-like
SfPot Lille 07/2015 - Utiliser Symfony sur des environnements Heroku-like
 
Industrialisation PHP - Canal+
Industrialisation PHP - Canal+Industrialisation PHP - Canal+
Industrialisation PHP - Canal+
 
Symfony2 for legacy app rejuvenation: the eZ Publish case study
Symfony2 for legacy app rejuvenation: the eZ Publish case studySymfony2 for legacy app rejuvenation: the eZ Publish case study
Symfony2 for legacy app rejuvenation: the eZ Publish case study
 
Performance au quotidien dans un environnement symfony
Performance au quotidien dans un environnement symfonyPerformance au quotidien dans un environnement symfony
Performance au quotidien dans un environnement symfony
 
The Art of Transduction
The Art of TransductionThe Art of Transduction
The Art of Transduction
 
A tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp Krenn
A tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp KrennA tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp Krenn
A tale of queues — from ActiveMQ over Hazelcast to Disque - Philipp Krenn
 
Grand Rapids PHP Meetup: Behavioral Driven Development with Behat
Grand Rapids PHP Meetup: Behavioral Driven Development with BehatGrand Rapids PHP Meetup: Behavioral Driven Development with Behat
Grand Rapids PHP Meetup: Behavioral Driven Development with Behat
 
ScalaItaly 2015 - Your Microservice as a Function
ScalaItaly 2015 - Your Microservice as a FunctionScalaItaly 2015 - Your Microservice as a Function
ScalaItaly 2015 - Your Microservice as a Function
 
Profiling php5 to php7
Profiling php5 to php7Profiling php5 to php7
Profiling php5 to php7
 
Finagle @ SoundCloud
Finagle @ SoundCloudFinagle @ SoundCloud
Finagle @ SoundCloud
 
Using Magnolia in a Microservices Architecture
Using Magnolia in a Microservices ArchitectureUsing Magnolia in a Microservices Architecture
Using Magnolia in a Microservices Architecture
 
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and more
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and moreSymfony Guard Authentication: Fun with API Token, Social Login, JWT and more
Symfony Guard Authentication: Fun with API Token, Social Login, JWT and more
 
All Aboard for Laravel 5.1
All Aboard for Laravel 5.1All Aboard for Laravel 5.1
All Aboard for Laravel 5.1
 
La #NouvelleVague de l’écosystème français des startups
La #NouvelleVague de l’écosystème français des startupsLa #NouvelleVague de l’écosystème français des startups
La #NouvelleVague de l’écosystème français des startups
 
Composer in monolithic repositories
Composer in monolithic repositoriesComposer in monolithic repositories
Composer in monolithic repositories
 
Keeping the frontend under control with Symfony and Webpack
Keeping the frontend under control with Symfony and WebpackKeeping the frontend under control with Symfony and Webpack
Keeping the frontend under control with Symfony and Webpack
 

Similar a Pg nordic-day-2014-2 tb-enough

Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureChristos Charmatzis
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!DataWorks Summit
 
Advanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and BackupAdvanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and BackupMongoDB
 
Hadoop Hardware @Twitter: Size does matter.
Hadoop Hardware @Twitter: Size does matter.Hadoop Hardware @Twitter: Size does matter.
Hadoop Hardware @Twitter: Size does matter.Michael Zhang
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014Connor McDonald
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolEDB
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogC4Media
 
OOW 2013 Highlights
OOW 2013 HighlightsOOW 2013 Highlights
OOW 2013 HighlightsAna Galindo
 
Lightning Fast Dataframes with Polars
Lightning Fast Dataframes with PolarsLightning Fast Dataframes with Polars
Lightning Fast Dataframes with PolarsAlberto Danese
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...Imperva Incapsula
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshareColin Charles
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsAli Hodroj
 
MongoDC 2012: "Operationalizing" MongoDB@AOL
MongoDC 2012: "Operationalizing" MongoDB@AOLMongoDC 2012: "Operationalizing" MongoDB@AOL
MongoDC 2012: "Operationalizing" MongoDB@AOLMongoDB
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOLradiocats
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshopFang Mac
 
Why Hadoop is important to Syncsort
Why Hadoop is important to SyncsortWhy Hadoop is important to Syncsort
Why Hadoop is important to Syncsorthuguk
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 

Similar a Pg nordic-day-2014-2 tb-enough (20)

Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with Azure
 
Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter: Size does matter!
 
Advanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and BackupAdvanced Administration, Monitoring and Backup
Advanced Administration, Monitoring and Backup
 
Hadoop Hardware @Twitter: Size does matter.
Hadoop Hardware @Twitter: Size does matter.Hadoop Hardware @Twitter: Size does matter.
Hadoop Hardware @Twitter: Size does matter.
 
Long and winding road - 2014
Long and winding road  - 2014Long and winding road  - 2014
Long and winding road - 2014
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
Phissug s01 ep6, stretch database
Phissug s01 ep6, stretch databasePhissug s01 ep6, stretch database
Phissug s01 ep6, stretch database
 
OOW 2013 Highlights
OOW 2013 HighlightsOOW 2013 Highlights
OOW 2013 Highlights
 
Lightning Fast Dataframes with Polars
Lightning Fast Dataframes with PolarsLightning Fast Dataframes with Polars
Lightning Fast Dataframes with Polars
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
MongoDC 2012: "Operationalizing" MongoDB@AOL
MongoDC 2012: "Operationalizing" MongoDB@AOLMongoDC 2012: "Operationalizing" MongoDB@AOL
MongoDC 2012: "Operationalizing" MongoDB@AOL
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop
 
Why Hadoop is important to Syncsort
Why Hadoop is important to SyncsortWhy Hadoop is important to Syncsort
Why Hadoop is important to Syncsort
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 

Último

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Último (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

Pg nordic-day-2014-2 tb-enough

  • 1. 2TB of RAM ought to be enough for anybody PG Nordic Day 2014
  • 2. The Presenter 1 Renaud Bruyeron - @brew_your_own Paris, France
  • 3. The Presenter – short bio 2 • 1998-1999: W3C @ MIT • 1999-2013: FullSIX • Top 3 Web agency in FR • Strong focus on PHP, Java, and Oracle… • 2013-present: CTO of
  • 4. Schibsted Classified Media From 1 to 30+ Countries in 7 years 3
  • 5. 4
  • 6. Templated deployment in 30 countries w/ shared technology 5 • Technology originally inherited from Blocket.se • Has since evolved to power all 30 sites • Focus on performance and ease of local modificationsPostgreSQL Middleware (data access, business rules) Index Presentation layer APIs Web Search C C C PHP C/PHP C PL/SQL
  • 7. PostgreSQL in the SCM Platform 6 100+ servers running PostgreSQL 8TB of data 50+ million classified ads
  • 8. Schibsted Classified Media & PostgreSQL married…and in love  7
  • 9. #1 Classified Web site in France 8
  • 10. History 9 Project initiated in 2006 Site launch: early 2007 Based on technology from 1 to 230 people, challenger to #1 in 7 years
  • 11. 10
  • 12. 11
  • 13. 12
  • 14. Explosive growth…with a few bumps along the way! 13
  • 15. Big Audience 14 250M page views / day 5M unique visitors / day 18M UV / month That’s 1/3 of French internet population… 600000+ new ads / day 25M live ads #7 most visited Website in France
  • 16. Big Ops 15 300+ servers in 2 DCs 20 servers hosting PG databases (in production)
  • 17. Built on SCM Technology 16 PostgreSQL Middleware (data access, business rules) Index Presentation layer APIs Web Search
  • 18. Built on SCM Technology 17 Website Data Master Slave (hot standby) Slave (short queries) Slave (Long queries) Middleware (data access, business rules) Index Presentation layer APIs Web Search
  • 19. Built on SCM Technology 18 Website Data Master Slave (hot standby) Slave (short queries) Slave (Long queries) Middleware (data access, business rules) Index Presentation layer APIs Web Search +15 support DBs
  • 20. Website Data +15 support DBs We use PostgreSQL everywhere! 19 Backoffice & Analytics Master Slave (hot standby) Slave (short queries) Slave (Long queries) ODS OLAP BI & CRM Middleware (data access, business rules) Index Presentation layer APIs Web Search
  • 21. We try to limit writes! 20 • Index/Search acting as a structured cache • Master DB workload = 70% writes • Slaves used to offload read queries • Main database = 6TB on disk… • +4TB archived away… • 20K LOCs of PL/SQL Website Data +15 support DBs Master Slave (hot standby) Slave (short queries) Slave (Long queries) Index Search
  • 22. PostgreSQL works beautifully as a DW! 21 Website DataBackoffice & Analytics Master Slave (hot standby) Slave (short queries) Slave (Long queries) ODS OLAP ETL moving 500GB / day into the DW 2.2TB Adding 6GB/day
  • 23. Big Iron: HP DL980, 2TB of RAM, 64-80 cores 22
  • 24. HTOP on the Master  23
  • 25. Physical storage for the main PostgreSQL instances 24 DL980 DL980 3Par V800 (SAN) Fusion-IO DL980DL980 3Par V800 (SAN) Fusion-IO Fiber Channel Fiber Channel Master Slave SlaveSlave DC 1 DC 2
  • 26. Big Iron: 3Par V800 SAN 25
  • 27. Big Iron: thin provisioning with mix of SSD and FC disks 26
  • 28. Big Iron: high performance… 27
  • 30. Despite the caches and the search engine, we get impressive workloads on the master DB 29 600 tx/sec
  • 31. How did we get there? 30 2x DL980 w/ Fusion-IO 2x 3Par V800 2x DL980 w/ FC to the V800s
  • 32. We did go through growing pains and near disasters 31 The « Big One »
  • 33. Our own « worst day of our lives »: March 1st 2013 (1/2) 32 Master DB is slowing down dramatically We find that Slony replication is the culprit We don’t know what to do… Until we find a solution on the net that involves cleaning up slony metadata… …(you know where this is going)… We fumble. We notice. The slave is borked. Rebuilding the Slave with slony brings the Master down. Oh. God… We take the Master off the stack, and start rebuilding the slave w/ Slony …5 days later, we are done (!)…
  • 34. Our own « worst day of our lives »: March 1st 2013 (2/2) 33 …but we are not out of the woods yet!... Pent up demand is bringing the site down! We decide to switch to native replication! …but the network cards are maxed out by the replication data… …triggering a kernel bug… …(Murphy, could you please step out of the room?)… We implement network card bonding, and start moving support tables off the main instance …and we are done!
  • 36. Vertical Scalability has limits! 35 We are already running on the biggest HW money can buy Past certain volume levels, execution plans can change radically Huge instances are difficult to maintain & backup safely Rebuilding the slave in March 2013 took a full 5 days… Although we are maxing out the HW available to us, especially on the writes/s, We are committed to PostgreSQL at the heart of our platform
  • 37. Key ideas to take our platform to the next level 36 Sharding Unbundling of the schema PGQ Spread the reads & writes horizontally Move parts of the schema (that can be decoupled) to other instances. Spread the workload Reduce application-level transaction interleaving by moving parts of the transactions to asynchronous workers
  • 38. 37