SSD Aware Scan Operation Optimization in PostGreSQL Database

•Descargar como PPTX, PDF•

1 recomendación•409 vistas

This document summarizes a study on optimizing scan operations in PostgreSQL for SSD storage. It hypothesizes that index scans may outperform other scan methods on SSDs due to near-equal random and sequential access times. The methodology tests scan performance on a SSD-equipped system using indexes versus bitmap index scans and heap scans. Results show index scans improve performance by 29-44% for selective queries when sufficient memory holds the table. The optimization only benefits databases fitting entirely in memory.

Software

A Study on SSD Aware
Scan Operation
Optimization in
PostgreSQL Database

SSDs
Silicon memory chips
No moving parts
No rotational delay
Near zero seek time
Both random and sequential block access
time is almost the same !

But ...
The cost models in RDBMS are based on the
characteristics of spin type HDDs.
Assumes random_block_access_time >
sequential_block_access_time
When used with SSDs this assumption is not
valid
- Is there opportunities for improvements ??

Background information
Scan operation
- SELECT * FROM table WHERE condition
Selectivity
Scan operation alternatives in PostgreSQL
- Heap Scan
- Bitmap index scan + Bitmap heap scan
- Index scan

Our Hypothesis
Index scan based on a secondary index can
perform better than other scan operations in
databases which runs on SSD type storage
media.
Based on the fact that in SSDs the random
block access cost is almost similar to
sequential block access cost

Our Hypothesis (Continued)
SELECT * FROM table WHERE column = val
- column is indexed (not primary)
- correlation between primary index and
secondary index is zero

Methodology
Kingston 8GB Data Traveler
Dedicated PC running Ubuntu 12.04 (i5 2.3 GHz processor
and 4GB system memory)
PostgreSQL 9.3
Table with 36 columns, 6,000,000 rows of data
SELECT * FROM table_1 WHERE column_1 > val_1 AND
column_1 < val_2
1.7 GB of data (with indexes)

Methodology (Continued)
numeric field “idx_column” indexed using a
btree index
correlation between primary index and
secondary index is = 0.000000…
cardinality of the “idx_column” field is 933900

Selectivity
(log) seq scan BHS + BIS index scan
-4 10594 0 0
-3 10269 1 0
-2 10255 9 4
-1 10260 94 44
0 10278 644 457
1 10407 8794 4915
2 11600 16528 49395

In PostgreSQL
random_block_access_time
= 4 * seq_block_access_time
This is assuming spin type HDDs
What is the relation in SSDs ?
random_block_access_time
= seq_block_access_time ??

Selectivity (log)
Running times before
optimization(ms)
Optimum running
times(ms)
Running times
after
optimization(ms)
Cost reduction
(ms) Cost reduction (%)
-4 0 0 0 0 -
-3 1 0 0 1 100
-2 9 4 4 5 56
-1 94 44 44 50 53
0 644 457 457 187 29
1 8794 4915 4915 3879 44
2 11600 11600 11600 0 0

Are we done ??
We haven’t consider an important factor
- relative size of the table compared to the
system memory

Observations
Sequential scan remains consistent for all the
system memory values. why ?
Both BIS + BHS and index scan drastically
underperforms when system memory is
reduced.
BIS + BHS performs slightly better than index
scan

So the optimization will work only in special
conditions where at least majority of the
table content can reside in the main
memory.
- Does this means the optimization is of no
use ??

Potential of this optimization
- Small table size databases
- Embedded devices
- Mobile phones etc.

Más contenido relacionado

Destacado

Power of NetworksAlec Couros

Hawk eye technologyAkash Sahu

Spain .NEXT on Tour Keynote and Technical SlidedeckNEXTtour

Italian .NEXT on Tour Keynote and Technical SlidedeckNEXTtour

160-Gb-s Silicon All-Optical Packet Switch for Buffer-less Optical Burst Swit...University of Technology

201111 diagramandy gandoz

USDA Rural Development webinar: Building Businesses on Rural Broadband Invest...Calix

The Network App Store, Maarten Ectors, Canonical. Alan Quayle

Silent sound technologynixytl

February 2017 Calix Investor PresentationCalixInc

Containers and Nutanix - Acropolis Container ServicesNEXTtour

August 2016 calix investor presentationCalixInc

IDC Nutanix - Hyperconvergence and the Pulling Forces in the DatacenterNEXTtour

Embedded System in Automobiles Seminar Links

Enterprise Cloud Platform - Keynote - UtrechtNEXTtour

Nutanix Fundamentals The Enterprise Cloud CompanyNEXTtour

Electronic' skin monitors heart, brain functioncmr cet

Nutanix NEXT on Tour - Maarssen, Netherlands NEXTtour

FTTH Solutions For Today And TomorrowCalix

ECG-T wave inversion , Dr. Malala Rajapaksha ,Cardiology unit,General Hospit...malala720

Destacado (20)

Power of Networks

Hawk eye technology

Spain .NEXT on Tour Keynote and Technical Slidedeck

Italian .NEXT on Tour Keynote and Technical Slidedeck

160-Gb-s Silicon All-Optical Packet Switch for Buffer-less Optical Burst Swit...

201111 diagram

USDA Rural Development webinar: Building Businesses on Rural Broadband Invest...

The Network App Store, Maarten Ectors, Canonical.

Silent sound technology

February 2017 Calix Investor Presentation

Containers and Nutanix - Acropolis Container Services

August 2016 calix investor presentation

IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter

Embedded System in Automobiles

Enterprise Cloud Platform - Keynote - Utrecht

Nutanix Fundamentals The Enterprise Cloud Company

Electronic' skin monitors heart, brain function

Nutanix NEXT on Tour - Maarssen, Netherlands

FTTH Solutions For Today And Tomorrow

ECG-T wave inversion , Dr. Malala Rajapaksha ,Cardiology unit,General Hospit...

Similar a SSD Aware Scan Operation Optimization in PostGreSQL Database

What’s Evolving in the Elastic StackElasticsearch

hpc2013_20131223Ryohei Kobayashi

Cost Based OracleSantosh Kangane

Deep Dive on Amazon DynamoDBAmazon Web Services

Sucet os module_5_notesSRINIVASUNIVERSITYEN

Imply at Apache Druid Meetup in London 1-15-20Jelena Zanko

Sql Server Performance TuningBala Subra

query-optimization-techniques_talk.pdfgaros1

PostgreSQL 9.4, 9.5 and Beyond @ COSCUP 2015 TaipeiSatoshi Nagayasu

Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services

Mass storage structureRobert Antony

Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffTimescale

Apache Cassandra at MacysDataStax Academy

PostgreSQL High_Performance_CheatsheetLucian Oprea

Why databases cry at nightMichael Yarichuk

Wolfgang Lehner Technische Universitat DresdenInfinIT - Innovationsnetværket for it

Modeling data and best practices for the Azure Cosmos DB.Mohammad Asif

PresentationDimitris Stripelis

Three steps to untangle data traffic jamsBol.com Techlab

Similar a SSD Aware Scan Operation Optimization in PostGreSQL Database (20)

What’s Evolving in the Elastic Stack

hpc2013_20131223

Cost Based Oracle

Deep Dive on Amazon DynamoDB

Sucet os module_5_notes

Imply at Apache Druid Meetup in London 1-15-20

Sql Server Performance Tuning

query-optimization-techniques_talk.pdf

PostgreSQL 9.4, 9.5 and Beyond @ COSCUP 2015 Taipei

Best Practices for Migrating Your Data Warehouse to Amazon Redshift

Mass storage structure

Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off

Apache Cassandra at Macys

PostgreSQL High_Performance_Cheatsheet

Why databases cry at night

Wolfgang Lehner Technische Universitat Dresden

Modeling data and best practices for the Azure Cosmos DB.

Presentation

Three steps to untangle data traffic jams

Último

AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit

tonesoftglanshi9

%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba

Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)

Right Money Management App For Your Financial GoalsJhone kinadey

%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba

WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2

VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda

Software Quality Assurance Interview QuestionsArshad QA

Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek

Architecture decision records - How not to get lost in the pastPapp Krisztián

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg

%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba

SSD Aware Scan Operation Optimization in PostGreSQL Database

1. A Study on SSD Aware Scan Operation Optimization in PostgreSQL Database

2. SSDs vs Traditional Spin Type HDDs

3. SSDs Silicon memory chips No moving parts No rotational delay Near zero seek time Both random and sequential block access time is almost the same !

4. But ... The cost models in RDBMS are based on the characteristics of spin type HDDs. Assumes random_block_access_time > sequential_block_access_time When used with SSDs this assumption is not valid - Is there opportunities for improvements ??

5. Background information Scan operation - SELECT * FROM table WHERE condition Selectivity Scan operation alternatives in PostgreSQL - Heap Scan - Bitmap index scan + Bitmap heap scan - Index scan

6. Our Hypothesis Index scan based on a secondary index can perform better than other scan operations in databases which runs on SSD type storage media. Based on the fact that in SSDs the random block access cost is almost similar to sequential block access cost

7. Our Hypothesis (Continued) SELECT * FROM table WHERE column = val - column is indexed (not primary) - correlation between primary index and secondary index is zero

8. Methodology Kingston 8GB Data Traveler Dedicated PC running Ubuntu 12.04 (i5 2.3 GHz processor and 4GB system memory) PostgreSQL 9.3 Table with 36 columns, 6,000,000 rows of data SELECT * FROM table_1 WHERE column_1 > val_1 AND column_1 < val_2 1.7 GB of data (with indexes)

9. Methodology (Continued) numeric field “idx_column” indexed using a btree index correlation between primary index and secondary index is = 0.000000… cardinality of the “idx_column” field is 933900

10.

11. Selectivity (log) seq scan BHS + BIS index scan -4 10594 0 0 -3 10269 1 0 -2 10255 9 4 -1 10260 94 44 0 10278 644 457 1 10407 8794 4915 2 11600 16528 49395

12. In PostgreSQL random_block_access_time = 4 * seq_block_access_time This is assuming spin type HDDs What is the relation in SSDs ? random_block_access_time = seq_block_access_time ??

13.

14.

15. Selectivity (log) Running times before optimization(ms) Optimum running times(ms) Running times after optimization(ms) Cost reduction (ms) Cost reduction (%) -4 0 0 0 0 - -3 1 0 0 1 100 -2 9 4 4 5 56 -1 94 44 44 50 53 0 644 457 457 187 29 1 8794 4915 4915 3879 44 2 11600 11600 11600 0 0

16. Are we done ?? We haven’t consider an important factor - relative size of the table compared to the system memory

17.

18.

19. Observations Sequential scan remains consistent for all the system memory values. why ? Both BIS + BHS and index scan drastically underperforms when system memory is reduced. BIS + BHS performs slightly better than index scan

20. So the optimization will work only in special conditions where at least majority of the table content can reside in the main memory. - Does this means the optimization is of no use ??

21. Potential of this optimization - Small table size databases - Embedded devices - Mobile phones etc.

22. Questions ??

SSD Aware Scan Operation Optimization in PostGreSQL Database

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (20)

Similar a SSD Aware Scan Operation Optimization in PostGreSQL Database

Similar a SSD Aware Scan Operation Optimization in PostGreSQL Database (20)

Último

Último (20)

SSD Aware Scan Operation Optimization in PostGreSQL Database