SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Inside PostgreSQL Shared Memory
BRUCE MOMJIAN
September, 2013
POSTGRESQL is an open-source, full-featured relational database.
This presentation gives an overview of the shared memory
structures used by Postgres.
Creative Commons Attribution License http://momjian.us/presentations
Inside PostgreSQL Shared Memory 1 / 25
Outline
1. File storage format
2. Shared memory creation
3. Shared buffers
4. Row value access
5. Locking
6. Other structures
Inside PostgreSQL Shared Memory 2 / 25
File System /data
Postgres
Postgres
Postgres
/data
Inside PostgreSQL Shared Memory 3 / 25
File System /data/base
Postgres
Postgres
Postgres
/data
/pg_clog
/pg_multixact
/pg_subtrans
/pg_tblspc
/pg_xlog
/global
/pg_twophase
/base
Inside PostgreSQL Shared Memory 4 / 25
File System /data/base/db
Postgres
Postgres
Postgres
/data /base /16385 (production)
/1 (template1)
/17982 (devel)
/16821 (test)
/21452 (marketing)
Inside PostgreSQL Shared Memory 5 / 25
File System /data/base/db/table
Postgres
Postgres
Postgres
/data /base /16385 /24692 (customer)
/27214 (order)
/25932 (product)
/25952 (employee)
/27839 (part)
Inside PostgreSQL Shared Memory 6 / 25
File System Data Pages
Postgres
Postgres
Postgres
8k 8k 8k 8k
/data /base /16385 /24692
Inside PostgreSQL Shared Memory 7 / 25
Data Pages
Postgres
Postgres
Postgres
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
8k 8k 8k 8k
/data /base /16385 /24692
Inside PostgreSQL Shared Memory 8 / 25
File System Block Tuple
Postgres
Postgres
Postgres
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
8k 8k 8k 8k
/data /base /16385 /24692
Tuple
Inside PostgreSQL Shared Memory 9 / 25
File System Tuple
hoff − length of tuple header
infomask − tuple flags
natts − number of attributes
ctid − tuple id (page / item)
cmax − destruction command id
xmin − creation transaction id
xmax − destruction transaction id
cmin − creation command id
bits − bit map representing NULLs
OID − object id of tuple (optional)
Tuple
Value Value ValueValue Value Value ValueHeader
int4in(’9241’)
textout()
’Martin’
Inside PostgreSQL Shared Memory 10 / 25
Tuple Header C Structures
typedef struct HeapTupleFields
{
TransactionId t_xmin; /* inserting xact ID */
TransactionId t_xmax; /* deleting or locking xact ID */
union
{
CommandId t_cid; /* inserting or deleting command ID, or both */
TransactionId t_xvac; /* VACUUM FULL xact ID */
} t_field3;
} HeapTupleFields;
typedef struct HeapTupleHeaderData
{
union
{
HeapTupleFields t_heap;
DatumTupleFields t_datum;
} t_choice;
ItemPointerData t_ctid; /* current TID of this or newer tuple */
/* Fields below here must match MinimalTupleData! */
uint16 t_infomask2; /* number of attributes + various flags */
uint16 t_infomask; /* various flag bits, see below */
uint8 t_hoff; /* sizeof header incl. bitmap, padding */
/* ^ − 23 bytes − ^ */
bits8 t_bits[1]; /* bitmap of NULLs −− VARIABLE LENGTH */
/* MORE DATA FOLLOWS AT END OF STRUCT */
} HeapTupleHeaderData; Inside PostgreSQL Shared Memory 11 / 25
Shared Memory Creation
postmaster postgres postgres
Program (Text)
Data
Program (Text)
Data
Shared Memory
Program (Text)
Data
Shared Memory Shared Memory
Stack Stack
fork()
Stack
Inside PostgreSQL Shared Memory 12 / 25
Shared Memory
Shared Buffers
Proc Array
PROC
Multi−XACT Buffers
Two−Phase Structs
Subtrans Buffers
CLOG Buffers
XLOG Buffers
Shared Invalidation
Lightweight Locks
Lock Hashes
Auto Vacuum
Btree Vacuum
Buffer Descriptors
Background Writer Synchronized Scan
Semaphores
Statistics
LOCK
PROCLOCK
Inside PostgreSQL Shared Memory 13 / 25
Shared Buffers
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
Postgres
Postgres
Postgres
8k 8k 8k 8k
/data /base /16385 /24692
Shared Buffers
LWLock − for page changes
Pin Count − prevent page replacement
read()
write()
8k 8k 8k
Buffer Descriptors
Inside PostgreSQL Shared Memory 14 / 25
HeapTuples
Shared Buffers
Postgres
HeapTuple
C pointer
hoff − length of tuple header
infomask − tuple flags
natts − number of attributes
ctid − tuple id (page / item)
cmax − destruction command id
xmin − creation transaction id
xmax − destruction transaction id
cmin − creation command id
bits − bit map representing NULLs
OID − object id of tuple (optional)
Tuple
Value Value ValueValue Value Value ValueHeader
int4in(’9241’)
textout()
’Martin’
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
8k 8k 8k
Inside PostgreSQL Shared Memory 15 / 25
Finding A Tuple Value in C
Datum
nocachegetattr(HeapTuple tuple,
int attnum,
TupleDesc tupleDesc,
bool *isnull)
{
HeapTupleHeader tup = tuple−>t_data;
Form_pg_attribute *att = tupleDesc−>attrs;
{
int i;
/*
* Note − This loop is a little tricky. For each non−null attribute,
* we have to first account for alignment padding before the attr,
* then advance over the attr based on its length. Nulls have no
* storage and no alignment padding either. We can use/set
* attcacheoff until we reach either a null or a var−width attribute.
*/
off = 0;
for (i = 0;; i++) /* loop exit is at "break" */
{
if (HeapTupleHasNulls(tuple) && att_isnull(i, bp))
continue; /* this cannot be the target att */
if (att[i]−>attlen == −1)
off = att_align_pointer(off, att[i]−>attalign, −1,
tp + off);
else
/* not varlena, so safe to use att_align_nominal */
off = att_align_nominal(off, att[i]−>attalign);
if (i == attnum)
break;
off = att_addlength_pointer(off, att[i]−>attlen, tp + off);
}
}
return fetchatt(att[attnum], tp + off);
} Inside PostgreSQL Shared Memory 16 / 25
Value Access in C
#define fetch_att(T,attbyval,attlen) 
( 
(attbyval) ? 
( 
(attlen) == (int) sizeof(int32) ? 
Int32GetDatum(*((int32 *)(T))) 
: 
( 
(attlen) == (int) sizeof(int16) ? 
Int16GetDatum(*((int16 *)(T))) 
: 
( 
AssertMacro((attlen) == 1), 
CharGetDatum(*((char *)(T))) 
) 
) 
) 
: 
PointerGetDatum((char *) (T)) 
)
Inside PostgreSQL Shared Memory 17 / 25
Test And Set Lock
Can Succeed Or Fail
0/1
1
0
Success
Was 0 on exchange
Lock already taken
Was 1 on exchange
Failure
1
1
Inside PostgreSQL Shared Memory 18 / 25
Test And Set Lock
x86 Assembler
static __inline__ int
tas(volatile slock_t *lock)
{
register slock_t _res = 1;
/*
* Use a non−locking test before asserting the bus lock. Note that the
* extra test appears to be a small loss on some x86 platforms and a small
* win on others; it’s by no means clear that we should keep it.
*/
__asm__ __volatile__(
" cmpb $0,%1 n"
" jne 1f n"
" lock n"
" xchgb %0,%1 n"
"1: n"
: "+q"(_res), "+m"(*lock)
:
: "memory", "cc");
return (int) _res;
}
Inside PostgreSQL Shared Memory 19 / 25
Spin Lock
Always Succeeds
0/1
1
Success
Was 0 on exchange
Failure
Was 1 on exchange
Lock already taken
Sleep of increasing duration
0 1
1
Spinlocks are designed for short-lived locking operations, like
access to control structures. They are not be used to protect code
that makes kernel calls or other heavy operations.
Inside PostgreSQL Shared Memory 20 / 25
Light Weight Locks
Shared Buffers
Proc Array
PROC
Multi−XACT Buffers
Two−Phase Structs
Subtrans Buffers
CLOG Buffers
XLOG Buffers
Shared Invalidation
Lightweight Locks
Auto Vacuum
Btree Vacuum
Background Writer Synchronized Scan
Semaphores
Statistics
LOCK
PROCLOCK
Lock Hashes
Buffer Descriptors
Sleep On Lock
Light weight locks attempt to acquire the lock, and go to sleep on
a semaphore if the lock request fails. Spinlocks control access to
the light weight lock control structure.
Inside PostgreSQL Shared Memory 21 / 25
Database Object Locks
PROCLOCKPROC LOCK
Lock Hashes
Inside PostgreSQL Shared Memory 22 / 25
Proc
Proc Array
used usedusedempty empty empty
PROC
Inside PostgreSQL Shared Memory 23 / 25
Other Shared Memory Structures
Shared Buffers
Proc Array
PROC
Multi−XACT Buffers
Two−Phase Structs
Subtrans Buffers
CLOG Buffers
XLOG Buffers
Shared Invalidation
Lightweight Locks
Lock Hashes
Auto Vacuum
Btree Vacuum
Buffer Descriptors
Background Writer Synchronized Scan
Semaphores
Statistics
LOCK
PROCLOCK
Inside PostgreSQL Shared Memory 24 / 25
Conclusion
http://momjian.us/presentations Pink Floyd: Wish You Were Here
Inside PostgreSQL Shared Memory 25 / 25

Más contenido relacionado

La actualidad más candente

Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksJignesh Shah
 
Linux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownScyllaDB
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Databricks
 
Tuning PostgreSQL for High Write Throughput
Tuning PostgreSQL for High Write Throughput Tuning PostgreSQL for High Write Throughput
Tuning PostgreSQL for High Write Throughput Grant McAlister
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper lookJignesh Shah
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep InternalEXEM
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresqlbotsplash.com
 
Oracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASMOracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASMArun Sharma
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDatabricks
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Community
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdfAmit Raj
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep divet3rmin4t0r
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architectureAjeet Singh
 

La actualidad más candente (20)

Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
 
Linux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance Showdown
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
 
Tuning PostgreSQL for High Write Throughput
Tuning PostgreSQL for High Write Throughput Tuning PostgreSQL for High Write Throughput
Tuning PostgreSQL for High Write Throughput
 
PostgreSQL Extensions: A deeper look
PostgreSQL Extensions:  A deeper lookPostgreSQL Extensions:  A deeper look
PostgreSQL Extensions: A deeper look
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
Oracle ASM Training
Oracle ASM TrainingOracle ASM Training
Oracle ASM Training
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Oracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASMOracle 12cR2 Installation On Linux With ASM
Oracle 12cR2 Installation On Linux With ASM
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things Right
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
 
Map reduce vs spark
Map reduce vs sparkMap reduce vs spark
Map reduce vs spark
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdf
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 

Destacado

Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenPostgresOpen
 
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenPostgresOpen
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenPostgresOpen
 
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013PostgresOpen
 
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...PostgresOpen
 
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenDavid Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenPostgresOpen
 
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...PostgresOpen
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenPostgresOpen
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...PostgresOpen
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningUmair Shahid
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningUmair Shahid
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Denish Patel
 
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenRobert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenPostgresOpen
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenPostgresOpen
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finishelliando dias
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres OpenPostgresOpen
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGiuseppe Broccolo
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HAharoonm
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenPostgresOpen
 

Destacado (20)

Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
 
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
 
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
 
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
 
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenDavid Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
 
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres Open
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuning
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuning
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenRobert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfs
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
 
Geometria Projetiva
Geometria ProjetivaGeometria Projetiva
Geometria Projetiva
 

Similar a Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open

The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bfinalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bChereCheek752
 
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki LinnakangasPG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangaspgdayrussia
 
Drizzles Approach To Improving Performance Of The Server
Drizzles  Approach To  Improving  Performance Of The  ServerDrizzles  Approach To  Improving  Performance Of The  Server
Drizzles Approach To Improving Performance Of The ServerPerconaPerformance
 
11 Things About11g
11 Things About11g11 Things About11g
11 Things About11gfcamachob
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01Karam Abuataya
 
ch 7 POSIX.pptx
ch 7 POSIX.pptxch 7 POSIX.pptx
ch 7 POSIX.pptxsibokac
 
Implementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphoresImplementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphoresGowtham Reddy
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EnginePrashant Vats
 
An Overview of SystemVerilog for Design and Verification
An Overview of SystemVerilog  for Design and VerificationAn Overview of SystemVerilog  for Design and Verification
An Overview of SystemVerilog for Design and VerificationKapilRaghunandanTrip
 
Geep networking stack-linuxkernel
Geep networking stack-linuxkernelGeep networking stack-linuxkernel
Geep networking stack-linuxkernelKiran Divekar
 

Similar a Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open (20)

The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
Sysprog 14
Sysprog 14Sysprog 14
Sysprog 14
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bfinalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
 
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki LinnakangasPG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
 
Writing MySQL UDFs
Writing MySQL UDFsWriting MySQL UDFs
Writing MySQL UDFs
 
Drizzles Approach To Improving Performance Of The Server
Drizzles  Approach To  Improving  Performance Of The  ServerDrizzles  Approach To  Improving  Performance Of The  Server
Drizzles Approach To Improving Performance Of The Server
 
11 Things About11g
11 Things About11g11 Things About11g
11 Things About11g
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01
 
Scope Stack Allocation
Scope Stack AllocationScope Stack Allocation
Scope Stack Allocation
 
ch 7 POSIX.pptx
ch 7 POSIX.pptxch 7 POSIX.pptx
ch 7 POSIX.pptx
 
Microkernel Development
Microkernel DevelopmentMicrokernel Development
Microkernel Development
 
Implementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphoresImplementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphores
 
Data recovery using pg_filedump
Data recovery using pg_filedumpData recovery using pg_filedump
Data recovery using pg_filedump
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing Engine
 
An Overview of SystemVerilog for Design and Verification
An Overview of SystemVerilog  for Design and VerificationAn Overview of SystemVerilog  for Design and Verification
An Overview of SystemVerilog for Design and Verification
 
Osol Pgsql
Osol PgsqlOsol Pgsql
Osol Pgsql
 
Oop 1
Oop 1Oop 1
Oop 1
 
Geep networking stack-linuxkernel
Geep networking stack-linuxkernelGeep networking stack-linuxkernel
Geep networking stack-linuxkernel
 

Último

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Último (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open

  • 1. Inside PostgreSQL Shared Memory BRUCE MOMJIAN September, 2013 POSTGRESQL is an open-source, full-featured relational database. This presentation gives an overview of the shared memory structures used by Postgres. Creative Commons Attribution License http://momjian.us/presentations Inside PostgreSQL Shared Memory 1 / 25
  • 2. Outline 1. File storage format 2. Shared memory creation 3. Shared buffers 4. Row value access 5. Locking 6. Other structures Inside PostgreSQL Shared Memory 2 / 25
  • 5. File System /data/base/db Postgres Postgres Postgres /data /base /16385 (production) /1 (template1) /17982 (devel) /16821 (test) /21452 (marketing) Inside PostgreSQL Shared Memory 5 / 25
  • 6. File System /data/base/db/table Postgres Postgres Postgres /data /base /16385 /24692 (customer) /27214 (order) /25932 (product) /25952 (employee) /27839 (part) Inside PostgreSQL Shared Memory 6 / 25
  • 7. File System Data Pages Postgres Postgres Postgres 8k 8k 8k 8k /data /base /16385 /24692 Inside PostgreSQL Shared Memory 7 / 25
  • 8. Data Pages Postgres Postgres Postgres Page Header Item Item Item Tuple Tuple Tuple Special 8K 8k 8k 8k 8k /data /base /16385 /24692 Inside PostgreSQL Shared Memory 8 / 25
  • 9. File System Block Tuple Postgres Postgres Postgres Page Header Item Item Item Tuple Tuple Tuple Special 8K 8k 8k 8k 8k /data /base /16385 /24692 Tuple Inside PostgreSQL Shared Memory 9 / 25
  • 10. File System Tuple hoff − length of tuple header infomask − tuple flags natts − number of attributes ctid − tuple id (page / item) cmax − destruction command id xmin − creation transaction id xmax − destruction transaction id cmin − creation command id bits − bit map representing NULLs OID − object id of tuple (optional) Tuple Value Value ValueValue Value Value ValueHeader int4in(’9241’) textout() ’Martin’ Inside PostgreSQL Shared Memory 10 / 25
  • 11. Tuple Header C Structures typedef struct HeapTupleFields { TransactionId t_xmin; /* inserting xact ID */ TransactionId t_xmax; /* deleting or locking xact ID */ union { CommandId t_cid; /* inserting or deleting command ID, or both */ TransactionId t_xvac; /* VACUUM FULL xact ID */ } t_field3; } HeapTupleFields; typedef struct HeapTupleHeaderData { union { HeapTupleFields t_heap; DatumTupleFields t_datum; } t_choice; ItemPointerData t_ctid; /* current TID of this or newer tuple */ /* Fields below here must match MinimalTupleData! */ uint16 t_infomask2; /* number of attributes + various flags */ uint16 t_infomask; /* various flag bits, see below */ uint8 t_hoff; /* sizeof header incl. bitmap, padding */ /* ^ − 23 bytes − ^ */ bits8 t_bits[1]; /* bitmap of NULLs −− VARIABLE LENGTH */ /* MORE DATA FOLLOWS AT END OF STRUCT */ } HeapTupleHeaderData; Inside PostgreSQL Shared Memory 11 / 25
  • 12. Shared Memory Creation postmaster postgres postgres Program (Text) Data Program (Text) Data Shared Memory Program (Text) Data Shared Memory Shared Memory Stack Stack fork() Stack Inside PostgreSQL Shared Memory 12 / 25
  • 13. Shared Memory Shared Buffers Proc Array PROC Multi−XACT Buffers Two−Phase Structs Subtrans Buffers CLOG Buffers XLOG Buffers Shared Invalidation Lightweight Locks Lock Hashes Auto Vacuum Btree Vacuum Buffer Descriptors Background Writer Synchronized Scan Semaphores Statistics LOCK PROCLOCK Inside PostgreSQL Shared Memory 13 / 25
  • 14. Shared Buffers Page Header Item Item Item Tuple Tuple Tuple Special 8K Postgres Postgres Postgres 8k 8k 8k 8k /data /base /16385 /24692 Shared Buffers LWLock − for page changes Pin Count − prevent page replacement read() write() 8k 8k 8k Buffer Descriptors Inside PostgreSQL Shared Memory 14 / 25
  • 15. HeapTuples Shared Buffers Postgres HeapTuple C pointer hoff − length of tuple header infomask − tuple flags natts − number of attributes ctid − tuple id (page / item) cmax − destruction command id xmin − creation transaction id xmax − destruction transaction id cmin − creation command id bits − bit map representing NULLs OID − object id of tuple (optional) Tuple Value Value ValueValue Value Value ValueHeader int4in(’9241’) textout() ’Martin’ Page Header Item Item Item Tuple Tuple Tuple Special 8K 8k 8k 8k Inside PostgreSQL Shared Memory 15 / 25
  • 16. Finding A Tuple Value in C Datum nocachegetattr(HeapTuple tuple, int attnum, TupleDesc tupleDesc, bool *isnull) { HeapTupleHeader tup = tuple−>t_data; Form_pg_attribute *att = tupleDesc−>attrs; { int i; /* * Note − This loop is a little tricky. For each non−null attribute, * we have to first account for alignment padding before the attr, * then advance over the attr based on its length. Nulls have no * storage and no alignment padding either. We can use/set * attcacheoff until we reach either a null or a var−width attribute. */ off = 0; for (i = 0;; i++) /* loop exit is at "break" */ { if (HeapTupleHasNulls(tuple) && att_isnull(i, bp)) continue; /* this cannot be the target att */ if (att[i]−>attlen == −1) off = att_align_pointer(off, att[i]−>attalign, −1, tp + off); else /* not varlena, so safe to use att_align_nominal */ off = att_align_nominal(off, att[i]−>attalign); if (i == attnum) break; off = att_addlength_pointer(off, att[i]−>attlen, tp + off); } } return fetchatt(att[attnum], tp + off); } Inside PostgreSQL Shared Memory 16 / 25
  • 17. Value Access in C #define fetch_att(T,attbyval,attlen) ( (attbyval) ? ( (attlen) == (int) sizeof(int32) ? Int32GetDatum(*((int32 *)(T))) : ( (attlen) == (int) sizeof(int16) ? Int16GetDatum(*((int16 *)(T))) : ( AssertMacro((attlen) == 1), CharGetDatum(*((char *)(T))) ) ) ) : PointerGetDatum((char *) (T)) ) Inside PostgreSQL Shared Memory 17 / 25
  • 18. Test And Set Lock Can Succeed Or Fail 0/1 1 0 Success Was 0 on exchange Lock already taken Was 1 on exchange Failure 1 1 Inside PostgreSQL Shared Memory 18 / 25
  • 19. Test And Set Lock x86 Assembler static __inline__ int tas(volatile slock_t *lock) { register slock_t _res = 1; /* * Use a non−locking test before asserting the bus lock. Note that the * extra test appears to be a small loss on some x86 platforms and a small * win on others; it’s by no means clear that we should keep it. */ __asm__ __volatile__( " cmpb $0,%1 n" " jne 1f n" " lock n" " xchgb %0,%1 n" "1: n" : "+q"(_res), "+m"(*lock) : : "memory", "cc"); return (int) _res; } Inside PostgreSQL Shared Memory 19 / 25
  • 20. Spin Lock Always Succeeds 0/1 1 Success Was 0 on exchange Failure Was 1 on exchange Lock already taken Sleep of increasing duration 0 1 1 Spinlocks are designed for short-lived locking operations, like access to control structures. They are not be used to protect code that makes kernel calls or other heavy operations. Inside PostgreSQL Shared Memory 20 / 25
  • 21. Light Weight Locks Shared Buffers Proc Array PROC Multi−XACT Buffers Two−Phase Structs Subtrans Buffers CLOG Buffers XLOG Buffers Shared Invalidation Lightweight Locks Auto Vacuum Btree Vacuum Background Writer Synchronized Scan Semaphores Statistics LOCK PROCLOCK Lock Hashes Buffer Descriptors Sleep On Lock Light weight locks attempt to acquire the lock, and go to sleep on a semaphore if the lock request fails. Spinlocks control access to the light weight lock control structure. Inside PostgreSQL Shared Memory 21 / 25
  • 22. Database Object Locks PROCLOCKPROC LOCK Lock Hashes Inside PostgreSQL Shared Memory 22 / 25
  • 23. Proc Proc Array used usedusedempty empty empty PROC Inside PostgreSQL Shared Memory 23 / 25
  • 24. Other Shared Memory Structures Shared Buffers Proc Array PROC Multi−XACT Buffers Two−Phase Structs Subtrans Buffers CLOG Buffers XLOG Buffers Shared Invalidation Lightweight Locks Lock Hashes Auto Vacuum Btree Vacuum Buffer Descriptors Background Writer Synchronized Scan Semaphores Statistics LOCK PROCLOCK Inside PostgreSQL Shared Memory 24 / 25
  • 25. Conclusion http://momjian.us/presentations Pink Floyd: Wish You Were Here Inside PostgreSQL Shared Memory 25 / 25