SlideShare a Scribd company logo
1 of 105
Download to read offline
About multiblock reads
!
!
!

Frits Hoogland - UKOUG Tech - 2013

!1
Who am I?
! Frits Hoogland
! Working with Oracle products since 1996
!
!

! Blog: http://fritshoogland.wordpress.com
! Twitter: @fritshoogland
! Email: frits.hoogland@enkitec.com
! Oracle ACE Director
! OakTable Member
!2
Agenda
! The Oracle full segment scan implementation
! Options for full segments scans <=V10 vs. V11
!

! ‘traditional full scans’ & direct path full scans
!

! ‘autotune’ / adaptive direct path reads
!

! Implementation details

!3
What is this presentation about?
! Multiblock reads can behave different after 10.2
! This could lead to different behavior of applications using
the database.
!

! I assume the audience to have basic understanding about:
! Oracle execution plans
! Oracle SQL/10046 extended traces
! General execution behavior of the RDBMS engine
! C language in general
!4
Row source operations
! Multiblock reads are an optimised method to read
database blocks from disk for a database process.
!

! Mainly used for the rowsource operations:
!

! ‘TABLE ACCESS FULL’
! ‘FAST FULL INDEX SCAN’
! ‘BITMAP FULL SCAN’

!5
Row source operations
! For much of other segment access rowsource
actions, like:
! ‘INDEX UNIQUE SCAN’
! ‘INDEX RANGE SCAN’
! ‘INDEX FULL SCAN’
! ‘TABLE ACCESS BY INDEX ROWID’
! Use single block reads are mostly used.
! The order in which individual blocks are read is
important.
!6
db file multiblock read count
! MBRC is the maximum amount of blocks read in one IO
call.
! Buffered MBRC cannot cross extent borders.
!

! Concepts guide on full table scans: (11.2 version)
! A scan of table data in which the database sequentially
reads all rows from a table and filters out those that
do not meet the selection criteria. All data blocks
under the high water mark are scanned.

!7
db file multiblock read count
! Multiblock reads are done up to
DB_FILE_MULTIBLOCK_READ_COUNT blocks.
! If MBRC is unset, default is ‘maximum IO size
that can be efficiently performed’.
!

! Most operating systems allow a single IO
operation up to 1 MB.
! I prefer to set it manually.

!8
My test environment
! Mac OSX Mountain Lion, VMware fusion
! VM: OL6u3 x64
! Database 10.2.0.1, 11.2.0.1 and 11.2.0.3
! ASM GI 11.2.0.3
!

! Sample tables:
! T1 - 21504 blocks - 176M - 1’000’000 rows
! PK index - 2304 blocks / 19M
! T2 - 21504 blocks - 176M - 1’000’000 rows

!9
First test
! 10.2.0.1 instance:
! sga_target = 600M
! Effective buffercache size = 450M
! Freshly started

!10
First test
TS@v10201 > select /*+ index(t t1_pk_ix) */ count(id), sum(scattered) from t1 t;!

!
COUNT(ID) SUM(SCATTERED)!
---------- --------------!
1000000

9999500000!

!
----------------------------------------------------------------------------------!
| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU) |!
----------------------------------------------------------------------------------!
|
0 | SELECT STATEMENT
|
|
1 |
5 | 23234
(1) |!
|
1 | SORT AGGREGATE
|
|
1 |
5 |
|!
|
2 |
TABLE ACCESS BY INDEX ROWID | T1
| 1000K | 4884K | 23234
(1) |!
|
3 |
INDEX FULL SCAN
| T1_PK_IX | 1000K |
| 2253
(2) |!
----------------------------------------------------------------------------------!

!11
First test
! How would you expect Oracle 10.2.0.1 to execute
this?
!

! In other words:
! What would be the result of a SQL trace with
waits? *
!

* If all blocks need to be read from disk (ie. not
cached)
!12
First test
! My guess would be:
!

! Index root bock (1 block)
! None, one or more branch blocks (1 block)
! Index leaf block, fetch values (1 block)
! Table block via index rowid, fetch value(s) (1/1+ block)
! Index values, block value(s), etc.

!13
First test
! That should look like something like this:
!

WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
...

#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:
#8:

nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db

file
file
file
file
file
file
file
file
file
file
file
file
file
file

sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential
sequential

read'
read'
read'
read'
read'
read'
read'
read'
read'
read'
read'
read'
read'
read'

ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=

326 file#=5 block#=43028 blocks=1!
197 file#=5 block#=43719 blocks=1!
227 file#=5 block#=43029 blocks=1!
125 file#=5 block#=20 blocks=1!
109 file#=5 block#=21 blocks=1!
242 file#=5 block#=22 blocks=1!
98 file#=5 block#=23 blocks=1!
76 file#=5 block#=24 blocks=1!
77 file#=5 block#=25 blocks=1!
77 file#=5 block#=26 blocks=1!
105 file#=5 block#=27 blocks=1!
82 file#=5 block#=28 blocks=1!
71 file#=5 block#=29 blocks=1!
93 file#=5 block#=43030 blocks=1!

!14
First test
! Instead, I get:
!
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
...!

#4:
#4:
#4:
#4:
#4:
#4:
#4:
#4:
#4:
#4:

nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db

file
file
file
file
file
file
file
file
file
file

scattered
scattered
scattered
scattered
scattered
scattered
scattered
scattered
scattered
scattered

read'
read'
read'
read'
read'
read'
read'
read'
read'
read'

ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=

361
220
205
219
192
141
123
190
231
113

file#=5
file#=5
file#=5
file#=5
file#=5
file#=5
file#=5
file#=5
file#=5
file#=5

block#=43025 blocks=8!
block#=43713 blocks=8!
block#=17 blocks=8!
block#=25 blocks=8!
block#=33 blocks=8!
block#=41 blocks=8!
block#=49 blocks=8!
block#=57 blocks=8!
block#=43033 blocks=8!
block#=65 blocks=8!

!15
First test
! Sets of 8 blocks are read for rowsources which really
need a single block.
! Reason:
! This is an empty cache.
! Oracle reads multiple blocks to get the cache filled.
! ‘cache warming’
! Statistic (‘physical reads cache prefetch’)
! Needed to tune the BC down to 50M and pre-warm it
with another table to get single block reads again (!!)
! Or set “_db_cache_pre_warm” to false
!16
Full scan - Oracle 10.2
! Let’s look at an Oracle 10.2.0.1 database
!

! SGA_TARGET 600M
!

! Table TS.T2 size 21504 blks / 176M

!17
Full scan - Oracle 10.2
TS@v10201 > set autot on exp stat!
TS@v10201 > select count(*) from t2;!

!
COUNT(*)!
----------!
1000000!

!
Execution Plan!
----------------------------------------------------------!
Plan hash value: 3724264953!

!
-------------------------------------------------------------------!
| Id

| Operation

| Name | Rows

| Cost (%CPU)| Time

|!

-------------------------------------------------------------------!
|

0 | SELECT STATEMENT

|

|

1 |

|

1 |

|

|

1 |

|

2 |

|

1007K|

SORT AGGREGATE

TABLE ACCESS FULL| T2

3674

(1)| 00:00:45 |!
|

3674

|!

(1)| 00:00:45 |!

-------------------------------------------------------------------

!18
Full scan - Oracle 10.2
Statistics!
----------------------------------------------------------!
212
0

recursive calls!
db block gets!

20976

consistent gets!

20942

physical reads!

0

redo size!

515

bytes sent via SQL*Net to client!

469

bytes received via SQL*Net from client!

2

SQL*Net roundtrips to/from client!

4

sorts (memory)!

0

sorts (disk)!

1

rows processed!

!19
Full scan - Oracle 10.2
SYS@v10201 AS SYSDBA> !
select object_id, object_name, owner from dba_objects where object_name = 'T2';!

!
OBJECT_ID OBJECT_NAME

OWNER!

---------- --------------------------------------------- ------------------!
10237 T2

TS!

!
!
SYS@v10201 AS SYSDBA> select * from x$kcboqh where obj# = 10237;!

!
ADDR

INDX

INST_ID

TS#

OBJ#

NUM_BUF HEADER!

---------------- -------- --------- ------- ------- --------- ----------------!
FFFFFD7FFD5C6FA8

335

1

5

10237

20942 000000038FBCF840!

!20
Full scan - Oracle 10.2
TS@v10201 > select count(*) from t2;!

!
Statistics!
----------------------------------------------------------!
0

recursive calls!

0

db block gets!

20953

consistent gets!

0

physical reads!

0

redo size!

515

bytes sent via SQL*Net to client!

469

bytes received via SQL*Net from client!

2

SQL*Net roundtrips to/from client!

0

sorts (memory)!

0

sorts (disk)!

1

rows processed!

!21
Full scan - Oracle 11.2
! Now look at an Oracle 11.2.0.3 database
!

! SGA_TARGET 600M
!

! Table TS.T2 size 21504 blks / 176M

!22
Full scan - Oracle 11.2
TS@v11203 > select count(*) from t2;!

!
COUNT(*)!
----------!
1000000!

!
!
Execution Plan!
----------------------------------------------------------!
Plan hash value: 3724264953!

!
-------------------------------------------------------------------!
| Id

| Operation

| Name | Rows

| Cost (%CPU)| Time

|!

-------------------------------------------------------------------!
|

0 | SELECT STATEMENT

|

|

1 |

|

1 |

|

|

1 |

|

2 |

|

1000K|

SORT AGGREGATE

TABLE ACCESS FULL| T2

3672

(1)| 00:00:45 |!
|

3672

|!

(1)| 00:00:45 |!

-------------------------------------------------------------------

!23
Full scan - Oracle 11.2
Statistics!
----------------------------------------------------------!
217
0

recursive calls!
db block gets!

20970

consistent gets!

20942

physical reads!

0

redo size!

526

bytes sent via SQL*Net to client!

523

bytes received via SQL*Net from client!

2

SQL*Net roundtrips to/from client!

4

sorts (memory)!

0

sorts (disk)!

1

rows processed!

!24
Full scan - Oracle 11.2
SYS@v11203 AS SYSDBA> !
select object_id, object_name, owner from dba_objects where object_name = 'T2';!

!
OBJECT_ID OBJECT_NAME

OWNER!

---------- ------------------------------------------------------ -------------!
66614 T2

TS!

!
!
SYS@v11203 AS SYSDBA> select * from x$kcboqh where obj# = 66614;!

!
ADDR

INDX

INST_ID

TS#

OBJ#

NUM_BUF HEADER!

---------------- ------- -------- ------ ---------- ---------- ----------------!
FFFFFD7FFC541B18

43

1

5

66614

1 000000039043E470!

!25
Full scan - Oracle 11.2
TS@v11203 > select count(*) from t2;!

!
Statistics!
----------------------------------------------------------!
0

recursive calls!

0

db block gets!

20945

consistent gets!

20941

physical reads!

0

redo size!

526

bytes sent via SQL*Net to client!

523

bytes received via SQL*Net from client!

2

SQL*Net roundtrips to/from client!

0

sorts (memory)!

0

sorts (disk)!

1

rows processed!

!26
Full scan 10.2 vs. 11.2
! Why does version 10 caches all the blocks read,
! And version 11 only 1 of them??
!

! Let’s do an extended SQL trace
! AKA 10046 level 8 trace.

!27
Full scan 10.2 vs. 11.2
! Relevant part of 10046/8 trace file of version
10.2.0.1:
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT
WAIT

#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:
#1:

nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db
nam='db

file
file
file
file
file
file
file
file
file
file
file
file
file
file
file

sequential read' ela= 32941 file#=5 block#=19 blocks=1!
scattered read' ela= 4003 file#=5 block#=20 blocks=5!
scattered read' ela= 6048 file#=5 block#=25 blocks=8!
scattered read' ela= 1155 file#=5 block#=34 blocks=7!
scattered read' ela= 860 file#=5 block#=41 blocks=8!
scattered read' ela= 837 file#=5 block#=50 blocks=7!
scattered read' ela= 1009 file#=5 block#=57 blocks=8!
scattered read' ela= 890 file#=5 block#=66 blocks=7!
scattered read' ela= 837 file#=5 block#=73 blocks=8!
scattered read' ela= 10461 file#=5 block#=82 blocks=7!
scattered read' ela= 623 file#=5 block#=89 blocks=8!
scattered read' ela= 1077 file#=5 block#=98 blocks=7!
scattered read' ela= 49146 file#=5 block#=105 blocks=8!
scattered read' ela= 719 file#=5 block#=114 blocks=7!
scattered read' ela= 1093 file#=5 block#=121 blocks=8!

!28
Full scan 10.2 vs. 11.2
! Relevant part of 10046/8 trace file of version
11.2.0.3:
WAIT #140120507194664: !
nam='db file sequential read' ela= 12607 file#=5 block#=43394 blocks=1 !
obj#=14033 tim=1329685383169372!
nam='direct path read' ela= 50599 file number=5 first dba=43395 block cnt=13!
nam='direct path read' ela= 21483 file number=5 first dba=43425 block cnt=15!
nam='direct path read' ela= 10766 file number=5 first dba=43441 block cnt=15!
nam='direct path read' ela= 12915 file number=5 first dba=43457 block cnt=15!
nam='direct path read' ela= 12583 file number=5 first dba=43473 block cnt=15!
nam='direct path read' ela= 11899 file number=5 first dba=43489 block cnt=15!
nam='direct path read' ela= 10010 file number=5 first dba=43505 block cnt=15!
nam='direct path read' ela= 160237 file number=5 first dba=43522 block cnt=126!
nam='direct path read' ela= 25561 file number=5 first dba=43650 block cnt=126!
nam='direct path read' ela= 121507 file number=5 first dba=43778 block cnt=126!
nam='direct path read' ela= 25253 file number=5 first dba=43906 block cnt=126

!29
First single block read
! The segment header is read separately
! Single block, read into SGA
!

! The header block is listed in dba_segments
!
select owner, segment_name, header_file, header_block !
from dba_segments where segment_name like 'T2';!

!
OWNER! SEGMENT_NAME !

HEADER_FILE HEADER_BLOCK!

----------

-------------------- ----------- ------------!

TS

T2

5

130

!30
Full scan 10.2 vs. 11.2
! A full scan uses direct path reads in the v11 case.
!

! Noticeable by ‘direct path read’ event
!

! Direct path reads go to PGA
! Which means the blocks read are not cached

!31
Full scan 10.2 vs. 11.2
! Do all full scans in version 11 always use direct
path?
! Direct path reads are considered
! #blk of the segment > 5*_small_table_threshold
!

! PS: MOS note 787373.1
! “How does Oracle load data into the buffer
cache for table scans ?”
! Mentions _small_table_threshold being the limit
! Note INCORRECT!
!32
Direct path read
! Small table threshold of my Oracle 11 instance:
!
NAME!

!

!

VALUE!

-------------------------- --------------------------!

_small_table_threshold

245!

!

!33
Direct path read
! This means objects up to 245*5=1225 blocks will
be read into buffercache / SGA.
!

! Let’s create a table with a size just below 1225
blocks:
!
TS@v11203 > create table t1_small as select * from t1 where id <= 47000;!
TS@v11203 > exec dbms_stats.gather_table_stats(null,‘T1_SMALL’);

!34
Direct path read
!
!
SYS@v11203 AS SYSDBA>!
!
select segment_name, blocks, bytes!
from dba_segments where segment_name = 'T1_SMALL';!

!
SEGMENT_NAME

BLOCKS

BYTES!

-------------------------------------- ---------- ----------!
T1_SMALL

1024

8388608!

!
!
SQL@v11203 AS SYSDBA> alter system flush buffer_cache;!

!

!35
Direct path read
TS@v11203 > set autot trace exp stat!
TS@v11203 > select count(*) from t1_small;!

!
!
Execution Plan!
----------------------------------------------------------!
Plan hash value: 1277318887!

!
-----------------------------------------------------------------------!
| Id

| Operation

| Name

| Rows

| Cost (%CPU)| Time

|!

-----------------------------------------------------------------------!
|

0 | SELECT STATEMENT

|

|

1 |

|

1 |

|

|

1 |

|

2 |

SORT AGGREGATE

TABLE ACCESS FULL| T1_SMALL | 47000 |

176

(1)| 00:00:03 |!
|

176

|!

(1)| 00:00:03 |!

-----------------------------------------------------------------------

!36
Direct path read
Statistics!
----------------------------------------------------------!
!

0

recursive calls!

!

0

db block gets!

!

983

consistent gets!

!

979

physical reads!

!

0

!

527

bytes sent via SQL*Net to client!

!

523

bytes received via SQL*Net from client!

!

2

SQL*Net roundtrips to/from client!

!

0

sorts (memory)!

!

0

sorts (disk)!

!

1

rows processed

redo size!

!37
Direct path read
SYS@v11203 AS SYSDBA> !
select object_id, object_name, owner !
from dba_objects where object_name = 'T1_SMALL';!

!

OBJECT_ID OBJECT_NAME

OWNER!

---------- ------------------------------------------------------ -------------!
66729 T1_SMALL

TS!

!
!
SYS@v11203 AS SYSDBA> select * from x$kcboqh where obj# = 66729;!

!
ADDR

INDX

INST_ID TS#

OBJ#

NUM_BUF

HEADER!

---------------- ------ ------- ------ ------- ---------- ----------------!
FFFFFD7FFC6E1EF0 0

1

5

66729

979

0000000390437840!

!38
Direct path read
! Ah, now the full scan is buffered!
! Another scan will reuse the cached blocks now:
!
TS@v11203 > select count(*) from t1_small;!

!
...!

!
Statistics!
----------------------------------------------------------!
!

0

recursive calls!

!

0

db block gets!

!

983

!

0

consistent gets!
physical reads

!39
Direct path read
!

! What type of wait event will be used for a full
scan:
!

! Oracle version 11.2
! If size is smaller than 5*_small_table_threshold?

!40
Direct path read
Well, try it:
!
TS@v11203 > alter session set events ‘10046 trace name context forever, level 8’;!
TS@v11203 > select count(*) from t1_small;!
...!
TS@v11203 > alter session set events ‘10046 trace name context off’;!

!

It shows:
!
WAIT #140358956326184: !
nam='db file sequential read' ela= 38476 file#=5 block#=88706 blocks=1 !
obj#=14047 tim=1330369985672633!
nam='db file scattered read' ela= 116037 file#=5 block#=88707 blocks=5 !
nam='db file scattered read' ela= 56675 file#=5 block#=88712 blocks=8 !
nam='db file scattered read' ela= 11195 file#=5 block#=88721 blocks=7 !
nam='db file scattered read' ela= 132928 file#=5 block#=88728 blocks=8!
nam='db file scattered read' ela= 18692 file#=5 block#=88737 blocks=7 !
nam='db file scattered read' ela= 87817 file#=5 block#=88744 blocks=8 !

!41
Oracle 11 multiblock IO
! In version 11 of the Oracle database
!

! Multiblocks reads use both wait events:
!

! db file scattered read
! direct path read
!

! Which are two different codepath’s
!42
Implementation
! Buffered multiblock reads
!

! Buffered multiblock reads == ‘db file scattered
read’
! Up to version 10 the ONLY* option for non-PQ
multiblock reads
! Starting from version 11, a possible multiblock
read option

!43
Buffered multiblock reads
SYS@v10201 AS SYSDBA> !
select segment_name, extent_id, block_id, blocks, bytes!
from dba_extents !
where segment_name = 'T2' and owner = 'TS' order by extent_id;!

!
SEGMENT_NAME

EXTENT_ID

BLOCKS

BYTES!

------------------------------------- ---------- --------- ----------!
T2

0

8

65536!

T2

15

8

65536!

T2

16

128

1048576!

T2

78

128

1048576!

T2

79

1024

8388608!

91

1024

8388608

...!

...!

...!
T2

!44
Buffered multiblock reads
Version 10 multiblock reads:
WAIT #2: nam='db file sequential read' ela= 12292 file#=5 block#=19 blocks=1!
WAIT #2: nam='db file scattered read' ela= 179162 file#=5 block#=20 blocks=5!
WAIT #2: nam='db file scattered read' ela= 47597 file#=5 block#=25 blocks=8!
WAIT #2: nam='db file scattered read' ela= 5206 file#=5 block#=34 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 94101 file#=5 block#=41 blocks=8 !
WAIT #2: nam='db file scattered read' ela= 512 file#=5 block#=50 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 87657 file#=5 block#=57 blocks=8 !
WAIT #2: nam='db file scattered read' ela= 27488 file#=5 block#=66 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 24316 file#=5 block#=73 blocks=8 !
WAIT #2: nam='db file scattered read' ela= 55251 file#=5 block#=82 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 641 file#=5 block#=89 blocks=8 !
WAIT #2: nam='db file scattered read' ela= 455 file#=5 block#=98 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 43826 file#=5 block#=105 blocks=8 !
WAIT #2: nam='db file scattered read' ela= 32685 file#=5 block#=114 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 60212 file#=5 block#=121 blocks=8 !
WAIT #2: nam='db file scattered read' ela= 37735 file#=5 block#=130 blocks=7 !
WAIT #2: nam='db file scattered read' ela= 59565 file#=5 block#=137 blocks=8 !
(ps: edited for clarity)

!45
nam='db file sequentialread' ela= 94101file#=5 block#=34 blocks=7
nam='db file scattered read' ela= 87657 file#=5 block#=41 blocks=8
nam='db filescattered read' ela=5206 file#=5block#=57 blocks=1
scattered
512
nam='db file scattered read' ela= 179162file#=5block#=50 blocks=7
47597
12292
block#=25 blocks=5
block#=19
block#=20
nam='db file scattered read' ela= 87657 file#=5 block#=147 blocks=126
17

129

25

137

33

145

41
49
57

273

65
73

401

81
89

529

97
105

657

113
121

785

!46
Non buffered multiblock reads
WAIT #140120507194664: !
nam='db file sequential read' ela= 12607 file#=5 block#=43394 blocks=1 obj#=14033
tim=1329685383169372!

!

nam='direct
nam='direct
nam='direct
nam='direct
nam='direct
nam='direct
nam='direct
nam='direct
nam='direct
nam='direct
nam='direct

path
path
path
path
path
path
path
path
path
path
path

read'
read'
read'
read'
read'
read'
read'
read'
read'
read'
read'

ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=
ela=

50599 file number=5 first dba=43395 block cnt=13!
21483 file number=5 first dba=43425 block cnt=15!
10766 file number=5 first dba=43441 block cnt=15!
12915 file number=5 first dba=43457 block cnt=15!
12583 file number=5 first dba=43473 block cnt=15!
11899 file number=5 first dba=43489 block cnt=15!
10010 file number=5 first dba=43505 block cnt=15!
160237 file number=5 first dba=43522 block cnt=126!
25561 file number=5 first dba=43650 block cnt=126!
121507 file number=5 first dba=43778 block cnt=126!
25253 file number=5 first dba=43906 block cnt=126

!47
nam='direct file sequential read' ela= 12607 on this later.
nam='db path read' ela= 50599 file more file#=5 block#=43394block cnt=13
not in 21483
tracefile. number=5 first dba=43395 blocks=1
dba=43425
cnt=15
43392
43400
43408

43504
43512
43520

43418
43424
43432

43648

43440
43448

43776

43456
43464

43904

43472
43480
43488
43496

!48
ASSM
! Automatic segment space management
! Tablespace property
! Default since Oracle 10.2
! Uses L 1/2/3 bitmap blocks for space management
! 8 blocks: 1 BMB as first block of every other
extent
! 128 blocks: 2 BMB as first blocks in all extents
! 1024 blocks: 4 BMB as first blocks in all extents

!49
Multiblock implementation
! Conclusion:
! Buffered reads scan up to:
! Non data (space admin. bitmap) block
! Extent border
! Block already in cache (from TOP)
!

! Direct path/non buffered reads scan up to:
! Non data (space admin. bitmap) block
! Block already in cache (from TOP)

!50
Waits and implementation
! ‘Wait’ or wait event
! Part of the formula:
! Elapsed time = CPU time + Wait time
! Inside the Oracle database it is meant to record
the time spent in a specific part of the oracle
database code not running on CPU.
! Let’s look at the implementation of some of the
wait events for multiblock reads!

!51
Waits and implementation
• Quick primer on read IO System calls on Linux:
!

• Synchronous
• pread64()
!

• Asynchronous
• io_submit()
• io_getevents()

!52
strace
! Linux tool for tracing (viewing) system calls
! Solaris/AIX: truss, HPUX: tusc.
!

! Very, very, useful to understand what is happening
! Much people are using it for years

• STRACE LIES!

(at least on linux)

!53
strace lies
! strace doesn’t show io_getevents() if:
! timeout struct set to {0,0} (‘zero’)
! does not succeed in reaping min_nr IO’s
!

! This strace omission is not documented!

!54
strace lies
! This is best seen with system’s IO capability
severely throttled (1 IOPS)
! See http://fritshoogland.wordpress.com/
2012/12/15/throttling-io-with-linux/
!

! Cgroups
! Control groups
! Linux feature
! Fully available with OL6
!55
strace lies
! Strace output
! Version 11.2.0.3
! IO throttled to 1 IOPS
! Full table scan doing ‘count(*) on t2’
! With 10046 at level 8
! To show where waits are occuring
!

! Start of FTS, up to first reap of IO

!56
strace lies
io_submit(139801394388992, 1, {{0x7f260a8b3450, 0, 0, 0, 257}}) = 1!
io_submit(139801394388992, 1, {{0x7f260a8b31f8, 0, 0, 0, 257}}) = 1!
io_getevents(139801394388992, 1, 128, {{0x7f260a8b3450, 0x7f260a8b3450, 106496, 0}},
{600, 0}) = 1!
write(8, "WAIT #139801362351208: nam='dire"..., 133) = 133!

!
* edited for clarity

!57
strace lies
! Profile the same using ‘gdb’
! Set breakpoints at functions:
! io_submit, io_getevents_0_4
! kslwtbctx, kslwtectx
! Let gdb continue after breakpoint
! The symbol table is preserved in the oracle
binary
! Making it able to set breakpoints at functions

!58
strace lies
#0
#0

io_submit (ctx=0x7f46fe708000, nr=1, iocbs=0x7fff24547ce0) at io_submit.c:23!
io_submit (ctx=0x7f46fe708000, nr=1, iocbs=0x7fff24547ce0) at io_submit.c:23!

!
Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128,
events=0x7fff24550348, timeout=0x7fff24551350) at io_getevents.c:46!
Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128,
events=0x7fff24553428, timeout=0x7fff24554430) at io_getevents.c:46!
Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128,
events=0x7fff24550148, timeout=0x7fff24551150) at io_getevents.c:46!
Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128,
events=0x7fff24553228, timeout=0x7fff24554230) at io_getevents.c:46!

!
#0 0x0000000008f9a652 in kslwtbctx ()!
Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=1, nr=128,
events=0x7fff24550138, timeout=0x7fff24551140) at io_getevents.c:46!
#0 0x0000000008fa1334 in kslwtectx ()!

!
* edited for clarity

!59
db file scattered read
Basic principle
ela time of ‘db file scattered read’

file # and # blocks are
determined

read ready, blocks
available

time

read call of # bytes

!60
db file scattered read
Implementation
synchronous IO
10.2.0.1/11.2.0.1/11.2.0.3

file # and # blocks are
determined

ela time of ‘db file scattered read’

read ready, blocks
available

time
darker quare means ‘optional’

pread64(fd, buf, #bytes, offset)

!61
db file scattered read
Implementation
asynchronous IO
10.2.0.1
file # and # blocks are
determined

ela time of ‘db file scattered read’

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)

!62
db file scattered read
Implementation
asynchronous IO
11.2.0.1
file # and # blocks are
determined

ela time of ‘db file scattered read’

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)

!63
db file scattered read
Implementation
asynchronous IO
11.2.0.3
file # and # blocks are
determined

ela time of ‘db file scattered read’

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)

!64
direct path read - 11g
! Time spent on waiting for reading blocks for
putting them into the PGA
! Reports wait time of the request that gets reaped
with a timed io_getevents() call.
! Multiple IO requests can be submitted with AIO
! At start, Oracle tries to keep 2 IO’s in flight
! Wait time is only reported if ‘waiting’ occurs
! Waiting means: not ALL IO’s can be reaped
immediately after submitting
!65
direct path read 11g
Implementation
asynchronous IO
11.2.0.1
file # and # blocks are
determined
for a number of IO’s

ela time of ‘direct path read’*

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)

!66
direct path read 11g
Implementation
asynchronous IO
11.2.0.3
file # and # blocks are
determined
for a number of IO’s

ela time of ‘direct path read’*

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)

!67
direct path read
Implementation
asynchronous IO
10.2.0.1
file # and # blocks are
determined
for a number of IO’s

ela time of ‘direct path read’

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)

!68
direct path read
Implementation
synchronous IO **
10.2.0.1/11.2.0.1/11.2.0.3
file # and # blocks are
determined
for a number of IO’s

ela time of ‘direct path read’

read ready, blocks
available

time
pread64(fd, buf, #bytes, offset)

!69
kfk: async disk IO
! Only seen with ‘direct path read’ waits and ASM
! Always seen in version 11.2.0.1
!

! Not normally seen in version 11.2.0.2+
!

! KFK = Kernel File K.. (?)
! Most ASM code is in this layer

!70
kfk: async disk io
Implementation
asynchronous IO
11.2.0.1

ela time of ‘direct path read’*

file # and # blocks are
determined
for a number of IO’s

read ready, blocks
available

time
io_submit(aio_ctx, #cb, {iocb})
io_getevents(aio_ctx, min_nr, nr, io_event,
timeout)
kfk: async disk io

!71
IO Slots
!
!
!
!

Discussion with Kerry Osborne about IO’s on Exadata

!72
!73
!74
IO Slots
! Jonathan Lewis pointed me to ‘total number of
slots’
! v$sysstat
! v$sesstat
! Global or per session number of slots
!

! ‘Slots are a unit of I/O and this factor controls the
number of outstanding I/Os’
! Comment with event 10353
!75
IO Slots
! ‘total number of slots’
!

! Is NOT cumulative!
!

! So you won’t capture this statistic when taking
delta’s from v$sysstat/v$sesstat!

!76
IO Slots
! Let’s look at the throughput statistics again
!

! But together with number of slots

!77
!78
!79
IO Slots
! IO Slots are a mechanism to take advantage of
storage bandwidth using AIO
! With version 11 direct path reads can be used by
both PQ slaves as well as non PQ foregrounds
! IO Slots are not used with buffered reads
! Each outstanding asynchronous IO request is
tracked using what is called a ‘slot’
! Default and minimal number of slots: 2

!80
‘autotune’
! The direct path code changed with version 11
! Second observation:
! The database foreground measures direct path
IO effectiveness
! It measures time, wait time and throughput
! The oracle process has the ability to add more
asynchronous IO slots
! Only does so starting from 11.2.0.2
! Although the mechanism is there in 11.2.0.1
!81
‘autotune’
! Introducing event 10365
! “turn on debug information for adaptive direct
reads”
!

! Set to 1 to get debug information
! alter session set events ‘10365 trace name
context forever, level 1’

!82
‘autotune’
kcbldrsini: Timestamp 61180 ms!
kcbldrsini: Current idx 16!
kcbldrsini: Initializing kcbldrps!
kcbldrsini: Slave idx 17!
kcbldrsini: Number slots 2!
kcbldrsini: Number of slots per session 2!

!
*** 2011-11-28 22:58:48.808!
kcblsinc:Timing time 1693472, wait time 1291416, ratio 76 st 248752270 cur 250445744!
kcblsinc: Timing curidx 17 session idx 17!
kcblsinc: Timestamp 64180 ms!
kcblsinc: Current idx 17!
kcblsinc: Slave idx 17!
kcblsinc: Number slots 2!
kcblsinc: Number of slots per session 2!
kcblsinc: Previous throughput 8378 state 2!
kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0

!83
‘autotune’
*** 2011-11-28 22:58:54.988!
kcblsinc:Timing time 2962717, wait time 2923226, ratio 98 st 253662983 cur 256625702!
kcblsinc: Timing curidx 19 session idx 19!
kcblsinc: Timestamp 70270 ms!
kcblsinc: Current idx 19!
kcblsinc: Slave idx 19!
kcblsinc: Number slots 2!
kcblsinc: Number of slots per session 2!
kcblsinc: Previous throughput 11210 state 1!
kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0!
kcblsinc: Adding extra slos 1!

!

*** 2011-11-28 22:58:58.999!
kcblsinc:Timing time 4011239, wait time 3528563, ratio 87 st 256625785 cur 260637026!
kcblsinc: Timing curidx 20 session idx 20!
kcblsinc: Timestamp 74170 ms!
kcblsinc: Current idx 20!
kcblsinc: Slave idx 20!
kcblsinc: Number slots 3!
kcblsinc: Number of slots per session 3!
kcblsinc: Previous throughput 12299 state 2!
kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0

!84
‘autotune’
! Looking at the 10365 trace, the reason 11.2.0.1
does not ‘autotune’ could be guessed....

!85
‘autotune’
*** 2011-11-28 22:54:18.361!
kcblsinc:Timing time 3092929, wait time 0, ratio 0 st 4271872759 cur 4274965690!
kcblsinc: Timing curidx 65 session idx 65!
kcblsinc: Timestamp 192430 ms!
kcblsinc: Current idx 65!
kcblsinc: Slave idx 65!
kcblsinc: Number slots 2!
kcblsinc: Number of slots per session 2!
kcblsinc: Previous throughput 20655 state 2!
kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0!

!
*** 2011-11-28 22:54:21.306!
kcblsinc:Timing time 2944852, wait time 0, ratio 0 st 4274965762 cur 4277910616!
kcblsinc: Timing curidx 66 session idx 66!
kcblsinc: Timestamp 195430 ms!
kcblsinc: Current idx 66!
kcblsinc: Slave idx 66!
kcblsinc: Number slots 2!
kcblsinc: Number of slots per session 2!
kcblsinc: Previous throughput 20746 state 1!
kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0
!86
IO slots
11.2.0.3
FAST IO

kcblsinc ()

io_submit(aio_ctx,1 , {iocb})

kcblsinc ()

kcbgtcr ()
# slots starts with 2
kcbgtcr ()

kcbgtcr ()

time
io_submit(aio_ctx,1 , {iocb})

io_submit(aio_ctx,1 , {iocb})

io_getevents(aio_ctx, 2, 128, io_event, {0, 0}) OK!

io_submit(aio_ctx,1 , {iocb})

io_getevents(aio_ctx, 2, 128, io_event, {0, 0}) OK!

!87
IO slots
11.2.0.3
FAST IO

kcblsinc ()
+1 ‘slos’

time
io_submit(aio_ctx,1 , {iocb})
io_getevents(aio_ctx, 3, 128, io_event, {0,
0}) OK!

!88
Time and waits
! Waits implementation
! Most are system call instrumentation
! db file sequential read
! ‘direct path read’ is different.
! Only shows up if not all IO can be reaped
immediately
! The wait only occurs if process is truly waiting
! With AIO, a process has the ability to keep on
processing without waiting on IO
! Wait time is not physical IO latency
!89
Super huge IOs
! Christian Antognini showed me an AWR report
! With an experiment of setting MBRC to 4000
! Which lead to IOs / stats of IOs > 1MB
!

! This conflicts with the official description about
MBRC in the Oracle concepts manual

!90
Super huge IOs
! Traditional multiblock reads
! “db file scattered read” event
! Indeed limited to 1 MB / block size

!91
Super huge IOs
! New direct path multiblock read
! “direct path read” event
! Experiments lead to reads up to 1024 blocks
! On the Oracle level
! Tested with Oracle 11.2.0.3 on Linux x64
! Limited by BMB, MBRC

!92
Super huge IOs
nam='direct path read' ela= 7996378 file number=5 first dba=174468 block cnt=1020
obj#=76227 tim=1373104660882677!
nam='direct path read' ela= 7995820 file number=5 first dba=175489 block cnt=1023
obj#=76227 tim=1373104668882345!
nam='direct path read' ela= 7996472 file number=5 first dba=176520 block cnt=632
obj#=76227 tim=1373104676882677!
nam='direct path read' ela= 7998049 file number=5 first dba=177152 block cnt=1024
obj#=76227 tim=1373104684883512!
nam='direct path read' ela= 7995472 file number=5 first dba=178176 block cnt=1024
obj#=76227 tim=1373104692882932!
nam='direct path read' ela= 7993677 file number=5 first dba=179200 block cnt=1024
obj#=76227 tim=1373104700880106!
nam='direct path read' ela= 7996969 file number=5 first dba=180224 block cnt=1024
obj#=76227 tim=1373104708880891!
nam='direct path read' ela= 5998630 file number=5 first dba=181248 block cnt=1024
obj#=76227 tim=1373104714882889!
nam='direct path read' ela= 9996459 file number=5 first dba=182272 block cnt=1024
obj#=76227 tim=1373104724882545!

!93
Super huge IOs
TRACEFILE EXCERPT

BLOCK ID

EXTENT SIZE

EXTENT ID

dba=174468 block cnt=1020

174464

1024

197

dba=175489 block cnt=1023

175488

1024

198

dba=176520 block cnt=632

176512

8192

199

dba=177152 block cnt=1024
dba=178176 block cnt=1024
dba=179200 block cnt=1024
dba=180224 block cnt=1024
dba=181248 block cnt=1024

!94
Super huge IOs
WAIT!
!

Isn’t the maximum physical (OS) limit 1MB on
linux?
(and most other platforms)
!

Did Oracle magically overcome this limitation?

!95
Super huge IOs
•Investigate!
!

•Throttle IOPS to get waits
•Setup session with MBRC=1024 & sql_trace level8
•Attach with gdb, break on io_submit, skip ~ 200

!96
Super huge IOs
(gdb) break io_submit!
Breakpoint 1 at 0x3f38200660: file io_submit.c, line 23.!
(gdb) ignore 1 198!
Will ignore next 198 crossings of breakpoint 1.!
(gdb) c!
Continuing.!

!
Breakpoint 1, io_submit (ctx=0x7f601fd11000, nr=8, iocbs=0x7fffbcbcb4f0) at
io_submit.c:23!
23!
io_syscall3(int, io_submit, io_submit, io_context_t, ctx, long, nr, struct
iocb **, iocbs)!

!

!97
Super huge IOs
!
(gdb) print *iocbs[0]!
$3 = {data = 0x7f601decfd60, key = 0, __pad2 = 0, aio_lio_opcode = 0, aio_reqprio =
0, aio_fildes = 257, u = {c = {buf = 0x7f601c544000, nbytes = 1048576, offset =
3319791616, __pad3 = 0, flags = 0, resfd = 0}, v = {vec = 0x7f601c544000, nr =
1048576, offset = 3319791616}, poll = {events = 475283456, __pad1 = 32608}, saddr =
{addr = 0x7f601c544000, len = 1048576}}}!
(gdb) print *iocbs[1]!
$4 = {data = 0x7f601decf658, key = 0, __pad2 = 0, aio_lio_opcode = 0, aio_reqprio =
0, aio_fildes = 256, u = {c = {buf = 0x7f601c444000, nbytes = 1048576, offset =
3328180224, __pad3 = 0, flags = 0, resfd = 0}, v = {vec = 0x7f601c444000, nr =
1048576, offset = 3328180224}, poll = {events = 474234880, __pad1 = 32608}, saddr =
{addr = 0x7f601c444000, len = 1048576}}}!

•From tracefile:
!

WAIT #140050797898872: nam='direct path read' ela= 7996566 file number=5 first
dba=177152 block cnt=1024 obj#=76227 tim=1381843670205294

!98
Super huge IOs
! Conclusion:
! When direct path multiblock IO is done.
! The database can read up to 1024 blocks.
! If BMB, MBRC allow

!99
Conclusion
! In Oracle version 10.2 and earlier non-PX reads use:
! db file sequential read / db file scattered read
events
! Read blocks go to buffercache.
! Starting from Oracle version 11 reads could do both
! buffered reads
! unbuffered or direct path reads
! Segment size determination
! Segment header or statistics (11.2.0.2+)
!100
Conclusion
! Direct path read is decision in IO codepath of full
scan.
! NOT an optimiser decision(!)
!

! In Oracle version 11, a read is done buffered,
unless the database decides to do a direct path
read

!101
Conclusion
! Direct path read decision for tables is influenced by
! Type of read (FTS)
! Size of segment (> 5 * _small_table_threshold)
! Number of blocks cached (< ~ 50%)
! Number of blocks dirty (< ~ 25%) -- + deterministics
! Direct path read decision for indexes is influenced by
! Type of read (FFIS)
! Size of segment (> 5 * _very_large_object_threshold)
! Number of blocks cached (< ~ 50%)
! Number of blocks dirty (< ~ 25%) -- + determinstics
!102
Conclusion
! By default, (AIO) direct path read uses two slots.
! ‘autotune’ scales up in steps.
! Direct path code has an ‘autotune’ function, which can
add IO slots.
! I’ve witnessed it scale up to 32 slots.
! In order to be able to use more bandwidth
! Direct path ‘autotune’ works for PX reads too!
! ‘autotune’ does not kick in with Oracle version
11.2.0.1
! Probably because some measurements return 0
!103
!
!

Thank you for attending!
!

Questions?

!104
Thanks, Links, etc.
! Tanel Poder, Kerry Osborne, Jason Arneil, Klaas-Jan Jongsma, Doug Burns,
Cary Millsap, Freek d’Hooge, Jonathan Lewis, Christian Antognini.
! http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-readswhat-is.html
! http://dioncho.wordpress.com/2009/07/21/disabling-direct-path-read-forthe-serial-full-table-scan-11g/
! http://www.oracle.com/pls/db112/homepage
! http://hoopercharles.wordpress.com/2010/04/10/auto-tuneddb_file_multiblock_read_count-parameter/
! http://fritshoogland.wordpress.com/2012/04/26/getting-to-know-oraclewait-events-in-linux/

!105

More Related Content

What's hot

Recent my sql_performance Test detail
Recent my sql_performance Test detailRecent my sql_performance Test detail
Recent my sql_performance Test detailLouis liu
 
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014odnoklassniki.ru
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeSveta Smirnova
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Denish Patel
 
Open Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraOpen Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraSveta Smirnova
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTanel Poder
 
PostgreSQL performance archaeology
PostgreSQL performance archaeologyPostgreSQL performance archaeology
PostgreSQL performance archaeologyTomas Vondra
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Denish Patel
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Antonios Giannopoulos
 
Distributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevdayDistributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevdayodnoklassniki.ru
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingTanel Poder
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRestPGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRestPGDay.Amsterdam
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awrLouis liu
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationMydbops
 
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud PlatformPostgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud PlatformSungJae Yun
 
还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery还原Oracle中真实的cache recovery
还原Oracle中真实的cache recoverymaclean liu
 

What's hot (20)

Recent my sql_performance Test detail
Recent my sql_performance Test detailRecent my sql_performance Test detail
Recent my sql_performance Test detail
 
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
 
WiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-TreeWiredTiger In-Memory vs WiredTiger B-Tree
WiredTiger In-Memory vs WiredTiger B-Tree
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)
 
Open Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second eraOpen Source SQL databases enters millions queries per second era
Open Source SQL databases enters millions queries per second era
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
 
PostgreSQL performance archaeology
PostgreSQL performance archaeologyPostgreSQL performance archaeology
PostgreSQL performance archaeology
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018
 
Distributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevdayDistributed systems at ok.ru #rigadevday
Distributed systems at ok.ru #rigadevday
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention Troubleshooting
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRestPGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
 
MyAWR another mysql awr
MyAWR another mysql awrMyAWR another mysql awr
MyAWR another mysql awr
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL Administration
 
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud PlatformPostgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud Platform
 
还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery
 

Viewers also liked

전자도서관 이용하기
전자도서관 이용하기전자도서관 이용하기
전자도서관 이용하기kulibrary1
 
A Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl AraoA Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl AraoEnkitec
 
Panduan praktikum geometri analitik
Panduan praktikum geometri analitikPanduan praktikum geometri analitik
Panduan praktikum geometri analitiknitahidayati
 
Números al besòs
Números al besòsNúmeros al besòs
Números al besòsaajcgpss
 
Enkitec Exadata Human Factor
Enkitec Exadata Human FactorEnkitec Exadata Human Factor
Enkitec Exadata Human FactorEnkitec
 
Delitos contra las libertades
Delitos contra las libertadesDelitos contra las libertades
Delitos contra las libertadesVanessa Coronel
 
M&A. Troika and Sberbank
M&A. Troika and SberbankM&A. Troika and Sberbank
M&A. Troika and SberbankAnna Akimova
 
QPR product ad plan
QPR product ad planQPR product ad plan
QPR product ad planAnna Akimova
 
Alt tab - better apex tabs
Alt tab - better apex tabsAlt tab - better apex tabs
Alt tab - better apex tabsEnkitec
 
Fatkulin hotsos 2014
Fatkulin hotsos 2014Fatkulin hotsos 2014
Fatkulin hotsos 2014Enkitec
 
Panduan praktikum geometri analitik
Panduan praktikum geometri analitikPanduan praktikum geometri analitik
Panduan praktikum geometri analitiknitahidayati
 
APEX Security Primer
APEX Security PrimerAPEX Security Primer
APEX Security PrimerEnkitec
 
Welcome aboard-06-05-12
Welcome aboard-06-05-12Welcome aboard-06-05-12
Welcome aboard-06-05-12alwynniii
 
Epic Clarity Running on Exadata
Epic Clarity Running on ExadataEpic Clarity Running on Exadata
Epic Clarity Running on ExadataEnkitec
 
Making Sense of APEX Security by Christoph Ruepprich
Making Sense of APEX Security by Christoph RuepprichMaking Sense of APEX Security by Christoph Ruepprich
Making Sense of APEX Security by Christoph RuepprichEnkitec
 

Viewers also liked (19)

전자도서관 이용하기
전자도서관 이용하기전자도서관 이용하기
전자도서관 이용하기
 
A Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl AraoA Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl Arao
 
Panduan praktikum geometri analitik
Panduan praktikum geometri analitikPanduan praktikum geometri analitik
Panduan praktikum geometri analitik
 
Números al besòs
Números al besòsNúmeros al besòs
Números al besòs
 
Enkitec Exadata Human Factor
Enkitec Exadata Human FactorEnkitec Exadata Human Factor
Enkitec Exadata Human Factor
 
Delitos contra las libertades
Delitos contra las libertadesDelitos contra las libertades
Delitos contra las libertades
 
M&A. Troika and Sberbank
M&A. Troika and SberbankM&A. Troika and Sberbank
M&A. Troika and Sberbank
 
Brochure
BrochureBrochure
Brochure
 
QPR product ad plan
QPR product ad planQPR product ad plan
QPR product ad plan
 
Angles
AnglesAngles
Angles
 
Alt tab - better apex tabs
Alt tab - better apex tabsAlt tab - better apex tabs
Alt tab - better apex tabs
 
Fatkulin hotsos 2014
Fatkulin hotsos 2014Fatkulin hotsos 2014
Fatkulin hotsos 2014
 
Panduan praktikum geometri analitik
Panduan praktikum geometri analitikPanduan praktikum geometri analitik
Panduan praktikum geometri analitik
 
APEX Security Primer
APEX Security PrimerAPEX Security Primer
APEX Security Primer
 
Program linier
Program linierProgram linier
Program linier
 
Plan de trabajo tutorado
Plan de trabajo tutoradoPlan de trabajo tutorado
Plan de trabajo tutorado
 
Welcome aboard-06-05-12
Welcome aboard-06-05-12Welcome aboard-06-05-12
Welcome aboard-06-05-12
 
Epic Clarity Running on Exadata
Epic Clarity Running on ExadataEpic Clarity Running on Exadata
Epic Clarity Running on Exadata
 
Making Sense of APEX Security by Christoph Ruepprich
Making Sense of APEX Security by Christoph RuepprichMaking Sense of APEX Security by Christoph Ruepprich
Making Sense of APEX Security by Christoph Ruepprich
 

Similar to About Multiblock Reads v4

London Spark Meetup Project Tungsten Oct 12 2015
London Spark Meetup Project Tungsten Oct 12 2015London Spark Meetup Project Tungsten Oct 12 2015
London Spark Meetup Project Tungsten Oct 12 2015Chris Fregly
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesSeveralnines
 
Proving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobProving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobKapil Goyal
 
When is something overflowing
When is something overflowingWhen is something overflowing
When is something overflowingPeter Hlavaty
 
CONFidence 2015: when something overflowing... - Peter Hlavaty
CONFidence 2015: when something overflowing... - Peter HlavatyCONFidence 2015: when something overflowing... - Peter Hlavaty
CONFidence 2015: when something overflowing... - Peter HlavatyPROIDEA
 
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...Insight Technology, Inc.
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesTanel Poder
 
Creating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksCreating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksTim Callaghan
 
HPCC Systems vs Hadoop
HPCC Systems vs HadoopHPCC Systems vs Hadoop
HPCC Systems vs HadoopFujio Turner
 
Project Basecamp: News From Camp 4
Project Basecamp: News From Camp 4Project Basecamp: News From Camp 4
Project Basecamp: News From Camp 4Digital Bond
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsBenjamin Darfler
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLMorgan Tocker
 
Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...
Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...
Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...Chris Fregly
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyAerospike
 
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container DayQuantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container DayPhil Estes
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...ScyllaDB
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaAmazee Labs
 
Finding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated DisassemblyFinding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated DisassemblyPriyanka Aash
 

Similar to About Multiblock Reads v4 (20)

London Spark Meetup Project Tungsten Oct 12 2015
London Spark Meetup Project Tungsten Oct 12 2015London Spark Meetup Project Tungsten Oct 12 2015
London Spark Meetup Project Tungsten Oct 12 2015
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
 
Proving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobProving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slob
 
When is something overflowing
When is something overflowingWhen is something overflowing
When is something overflowing
 
CONFidence 2015: when something overflowing... - Peter Hlavaty
CONFidence 2015: when something overflowing... - Peter HlavatyCONFidence 2015: when something overflowing... - Peter Hlavaty
CONFidence 2015: when something overflowing... - Peter Hlavaty
 
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
Creating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksCreating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just Works
 
HPCC Systems vs Hadoop
HPCC Systems vs HadoopHPCC Systems vs Hadoop
HPCC Systems vs Hadoop
 
Project Basecamp: News From Camp 4
Project Basecamp: News From Camp 4Project Basecamp: News From Camp 4
Project Basecamp: News From Camp 4
 
RocksDB meetup
RocksDB meetupRocksDB meetup
RocksDB meetup
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Presentation day4 oracle12c
Presentation day4 oracle12cPresentation day4 oracle12c
Presentation day4 oracle12c
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...
Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...
Madrid Spark Big Data Bluemix Meetup - Spark Versus Hadoop @ 100 TB Daytona G...
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
 
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container DayQuantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & Kibana
 
Finding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated DisassemblyFinding Xori: Malware Analysis Triage with Automated Disassembly
Finding Xori: Malware Analysis Triage with Automated Disassembly
 

More from Enkitec

Using Angular JS in APEX
Using Angular JS in APEXUsing Angular JS in APEX
Using Angular JS in APEXEnkitec
 
Controlling execution plans 2014
Controlling execution plans   2014Controlling execution plans   2014
Controlling execution plans 2014Enkitec
 
Engineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service DemonstrationEngineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service DemonstrationEnkitec
 
Think Exa!
Think Exa!Think Exa!
Think Exa!Enkitec
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneEnkitec
 
In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1Enkitec
 
Mini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingMini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingEnkitec
 
Profiling Oracle with GDB
Profiling Oracle with GDBProfiling Oracle with GDB
Profiling Oracle with GDBEnkitec
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the TradeEnkitec
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsEnkitec
 
SQL Tuning Tools of the Trade
SQL Tuning Tools of the TradeSQL Tuning Tools of the Trade
SQL Tuning Tools of the TradeEnkitec
 
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan StabilityUsing SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan StabilityEnkitec
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceEnkitec
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture PerformanceEnkitec
 
How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?Enkitec
 
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Enkitec
 
Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)Enkitec
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writerEnkitec
 
Combining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM StabilityCombining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM StabilityEnkitec
 
Why You May Not Need Offloading
Why You May Not Need OffloadingWhy You May Not Need Offloading
Why You May Not Need OffloadingEnkitec
 

More from Enkitec (20)

Using Angular JS in APEX
Using Angular JS in APEXUsing Angular JS in APEX
Using Angular JS in APEX
 
Controlling execution plans 2014
Controlling execution plans   2014Controlling execution plans   2014
Controlling execution plans 2014
 
Engineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service DemonstrationEngineered Systems: Environment-as-a-Service Demonstration
Engineered Systems: Environment-as-a-Service Demonstration
 
Think Exa!
Think Exa!Think Exa!
Think Exa!
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1
 
Mini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingMini Session - Using GDB for Profiling
Mini Session - Using GDB for Profiling
 
Profiling Oracle with GDB
Profiling Oracle with GDBProfiling Oracle with GDB
Profiling Oracle with GDB
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the Trade
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
 
SQL Tuning Tools of the Trade
SQL Tuning Tools of the TradeSQL Tuning Tools of the Trade
SQL Tuning Tools of the Trade
 
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan StabilityUsing SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
Using SQL Plan Management (SPM) to Balance Plan Flexibility and Plan Stability
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
 
How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?
 
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
 
Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)Sql tuning made easier with sqltxplain (sqlt)
Sql tuning made easier with sqltxplain (sqlt)
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
 
Combining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM StabilityCombining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM Stability
 
Why You May Not Need Offloading
Why You May Not Need OffloadingWhy You May Not Need Offloading
Why You May Not Need Offloading
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

About Multiblock Reads v4

  • 1. About multiblock reads ! ! ! Frits Hoogland - UKOUG Tech - 2013 !1
  • 2. Who am I? ! Frits Hoogland ! Working with Oracle products since 1996 ! ! ! Blog: http://fritshoogland.wordpress.com ! Twitter: @fritshoogland ! Email: frits.hoogland@enkitec.com ! Oracle ACE Director ! OakTable Member !2
  • 3. Agenda ! The Oracle full segment scan implementation ! Options for full segments scans <=V10 vs. V11 ! ! ‘traditional full scans’ & direct path full scans ! ! ‘autotune’ / adaptive direct path reads ! ! Implementation details !3
  • 4. What is this presentation about? ! Multiblock reads can behave different after 10.2 ! This could lead to different behavior of applications using the database. ! ! I assume the audience to have basic understanding about: ! Oracle execution plans ! Oracle SQL/10046 extended traces ! General execution behavior of the RDBMS engine ! C language in general !4
  • 5. Row source operations ! Multiblock reads are an optimised method to read database blocks from disk for a database process. ! ! Mainly used for the rowsource operations: ! ! ‘TABLE ACCESS FULL’ ! ‘FAST FULL INDEX SCAN’ ! ‘BITMAP FULL SCAN’ !5
  • 6. Row source operations ! For much of other segment access rowsource actions, like: ! ‘INDEX UNIQUE SCAN’ ! ‘INDEX RANGE SCAN’ ! ‘INDEX FULL SCAN’ ! ‘TABLE ACCESS BY INDEX ROWID’ ! Use single block reads are mostly used. ! The order in which individual blocks are read is important. !6
  • 7. db file multiblock read count ! MBRC is the maximum amount of blocks read in one IO call. ! Buffered MBRC cannot cross extent borders. ! ! Concepts guide on full table scans: (11.2 version) ! A scan of table data in which the database sequentially reads all rows from a table and filters out those that do not meet the selection criteria. All data blocks under the high water mark are scanned. !7
  • 8. db file multiblock read count ! Multiblock reads are done up to DB_FILE_MULTIBLOCK_READ_COUNT blocks. ! If MBRC is unset, default is ‘maximum IO size that can be efficiently performed’. ! ! Most operating systems allow a single IO operation up to 1 MB. ! I prefer to set it manually. !8
  • 9. My test environment ! Mac OSX Mountain Lion, VMware fusion ! VM: OL6u3 x64 ! Database 10.2.0.1, 11.2.0.1 and 11.2.0.3 ! ASM GI 11.2.0.3 ! ! Sample tables: ! T1 - 21504 blocks - 176M - 1’000’000 rows ! PK index - 2304 blocks / 19M ! T2 - 21504 blocks - 176M - 1’000’000 rows !9
  • 10. First test ! 10.2.0.1 instance: ! sga_target = 600M ! Effective buffercache size = 450M ! Freshly started !10
  • 11. First test TS@v10201 > select /*+ index(t t1_pk_ix) */ count(id), sum(scattered) from t1 t;! ! COUNT(ID) SUM(SCATTERED)! ---------- --------------! 1000000 9999500000! ! ----------------------------------------------------------------------------------! | Id | Operation | Name | Rows | Bytes | Cost (%CPU) |! ----------------------------------------------------------------------------------! | 0 | SELECT STATEMENT | | 1 | 5 | 23234 (1) |! | 1 | SORT AGGREGATE | | 1 | 5 | |! | 2 | TABLE ACCESS BY INDEX ROWID | T1 | 1000K | 4884K | 23234 (1) |! | 3 | INDEX FULL SCAN | T1_PK_IX | 1000K | | 2253 (2) |! ----------------------------------------------------------------------------------! !11
  • 12. First test ! How would you expect Oracle 10.2.0.1 to execute this? ! ! In other words: ! What would be the result of a SQL trace with waits? * ! * If all blocks need to be read from disk (ie. not cached) !12
  • 13. First test ! My guess would be: ! ! Index root bock (1 block) ! None, one or more branch blocks (1 block) ! Index leaf block, fetch values (1 block) ! Table block via index rowid, fetch value(s) (1/1+ block) ! Index values, block value(s), etc. !13
  • 14. First test ! That should look like something like this: ! WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT ... #8: #8: #8: #8: #8: #8: #8: #8: #8: #8: #8: #8: #8: #8: nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db file file file file file file file file file file file file file file sequential sequential sequential sequential sequential sequential sequential sequential sequential sequential sequential sequential sequential sequential read' read' read' read' read' read' read' read' read' read' read' read' read' read' ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= 326 file#=5 block#=43028 blocks=1! 197 file#=5 block#=43719 blocks=1! 227 file#=5 block#=43029 blocks=1! 125 file#=5 block#=20 blocks=1! 109 file#=5 block#=21 blocks=1! 242 file#=5 block#=22 blocks=1! 98 file#=5 block#=23 blocks=1! 76 file#=5 block#=24 blocks=1! 77 file#=5 block#=25 blocks=1! 77 file#=5 block#=26 blocks=1! 105 file#=5 block#=27 blocks=1! 82 file#=5 block#=28 blocks=1! 71 file#=5 block#=29 blocks=1! 93 file#=5 block#=43030 blocks=1! !14
  • 15. First test ! Instead, I get: ! WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT ...! #4: #4: #4: #4: #4: #4: #4: #4: #4: #4: nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db file file file file file file file file file file scattered scattered scattered scattered scattered scattered scattered scattered scattered scattered read' read' read' read' read' read' read' read' read' read' ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= 361 220 205 219 192 141 123 190 231 113 file#=5 file#=5 file#=5 file#=5 file#=5 file#=5 file#=5 file#=5 file#=5 file#=5 block#=43025 blocks=8! block#=43713 blocks=8! block#=17 blocks=8! block#=25 blocks=8! block#=33 blocks=8! block#=41 blocks=8! block#=49 blocks=8! block#=57 blocks=8! block#=43033 blocks=8! block#=65 blocks=8! !15
  • 16. First test ! Sets of 8 blocks are read for rowsources which really need a single block. ! Reason: ! This is an empty cache. ! Oracle reads multiple blocks to get the cache filled. ! ‘cache warming’ ! Statistic (‘physical reads cache prefetch’) ! Needed to tune the BC down to 50M and pre-warm it with another table to get single block reads again (!!) ! Or set “_db_cache_pre_warm” to false !16
  • 17. Full scan - Oracle 10.2 ! Let’s look at an Oracle 10.2.0.1 database ! ! SGA_TARGET 600M ! ! Table TS.T2 size 21504 blks / 176M !17
  • 18. Full scan - Oracle 10.2 TS@v10201 > set autot on exp stat! TS@v10201 > select count(*) from t2;! ! COUNT(*)! ----------! 1000000! ! Execution Plan! ----------------------------------------------------------! Plan hash value: 3724264953! ! -------------------------------------------------------------------! | Id | Operation | Name | Rows | Cost (%CPU)| Time |! -------------------------------------------------------------------! | 0 | SELECT STATEMENT | | 1 | | 1 | | | 1 | | 2 | | 1007K| SORT AGGREGATE TABLE ACCESS FULL| T2 3674 (1)| 00:00:45 |! | 3674 |! (1)| 00:00:45 |! ------------------------------------------------------------------- !18
  • 19. Full scan - Oracle 10.2 Statistics! ----------------------------------------------------------! 212 0 recursive calls! db block gets! 20976 consistent gets! 20942 physical reads! 0 redo size! 515 bytes sent via SQL*Net to client! 469 bytes received via SQL*Net from client! 2 SQL*Net roundtrips to/from client! 4 sorts (memory)! 0 sorts (disk)! 1 rows processed! !19
  • 20. Full scan - Oracle 10.2 SYS@v10201 AS SYSDBA> ! select object_id, object_name, owner from dba_objects where object_name = 'T2';! ! OBJECT_ID OBJECT_NAME OWNER! ---------- --------------------------------------------- ------------------! 10237 T2 TS! ! ! SYS@v10201 AS SYSDBA> select * from x$kcboqh where obj# = 10237;! ! ADDR INDX INST_ID TS# OBJ# NUM_BUF HEADER! ---------------- -------- --------- ------- ------- --------- ----------------! FFFFFD7FFD5C6FA8 335 1 5 10237 20942 000000038FBCF840! !20
  • 21. Full scan - Oracle 10.2 TS@v10201 > select count(*) from t2;! ! Statistics! ----------------------------------------------------------! 0 recursive calls! 0 db block gets! 20953 consistent gets! 0 physical reads! 0 redo size! 515 bytes sent via SQL*Net to client! 469 bytes received via SQL*Net from client! 2 SQL*Net roundtrips to/from client! 0 sorts (memory)! 0 sorts (disk)! 1 rows processed! !21
  • 22. Full scan - Oracle 11.2 ! Now look at an Oracle 11.2.0.3 database ! ! SGA_TARGET 600M ! ! Table TS.T2 size 21504 blks / 176M !22
  • 23. Full scan - Oracle 11.2 TS@v11203 > select count(*) from t2;! ! COUNT(*)! ----------! 1000000! ! ! Execution Plan! ----------------------------------------------------------! Plan hash value: 3724264953! ! -------------------------------------------------------------------! | Id | Operation | Name | Rows | Cost (%CPU)| Time |! -------------------------------------------------------------------! | 0 | SELECT STATEMENT | | 1 | | 1 | | | 1 | | 2 | | 1000K| SORT AGGREGATE TABLE ACCESS FULL| T2 3672 (1)| 00:00:45 |! | 3672 |! (1)| 00:00:45 |! ------------------------------------------------------------------- !23
  • 24. Full scan - Oracle 11.2 Statistics! ----------------------------------------------------------! 217 0 recursive calls! db block gets! 20970 consistent gets! 20942 physical reads! 0 redo size! 526 bytes sent via SQL*Net to client! 523 bytes received via SQL*Net from client! 2 SQL*Net roundtrips to/from client! 4 sorts (memory)! 0 sorts (disk)! 1 rows processed! !24
  • 25. Full scan - Oracle 11.2 SYS@v11203 AS SYSDBA> ! select object_id, object_name, owner from dba_objects where object_name = 'T2';! ! OBJECT_ID OBJECT_NAME OWNER! ---------- ------------------------------------------------------ -------------! 66614 T2 TS! ! ! SYS@v11203 AS SYSDBA> select * from x$kcboqh where obj# = 66614;! ! ADDR INDX INST_ID TS# OBJ# NUM_BUF HEADER! ---------------- ------- -------- ------ ---------- ---------- ----------------! FFFFFD7FFC541B18 43 1 5 66614 1 000000039043E470! !25
  • 26. Full scan - Oracle 11.2 TS@v11203 > select count(*) from t2;! ! Statistics! ----------------------------------------------------------! 0 recursive calls! 0 db block gets! 20945 consistent gets! 20941 physical reads! 0 redo size! 526 bytes sent via SQL*Net to client! 523 bytes received via SQL*Net from client! 2 SQL*Net roundtrips to/from client! 0 sorts (memory)! 0 sorts (disk)! 1 rows processed! !26
  • 27. Full scan 10.2 vs. 11.2 ! Why does version 10 caches all the blocks read, ! And version 11 only 1 of them?? ! ! Let’s do an extended SQL trace ! AKA 10046 level 8 trace. !27
  • 28. Full scan 10.2 vs. 11.2 ! Relevant part of 10046/8 trace file of version 10.2.0.1: WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT WAIT #1: #1: #1: #1: #1: #1: #1: #1: #1: #1: #1: #1: #1: #1: #1: nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db nam='db file file file file file file file file file file file file file file file sequential read' ela= 32941 file#=5 block#=19 blocks=1! scattered read' ela= 4003 file#=5 block#=20 blocks=5! scattered read' ela= 6048 file#=5 block#=25 blocks=8! scattered read' ela= 1155 file#=5 block#=34 blocks=7! scattered read' ela= 860 file#=5 block#=41 blocks=8! scattered read' ela= 837 file#=5 block#=50 blocks=7! scattered read' ela= 1009 file#=5 block#=57 blocks=8! scattered read' ela= 890 file#=5 block#=66 blocks=7! scattered read' ela= 837 file#=5 block#=73 blocks=8! scattered read' ela= 10461 file#=5 block#=82 blocks=7! scattered read' ela= 623 file#=5 block#=89 blocks=8! scattered read' ela= 1077 file#=5 block#=98 blocks=7! scattered read' ela= 49146 file#=5 block#=105 blocks=8! scattered read' ela= 719 file#=5 block#=114 blocks=7! scattered read' ela= 1093 file#=5 block#=121 blocks=8! !28
  • 29. Full scan 10.2 vs. 11.2 ! Relevant part of 10046/8 trace file of version 11.2.0.3: WAIT #140120507194664: ! nam='db file sequential read' ela= 12607 file#=5 block#=43394 blocks=1 ! obj#=14033 tim=1329685383169372! nam='direct path read' ela= 50599 file number=5 first dba=43395 block cnt=13! nam='direct path read' ela= 21483 file number=5 first dba=43425 block cnt=15! nam='direct path read' ela= 10766 file number=5 first dba=43441 block cnt=15! nam='direct path read' ela= 12915 file number=5 first dba=43457 block cnt=15! nam='direct path read' ela= 12583 file number=5 first dba=43473 block cnt=15! nam='direct path read' ela= 11899 file number=5 first dba=43489 block cnt=15! nam='direct path read' ela= 10010 file number=5 first dba=43505 block cnt=15! nam='direct path read' ela= 160237 file number=5 first dba=43522 block cnt=126! nam='direct path read' ela= 25561 file number=5 first dba=43650 block cnt=126! nam='direct path read' ela= 121507 file number=5 first dba=43778 block cnt=126! nam='direct path read' ela= 25253 file number=5 first dba=43906 block cnt=126 !29
  • 30. First single block read ! The segment header is read separately ! Single block, read into SGA ! ! The header block is listed in dba_segments ! select owner, segment_name, header_file, header_block ! from dba_segments where segment_name like 'T2';! ! OWNER! SEGMENT_NAME ! HEADER_FILE HEADER_BLOCK! ---------- -------------------- ----------- ------------! TS T2 5 130 !30
  • 31. Full scan 10.2 vs. 11.2 ! A full scan uses direct path reads in the v11 case. ! ! Noticeable by ‘direct path read’ event ! ! Direct path reads go to PGA ! Which means the blocks read are not cached !31
  • 32. Full scan 10.2 vs. 11.2 ! Do all full scans in version 11 always use direct path? ! Direct path reads are considered ! #blk of the segment > 5*_small_table_threshold ! ! PS: MOS note 787373.1 ! “How does Oracle load data into the buffer cache for table scans ?” ! Mentions _small_table_threshold being the limit ! Note INCORRECT! !32
  • 33. Direct path read ! Small table threshold of my Oracle 11 instance: ! NAME! ! ! VALUE! -------------------------- --------------------------! _small_table_threshold 245! ! !33
  • 34. Direct path read ! This means objects up to 245*5=1225 blocks will be read into buffercache / SGA. ! ! Let’s create a table with a size just below 1225 blocks: ! TS@v11203 > create table t1_small as select * from t1 where id <= 47000;! TS@v11203 > exec dbms_stats.gather_table_stats(null,‘T1_SMALL’); !34
  • 35. Direct path read ! ! SYS@v11203 AS SYSDBA>! ! select segment_name, blocks, bytes! from dba_segments where segment_name = 'T1_SMALL';! ! SEGMENT_NAME BLOCKS BYTES! -------------------------------------- ---------- ----------! T1_SMALL 1024 8388608! ! ! SQL@v11203 AS SYSDBA> alter system flush buffer_cache;! ! !35
  • 36. Direct path read TS@v11203 > set autot trace exp stat! TS@v11203 > select count(*) from t1_small;! ! ! Execution Plan! ----------------------------------------------------------! Plan hash value: 1277318887! ! -----------------------------------------------------------------------! | Id | Operation | Name | Rows | Cost (%CPU)| Time |! -----------------------------------------------------------------------! | 0 | SELECT STATEMENT | | 1 | | 1 | | | 1 | | 2 | SORT AGGREGATE TABLE ACCESS FULL| T1_SMALL | 47000 | 176 (1)| 00:00:03 |! | 176 |! (1)| 00:00:03 |! ----------------------------------------------------------------------- !36
  • 37. Direct path read Statistics! ----------------------------------------------------------! ! 0 recursive calls! ! 0 db block gets! ! 983 consistent gets! ! 979 physical reads! ! 0 ! 527 bytes sent via SQL*Net to client! ! 523 bytes received via SQL*Net from client! ! 2 SQL*Net roundtrips to/from client! ! 0 sorts (memory)! ! 0 sorts (disk)! ! 1 rows processed redo size! !37
  • 38. Direct path read SYS@v11203 AS SYSDBA> ! select object_id, object_name, owner ! from dba_objects where object_name = 'T1_SMALL';! ! OBJECT_ID OBJECT_NAME OWNER! ---------- ------------------------------------------------------ -------------! 66729 T1_SMALL TS! ! ! SYS@v11203 AS SYSDBA> select * from x$kcboqh where obj# = 66729;! ! ADDR INDX INST_ID TS# OBJ# NUM_BUF HEADER! ---------------- ------ ------- ------ ------- ---------- ----------------! FFFFFD7FFC6E1EF0 0 1 5 66729 979 0000000390437840! !38
  • 39. Direct path read ! Ah, now the full scan is buffered! ! Another scan will reuse the cached blocks now: ! TS@v11203 > select count(*) from t1_small;! ! ...! ! Statistics! ----------------------------------------------------------! ! 0 recursive calls! ! 0 db block gets! ! 983 ! 0 consistent gets! physical reads !39
  • 40. Direct path read ! ! What type of wait event will be used for a full scan: ! ! Oracle version 11.2 ! If size is smaller than 5*_small_table_threshold? !40
  • 41. Direct path read Well, try it: ! TS@v11203 > alter session set events ‘10046 trace name context forever, level 8’;! TS@v11203 > select count(*) from t1_small;! ...! TS@v11203 > alter session set events ‘10046 trace name context off’;! ! It shows: ! WAIT #140358956326184: ! nam='db file sequential read' ela= 38476 file#=5 block#=88706 blocks=1 ! obj#=14047 tim=1330369985672633! nam='db file scattered read' ela= 116037 file#=5 block#=88707 blocks=5 ! nam='db file scattered read' ela= 56675 file#=5 block#=88712 blocks=8 ! nam='db file scattered read' ela= 11195 file#=5 block#=88721 blocks=7 ! nam='db file scattered read' ela= 132928 file#=5 block#=88728 blocks=8! nam='db file scattered read' ela= 18692 file#=5 block#=88737 blocks=7 ! nam='db file scattered read' ela= 87817 file#=5 block#=88744 blocks=8 ! !41
  • 42. Oracle 11 multiblock IO ! In version 11 of the Oracle database ! ! Multiblocks reads use both wait events: ! ! db file scattered read ! direct path read ! ! Which are two different codepath’s !42
  • 43. Implementation ! Buffered multiblock reads ! ! Buffered multiblock reads == ‘db file scattered read’ ! Up to version 10 the ONLY* option for non-PQ multiblock reads ! Starting from version 11, a possible multiblock read option !43
  • 44. Buffered multiblock reads SYS@v10201 AS SYSDBA> ! select segment_name, extent_id, block_id, blocks, bytes! from dba_extents ! where segment_name = 'T2' and owner = 'TS' order by extent_id;! ! SEGMENT_NAME EXTENT_ID BLOCKS BYTES! ------------------------------------- ---------- --------- ----------! T2 0 8 65536! T2 15 8 65536! T2 16 128 1048576! T2 78 128 1048576! T2 79 1024 8388608! 91 1024 8388608 ...! ...! ...! T2 !44
  • 45. Buffered multiblock reads Version 10 multiblock reads: WAIT #2: nam='db file sequential read' ela= 12292 file#=5 block#=19 blocks=1! WAIT #2: nam='db file scattered read' ela= 179162 file#=5 block#=20 blocks=5! WAIT #2: nam='db file scattered read' ela= 47597 file#=5 block#=25 blocks=8! WAIT #2: nam='db file scattered read' ela= 5206 file#=5 block#=34 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 94101 file#=5 block#=41 blocks=8 ! WAIT #2: nam='db file scattered read' ela= 512 file#=5 block#=50 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 87657 file#=5 block#=57 blocks=8 ! WAIT #2: nam='db file scattered read' ela= 27488 file#=5 block#=66 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 24316 file#=5 block#=73 blocks=8 ! WAIT #2: nam='db file scattered read' ela= 55251 file#=5 block#=82 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 641 file#=5 block#=89 blocks=8 ! WAIT #2: nam='db file scattered read' ela= 455 file#=5 block#=98 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 43826 file#=5 block#=105 blocks=8 ! WAIT #2: nam='db file scattered read' ela= 32685 file#=5 block#=114 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 60212 file#=5 block#=121 blocks=8 ! WAIT #2: nam='db file scattered read' ela= 37735 file#=5 block#=130 blocks=7 ! WAIT #2: nam='db file scattered read' ela= 59565 file#=5 block#=137 blocks=8 ! (ps: edited for clarity) !45
  • 46. nam='db file sequentialread' ela= 94101file#=5 block#=34 blocks=7 nam='db file scattered read' ela= 87657 file#=5 block#=41 blocks=8 nam='db filescattered read' ela=5206 file#=5block#=57 blocks=1 scattered 512 nam='db file scattered read' ela= 179162file#=5block#=50 blocks=7 47597 12292 block#=25 blocks=5 block#=19 block#=20 nam='db file scattered read' ela= 87657 file#=5 block#=147 blocks=126 17 129 25 137 33 145 41 49 57 273 65 73 401 81 89 529 97 105 657 113 121 785 !46
  • 47. Non buffered multiblock reads WAIT #140120507194664: ! nam='db file sequential read' ela= 12607 file#=5 block#=43394 blocks=1 obj#=14033 tim=1329685383169372! ! nam='direct nam='direct nam='direct nam='direct nam='direct nam='direct nam='direct nam='direct nam='direct nam='direct nam='direct path path path path path path path path path path path read' read' read' read' read' read' read' read' read' read' read' ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= ela= 50599 file number=5 first dba=43395 block cnt=13! 21483 file number=5 first dba=43425 block cnt=15! 10766 file number=5 first dba=43441 block cnt=15! 12915 file number=5 first dba=43457 block cnt=15! 12583 file number=5 first dba=43473 block cnt=15! 11899 file number=5 first dba=43489 block cnt=15! 10010 file number=5 first dba=43505 block cnt=15! 160237 file number=5 first dba=43522 block cnt=126! 25561 file number=5 first dba=43650 block cnt=126! 121507 file number=5 first dba=43778 block cnt=126! 25253 file number=5 first dba=43906 block cnt=126 !47
  • 48. nam='direct file sequential read' ela= 12607 on this later. nam='db path read' ela= 50599 file more file#=5 block#=43394block cnt=13 not in 21483 tracefile. number=5 first dba=43395 blocks=1 dba=43425 cnt=15 43392 43400 43408 43504 43512 43520 43418 43424 43432 43648 43440 43448 43776 43456 43464 43904 43472 43480 43488 43496 !48
  • 49. ASSM ! Automatic segment space management ! Tablespace property ! Default since Oracle 10.2 ! Uses L 1/2/3 bitmap blocks for space management ! 8 blocks: 1 BMB as first block of every other extent ! 128 blocks: 2 BMB as first blocks in all extents ! 1024 blocks: 4 BMB as first blocks in all extents !49
  • 50. Multiblock implementation ! Conclusion: ! Buffered reads scan up to: ! Non data (space admin. bitmap) block ! Extent border ! Block already in cache (from TOP) ! ! Direct path/non buffered reads scan up to: ! Non data (space admin. bitmap) block ! Block already in cache (from TOP) !50
  • 51. Waits and implementation ! ‘Wait’ or wait event ! Part of the formula: ! Elapsed time = CPU time + Wait time ! Inside the Oracle database it is meant to record the time spent in a specific part of the oracle database code not running on CPU. ! Let’s look at the implementation of some of the wait events for multiblock reads! !51
  • 52. Waits and implementation • Quick primer on read IO System calls on Linux: ! • Synchronous • pread64() ! • Asynchronous • io_submit() • io_getevents() !52
  • 53. strace ! Linux tool for tracing (viewing) system calls ! Solaris/AIX: truss, HPUX: tusc. ! ! Very, very, useful to understand what is happening ! Much people are using it for years • STRACE LIES! (at least on linux) !53
  • 54. strace lies ! strace doesn’t show io_getevents() if: ! timeout struct set to {0,0} (‘zero’) ! does not succeed in reaping min_nr IO’s ! ! This strace omission is not documented! !54
  • 55. strace lies ! This is best seen with system’s IO capability severely throttled (1 IOPS) ! See http://fritshoogland.wordpress.com/ 2012/12/15/throttling-io-with-linux/ ! ! Cgroups ! Control groups ! Linux feature ! Fully available with OL6 !55
  • 56. strace lies ! Strace output ! Version 11.2.0.3 ! IO throttled to 1 IOPS ! Full table scan doing ‘count(*) on t2’ ! With 10046 at level 8 ! To show where waits are occuring ! ! Start of FTS, up to first reap of IO !56
  • 57. strace lies io_submit(139801394388992, 1, {{0x7f260a8b3450, 0, 0, 0, 257}}) = 1! io_submit(139801394388992, 1, {{0x7f260a8b31f8, 0, 0, 0, 257}}) = 1! io_getevents(139801394388992, 1, 128, {{0x7f260a8b3450, 0x7f260a8b3450, 106496, 0}}, {600, 0}) = 1! write(8, "WAIT #139801362351208: nam='dire"..., 133) = 133! ! * edited for clarity !57
  • 58. strace lies ! Profile the same using ‘gdb’ ! Set breakpoints at functions: ! io_submit, io_getevents_0_4 ! kslwtbctx, kslwtectx ! Let gdb continue after breakpoint ! The symbol table is preserved in the oracle binary ! Making it able to set breakpoints at functions !58
  • 59. strace lies #0 #0 io_submit (ctx=0x7f46fe708000, nr=1, iocbs=0x7fff24547ce0) at io_submit.c:23! io_submit (ctx=0x7f46fe708000, nr=1, iocbs=0x7fff24547ce0) at io_submit.c:23! ! Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128, events=0x7fff24550348, timeout=0x7fff24551350) at io_getevents.c:46! Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128, events=0x7fff24553428, timeout=0x7fff24554430) at io_getevents.c:46! Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128, events=0x7fff24550148, timeout=0x7fff24551150) at io_getevents.c:46! Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=2, nr=128, events=0x7fff24553228, timeout=0x7fff24554230) at io_getevents.c:46! ! #0 0x0000000008f9a652 in kslwtbctx ()! Breakpoint 3, io_getevents_0_4 (ctx=0x7f46fe708000, min_nr=1, nr=128, events=0x7fff24550138, timeout=0x7fff24551140) at io_getevents.c:46! #0 0x0000000008fa1334 in kslwtectx ()! ! * edited for clarity !59
  • 60. db file scattered read Basic principle ela time of ‘db file scattered read’ file # and # blocks are determined read ready, blocks available time read call of # bytes !60
  • 61. db file scattered read Implementation synchronous IO 10.2.0.1/11.2.0.1/11.2.0.3 file # and # blocks are determined ela time of ‘db file scattered read’ read ready, blocks available time darker quare means ‘optional’ pread64(fd, buf, #bytes, offset) !61
  • 62. db file scattered read Implementation asynchronous IO 10.2.0.1 file # and # blocks are determined ela time of ‘db file scattered read’ read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) !62
  • 63. db file scattered read Implementation asynchronous IO 11.2.0.1 file # and # blocks are determined ela time of ‘db file scattered read’ read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) !63
  • 64. db file scattered read Implementation asynchronous IO 11.2.0.3 file # and # blocks are determined ela time of ‘db file scattered read’ read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) !64
  • 65. direct path read - 11g ! Time spent on waiting for reading blocks for putting them into the PGA ! Reports wait time of the request that gets reaped with a timed io_getevents() call. ! Multiple IO requests can be submitted with AIO ! At start, Oracle tries to keep 2 IO’s in flight ! Wait time is only reported if ‘waiting’ occurs ! Waiting means: not ALL IO’s can be reaped immediately after submitting !65
  • 66. direct path read 11g Implementation asynchronous IO 11.2.0.1 file # and # blocks are determined for a number of IO’s ela time of ‘direct path read’* read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) !66
  • 67. direct path read 11g Implementation asynchronous IO 11.2.0.3 file # and # blocks are determined for a number of IO’s ela time of ‘direct path read’* read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) !67
  • 68. direct path read Implementation asynchronous IO 10.2.0.1 file # and # blocks are determined for a number of IO’s ela time of ‘direct path read’ read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) !68
  • 69. direct path read Implementation synchronous IO ** 10.2.0.1/11.2.0.1/11.2.0.3 file # and # blocks are determined for a number of IO’s ela time of ‘direct path read’ read ready, blocks available time pread64(fd, buf, #bytes, offset) !69
  • 70. kfk: async disk IO ! Only seen with ‘direct path read’ waits and ASM ! Always seen in version 11.2.0.1 ! ! Not normally seen in version 11.2.0.2+ ! ! KFK = Kernel File K.. (?) ! Most ASM code is in this layer !70
  • 71. kfk: async disk io Implementation asynchronous IO 11.2.0.1 ela time of ‘direct path read’* file # and # blocks are determined for a number of IO’s read ready, blocks available time io_submit(aio_ctx, #cb, {iocb}) io_getevents(aio_ctx, min_nr, nr, io_event, timeout) kfk: async disk io !71
  • 72. IO Slots ! ! ! ! Discussion with Kerry Osborne about IO’s on Exadata !72
  • 73. !73
  • 74. !74
  • 75. IO Slots ! Jonathan Lewis pointed me to ‘total number of slots’ ! v$sysstat ! v$sesstat ! Global or per session number of slots ! ! ‘Slots are a unit of I/O and this factor controls the number of outstanding I/Os’ ! Comment with event 10353 !75
  • 76. IO Slots ! ‘total number of slots’ ! ! Is NOT cumulative! ! ! So you won’t capture this statistic when taking delta’s from v$sysstat/v$sesstat! !76
  • 77. IO Slots ! Let’s look at the throughput statistics again ! ! But together with number of slots !77
  • 78. !78
  • 79. !79
  • 80. IO Slots ! IO Slots are a mechanism to take advantage of storage bandwidth using AIO ! With version 11 direct path reads can be used by both PQ slaves as well as non PQ foregrounds ! IO Slots are not used with buffered reads ! Each outstanding asynchronous IO request is tracked using what is called a ‘slot’ ! Default and minimal number of slots: 2 !80
  • 81. ‘autotune’ ! The direct path code changed with version 11 ! Second observation: ! The database foreground measures direct path IO effectiveness ! It measures time, wait time and throughput ! The oracle process has the ability to add more asynchronous IO slots ! Only does so starting from 11.2.0.2 ! Although the mechanism is there in 11.2.0.1 !81
  • 82. ‘autotune’ ! Introducing event 10365 ! “turn on debug information for adaptive direct reads” ! ! Set to 1 to get debug information ! alter session set events ‘10365 trace name context forever, level 1’ !82
  • 83. ‘autotune’ kcbldrsini: Timestamp 61180 ms! kcbldrsini: Current idx 16! kcbldrsini: Initializing kcbldrps! kcbldrsini: Slave idx 17! kcbldrsini: Number slots 2! kcbldrsini: Number of slots per session 2! ! *** 2011-11-28 22:58:48.808! kcblsinc:Timing time 1693472, wait time 1291416, ratio 76 st 248752270 cur 250445744! kcblsinc: Timing curidx 17 session idx 17! kcblsinc: Timestamp 64180 ms! kcblsinc: Current idx 17! kcblsinc: Slave idx 17! kcblsinc: Number slots 2! kcblsinc: Number of slots per session 2! kcblsinc: Previous throughput 8378 state 2! kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0 !83
  • 84. ‘autotune’ *** 2011-11-28 22:58:54.988! kcblsinc:Timing time 2962717, wait time 2923226, ratio 98 st 253662983 cur 256625702! kcblsinc: Timing curidx 19 session idx 19! kcblsinc: Timestamp 70270 ms! kcblsinc: Current idx 19! kcblsinc: Slave idx 19! kcblsinc: Number slots 2! kcblsinc: Number of slots per session 2! kcblsinc: Previous throughput 11210 state 1! kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0! kcblsinc: Adding extra slos 1! ! *** 2011-11-28 22:58:58.999! kcblsinc:Timing time 4011239, wait time 3528563, ratio 87 st 256625785 cur 260637026! kcblsinc: Timing curidx 20 session idx 20! kcblsinc: Timestamp 74170 ms! kcblsinc: Current idx 20! kcblsinc: Slave idx 20! kcblsinc: Number slots 3! kcblsinc: Number of slots per session 3! kcblsinc: Previous throughput 12299 state 2! kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0 !84
  • 85. ‘autotune’ ! Looking at the 10365 trace, the reason 11.2.0.1 does not ‘autotune’ could be guessed.... !85
  • 86. ‘autotune’ *** 2011-11-28 22:54:18.361! kcblsinc:Timing time 3092929, wait time 0, ratio 0 st 4271872759 cur 4274965690! kcblsinc: Timing curidx 65 session idx 65! kcblsinc: Timestamp 192430 ms! kcblsinc: Current idx 65! kcblsinc: Slave idx 65! kcblsinc: Number slots 2! kcblsinc: Number of slots per session 2! kcblsinc: Previous throughput 20655 state 2! kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0! ! *** 2011-11-28 22:54:21.306! kcblsinc:Timing time 2944852, wait time 0, ratio 0 st 4274965762 cur 4277910616! kcblsinc: Timing curidx 66 session idx 66! kcblsinc: Timestamp 195430 ms! kcblsinc: Current idx 66! kcblsinc: Slave idx 66! kcblsinc: Number slots 2! kcblsinc: Number of slots per session 2! kcblsinc: Previous throughput 20746 state 1! kcblsinc: adaptive direct read mode 1, adaptive direct write mode 0 !86
  • 87. IO slots 11.2.0.3 FAST IO kcblsinc () io_submit(aio_ctx,1 , {iocb}) kcblsinc () kcbgtcr () # slots starts with 2 kcbgtcr () kcbgtcr () time io_submit(aio_ctx,1 , {iocb}) io_submit(aio_ctx,1 , {iocb}) io_getevents(aio_ctx, 2, 128, io_event, {0, 0}) OK! io_submit(aio_ctx,1 , {iocb}) io_getevents(aio_ctx, 2, 128, io_event, {0, 0}) OK! !87
  • 88. IO slots 11.2.0.3 FAST IO kcblsinc () +1 ‘slos’ time io_submit(aio_ctx,1 , {iocb}) io_getevents(aio_ctx, 3, 128, io_event, {0, 0}) OK! !88
  • 89. Time and waits ! Waits implementation ! Most are system call instrumentation ! db file sequential read ! ‘direct path read’ is different. ! Only shows up if not all IO can be reaped immediately ! The wait only occurs if process is truly waiting ! With AIO, a process has the ability to keep on processing without waiting on IO ! Wait time is not physical IO latency !89
  • 90. Super huge IOs ! Christian Antognini showed me an AWR report ! With an experiment of setting MBRC to 4000 ! Which lead to IOs / stats of IOs > 1MB ! ! This conflicts with the official description about MBRC in the Oracle concepts manual !90
  • 91. Super huge IOs ! Traditional multiblock reads ! “db file scattered read” event ! Indeed limited to 1 MB / block size !91
  • 92. Super huge IOs ! New direct path multiblock read ! “direct path read” event ! Experiments lead to reads up to 1024 blocks ! On the Oracle level ! Tested with Oracle 11.2.0.3 on Linux x64 ! Limited by BMB, MBRC !92
  • 93. Super huge IOs nam='direct path read' ela= 7996378 file number=5 first dba=174468 block cnt=1020 obj#=76227 tim=1373104660882677! nam='direct path read' ela= 7995820 file number=5 first dba=175489 block cnt=1023 obj#=76227 tim=1373104668882345! nam='direct path read' ela= 7996472 file number=5 first dba=176520 block cnt=632 obj#=76227 tim=1373104676882677! nam='direct path read' ela= 7998049 file number=5 first dba=177152 block cnt=1024 obj#=76227 tim=1373104684883512! nam='direct path read' ela= 7995472 file number=5 first dba=178176 block cnt=1024 obj#=76227 tim=1373104692882932! nam='direct path read' ela= 7993677 file number=5 first dba=179200 block cnt=1024 obj#=76227 tim=1373104700880106! nam='direct path read' ela= 7996969 file number=5 first dba=180224 block cnt=1024 obj#=76227 tim=1373104708880891! nam='direct path read' ela= 5998630 file number=5 first dba=181248 block cnt=1024 obj#=76227 tim=1373104714882889! nam='direct path read' ela= 9996459 file number=5 first dba=182272 block cnt=1024 obj#=76227 tim=1373104724882545! !93
  • 94. Super huge IOs TRACEFILE EXCERPT BLOCK ID EXTENT SIZE EXTENT ID dba=174468 block cnt=1020 174464 1024 197 dba=175489 block cnt=1023 175488 1024 198 dba=176520 block cnt=632 176512 8192 199 dba=177152 block cnt=1024 dba=178176 block cnt=1024 dba=179200 block cnt=1024 dba=180224 block cnt=1024 dba=181248 block cnt=1024 !94
  • 95. Super huge IOs WAIT! ! Isn’t the maximum physical (OS) limit 1MB on linux? (and most other platforms) ! Did Oracle magically overcome this limitation? !95
  • 96. Super huge IOs •Investigate! ! •Throttle IOPS to get waits •Setup session with MBRC=1024 & sql_trace level8 •Attach with gdb, break on io_submit, skip ~ 200 !96
  • 97. Super huge IOs (gdb) break io_submit! Breakpoint 1 at 0x3f38200660: file io_submit.c, line 23.! (gdb) ignore 1 198! Will ignore next 198 crossings of breakpoint 1.! (gdb) c! Continuing.! ! Breakpoint 1, io_submit (ctx=0x7f601fd11000, nr=8, iocbs=0x7fffbcbcb4f0) at io_submit.c:23! 23! io_syscall3(int, io_submit, io_submit, io_context_t, ctx, long, nr, struct iocb **, iocbs)! ! !97
  • 98. Super huge IOs ! (gdb) print *iocbs[0]! $3 = {data = 0x7f601decfd60, key = 0, __pad2 = 0, aio_lio_opcode = 0, aio_reqprio = 0, aio_fildes = 257, u = {c = {buf = 0x7f601c544000, nbytes = 1048576, offset = 3319791616, __pad3 = 0, flags = 0, resfd = 0}, v = {vec = 0x7f601c544000, nr = 1048576, offset = 3319791616}, poll = {events = 475283456, __pad1 = 32608}, saddr = {addr = 0x7f601c544000, len = 1048576}}}! (gdb) print *iocbs[1]! $4 = {data = 0x7f601decf658, key = 0, __pad2 = 0, aio_lio_opcode = 0, aio_reqprio = 0, aio_fildes = 256, u = {c = {buf = 0x7f601c444000, nbytes = 1048576, offset = 3328180224, __pad3 = 0, flags = 0, resfd = 0}, v = {vec = 0x7f601c444000, nr = 1048576, offset = 3328180224}, poll = {events = 474234880, __pad1 = 32608}, saddr = {addr = 0x7f601c444000, len = 1048576}}}! •From tracefile: ! WAIT #140050797898872: nam='direct path read' ela= 7996566 file number=5 first dba=177152 block cnt=1024 obj#=76227 tim=1381843670205294 !98
  • 99. Super huge IOs ! Conclusion: ! When direct path multiblock IO is done. ! The database can read up to 1024 blocks. ! If BMB, MBRC allow !99
  • 100. Conclusion ! In Oracle version 10.2 and earlier non-PX reads use: ! db file sequential read / db file scattered read events ! Read blocks go to buffercache. ! Starting from Oracle version 11 reads could do both ! buffered reads ! unbuffered or direct path reads ! Segment size determination ! Segment header or statistics (11.2.0.2+) !100
  • 101. Conclusion ! Direct path read is decision in IO codepath of full scan. ! NOT an optimiser decision(!) ! ! In Oracle version 11, a read is done buffered, unless the database decides to do a direct path read !101
  • 102. Conclusion ! Direct path read decision for tables is influenced by ! Type of read (FTS) ! Size of segment (> 5 * _small_table_threshold) ! Number of blocks cached (< ~ 50%) ! Number of blocks dirty (< ~ 25%) -- + deterministics ! Direct path read decision for indexes is influenced by ! Type of read (FFIS) ! Size of segment (> 5 * _very_large_object_threshold) ! Number of blocks cached (< ~ 50%) ! Number of blocks dirty (< ~ 25%) -- + determinstics !102
  • 103. Conclusion ! By default, (AIO) direct path read uses two slots. ! ‘autotune’ scales up in steps. ! Direct path code has an ‘autotune’ function, which can add IO slots. ! I’ve witnessed it scale up to 32 slots. ! In order to be able to use more bandwidth ! Direct path ‘autotune’ works for PX reads too! ! ‘autotune’ does not kick in with Oracle version 11.2.0.1 ! Probably because some measurements return 0 !103
  • 104. ! ! Thank you for attending! ! Questions? !104
  • 105. Thanks, Links, etc. ! Tanel Poder, Kerry Osborne, Jason Arneil, Klaas-Jan Jongsma, Doug Burns, Cary Millsap, Freek d’Hooge, Jonathan Lewis, Christian Antognini. ! http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-readswhat-is.html ! http://dioncho.wordpress.com/2009/07/21/disabling-direct-path-read-forthe-serial-full-table-scan-11g/ ! http://www.oracle.com/pls/db112/homepage ! http://hoopercharles.wordpress.com/2010/04/10/auto-tuneddb_file_multiblock_read_count-parameter/ ! http://fritshoogland.wordpress.com/2012/04/26/getting-to-know-oraclewait-events-in-linux/ !105