SlideShare una empresa de Scribd logo
1 de 43
Exadata Consolidation Success Story
Getting the kids to play nice with each other…
1
Presented by:
Karl Arao
whoami
Karl Arao
• Senior Technical Consultant @ Enkitec
• Performance and Capacity Planning Enthusiast
7 years DBA experience
Oracle ACE, OCP-DBA, RHCE, OakTable
Blog: karlarao.wordpress.com
Wiki: karlarao.tiddlyspot.com
Twitter: @karlarao
www.enkitec.com 2
www.enkitec.com 3
100+
3
Agenda
• Architecture
• Tools and Methodology
– Simple consolidation scenario
– Provisioning workflow and the worksheet
• War Stories
www.enkitec.com 4
General Architecture
www.enkitec.com 5
Primary Site Standby Site
Production
Test & Dev Disaster Recovery
Future Growth
General Architecture
www.enkitec.com 6
The Stats
Three Half Rack Exadata clusters with High Cap. drives
Cluster #1
36 Dev/Test Databases
Cluster #2
11 Production Databases
Cluster #3
13 Dev/Test Databases
6 Standby Databases
Still more databases to come…
www.enkitec.com 7
Why Consolidate?
Primary drivers for consolidation center around cost savings
•Reduces Oracle software licensing
•3rd party products such as backup agents, ETL tools, etc…
•More efficient use of system resources
•Soft Costs
– Floor space
– Power & Cooling
– Administration, Staffing Costs
(training, etc.)
www.enkitec.com 8
www.enkitec.com 9
7 Databases
A Simple Consolidation Example
www.enkitec.com 10
For example, the first row should read…
Database ‘A’ requires 4 CPU’s and will run on nodes 1 and 2 (2 CPU’s each)
Let’s say we have the following databases to migrate on Exadata:
Cluster Level
Utilization
A Simple Consolidation Example
www.enkitec.com 11
Let’s say we have the following databases to migrate on Exadata:
Per compute node
Utilization
For example, the first row should read…
Database ‘A’ requires 4 CPU’s and will run on nodes 1 and 2 (2 CPU’s each)
A Simple Consolidation Example
www.enkitec.com 12
Cluster Level
Utilization = 29.2%
Per compute node Utilization
25% 42% 33% 17%
A Simple Consolidation Example
www.enkitec.com 13
Cluster Level
Utilization = 29.2%
Per compute node Utilization
8% 83%83% 17% 8%
A Simple Consolidation Example
www.enkitec.com 14
• Gather Utilization Metrics (usage history)
• Create Provisioning Plan
• Implement Plan
• Audit Your Implementation
Tools And Methodology
www.enkitec.com 15
Provisioning Worksheet
• Capacity Planning
• Communication Tool
• Hand off
www.enkitec.com 16
**Supplement to existing Exadata installation tools:
• Site planning checklist
• Configuration Worksheet
• Exadata Configurator sheet
• CheckIP
• OneCommand
Utilization = Requirements / Capacity
Capacity
www.enkitec.com 17
2 = quarter rack
4 = half rack
8 = full rack
SPECint_rate2006
http://goo.gl/doBI5
CPU_COUNT,
threads, & cores
http://goo.gl/CunHN
96 to 144GB
(frequency of the
memory DIMMs
drops to 800 MHz
from 1333 MHz)
Space will also depend on:
•ASM redundancy
•DATA/RECO allocation
http://goo.gl/I3fjn
Query Low (4x)
Query High (6x)
Archive Low (7x)
Archive High (12x)
Smart
Scans!
CPU Core Comparison
www.enkitec.com 18
Source
chip efficiency factor = source SPEC rating / Exadata SPEC rating
= 16/26
= .6154
EXA cores requirement = source host cores * utilization * chip efficiency factor
= 32 * .7 * .6154
= 13.78
* offload factor
* .5
--------- 6.89
Sun Fire X4170 M2 X5670@2.93GHz
Destination
how much of the
source CPU cores
are being used
multiplier for
equivalent
database
machine cores
amount of CPU
resources that will
be offloaded to the
storage cells
The Perfect Storm
(Peoplesoft HR)
www.enkitec.com 19
Month-end Processing
+ Weekly Time Entry
+ SQL Plan Change
------------------------------------
Uh-oh!
CPU Allocation
www.enkitec.com 20
DB Uniq Name DB Name
node
1
node
2
node
3
node
4
    4 instance 5 instance 4 instance 3 instance
    47% cpu used 75% cpu used 47% cpu used 18% cpu used
  49% mem used 66% mem used 71% mem used 54% mem used
 BIPRDDAL  biprd   P P  
 DBFSPRD  DBFSPRD P P P P
 HCMPRDDAL  hcmprd P P    
 MTAPRD11DAL  mtaprd11     P P
 PAPRDDAL  paprd P P    
 RMPRDDAL  rmprd P P    
 dbm  dbm F F F F
 Fsprddal  fsprd     P P
 = Preferred
 = Failover
www.enkitec.com 21
Load Map
(our first stop…)
Users Complaint: HR time entry and OBIEE reports painfully slow…
www.enkitec.com 22
Top Activity - HCMPRD
www.enkitec.com 23
Instance Activity – HCMPRD2
HCMPRD Caged
at 12 CPU’s
SQL Profile Installed
to lock in good plan.
Problem: A single SQL stmt. overwhelming
CPU resources.
Node 2
Memory Exhaustion
(OBIEE)
“1 Report = 1 SQL query, right?”
WRONG!
www.enkitec.com 24
www.enkitec.com 25
Overlapping workloads of three databases
across 3 nodes.
BIPRD, HCMPRD, and MTAPRD
Overlapping workloads of three databases
across 3 nodes.
BIPRD, HCMPRD, and MTAPRD
Node 1
Node 2
Node 3
Node 4
www.enkitec.com 26
Node Layout Revisited…
www.enkitec.com 27
Notice what happens to CPU waits
and the system load average when
this report is run.
Notice what happens to CPU waits
and the system load average when
this report is run.
www.enkitec.com 28
PGA Memory SpikesPGA Memory Spikes
www.enkitec.com 29
www.enkitec.com 30
Storage Cell Saturation
(OBIEE)
www.enkitec.com 31
www.enkitec.com 32
www.enkitec.com 33
I/O Intensive Workload
www.enkitec.com 34
Smart Scans as seen in Grid Control
www.enkitec.com 35
25 Sessions Doing Smart Scans
…as seen in gv$sql
www.enkitec.com 36
www.enkitec.com 37
Smart Scan in Action. The cells are scanning 1T but only returning 144G…
***That’s on each of the highlighted row source below…
www.enkitec.com 38
The databases on other nodes see the contention as “System I/O”
Without I/O resource management even critical processes are affected (CKPT, LGWR, …)
www.enkitec.com 39
Inter-database IORM Plan
(only kicks in when needed)
I/O requests from critical processes like CKPT, LGWR, LMON get priority automatically.
Without IORM I/O requests from these important processes receive the same priority
as any other process.
*Side Benefit (automatic when IORM is enabled)
www.enkitec.com 40
IORM Plan Definition
(on each storage cell)
Wrap up!
Provisioning Methodology & Tools
– Workflow
– Provisioning Spreadsheet
Success Stories
– CPU resource management
– Tuning and provisioning adjustments
– I/O resource management
www.enkitec.com 41
www.enkitec.com 42
43
Fastest Growing Companies
in Dallas
Contact Info…

Más contenido relacionado

La actualidad más candente

PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro Yamada
PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro YamadaPGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro Yamada
PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro YamadaEqunix Business Solutions
 
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...Kristofferson A
 
PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)Alexander Kukushkin
 
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinPGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinEqunix Business Solutions
 
Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.EDB
 
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...Equnix Business Solutions
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesDatabricks
 
Building Spark as Service in Cloud
Building Spark as Service in CloudBuilding Spark as Service in Cloud
Building Spark as Service in CloudInMobi Technology
 
Whitepaper: Where did my CPU go?
Whitepaper: Where did my CPU go?Whitepaper: Where did my CPU go?
Whitepaper: Where did my CPU go?Kristofferson A
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 
High-Load Storage of Users’ Actions with ScyllaDB and HDDs
High-Load Storage of Users’ Actions with ScyllaDB and HDDsHigh-Load Storage of Users’ Actions with ScyllaDB and HDDs
High-Load Storage of Users’ Actions with ScyllaDB and HDDsScyllaDB
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoEqunix Business Solutions
 
Configuring Aerospike - Part 1
Configuring Aerospike - Part 1Configuring Aerospike - Part 1
Configuring Aerospike - Part 1Aerospike, Inc.
 
VirtaThon 2011 - Mining the AWR
VirtaThon 2011 - Mining the AWRVirtaThon 2011 - Mining the AWR
VirtaThon 2011 - Mining the AWRKristofferson A
 
Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016
Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016
Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016DataStax
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Ashnikbiz
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit
 

La actualidad más candente (20)

PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro Yamada
PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro YamadaPGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro Yamada
PGConf.ASIA 2019 Bali - Foreign Data Wrappers - Etsuro Fujita & Tatsuro Yamada
 
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
Hotsos 2011: Mining the AWR repository for Capacity Planning, Visualization, ...
 
PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)PostgreSQL on AWS: Tips & Tricks (and horror stories)
PostgreSQL on AWS: Tips & Tricks (and horror stories)
 
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinPGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
 
Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.
 
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...
PGConf.ASIA 2019 Bali - Setup a High-Availability and Load Balancing PostgreS...
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage Devices
 
Building Spark as Service in Cloud
Building Spark as Service in CloudBuilding Spark as Service in Cloud
Building Spark as Service in Cloud
 
Whitepaper: Where did my CPU go?
Whitepaper: Where did my CPU go?Whitepaper: Where did my CPU go?
Whitepaper: Where did my CPU go?
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
High-Load Storage of Users’ Actions with ScyllaDB and HDDs
High-Load Storage of Users’ Actions with ScyllaDB and HDDsHigh-Load Storage of Users’ Actions with ScyllaDB and HDDs
High-Load Storage of Users’ Actions with ScyllaDB and HDDs
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
 
Configuring Aerospike - Part 1
Configuring Aerospike - Part 1Configuring Aerospike - Part 1
Configuring Aerospike - Part 1
 
VirtaThon 2011 - Mining the AWR
VirtaThon 2011 - Mining the AWRVirtaThon 2011 - Mining the AWR
VirtaThon 2011 - Mining the AWR
 
Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016
Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016
Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
 

Similar a KSCOPE 2013: Exadata Consolidation Success Story

A Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl AraoA Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl AraoEnkitec
 
OOW 2013: Where did my CPU go
OOW 2013: Where did my CPU goOOW 2013: Where did my CPU go
OOW 2013: Where did my CPU goKristofferson A
 
A Consolidation Success Story
A Consolidation Success StoryA Consolidation Success Story
A Consolidation Success StoryEnkitec
 
Where Did My CPU Go?
Where Did My CPU Go?Where Did My CPU Go?
Where Did My CPU Go?Enkitec
 
Rmoug13 - where did my CPU go?
Rmoug13 - where did my CPU go?Rmoug13 - where did my CPU go?
Rmoug13 - where did my CPU go?Enkitec
 
RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?Kristofferson A
 
RedGateWebinar - Where did my CPU go?
RedGateWebinar - Where did my CPU go?RedGateWebinar - Where did my CPU go?
RedGateWebinar - Where did my CPU go?Kristofferson A
 
Where Did My Cpu Go?
Where Did My Cpu Go?Where Did My Cpu Go?
Where Did My Cpu Go?Enkitec
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
 
Exadata SMART Monitoring - OEM 13c
Exadata SMART Monitoring - OEM 13cExadata SMART Monitoring - OEM 13c
Exadata SMART Monitoring - OEM 13cAlfredo Krieg
 
Sparc t4 systems customer presentation
Sparc t4 systems customer presentationSparc t4 systems customer presentation
Sparc t4 systems customer presentationsolarisyougood
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...Databricks
 
Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...
Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...
Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...Flink Forward
 
Introduction to architecture exploration
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture explorationDeepak Shankar
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Impetus Technologies
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko GlobalLogic Ukraine
 

Similar a KSCOPE 2013: Exadata Consolidation Success Story (20)

A Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl AraoA Consolidation Success Story by Karl Arao
A Consolidation Success Story by Karl Arao
 
OOW 2013: Where did my CPU go
OOW 2013: Where did my CPU goOOW 2013: Where did my CPU go
OOW 2013: Where did my CPU go
 
A Consolidation Success Story
A Consolidation Success StoryA Consolidation Success Story
A Consolidation Success Story
 
Where Did My CPU Go?
Where Did My CPU Go?Where Did My CPU Go?
Where Did My CPU Go?
 
Rmoug13 - where did my CPU go?
Rmoug13 - where did my CPU go?Rmoug13 - where did my CPU go?
Rmoug13 - where did my CPU go?
 
RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?
 
RedGateWebinar - Where did my CPU go?
RedGateWebinar - Where did my CPU go?RedGateWebinar - Where did my CPU go?
RedGateWebinar - Where did my CPU go?
 
Where Did My Cpu Go?
Where Did My Cpu Go?Where Did My Cpu Go?
Where Did My Cpu Go?
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
Exadata SMART Monitoring - OEM 13c
Exadata SMART Monitoring - OEM 13cExadata SMART Monitoring - OEM 13c
Exadata SMART Monitoring - OEM 13c
 
Sparc t4 systems customer presentation
Sparc t4 systems customer presentationSparc t4 systems customer presentation
Sparc t4 systems customer presentation
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 
Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...
Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...
Flink Forward Berlin 2018: George Theodorakis - "Hardware-efficient Stream Pr...
 
Introduction to architecture exploration
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture exploration
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
 

Más de Kristofferson A

Whitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and VisualizationWhitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and VisualizationKristofferson A
 
RMOUG 2012 - Mining the AWR
RMOUG 2012 - Mining the AWRRMOUG 2012 - Mining the AWR
RMOUG 2012 - Mining the AWRKristofferson A
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...Kristofferson A
 
Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...
Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...
Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...Kristofferson A
 

Más de Kristofferson A (6)

Whitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and VisualizationWhitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and Visualization
 
RMOUG 2012 - Mining the AWR
RMOUG 2012 - Mining the AWRRMOUG 2012 - Mining the AWR
RMOUG 2012 - Mining the AWR
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Devcon: Virtualization?
Devcon: Virtualization?Devcon: Virtualization?
Devcon: Virtualization?
 
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
 
Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...
Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...
Oracle Closed World 2010: Graphing the AAS ala EM + doing some cool linear re...
 

Último

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Último (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

KSCOPE 2013: Exadata Consolidation Success Story

Notas del editor

  1. welcome, I'll mainly talk about server consolidation of a mixed workload environment specifically on a half rack exadata and also will talk about the methodology and tools, as well as lessons learned The company btw is a large real state investment company that consolidated their peoplesoft and biee environments
  2. Just a brief introduction of myself..
  3. Why do we consolidate? Primary driver is… cost savings.. When you consolidate you reduce your total footprint on everything!
  4. (the idea behind all things) Monitoring Cluster level, node level is critical for managing resources of a consolidated environment The scenario is 7 databases that will be spread out across the 4 nodes
  5. Let ’s say we have the following DBs to migrate on Exadata… I call this table “the node layout” And it is read as .. Database ‘A’ requires 4 CPU’s and will run on nodes 1 and 2 (2 CPU’s each) You have 4 nodes with 7 databases spread out.. Now you want to be able to see the “cluster level utilization” which you just sum up all of their core requirement and divide it by the total number of cores across the cluster
  6. BUT more important is seeing the per compute node utilization because you may be having a node that ’s 80% utilized where the rest of the nodes are on the 10% range
  7. Here ’s another view of the node layout where we distribute the CPU cores of the instances based on their node assignments So each block on the left side is one CPU core That ’s 24 cores which accounts the threads as cores. And that is based on the CPU_COUNT parameter and /proc/cpuinfo and you set or take the number of CPUs from CPU_COUNT when you do instance caging and to be consistent with the monitoring of OEM and AWR So on the cluster level utilization it ’s 29.2% While on the per compute node this is the utilization.
  8. Now what we don ’t want to happen is if we change the node layout and assign more instances on node 2 and still make use of the same number of CPU core requirement across the databases On the cluster level it will be the same utilization BUT on the per compute node you end up having node2 with 80% utilization while the rest are pretty much idle So we created a bunch of tools to where we can easily create a provisioning plan, be able to play around with scenarios, and be able to audit it. And that ’s what Randy will be introducing.. BTW, I like the part of the interview of Cary where he mentioned that even with a 135 lane highway you will still have the same traffic problem with the 35 lane highway if you saturate it a bunch of cars.. So the capacity issue on a small hardware can also be an issue on a big hardware and also on Exadata. On this slide it is similar to monitoring the utilization of the whole highway as well as the per lane utilization of that highway..
  9. The three legs of the consolidation process: Gather Requirements Provision Resources Audit Results Utilization metrics from the audit should be fed back into the provisioning process for re-evaluation.
  10. It ’s a capacity planning tool where we make sure that the 3 basic components (CPU, memory, IO) does not exceed the available capacity. And Oracle has created a bunch of tools to standardize their installation of Exadata which helps to avoid mistakes and configuration issues. BUT the problem is when all of the infra is in place how do you now get the end state where all of the instances are up and running So this tool bridges that gap for you to get to that end state And since it ’s an Excel based tool it’s pretty flexible and you can hand this off to your boss as a documentation of their instance layout
  11. Now we move on to the capacity… So there ’s a section on the provisioning worksheet where you will input the capacity of the Exadata that you currently have On the node count you put .. 2,4,8 Then we get the SPECint_rate equivalent of the Exadata processor so you ’ll be able to compare the SPEED of the Exadata CPU against your source servers And I ’ve also explained earlier that we are counting threads as cores.. And I have an investigation on that which is available at this link Each node has 96GB of memory Disk space is dependent on ASM redundancy and DATA/RECO allocation The table compression factor lets you gain more disk space as you compress the big tables The OFFLOAD FACTOR which is the amount of CPU resources that will be offloaded to the storage cells This is art.. This is NOT something that you can calculate.. It ’s not math.. It’s like black magic.. We have done a bunch of Exadata so we know when we see a workload we can guess of what we think the offload percentage is.. That definitely affects the CPU but it’s not something that you can scientifically calculate.
  12. So there ’s a source and destination platform which is the Exadata And you are transferring a bunch of databases from different platforms And you have to get the equivalent number of cores of the source system against Exadata And what we do is We find the type, speed, and number of CPU cores of the source platform We make use of SPECint comparisons to find the equivalent number of Exadata cores needed Of course the CPU cores capacity will depend of the Exadata that you have (Quarter, Half, Full, Mulitple Racks) Let me introduce you to some simple math… Chip eff factor.. That will be your multiplier as to how it is equivalent to Exadata cores And we make use of that to get the “Exa cores requirement”.. (let me explain the formula) Now if it ’s a DW database you will probably doing a lot of offloading.. So that where we factor in the offload factor. And I ’m pretty sure that if you attended Tim’s presentation or the tuning class… you will get a higher offload factor here ;)
  13. The first event came about on a busy Friday, right in the middle of month-end processing. During month-end processing the 4 primary business databases become extremely busy and our configuration is put to the test.
  14. Just a quick review of the instance/node layout. Notice that the HR database (HCMPRD) shares node 2 with the BIPRD (and two other smaller ones)
  15. Our first hint that there was a problem was the Oracle Load Map which showed 66 active sessions on the HR database – waiting for CPU! Complaints were: people could not enter their time (HR) OBIEE was running painfully slow. When we tried to login to the database server it was so busy that we could hardly get logged in.
  16. We went to the Top Activity page for HCMPRD and found that the problem was with on particular SQL statement. We knew that we probably had a SQL statement with a bad plan but we needed to take the pressure off of the CPU before we could do anything.
  17. Our first course of action was to implement instance caging to reduce the load on CPU resources and lessen the impact to the other databases sharing that node. When we confined the instance to 12 CPU ’s – notice what happened to the operating system run queue. Once we limited the instance to 12 CPU ’s we went about the task of investigating what went wrong with the SQL statement. We found the that execution plan *had* in fact changed. We used a SQL Profile to lock in the good execution plan – now look at what happened to the active session count when we implemented the profile.
  18. During the course of troubleshooting this issue we discovered that the OBIEE application would fire off between 15-25 SQL queries with the click of a single button.
  19. Unlike the other databases, OBIEE was a new application without any utilization history. So we didn ’t know what to expect from it. BIPRD is still contending with the HCMPRD, FSPRD, and MTAPRD when it runs some inefficient queries with cartesian joins which use a lot of PGA memory. The problem can be so extreme that we can run out of memory and the system begins to swap heavily. Swapping causes high wait I/O and load average. Nothing will cripple a system faster than running out of memory. Swapping happens outside of the Oracle kernel so the database doesn ’t know anything is wrong. All it *sees* are high waits for I/O. Since swapping happens outside of the database, instance caging does not help. This is a Tableu graph using metrics from Karl Arao ’s AWR Toolkit.
  20. We have to be drastic on our solution so we segregated the OBIEE from the other databases and run it as a standalone database The advantage is that we have isolated our “anti-social” database on a node by itself (node 3) So now when the workloads overlap, these databases won ’t have to compete for CPU and memory resources. This does *not* help us in terms of I/O. But we ’ll talk about that in our next story.
  21. This is the performance page whenever that inefficient batch of SQLs ran.
  22. Looking at the AWR data we find spikes on PGA, WAIT IO, and LOAD AVG
  23. And this is the inefficient SQL.. It ’s a SQL that runs for 2mins, and it’s doing a tiny bit of smart scans Consumes 1.6G of PGA. That doesn ’t sound too terribly bad but when you execute 60 of these at the same time it can quickly consume all the memory on the server.
  24. So we brought in Karen Morton and Martin Paynter to see if there was anything we could tune the SQL. This was when we discovered that a single report could kick off as many as 26 independent SQL queries. (another Ah-hah moment) The tuning greatly reduced the amount of PGA memory needed for the report. Martin also helped us configure OBIEE so that it would manage the number of SQL sent to the database. This graph shows what the load profile looked like when we were finished.
  25. IORM is an Exadata exclusive!
  26. Got an email from the DBA that his database refresh took 12 hours which usually takes 40mins Immediately we did an investigation on IO contention issues
  27. We saw that this BIUAT database is the only active database that is doing heavy I/O.
  28. Then we confirmed that it is doing sustained smart scans
  29. From v$sql we saw about 31 active sessions; most of which were doing smart scans
  30. And looking at the SQL it is scanning 4TB of data and returning 400GB of it. If you do this kind of SQL you will saturate the Exadata storage cells.
  31. You can see the details of the Smart Scans on the row source operations of the SQL.. So here each of the row source is scanning 1TB and returning 144GB..
  32. And this is what the Top Activity page of the other database will look like when encountering an IO contention issue.. You ’ll see here that the database is having high system IO waits on the critical background processes CKPT, LGWR, DBWR, LMON
  33. This lead us to a simple IORM plan that just caps the BIPRD database which is the “anti-social” database on Level1 And the OTHER group which is the rest of the database on the cluster on Level2 This decision is based on the analysis of the workload of the databases that are doing the smart scans… BIPRD will still get the 100% if the other databases are idle and will pull back to 30% when the databases from the OTHER group will need IO bandwidth (and vice versa).
  34. SYS@biprd2> show parameter db_uniq   NAME                                 TYPE        VALUE ------------------------------------ ----------- ------------------------------ db_unique_name                       string      BIPRDDAL   # main commands alter iormplan dbPlan=( - (name= BIPRDDAL,    level=1, allocation=30), - (name=other,    level=2, allocation=100)); alter iormplan active list iormplan detail   list iormplan attributes objective alter iormplan objective = auto   # list dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail' dcli -g ~/cell_group -l root 'cellcli -e list iormplan attributes objective'   # implement dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail' dcli -g ~/cell_group -l root 'cellcli -e alter iormplan dbPlan=\( \(name= BIPRDDAL,    level=1, allocation=30\), \(name=other,    level=2, allocation=100\)\);' dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail' dcli -g ~/cell_group -l root 'cellcli -e alter iormplan active' dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'   dcli -g ~/cell_group -l root 'cellcli -e alter iormplan objective = auto' dcli -g ~/cell_group -l root 'cellcli -e list iormplan attributes objective'   # revert dcli -g ~/cell_group -l root 'cellcli -e alter iormplan dbPlan=\"\"' dcli -g ~/cell_group -l root 'cellcli -e alter iormplan catPlan=\"\"' dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail' dcli -g ~/cell_group -l root 'cellcli -e alter iormplan inactive' dcli -g ~/cell_group -l root 'cellcli -e list iormplan detail'   dcli -g ~/cell_group -l root 'cellcli -e alter iormplan objective=\"\"' dcli -g ~/cell_group -l root 'cellcli -e list iormplan attributes objective'