3. All businesses have high IT Costs
Silo’s of hardware, storage, software & applications
• Sized for individual peak loads
– Inefficient and expensive
• Meet changing business needs?
– Inflexible and unresponsive
• Expensive to manage
– Too many moving parts
3
4. Factor Driving Consolidation
Business Drivers
Lower: Reduce:
• CapEx • Configurations
• Servers • Services
• Storage
• S/W licenses Reduce Reduce Standardize:
• OpEx IT Costs Complexity • OS
• Maintenance • DB Versions
• Management
Increase
Increase
Quality of
Agility
Enable: Service Enhance:
• Resource Elasticity • IT service time
• Rapid Provisioning • Availability
• Fast Deployment • Security
4
5. Server Consolidation
Utilizing more powerful hardware
• Servers are getting more and more powerful
– For example:
• Exadata X2-8: 8 Nehalem CPUs, 64 cores, 1TB memory
– Many databases don’t fully utilize their server!
• Solution: Server Consolidation
– Run multiple database instances on the same server
• But there may be problems:
– Contention for CPU, memory, and I/O Videos Carpool
– Unexpected workload surging on one
instance can impact other databases
OLTP_A
MAIL_P
OLTP_P
MAIL_A
5
6. Database Consolidation
Utilizing the power of one
• Database consolidation means:
– multiple applications or workloads run within the same database
• For shared data consolidation it is almost imperative…
– “Reporting on an OLTP database”
– Tactical queries and advanced
analytics in a data warehouse
• Pros and cons when it comes to:
MAIL_A MAIL_P
– Upgrades and patching Carpool
OLTP_P Videos OLTP_A
– Backup and Recovery
Database One
6
8. Mixed Workload on Consolidated Servers
4-node cluster 4-8 node cluster
for smaller for large
databases databases
• … different Cluster sizes for different use cases
• The question remains: “How to govern resources on the server?”
?
I/O I/O I/O
8
9. The “How” is -
Instance Caging
Server A
DB1
DB1 DB2
DB2
5 core limit 3 core limit
9
10. CPU Usage Without Instance Caging
Wait for CPU
on O/S run
queue Oracle processes
from one Database
Instance try to use
all CPUs
Running
Processes
10
11. CPU Usage With Instance Caging
Wait for CPU
on Resource
Manager run
queues
Instance Caging
limits the number
of Oracle
processes running
at any moment in
time
Running
Processes
11
12. How to configure Instance Caging
• Limits CPU resources that database instance uses
• Available in 11.2.0.1
• Configured in just 2 steps:
1. Set “cpu_count” parameter
• Maximum number of CPUs the instance can use at any time
2. Set “resource_manager_plan” parameter
• Enables CPU Resource Manager
• E.g. out-of-box plan “DEFAULT_PLAN”
12
13. Instance Caging
Partitioning Approach
CPU Allocations
• Provides maximum 32
isolation
28
• For performance-critical 24
databases 20
Number
16
• If one database is idle, its Instance CRM: 2 CPUs of CPUs on
Instance HR: 2 CPUs Server
CPU allocation is unused 12
Instance ERP: 4 CPUs
8
4 Instance EDW: 8 CPUs
0
13
14. Instance Caging
Over-Provisioning Approach
CPU Allocations
• For non-critical databases 32
that are typically well-
28
behaved
24
Instance CRM: 4 CPUs
• Contention for CPU if 20
databases are sufficiently Instance HR: 4 CPUs
Number
loaded 16
of CPUs on
– Not enough contention to Server
12 Instance ERP: 8 CPUs
destabilize OS or database
instances
8
• Best approach if goal 4 Instance EDW: 8 CPUs
is fully utilize CPUs
0
14
15. Instance Caging Results
• 4 CPU server
• Workload is a mix of OLTP transactions,
parallel queries, and DMLs from Oracle Financials
15
16. Instance Caging
Best Practices
• Cage size, a.k.a. cpu_count, is a dynamic parameter
– Changes take place immediately
– Some overhead, so limit changes to once an hour
– Changes to cpu_count also affects other settings
• e.g. parallel execution
– Avoid huge changes to cpu_count,
particularly from a small initial value (e.g. 1 or 2)
• Instance Caging in 11.2.0.1:
– See My Oracle Support note 1208064.1
• Monitor Instance Caging throttling
– AWR reports: “`” wait event
– Indicates that this instance would benefit from larger cage size
16
17. Mixed Workload – More Aspects to Consider
4-node cluster 4-8 node cluster
for smaller for large
databases databases
• Instance Caging can be used to govern “external CPU usage”
• What about governing CPU usage inside of one database?
Videos Carpool
?
OLTP_A
MAIL_P
OLTP_P
MAIL_A
?
M M
A A
IO I
L O
L
L
_T _ Carpool
Videos T
AP P
_ P
P _
A
Database One
I/O
17
18. The “How” is:
Configure Database Resource Manager
1. Group sessions with similar
performance objectives into Consumer Groups
2. Allocate resources to consumer groups
using Resource Plans
3. Enable Resource Plan
18
19. Problem:
Workloads contending for CPU
When a database host has
100%
insufficient CPU for all
workloads, the workloads
60%
will compete for CPU.
Performance of all
CPU
Usage 80% 90% workloads will degrade!
40%
What, if you cannot tolerate
performance degradations for
certain workloads?
OLTP Reports OLTP +
only only Reports
19
20. Solution:
Resource Manager to manage such workloads
100%
20%
CPU
With Resource Manager,
80% 90% 80% 90% you control how CPU
Usage
resources should be
allocated
10%
OLTP Reports OLTP + Reports OLTP + Reports
only only
Resource Manager Enabled
OLTP Reports
Prioritized Prioritized
20
22. Resource Management
Step 1: Create consumer groups and map sessions
User Mapping Consumer
Sessions Rules Groups
Service, module, and action
names (or combinations
OLTP
thereof), oracle user name,
client pgm name, os user Reports
name, client machine name
and client id can be used to
map sessions to consumer Ad-Hoc
groups dynamically
Low Pri
22
23. Resource Management
Step 1: Create consumer groups and map sessions
User Mapping Consumer
Sessions Rules Groups
client program = ‘Siebel Call Center’
OLTP
service = ‘Customer_Service’
Oracle user = ‘Reports%’ Reports
module = ‘Oscar’
Ad-Hoc
query has been running > 1 hour
Low Pri
estimated execution time of query > 1 hour
23
24. Resource Management
Step 2: Create resource plans
User Consumer Resource
Sessions Groups Plan(s)
OLTP
Resource
allocations for
Consumer Groups
Reports
Ad-Hoc Consumer Group Level 1 Level 2 Maximum
Allocation Allocation Utilization
OLTP 90%
Low Pri
Reports 60% 80%
Ad-Hoc 10% 30% 50%
Low Pri 10% 50%
24
25. Resource Management
Step 3: Enable plans
User Consumer Resource
Sessions Groups Plan(s) Oracle
Database
CPU Instance
OLTP (DBRM)
Resource
Reports allocations for
Consumer Groups
Ad-Hoc
I/O
(IORM)
Low Pri Exadata
Storage Server
Software
25
26. Resource Manager Example
Prioritizing Level 2 Allocation
Oracle-Internal Reports Ad-Hoc
CPU Queue Resource Plan
Consumer Level 2
Resource Manager Group Allocation
Reports 60%
Sessions
scheduled every Ad-Hoc 30%
100 milleseconds Low Pri 10%
26
27. CPU Usage with Resource Manager
Sessions wait on
“resmgr:cpu quantum”
event
Oracle-
Internal CPU
Queue OLTP Reports
Resource Plan:
OLTP 75%
Sessions CPU Resource
Manager Reports 25%
scheduled every
100 ms (OLTP picked 3 out of 4 times)
27
28. Manage Runaway Queries with
Resource Manager
For Tactical consumer group,
runaway means: Switch to
“Low Priority”
30+ sec
consumer group!
For Reports consumer group,
runaway means: Abort query!
32GB+ I/Os
For Ad-Hoc consumer group,
runaway means: Don’t execute!
24+ hour estimated execution time
28
29. Mixed Workload – More Aspects to Consider
4-node cluster 4-8 node cluster
for smaller for large
databases databases
• Use the Database Resource manager to control “CPU usage inside
of one database”
• What about prioritising my statement execution?
M M
A A
IO I
L O
L
L
_T _ Carpool
Videos T
AP P
_ P
P _
A
Database One
I/O
29
30. The “How” is –
Parallel Statement Queuing
When parallel servers become available, the resource
Since there are no more a higher priority, statements, we pick
Since Reports is Reports parallel its parallel
plan is used to select a queue. The head parallel
statements are always selected first.
either Ad-Hoc or Low Pri.
statement from that queue is run.
orts orts
64 Rep Rep orts
Rep
orts
Rep
Reports Queue orts
Rep
Resource Manager
Hoc Hoc Hoc
Ad- Ad- Ad-
Hoc
Ad-Hoc Queue Ad-
Parallel Statement
Queue Coordinator
P ri P ri P ri P ri
Low Low Low Low P ri
Low
Low Pri Queue Consumer Group Level 2 Level 3
Reports 60%
Ad-Hoc 30%
Running Queries
Low Pri 10%
30
31. Reserving Parallel Servers for Critical Work
Since parallel servers are available, Report
requests can be run immediately Available Servers: 48
64
32
64
Reports Queue
Resource Manager
Hoc Hoc Hoc Hoc
Ad- Ad- Ad- Ad-
Hoc
Ad-
Parallel Statement Hoc
Ad-Hoc Queue Ad-
Hoc
Queue Coordinator Ad-
Reports limited
Consumer Level 2 Level 3 Max % of
Low Pri Queue Group Parallel to 50% of the
Servers parallel servers
Reports 60% 50%
Ad-Hoc 30% 50% Running
Low Pri 10% 50%
Queries
31
32. Mixed Workload – More Aspects to Consider
4-node cluster 4-8 node cluster
for smaller for large
databases databases
• Use parallel statement queuing to prioritise statement execution
• What about governing I/O usage to all my databases?
Videos Carpool
?
OLTP_A
MAIL_P
OLTP_P
MAIL_A
?
M M
A A
IO I
LL O
L
L
_T _ Carpool
Videos T
AP P
_ P
P _
A
Database One
I/O
32
33. The “How” is -
Exadata - I/O Resource Manager
An Inter-Database
Resource Plan
A Database manages databases
Resource Plan sharing Exadata
manages workloads storage cells
within a database
EDW
OLTP
Reports ERP
Ad Hoc
HR
Exadata
Storage
33
34. Exadata I/O Resource Manager
1. Pick a database
Sales Database 2. Pick a Consumer Group
OLTP Queue
Resource Plans 3. Issue the head I/O request
Consumer Level 1
OO Group Allocation
ERP
OLTP 75%
Database Reports 25%
RRR
Reports Queue
Finance Database R O T O
Tactical Queries Queue
IORM
T
Outstanding I/O
HR Requests
Database Database Allocation
BBBB Exadata Sales 80%
Batch Queries Queue Storage Finance 20%
Cell
34
35. Mixed Workload – More Aspects to Consider
4-node cluster 4-8 node cluster
for smaller for large
databases databases
• Use Exadata I/O Resource Management to govern I/O
• Can I dynamically move my mixed workload?
Videos Carpool
?
OLTP_A
MAIL_P
OLTP_P
MAIL_A
?
M M
A A
IO I
LL O
L
L
_T _ Carpool
Videos T
AP P
_ P
P _
A
Database One
I/O
35
36. The “How” is –
Server Pools
• Logical division of a cluster into pools of
Siebel servers.
PSFT • Hosts applications
Oracle Grid Infrastructure
Oracle RAC DBs (which could be databases or applications)
RAC
DB1
Why Use Server Pools?
• Easy allocation of resources to workload
RAC
DB2 • Easy management of Oracle RAC
– Just define instance requirements
(# of nodes – no fixed assignment)
RAC
FREE
One • Facilitates Consolidation of Applications
and Databases on Clusters
36
37. Policy-based Database Management
A new way of managing your Oracle RAC
• Policy-managed cluster management can be
applied to Oracle Real Application Clusters (RAC)
FREE • Two management styles available now:
Oracle Grid Infrastructure • Administrator Managed
Oracle RAC DBs
RAC
– Specifically define where the database should run
DB1 with a list of servers names (“traditional way”)
– Define where services should run within the DB
RAC
• Policy Managed
DB2
– Define resource requirements for expected workload
– Ensure enough instances are started to support
RAC expected workload, if enough node in the cluster
FREE
One – Goal: remove hard coding of service to instance
37
38. What Management Style to use?
Policy managed is the future
• Administrator Managed
– Allows and requires maximum control
• Failover management is pre-set
– Existing systems have worked well using it
– Slows down dynamic addition of nodes to the cluster
– Suitable for smaller clusters or rather static systems
• Policy Managed
– Control is based on policies
– Additional capacity will be used instantaneously
in accordance to the policies defined
– Optimizes bigger clusters (> 4 nodes)
– Enables dynamic cluster environments
– Useful for future projects and when planning ahead
38
40. Mixed Workload Management
Define, monitor, adjust resource sharing plans
• Define mixed workload plans
– Set priorities
Define
– Allocate resources Workload
– Set thresholds and throttles Plans
• Monitor the workload Execute
• Adjust policies over time Workloads
• If using Quality of Service
– May make recommendations
Adjust Monitor
Workload Workloads
Plans
40
42. Mixed Workload Management in Action
Reports Query
Response Time Reports Query Only
(seconds) With Ad-Hoc
42
43. Mixed Workload Management in Action
Reports Query Only
Reports Query
Response Time With Ad-Hoc
(seconds)
Enable Parallel Queuing
43
44. Mixed Workload Management in Action
Reports Query Only
With Ad-Hoc
Reports Query
Response Time
(seconds) Enable Parallel Queuing
CPU Resource Manager
44
45. Mixed Workload Management in Action
Reports Query Only
With Ad-Hoc
Reports Query
Response Time Parallel Queuing
(seconds)
CPU Resource Manager
I/O Resource Manager
45
46. Summary
1) For mixed workload databases, use Resource Manager to
ensure sufficient resources for workloads that are performance
critical.
• CPU Resource Manager
• I/O Resource Manager
• Parallel Statement Queuing
• Runaway Query Management
2) For server consolidation, use Instance Caging to distribute
CPU among the databases.
3) For storage consolidation, use IORM to distribute disk
bandwidth among the databases.
46