2. WHO AM I?
I am an Adobe Community Professional
I started building web applications a long time ago
Contributor to Learn CF in a week
I have a ColdFusion podcast called
CFHour w/ Scott Stroz (@boyzoid)
(please listen)
3x California State Taekwondo
Weapons Champion
3. WHAT WILL WE COVER?
Running Queries
When good SQL goes bad
Bulk processing
Large volume datasets
Indexes
Outside influences
14. “My app works fine. It has
thousands of queries and we
only see slowness every once in
a while. ”
15. Have you ever truly looked at
what your queries are doing?
16. Most developers don't bother.
They leave all that technical
database stuff up to the DBA.
But what if you are the
developer AND the DBA?
17. Query Plan
Uses Execution Contexts
Created for each degree of parallelism for
a query
Execution Context
Specific to the query being executed.
Created for each query
QUERY EXECUTION
21. WHAT A QUERY PLAN WILL TELL
YOU
Path taken to get data
Almost like a Java stack trace
Indexes usage
How the indexes are being used
Cost of each section of plan
Possible suggestions for performance
improvement
Whole bunch of other stuff
22. How long are plans / contexts kept?
1 Hour
1 Day
‘Til SQL server restarts
Discards it immediately
The day after forever
Till the server runs out of cache space
23. What can cause plans to be flushed from
cache?
Forced via code
Memory pressure
Alter statements
Statistics update
auto_update statistics on
24. HOW CAN WE KEEP THE
DATABASE FROM
THROWING AWAY THE
PLANS?
32. According to the SQL
optimizer,
select id, name from myTable where id = 2
select id, name from myTable where id = 5
this query…
and this query…
are not the same.
So, they each get their own execution plan.
33. PLANS CAN BECOME DATA HOGS
select id, name from myTable where id = 2
If the query above ran 5,000 times over the
course of an hour (with different ids), you could
have that many plans cached.
That could equal around 120mb of cache space!
36. <cfquery name="testQuery">
select a.ARTID, a.ARTNAME from ART a
where a.ARTID = <cfqueryparam value="5”
cfsqltype="cf_sql_integer">
</cfquery>
Using a simple query… let’s add a param for the
id.
38. testQuery (Datasource=cfartgallery, Time=1ms, Records=1)
in /xxx/x.cfm
select a.ARTID, a.ARTNAME from ART a
where a.ARTID = ?
Query Parameter Value(s) -
Parameter #1(cf_sql_integer) = 5
THE DEBUG OUTPUT LOOKS LIKE
THIS…
39. testQuery (Datasource=cfartgallery, Time=8ms, Records=5) in
/xxx/x.cfm
select a.ARTID, a.ARTNAME from ART a
where a.ARTID in (?,?,?,?,?)
Query Parameter Value(s) -
Parameter #1(CF_SQL_CHAR) = 1
Parameter #2(CF_SQL_CHAR) = 2
Parameter #3(CF_SQL_CHAR) = 3
Parameter #4(CF_SQL_CHAR) = 4
Parameter #5(CF_SQL_CHAR) = 5
IT EVEN WORKS ON LISTS…
40. testQuery (Datasource=cfartgallery, Time=3ms, Records=1) in
/xxx/x.cfm
select a.ARTID, a.ARTNAME, (
select count(*) from ORDERITEMS oi where oi.ARTID = ?
) as ordercount
from ART a
where a.ARTID in (?)
Query Parameter Value(s) -
Parameter #1(cf_sql_integer) = 5
Parameter #2(cf_sql_integer) = 5
MORE ACCURATELY, THEY WORK ANYWHERE
YOU WOULD HAVE DYNAMIC INPUT...
41. When can plans cause more harm
then help?
► When your data structure changes
► When data volume grows quickly
► When you have data with a high
degree of cardinality.
43. What do I mean by large data
sets?
► Tables over 1 million rows
► Large databases
► Heavily denormalized data
44. Ways to manage large data
► Only return what you need (no “select *”)
► Try and page the data in some fashion
► Optimize indexes to speed up where
clauses
► Avoid using triggers on large volume
inserts
► Reduce any post query processing as
much as possible
45. Inserting / Updating large datasets
► Reduce calls to database by combining
queries
► Use bulk loading features of your
Database
► Use XML/JSON to load data into
Database
48. Gotcha’s in query combining
► Errors could cause whole batch to fail
► Overflowing allowed query string size
► Database locking can be problematic
► Difficult to get any usable result from
query
49. Upside query combining
► Reduces network calls to database
► Processed as a single batch in database
► Generally processed many times faster
than doing the insert one at a time
► I have used this to insert over 50k rows
into mysql in under one second.
51. Index Types
► Unique
► Primary key or row ID
► Covering
► A collection of columns indexed in an order that
matches where clauses
► Clustered
► The way the data is physically stored
► Table can only have one
► NonClustered
► Only contain indexed data with a pointer back to
source data
52. Seeking and Scanning
► Index SCAN (table scan)
► Touches all rows
► Useful only if the table contains small amount of
rows
► Index SEEK
► Only touches rows that qualify
► Useful for large datasets or highly selective
queries
► Even with an index, the optimizer may still opt to
perform a scan
53. To index or not to index…
► DO INDEX
► Large datasets where 10 – 15% of the data is
usually returned
► Columns used in where clauses with high
cardinality
► User name column where values are unique
► DON’T INDEX
► Small tables
► Columns with low cardinality
► Any column with only a couple values
58. Other things that can effect
performance
► Processor load
► Memory pressure
► Hard drive I/O
► Network
59. Processor
► Give SQL Server process CPU priority
► Watch for other processes on the server using
excessive CPU cycles
► Have enough cores to handle your database
activity
► Try to keep average processor load below
50% so the system can handle spikes
gracefully
60. Memory (RAM)
► Get a ton (RAM is cheap)
► Make sure you have enough RAM to keep
your server from doing excess paging
► Make sure your DB is using the RAM in the
server
► Allow the DB to use RAM for cache
► Watch for other processes using excessive
RAM
61. Drive I/O
► Drive I/O is usually the largest bottle neck on the
server
► Drives can only perform one operation at a time
► Make sure you don’t run out of space
► Purge log files
► Don’t store all DB and log files on the same physical
drives
► On windows don’t put your DB on the C: drive
► If possible, use SSD drives for tempdb or other highly
transactional DBs
► Log drives should be in write priority mode
► Data drives should be in read priority mode
62. Network
► Only matters if App server and DB server are on
separate machines (they should be)
► Minimize network hops between servers
► Watch for network traffic spikes that slow data
retrieval
► Only retrieving data needed will speed up retrieval
from DB server to app server
► Split network traffic on SQL server across multiple NIC
cards so that general network traffic doesn’t impact
DB traffic
64. Important stats
► Recompiles
► Recompile of a proc while running shouldn’t occur
► Caused by code in proc or memory issues
► Latch Waits
► Low level lock inside DB; Should be sub 10ms
► Lock Waits
► Data lock wait caused by thread waiting for
another lock to clear
► Full Scans
► Select queries not using indexes
65. Important stats continued..
► Cache Hit Ratio
► How often DB is hitting memory cache vs Disk
► Disk Read / Write times
► Access time or write times to drives
► SQL Processor time
► SQL server processor load
► SQL Memory
► Amount of system memory being used by SQL