The Ultimate Guide to Choosing WordPress Pros and Cons
Sql Performance Tuning For Developers
1. SQL SERVER 2005/2008
Performance tuning for
the developer
Michelle Gutzait
gutzait@pythian.com
michelle.gutzait@gmail.com
Blog: http://michelle-gutzait.spaces.live.com/default.aspx
2. Whoami?
SQL Server Team Lead @ www.pythian.com
24/7 Remote DBA services
I live in Montreal
gutzait@pythian.com
michelle.gutzait@gmail.com
Blog: http://michelle-gutzait.spaces.live.com/default.aspx
2
3. Agenda – Part I
General concepts of performance and
tuning
• Performance bottlenecks
• Optimization tools
• Table and index
• The data page
• the optimizer
• Execution plans
3
4. Agenda – Part II
Development performance Tips
• T-SQL commands
• Views
• Cursors
• User-defined functions
• Working with temporary tables and table variables
• Stored Procedures and functions
• Data Manipulation
• Transactions
• Dynamic SQL
• Triggers
• Locks
• Table and database design issues
4
5. “The fact that I can
does not mean that I
should !”
Kimberly Tripp (?)
5
6. Always treat your
code as if it‟s
running:
Frequently
On large amount of data
In a very busy environment
6
7. The goal
Minresponse time and Max
throughput
Reduce network traffic, disk I/O
and CPU time
Start optimizing as early as
possible as it will be harder
later.
7
9. Client/Server Tuning Levels
Presentation Layer
Application Logic Client
Side
Client OS
Network
Operating
System and Network Communication
Hardware
Network
OS/IO Subsystem Server
SQL Server Side
Database Applications
9
10. The Typical Performance
Pyramid
Application / Query / Database
Design
Operating Environment
Beware: In Hardware
certain
environments
this pyramid
may be upside
down!
10
26. Heap table
• A table with no clustered index
RID is built from file:page:row
26
27. Table Scan
Will usually
be faster
using a
clustered
index
27
28. Parsing
Sequence Tree
Normalization
Execution Is SQL? SQL
Plan – cost Trivial Plan
Optimization
based Syntatic
Transformation
optimization T-SQL
Optimization
SARG Selection
NO
Index Selection
Optimizer hints JOIN Selection
View optimizer info
Execution Plan Is Cheap
Yes
Enough?
Caching
Memory Allocation
Execution 28
30. Search ARGuments
SARG
Always isolate Columns
SARG NOT SARG
where MonthlySalary > 600000/12 where MonthlySalary * 12 > 600000
where ID in (select ID from vw_Person) where dbo.fu_IsPerson(ID) = 1234
where firstname like 'm%' where SUBSTRING(firstname,1,1) = 'm’
SARG:
= BETWEEN, >, <, LIKE ‟x%‟, EXISTS
Not SARGABLE:
LIKE „%x‟, NOT LIKE, NOT EXISTS, FUNCTION(column)
AND creates a single SARG
OR creates multiple SARG‟s
30
31. Table, column and index statistics
… … state Step Step #
Sales
… … AL AL 0
… … AK
… … CA
… … CA CA 1
… … CA
… … CT
… … IL IL 2
… … IL
… … IL sys.sysobjvalues (internal)
… … IL IL 3
statblob … …
… … IL
… … MT AL … …
OR 4 CA … …
… … OR
… … OR IL … …
… … PA TX 5 IL … …
… … TX OR … …
… … TX TX … …
… … WA WA 6 WA … …
… … WA WY … …
… … WA
… … WI WY 7
31
… … WY
32. Update statistics - “Rules of thumb”
Use auto create and auto update statistics
5% of the table changes
Still bad query:
Create statistics
Update statistics with FULLSCAN
Use multi-column statistics when queries have multi-
column conditions
Use AUTO_UPDATE_STATISTICS_ASYNC
database option
No stats for temporary objects and functions
32
34. Joins - Optimization tip
HASH Joins are used when no useful index
exists on one or both of the JOIN inputs.
These can be converted to MERGE or LOOP
joins through effective indexing.
34
37. Index tips
MORE indexes – for queries, LESS indexes – for updates
More indexes – more possibilities for optimizer
Having a CLUSTERED INDEX is almost always a good
idea…
Sort operations: TOP, DISTINCT, GROUP BY, ORDER BY
and JOIN; WHERE
As narrow as possible to avoid excessive I/O
Use integer values rather than character values
Values with low selectivity
covering index - faster than a clustered index
37
38. Index tips 2
CLUSTERED index key in all non-clustered indexes
(otherwise RID is used)
Frequently updated column and clustered index
Drop costly UNUSED indexes
High volume inserts – incremental Clustered index
Surrogate integer primary key (identity ?)
Clustered index for random modifications and index
bottleneck
CLUSTERED index on non-unique columns – 16 bytes
added (uniqueidentifier)
38
39. Index tips 3
Creating index before rare heavy operations
When Changing/dropping CLUSTERED index, drop all
NON-CLUSTERED indexes first.
Don‟t forget to recreate them later
Indexes are almost always in cache, therefore are faster
Column referenced by OR and no index on the column
table scan.
PRIMARY KEY and UNIQUE CONSTRAINTS create
indexes
Foreign Keys do NOT create indexes
39
40. Index tips 4
Wide and fewer indexes are sometimes better
than many and narrower indexes
INCLUDE columns for covering index
Indexes are used to reduce the number of rows
fetched, otherwise they are not necessary
If TEMPDB resides on different physical disk,
you may use SORT_IN_TEMPDB
40
44. Data modifications
In-place
direct
96 Page Header
bytes
Row D
Row A Row A Ver 2
Row E
Row C
8,096 Data rows
bytes
Row B
C B A Row Offset Table
2 bytes each
44
45. Data modifications
In-place
indirect
96 Page Header
bytes
Row D
Row A Row A
Row E
Row C
8,096 Data rows
bytes
Row B
C B A Row Offset Table
2 bytes each
45
46. Data modifications
Differed
update –
96 Page Header forwarded
bytes
Row D
Row A
Row E
Row C
8,096 Data rows
bytes
Row B In a heap – rows are
forwarded leaving old
address in place
C B A Row Offset Table
2 bytes each
46
49. Indexed Views
Possible bottleneck
SELECT t1.Col2, t2.Col3,
count(*) as Cnt
FROM Table_1 t1
INNER JOIN Table_2 t2
ON t1.Col1 = t2.Col1
GROUP BY t1.Col2, t2.Col3
49
50. “Performance tuning SQL Statements
involves doing things to allow the optimizer
make better decisions”
Your options for performance
tuning are indexing or rewriting
50
54. Presentation Layer
Application Logic
Client OS
Network
Network
Disk
OS/IO Subsystem
SQL Server
Database Applications
54
55. What could possibly be “wrong”
with this query ?
SELECT * FROM MyTable WHERE Col1 = „x‟
SELECT Col1 FROM MyTable1, MyTable2
SELECT TOP 2000000 Col1 FROM MyTable1
Looping on the Client side:
WHILE @i < 10000
Update tb1 WHERE Col = @i
@i = @i + 1
55
56. What could possibly be wrong
with this query (cont) ?
SELECT *
FROM MyTable t1
INNER JOIN MyTable_2 t2 on t1.Col1 = t2.Col1
INNER JOIN MyTable_3 t3 on t1.Col1 = t3.Col1
LEFT JOIN MyTable_4 t4 on t1.Col1 = t4.Col1
LEFT JOIN MyTable_5 t5 on t1.Col1 = t5.Col1
LEFT JOIN MyTable_6 t6 on t1.Col1 = t6.Col1
LEFT JOIN MyTable_7 t7 on t1.Col1 = t7.Col1
LEFT JOIN MyTable_8 t8 on t1.Col1 = t8.Col1
LEFT JOIN MyTable_9 t9 on t1.Col1 = t8.Col1
LEFT JOIN MyTable_10 t10 on t1.Col1 = t8.Col1
……
56
57. What is the difference?
Short Long(er) ?
IF EXISTS SELECT @rc=COUNT(*)
(SELECT 1 FROM MyTable) FROM MyTable
IF @rc > 0
IF EXISTS IF EXISTS
(SELECT 1 FROM MyTable) (SELECT * FROM MyTable)
IF EXISTS IF NOT EXISTS
(SELECT 1 FROM MyTable) (SELECT 1 FROM MyTable)
SELECT MyTable1.Col1, SELECT MyTable1.Col1,
MyTable1.Col2 MyTable1.Col2
FROM MyTable1 FROM MyTable1
INNER JOIN MyTable2 WHERE MyTable1.Col1 IN
ON MyTable1.Col1 = MyTable2.Col1 (SELECT MyTable2.Col1
FROM MyTable2)
57
58. What is the difference?
Short Long(er) ?
SELECT MyTable1.Col1, SELECT MyTable1.Col1,
MyTable1.Col2 MyTable1.Col2
FROM MyTable1 FROM MyTable1
WHERE MyTable1.Col1 IN WHERE EXISTS
(SELECT MyTable2.Col1 (SELECT 1
FROM MyTable2) FROM MyTable2.Col1
WHERE MyTable2.Col1 =
MyTable1.Col1)
58
60. What is the difference?
Sort No sort
SELECT Col1 SELECT Col1
FROM Table1 FROM Table1
UNION UNION ALL
SELECT Col2 SELECT Col2
FROM Table2 FROM Table2
SELECT DISTINCT Col1 SELECT Col1
FROM Table1 FROM Table1
SELECT Col1 SELECT Col1
FROM Table1 FROM Table1
WHERE col2 IN (SELECT DISTINCT Col3 WHERE col2 IN (SELECT Col3
FROM Table2) FROM Table2)
CREATE VIEW VW1 CREATE VIEW VW1
SELECT * FROM DB2..Table1 SELECT * FROM
ORDER BY Col1 DB2..Table1
60
61. Which one is BETTER ?
Sort No sort
SELECT Col1 SELECT Col1
FROM Table1 FROM Table1
WHERE ModifiedDate WHERE ModifiedDate =
IN (SELECT TOP 1 (SELECT MAX(ModifiedDate )
FROM Table1 FROM Table1)
ORDER BY ModifiedDate
DESC)
61
63. What is the difference?
OR No OR
SELECT Col1 SELECT Col1
FROM Table1 FROM Table1
WHERE Col1 = „x‟ WHERE Col1 = „x‟
OR Col2 = „y‟ UNION
SELECT Col1
FROM Table1
WHERE Col2 = „y‟
SELECT Col1 SELECT Col1
FROM Table1 FROM Table1
WHERE Col1 IN WHERE EXISTS (SELECT 1 FROM Table2
(SELECT C1 FROM Table2) WHERE Col1 = C1)
OR Col1 IN UNION ALL
(SELECT C2 FROM Table2) SELECT 1 FROM Table2
WHERE Col1 = C2)
SELECT * SELECT *
FROM Table1 FROM Table1
WHERE Col1 IN
(SELECT C1 FROM Table2) ????
OR Col2 IN
(SELECT C2 FROM Table2)
63
73. Isolation levels - example
USE pubs
GO
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
SELECT au_lname FROM authors WITH (NOLOCK)
GO
The locks generated are:
EXEC sp_lock
GO
73
74. EXEC Sp_lock
spid dbid ObjId IndId Type Resource Mode Status
51 5 0 DB S GRANT
51 10 85575343 2 KEY (a802b526c101) RangeS-S GRANT
51 10 85575343 2 KEY (54013f7c6be5) RangeS-S GRANT
51 10 85575343 2 KEY (b200dbb63a8d) RangeS-S GRANT
51 10 85575343 2 KEY (49014dc93755) RangeS-S GRANT
51 10 85575343 2 KEY (170130366f3d) RangeS-S GRANT
51 10 85575343 2 PAG 1:1482 IS GRANT
51 10 85575343 2 KEY (c300d27116cf) RangeS-S GRANT
51 10 85575343 0 TAB IS GRANT
51 10 85575343 2 KEY (1101ed75c8f8) RangeS-S GRANT
51 10 85575343 2 KEY (2802f6d3696b) RangeS-S GRANT
51 10 85575343 2 KEY (0701fdd03550) RangeS-S GRANT
51 10 85575343 2 KEY (7f00d0d5506b) RangeS-S GRANT
SELECT object_name(85575343)
GO
-----------------------------
authors 74
78. What are the benefits of
Stored Procedures?
Reduce network traffic
Reusable execution plans
Efficient Client execution requests
Code reuse
Encapsulation of logic
Client independence
Security implementation
As a general rule of thumb, all Transact-SQL
code should be called from stored procedures.
78
79. Stored Procedures tips
SET NOCOUNT ON
No sp_
Owned by DBO
Exec databaseowner.objectname
Select from databaseowner.objectname
Break down large SPs
79
80. SP Recompilations
#temp instead of @Temp table variables
DDL statements
Some set commands
Use SQL Server Profiler to check recompilations
80
81. Which one is better and why?
IF @P = 0
SQL Statement Block1
ELSE
SQL Statement Block2
IF @P = 0
Exec sp_Block1
ELSE
Exec sp_Block2
81
82. What could be problematic
MyTable
here? PK
CREATE PROC MySP 0
@p_FROM INT, @p_TO INT 5
AS 10
SELECT count(*) FROM MyTable
34
WHERE PK
between @p_FROM and @p_TO 87
…
CREATE … WITH RECOMPILE 198,739
EXECUTE … WITH RECOMPILE ….
sp_recompile objname 3,898,787
7 million rows
82
88. What could possibly replace
cursors?
Loops ?
Temp tables
Local variables (!)
CTEs
CASE statements
Multiple queries
AND…
88
89. Replacing cursor
Tip #1
Select Seq=identity(int,1,1),
Seq Fld1 Fld2 …..
Fld1,
Fld2, 1 Aaa 45.7
…… 2 Absb 555.0
Into #TmpTable
From Table1 3 Adasd 12.8
Order by … 4 oioiooi 0.0
….. ….. ….. …..
89
90. Replacing cursor
Tip #2
declare @var int
set @var = 0
Update Table1
set @Var = Fld2 = Fld2 + @Var
From Table1 with (index=pk_MyExampleTable)
option (maxdop 1)
go
90
94. Optimizer Hints
Most common
WITH (ROWLOCK)
WITH (NOLOCK)
WITH (INDEX = IX_INDEX_NAME)
WITH (HOLDLOCK)
SET FORCEPLAN ON
OPTION (MAXDOP 1)
Join hints (MERGE/HASH/LOOP)
Isolation levels WITH (SERIALIZABLE, READ COMMITED)
Granularity level (UPDLOCK, TABLOCK, TABLOCKX)
94
95. What is possibly wrong here?
MyTable
Col1
BEGIN TRAN x
BEGIN TRAN
UPDATE MyTable SET Col1 == ‘x’
UPDATE MyTable SET Col1 ‘x’ x
WHERE Col1 IN
WHERE Col1 IN y
(SELECT Col1 from MyTable_2)
(SELECT Col1 from MyTable_2 y
COMMIT TRAN WITH (NOLOCK) ) y
COMMIT TRAN …
m
….
z
95
98. What is wrong here?
MyTable
Col1
BEGIN TRAN x
UPDATE MyTable SET Col1 = ‘x’ x
WHERE Col1 = ‘y’ y
IF @@ROWCOUNT <> 10
y
ROLLBACK TRAN
y
COMMIT TRAN
…
m
….
z
1000 rows with
Col1 = „y‟
98
99. What could be possibly
MyTable
wrong here? Col1
x
BEGIN TRAN
x
DELETE MyTable
COMMIT TRAN y
y
T-Log size y
Concurrency …
m
How do we “solve” this ? ….
What if we have a WHERE clause in the z
DELETE ?
7 million
rows
99
100. Transaction Habits
As short as possible
Long transactions:
Reduce concurrency
Blocking and deadlocks more likely
Excess space in transaction log to not be
removed.
T-log IO
No “logical” ROLLBACKS!
100
102. What is wrong here?
MyTable
PK Insert
CREATE TRIGGER TRG_MyTable_UP Date
ON MyTable 1
AFTER INSERT 5
AS 13
UPDATE MyTable 67
SET InsertDate = getdate() 89
FROM MyTable
…
INNER JOIN inserted
1234
ON MyTable.PK = inserted.PK
….
345667
102
103. Typical Trigger Applications
• Cascading modifications through related tables
• Rolling back changes that violate data integrity
• Enforcing restrictions that are too complex for
rules or constraints
• Maintaining duplicate data
• Maintaining columns with derived data
• Performing custom recording
• Try to use constraints instead of triggers,
whenever possible.
103
105. Column name Type Property Key/index
Employees
table Employee ID Int NOT NULL Clustered
Identity (values are unique)
First Name Char(100) NOT NULL
Last Name Char(100) NOT NULL
Hire Date Datetime NULL
Description Varchar(8000) NULL
ContractEndDate Char(8) NOT NULL Index
SelfDescription Varchar(8000) NOT NULL default „‟
Picture Image NULL
Comments Text NULL
Application rules:
All queries fetch EmployeeID , FirstName, LastName and HireDate WHERE EmployeeID
equals or BETWEEN two values, where ContractEndDate >= getdate()
All other column are fetched only when user drills down from application
FirstName, LastName, HireDate and ContractEndDate rarely change
Comments , Description And SelfDescription are rarely filled up and they never appear in
the WHERE clause
Picture column is ALWAYS updated after row already exists.
Once the contract ends, the data should be saved but will not be queried by application
105
106. Column name Type Property Key/index
Employee ID Int NOT NULL Clustered
Clustered
Identity (values are unique) UNIQUE
First Name Varchar(100)
Char(100) NOT NULL
Last Name Char(100)
Varchar(100) NOT NULL
Hire Date Datetime NULL
Description Varchar(8000) NULL
ContractEndDate Datetime
Char(8) NOT NULL Index
SelfDescription Varchar(8000) NOT NULL default „‟
NULL
Picture Varbinary(MAX)
Image NULL
Comments Varchar(MAX)
Text NULL
First…
106
107. Employees (active
employees) OldEmployees (inactive
employees)
Column name Key/index
Column name Key/index
Employee ID Clustered PK
Employee ID Clustered PK
First Name
First Name
Last Name
Hire Date
This is vertical Last Name
ContractEndDate Index partitioning… Hire Date
Description
ContractEndDate
Employees details 1:1 SelfDescription
Picture
Column name Key/index Comments
Employee ID Clustered PK
Description
SelfDescription
4 different tables?
Picture
Comments
107
108. Column name Type Horizontal partitioning
Employee ID INT
First Name Varchar(100)
Last Name Varchar(100)
Contract Date < 2008-01-01
Hire Date Datetime
ContractEndDate Datetime
Column name Type
Employee ID INT
First Name Varchar(100)
Last Name Varchar(100)
Contract Date >= 2008-01-01
Hire Date Datetime
ContractEndDate Datetime and < 2009-01-01
Column name Type
Employee ID INT
First Name Varchar(100)
Last Name Varchar(100)
Hire Date Datetime
ContractEndDate Datetime Contract Date >= 2009-01-01
108
110. Beware of…
Server-side cursors prior to .NET 2.0
Sorts and grouping on the client
End-user reporting
Default Transaction isolation levels
Intensive communication with database
Connection pooling
Long transactions
Ad-hoc T-SQL
SQL injection…
110
111. Performance Audit Checklist
Does the Transact-SQL code return more data than needed?
Is the interaction between the application and the Database Server too often.
Are cursors being used when they don't need to be? Does the application
uses server-side cursors?
Are UNION and UNION ALL properly used?
Is SELECT DISTINCT being used properly?
Is the WHERE clause SARGable?
Are temp tables being used when they don't need to be?
Are hints being properly used in queries?
Are views unnecessarily being used?
Are stored procedures and sp_executesql being used whenever possible?
Inside stored procedures, is SET NOCOUNT ON being used?
Do any of your stored procedures start with sp_?
Are all stored procedures owned by DBO, and referred to in the form of
databaseowner.objectname?
Are you using constraints or triggers for referential integrity?
Are transactions being kept as short as possible? Does the application keep
transactions open when the user is modifying data?
Is the application properly opening, reusing, and closing connections?
111