5. No Serious…
What are these best practices
General rules and guidelines
Intend to improve:
Maintenance
Performance
Availability
Quality
Not always 100% implementable
But at least try
Document why
7. Coding Best Practices
There is no performance loss for documented code
Code Tells You How, Comments Tell You Why
Don‟t hardcode
SQL Server supports Variables ;-)
Format your code for Readability
There is No “right” method
But… make clear agreements
And follow them
8. Windows Authentication
Easier to administer
Centralized
Better auditing
More
Secure
Flexible
Always-On
Contained Databases
9. Normalize
Normalforms
Aim for 3rd Normalform
Normalize first
DEnormalize when required
DEnormalization -> Sometimes OK
DEnormalization can be done using many techniques
Un-Normalized
10. Data Integrity
Top priority
Maintained by using constraints
Sometimes you‟ll have to rely on triggers
Never trust code outside of the table to assure data integrity of a table
Primary Keys Should ALWAYS exist
Even if you‟ll have to make a surrogate key
Declare your alternate keys
Declared Referential Integrity
Foreign Keys (Fast)
If there is absolutely no other choice -> Trigger code (Slow)
11. Data Integrity
Limit your column data
Similar to Referential, but there is no table holding your values
Why not build that table?
Easier to manage, easier to scale
Use check constraints
Limit your data type
Why is everyone afraid of the “big” bad tinyint?
Or even worse, the bit…
12. Clustered Index
Your table should have one
Unless in very specific well documented cases, it will be faster
The primary key is usually NOT the best choice
It is the default
Best choice can only be determined by usage
If usage determines the PK to be the best choice, then it is!
Always keep partitioning in mind
Should be your (range)-scan-key
13. Non-Clustered Indexes
OLTP vs. OLAP
Avoid having more indexes then data...
This is what makes a lot of databases SLOW²
Think about Scan vs. Seek
Think about entry points
Be carefull with:
composite indexes with more then 2 columns
ABC <> BCA <> BAC -> If you‟re not carefull you‟ll be creating all 3
Included columns
Don‟t include 90% of your table
Filtered Indexes
Know your logic and test!
14. Think about...
Null‟s
Generates quite some overhead
Has a meaning <> „None‟
Datatypes
Don‟t overuse (n)varchar(max), think about the content
Examples
Telephone numbers (exists out of 4 blocks that all can have prefix 0) – E.164 standard
Country Code (Max 3) | regio code + Number (max 15) | Extention Max (4)
„00999-(0)1-123.12.23 ext1234‟ [varchar(33)] (2+33 bytes= 35
bytes)
„+99911231223‟,‟1234‟ [varchar(18)]+[varchar(4)] (2+18 + 2+4 bytes= 26
bytes)
tinyint,smallint | tinyint, tinyint | tinyint, int, int (1+2+1+1+1+4 (+4) = 10 + 4 Bytes)
Length, Value | Length, Value | Length, Value | Extention -> Other table (to avoid Nulls)
15. Bad Data types -> Avoid
TEXT
String functions are limited
Indexing becomes useless
LARGE
NTEXT
… No Comment
FLOAT, REAL
Approximate numeric values
Not exact!
Can give “funny“ error‟s 1E-999 <> 0
16. Char vs. Varchar
Action Char Varchar
Length Known Unknown
Fragmentation Easier to control Bad with updates
Flexibility None (From 1 to 8000) From 1 to MAX
Frequent Updates Size is allocated Needs to resize/split
Index able Supports Online Depends
Null size Full size is allocated + Overhead
Overhead
Avoid (When Possible) Empty space / Nulls MAX
17. SET-based
SQL is a set based language
The optimizer is designed to do just that
Batch-mode
Typically represents 1000 rows of data.
Optimized for the multicore CPUs and increased memory throughput.
Batch mode processing spreads metadata costs and overhead.
18. UDF‟s
User defined functions
Make code easier to read
Make code easier to write
Be careful with non-deterministic
Can have a very negative impact on performance
19. Select *
Never use Select *
Avoid operators that don‟t use your indexes
Explicit column lists
Are less error prone
Easier to debug
Reduce Disk IO
More Maintainable
Columns can be added or re-positionned
20. Always
Use Begin and END
Even if it only contains one statement
Use schema name
There is a slight performance improvement
Makes code more readable
Use table alias
Even when not joining
Eliminated ambiguity
Reduce typo chance
Assist intellisence
Set Nocount on
21. Always
Use ANSI join syntax
TSQL join syntax can return incorrect results
Is deprecated
Easier to read
22. Avoid
Table Hints
Index Hints
Join Hints
Lock Hints (this should be done on a higher level)
Very rare for the optimizer not to choose the best plan
Triple check your query (and do so with the full dataset)
Hints break your DBA‟s ability to tune the database
23. Be careful with
Dynamic SQL
If used wrongly, it will perform slower
Increased security risks because it does not take part in ownership chaining
@@Identity
Can return wrong values if used in combination with triggers
Use SCOPE_IDENTITY or IDENT_CURRENT() instead
TRUNCATE
Minimally logged
Doesn‟t fire Triggers
Cannot use schema binding
24. Stored Procedures
Anticipate debug
You can add a @Debug flag that talks or logs more
Make sure your stored procedures return values
Call SP‟s with their parameter names
Easier to read
More error free, because you can switch order
Error handling
Handle your nested transactions!
25. Temp Tables vs. Table Variable vs.
Table Parameters
Size does matter
Test!
Consider derived tables or CTE‟s
Never forget IO and scaling
Check your query plans
Think careful about the order of execution
Take into consideration indexing
Query plan regeneration
Default values
26. Avoid
String = “Expression”
Both in selects as in Where clauses
Be careful with NULL‟s
A Null value has a meaning
And it doesn‟t mean “default” or “not available”
27. ANSI/ISO Standards
Use ANSI standards where possible
ISNULL vs. Coalesce
CURRENT_TIMESTAMP vs. Getdate()
ROWVERSION vs. Timestamp
ANSI SETTINGS -> ON
ANSI NULLS
ANSI PADDINGS
ANSI WARNING
ARITHABORT
CONCAT_NULL_YIELDS_NULL
QUOTED IDENTIFIERS
Numeric_Roundabout -> Should be OFF
Always Format your date time using ISO standards
YYYY-MM-DDTHH:MM:SS
28. Part 2 - 2012 Specific
Always ON
ColumnStore Indexes
Contained Databases
Filestore
Always-On vs. Clustering vs. Mirroring
30. Always-ON
Superior to Mirroring (Depricated)
Pro‟s
Good wizard
Good dashboards
Same responsiveness in failover
Only One IP-adress
Multiple replica‟s
Readable replica‟s
Drop the [#@!*] snapshots
Contra
Same overhead
Same maintenance problems
Even more sensible to bad database design
31. Always-On
Be carefull with
Snapshot Isolation
Repeatable-read (LOCKS!)
Logins
Creating indexes for reporting on live databases
Overhead
Backups on secondairy
Copy only for the time being
TF9532 (Enable multiple replica‟s in Always on)
Keep your settings compatible (ex. TF‟s)
Bulk load isn‟t supported
32. Always-ON
Sollutions
CRUD overhead
Partition!
Maintenance overhead
Partition !
No “good” Index‟s for reporting vs. Overhead for OLTP
Partition !
Users/logins/SID‟s
Partition ! (kidding)
Use windows Authentication
Use 'sp_help_revlogin„ en automate it!
Careful with maintenance plans
33. AlwaysON
Performance benefits
Has huge benefits from combining it with:
Resource governour
Compression
Non-Wizard maintenance
Read-only partitions
Dedicated data-network
Local (SSD) Storage
Documentation
PARTITIONING
34. Column Store Indexes
Fundamentals
Stores data in highly compressed format, with each column kept in a
separate group of pages
Use the vector-based query execution method called "batch processing“
Segment Elimination
Engine pushes filters down into the scans
Makes the table/partition read-only
key to performance is to make sure your queries process the large majority
of data in batch mode
35. Column Store Indexes
DO‟s & Don‟ts
Do‟s
Only on large tables
Include every column
Star joins with grouping and aggregation
BATCH mode
On the OLAP part of your database
Don‟ts
String Filters on column store indexes
OUTER/CROSS JOIN
NOT IN
UNION ALL
ROW mode
ON the OLTP part of your database
36. Column Store Indexes
Maximise Performance
Resource governour
Maxdop >= 2
CTE‟s
Works arround not in Joins
Works arround UNION ALL
Carefull with
EXISTS IN -> Inner joins
Data Managment
DROP/Rebuild approach on data updates
Queries can become complex, but focus on Batch mode
37. Contained Databases
Security
Disable the guest account
Duplicate Logins
Sysadmins
Different passwords
Initial catalog
Containment Status of a Database
Attaching (Restricted_User mode)
Kerberos
Restrict access to the database file
Don‟t use auto close -> DOS attacks
Excaping Contained databases
38. Filetable
(Disable windows Indexing on these disk volumes)
Disable generation of 8.3 names (command: FSUTIL BEHAVIOR SET
DISABLE8DOT3 1)
Disable last file access time tracking (command: FSUTIL BEHAVIOR SET
DISABLELASTACCESS 1)
Keep some space empty (let us say 15% for reference) on drive if possible
Defragement the volume
Is supported in ALWAYSON!
If property is enabled on all servers
Using VNN‟s
39. AlwaysOn
Mirroring – Clustering – Logshipping
Contained Databases, Column Store Index
AlwaysOn complements these technologies
In a Way, AlwaysOn replaces Mirroring (Depricated)
Clearly a step into a new direction
To optimaly use these technologies
Part 1 best practices are very important
Your database design should be as optimal as possible
Partitioning becomes a MUST
Resource governour becomes a MUST
You‟ll need the Enterprise edtion
40. Call to action
Start giving feedback to your developers / 3rd party vendors NOW
Start thinking about
Data flows
Data retention
Data management
Partitioning
Filegroups/Files
Data-tiering
Don‟t
Restrict your view to the boundairy of a database
Maximizing Performance and Working Around Columnstore LimitationsFollow the links to the topics listed below about how to maximize performance with columnstores indexes, and work around their functional and performance limitations in SQL Server 2012.Ensuring Use of the Fast Batch Mode of Query Execution Parallelism (DOP >= 2) is Required to Get Batch ProcessingUse Outer Join and Still Get the Benefit of Batch ProcessingWork Around Inability to get Batch Processing with IN and EXISTSPerform NOT IN and Still Get the Benefit of Batch ProcessingPerform UNION ALL and Still Get the Benefit of Batch ProcessingPerform Scalar Aggregates and Still get the Benefit of Batch ProcessingMaintaining Batch Processing with Multiple Aggregates Including one or More DISTINCT AggregatesUsing HASH JOIN hint to avoid nested loop join and force batch processing Physical Database Design, Loading, and Index ManagementAdding Data Using a Drop-and-Rebuild ApproachAdding Data Using Partition SwitchingTrickle Loading with Columnstore IndexesAvoid Using Nonclustered B-tree IndexesChanging Your Application to Eliminate Unsupported Data Types Achieving Fast Parallel Columnstore Index BuildsMaximizing the Benefits of Segment EliminationUnderstanding Segment EliminationVerifying Columnstore Segment EliminationEnsuring Your Data is Sorted or Nearly Sorted by Date to Benefit from Date Range EliminationMulti-Dimensional Clustering to Maximize the Benefit of Segment EliminationAdditional Tuning Considerations Work Around Performance Issues for Columnstores Related to StringsForce Use or Non-Use of a Columnstore IndexWorkarounds for Predicates that Don't Get Pushed Down to Columnstore Scan (Including OR)Using Statistics with Columnstore Indexes