Jdbc Best Practices - DB2/ IDUG - Orlando, May 10, 2004
1. JDBC Best
Practices for DB2
Programmers
Derek C. Ashmore,
Delta Vortex Technologies
Session B1 – 10:30am
May 10, 2004
2. JDBC Best Practices for DB2
Programmers
• Presentation Summary
– JDBC Coding for Performance
– JDBC Coding for Maintainability
– JDBC Coding for Portability
– Future Directions
3. Who Am I?
• Derek C. Ashmore
• Author of The J2EE™ Architect’s Handbook
– Downloadable at:
– http://www.dvtpress.com/javaarch
• Over 6 years of Java-related experience
• Over 10 years of database design/administration experience.
• Can be reached at dashmore@dvt.com
5. Why focus on JDBC?
• JDBC the most commonly used.
– Not a technical judgment – just an observation.
• Why is JDBC the most common access method?
– It was the first access method available.
– It works
– It satisfies developers needs
– Most DBMSs support it.
– JDBC skills are easy to find in the market
• JDBC tutorial at: http://java.sun.com/docs/books/tutorial/jdbc/
6. JDBC Coding for Performance
• Use PreparedStatements with host variable markers instead of
Statements.
• Beware that selects issue shared locks by default
• Consider query fetch sizing
• Utilize connection pooling features
7. Use PreparedStatements
Use PreparedStatements with parameter markers instead of Statements
Use “select name from Customer where id = ?”
Instead of “…. where id = ‘H23’”
Statements are less typing but…….
Extra String Processing to assemble the where clause.
Circumvents Dynamic Caching
Prevents reuse of the query access path by the database.
This means that statements will be Slower
Dynamic Caching improves performance of dynamic SQL
DB2/UDB (all environments) caches plans associated with dynamic SQL
statements
Like statements will reuse that dynamically stored plan
Dynamic SQL gets many of the performance benefits that static SQL has.
8. Beware of Shared Locks
• Beware of shared locking with Select statements
– Common Myth: Reading is harmless
– Cursor Stability is default == Shared Locks
– When only Reading: Commit as early as possible (or use autocommit)
– Use “commit” or JDBCs autocommit feature for selects, but don’t use
both.
• Issues the commit twice lower performance
9. Consider Setting the query fetch size
• Instruct database to return rows in batches of 10 to 100.
• Example Query Fetch Sizing
– statement.setFetchSize(10)
• Higher isn’t always better
– Can degrade performance if used for small ResultSets.
• Has Fewer network round-trips
– Most benefit using batches of 10 to 100 – diminishing returns after that.
• Larger benefit reducing network trips from 100,000 to 1,000 than from 100,000 to 100.
• The larger the batch, the more memory required.
• Needs to be tested on a case-by-case basis.
10. Utilize Connection Pooling
• Connection Pools eliminate wait time for database
connections by creating them ahead of time.
– I’ve seen enough J2EE apps managing connection creation directly to
warrant this practice.
– Connections take 30 – 50 ms depending on platform.
– Allows for capacity planning of database resources
– Provides automatic recovery from database or network outages
• Issuing close() on a pooled connection merely returns it to the
pool for use by another request.
11. JDBC Coding for Maintainability
• Close JDBC Objects in a “finally” block.
• Consolidate SQL string formation.
• Always specify column names in select and insert statements.
12. Close all JDBC Objects
• Most JDBC Objects require a Statement Handle from DB2 Client.
– Includes PreparedStatement, Statement, and ResultSet
– This is a finite resource (e.g. either 600 or 1300 depending upon version).
• Close all JDBC Objects in a finally block
– Stranded JDBC consume scarce db resources
• Cause errors down the line
• DB2/UDB (all environments) w/DB2 Client Statement handles are consumed
• Close JDBC objects in the method that creates them.
– Easier to implement this habit.
– Easier to visually identify objects not being closed.
• As the garbage collector “closes” these objects, but no guarantee of being
under the resource limit.
– You may not see problems until stress testing or production.
13. Closure Issues
• Closing JDBC Objects is inconvenient
– Close() throws a SQLException
– Leads to nested try/catch logic in the finally block
– A lot to type
• Utility Support can make this easier
– Use generic close utility that logs SQLExceptions received, but
doesn’t throw an exception
– Gets the “close” down to one line.
– CementJ – http://sourceforge.net/projects/cementj
• org.cementj.util.DatabaseUtility
14. Penalty for Object Leaks
• Applies if you’re using DB2/Client (which most do)
• Each JDBC Object acquires a statement handle within
DB2/Client.
• Limited to between 600 and 1300 (depending on version of
DB2/Client)
15. Closure Issues (con’t)
• Finding Stranded JDBC Objects Problematic
– This is especially difficult if you need to identify leaks in an application you
didn’t write.
– Use P6Spy with an extension library
– P6Spy is a JDBC Profiler that logs SQL statements, and their execution time.
– I’ve extended P6Spy so that it will identify all stranded objects and list SQL
statements associated with them.
– P6Spy available at http://www.p6spy.com/
– P6Spy Extension at “Resources” link from www.dvtpress.com/javaarch
16. Consolidate SQL String formation
Some developers dynamically build the SQL string with scattered
concatenation logic
String sqlStmt = “select col1, col2 from tab1”;
<<< more application code >>>
sqlStmt = sqlStmt + “ where col2 > 200”;
<<< more application code >>>
sqlStmt = sqlStmt + “ and col3 < 5”;
With a small number of apps, this is necessary, but most can consolidate the
logic.
Disadvantages
Harder to use dynamic caching
Harder to read
More String Processing
More Memory Allocation
17. Consolidate SQL String Example
Using “static” variables for SQL text
Reduces string processing and memory allocation as happens
when the class is first referenced.
Consolidates SQL text so that it’s easier to read.
Example
public static final String CUST_SQL=
“select name from Cust where id = ?”;
……
pStmt = conn.prepareStatement(CUST_SQL)
18. Specify Column Names
• Always specify column names in select and insert statements.
– Code won’t break if DBA changes column order
– Clearer for maintenance purposes
• Imagine a select or insert statement involving 20-30 columns
– Hard to tell which value pertains to which column
• Specify column name instead of offset when using ResultSets
– Use resultSet.getString(“col1”);
– Instead of resultSet.getString(3);
19. JDBC Coding for Portability
• Limit use of platform-specific features.
• Reference java.sql or javax.sql classes only
– Avoid DB2-specific classes
20. Limit use of Platform-specific features
• Portability == The ability to switch DBMSs.
• Use of platform-specific features create portability obstacles
– Your code might live longer than you think (Y2K).
• Only use when clear benefit – not out of habit
• Examples
– Stored procedures using proprietary language
– Proprietary Column Functions
• ENCRYPT
• NULLIF
– Proprietary Operators
• CASE
• OLAP (e.g. RANK)
21. Reference java.sql or javax.sql classes
only
• Avoid vendor-specific class implementations unless required
for performance
– Usually not necessary now
• Was necessary in early days before formal support for
– Fetch sizing/Array Processing
– Statement Batching
– Creates a portability issue
• Harder to switch DBMSs
– Creates a maintenance issue
• The JDBC interfaces are familiar
• Proprietary objects may not be
22. Latest Developments
• JDBC 3.0 Specification
– Return generated PK value on insert.
– ResultSet Holdability – exist through commits
– Support multiple ResultSets for stored procedure fans
– Standardizes Connection Pooling
– Adds PreparedStatement pooling
– Savepoint support
23. Future Directions
• JDBC is a maturing spec
– Expect frequency of change to slow considerably
• Use of Object-Relational mapping toolsets is increasing
– Hibernate (www.hibernate.org)
– JDO (www.jdocentral.com)
• Despite technical advances, entity beans are close to
becoming a part of history.
24. Stored Procedure Use
• Aren’t Stored Procedures better performing?
– Depends on platform
• Sybase – yes, Oracle/DB2 – not always
– As a general rule, CPU intensive actions are bad as stored procedures
– SQL are statically bound
• Used to be more significant before dynamic caching
– As a rule, stored procedures help performance by reducing the number of
network transmissions.
• Conditional selects or updates
• As a batch update surrogate (combining larger numbers of SQL statements)
• Ask: How many network transmissions will be saved by making this a
stored procedure? If the answer is “0”, performance is not likely to be
improved.
25. Header Text
Questions
• JDBC Best Practices for DB2 Programmers
• Session B1
• Derek C. Ashmore
• Email: dashmore@dvt.com