IBM Troubleshooting Connection Pooling (J2C) Problems in WebSphere Application Server
IBM Troubleshooting Connection Pooling (J2C) Problems in WebSphere Application Server
Note: The troubleshooting steps in this document are applicable to JDBC connection
problems using a standard data source in WebSphere Application Server. This document is
not applicable to Version 4 data sources. You can also use this document for
troubleshooting JMS connection problems and enterprise information system (EIS)
connection problems.
1. Are you having a problem using a data source to establish a JDBC connection to a
database? Most exceptions that occur when a data source is used to connect to a
database will have the com.ibm.ws.rsadapter package in the stack trace of the
exception. This includes java.sql.SQLExceptions and WebSphere Application Server
messages that begin with DSRA.
Yes, go to the Troubleshooting JDBC connection problems section.
No, continue to question 2.
2. Are you having a problem using a JMS connection factory to establish a JMS
connection to WebSphere MQ or another messaging system? Most exceptions that
occur when a connection factory is used to connect to a messaging system will have
the com.ibm.ejs.jms package in the stack trace of the exception. This includes
javax.jms.JMSExceptions and WebSphere Application Server messages that begin
with WMSG.
Yes, go to the Troubleshooting JMS connection problems section.
No, continue to question 3.
3. Are you seeing J2CA0045E errors with ConnectionWaitTimeoutExceptions or slow
performance in getting a connection?
Yes, go to the Troubleshooting connection wait issues, connection
leaks, and performance problems section.
No, continue to question 4.
4. Are you seeing StaleConnectionExceptions or having problems recovering from
invalid, or stale, connections in the pool?
Yes, go to the Troubleshooting stale connection problems section.
No, continue to question 5.
5. Go to the Miscellaneous connection pooling problems section. If your problem
is not covered in that section or addressed elsewhere in this document, review the
WebSphere Application Server V8.0, V8.5.5, or V9.0 product documentation, or
the WebSphere Application Server Support site for additional information that
might help you to resolve the problem, or continue to the MustGather:
Connection pooling problems for WebSphere Application Server for
database connection and connection pooling problems.
To troubleshoot this type of problem, you should have access to the administrative
console and the SystemOut.log for your application server. When diagnosing a
database connection problem, the first step is to use the Test Connection button for
the data source in the administrative console to test the connection to the database.
You can find the Test Connection button in the data source configuration panel.
2. Was the attempt to connect to the database using the Test Connection button
successful?
Yes, continue to question 3.
No, continue to question 4.
3. If you can use the Test Connection button to successfully connect to the database,
but a failure occurs when your application tries to get a connection, the problem is
likely caused by a failed JNDI lookup. Check the SystemOut.log to see if a
NameNotFoundException occurs when the application tries to use the data source. Does
a NameNotFoundException occur?
Yes, the root cause of the problem is that the JNDI lookup of the data source by
the application fails. Check the data source JNDI name and ensure that it is
bound to the JNDI namespace when the application server starts. If the JNDI
name is jdbc/ds, you should see this entry in the SystemOut.log:
WSVR0049I: Binding ds as jdbc/ds
Also, check the application code and ensure that it is looking up the correct
JNDI name. If you are doing an indirect JNDI lookup (i.e. java:comp/env
/jdbc/ds), ensure that the binding is correct in the data source resource
reference. The resource reference is configured in the web, EJB, or application
client deployment descriptor. Checking for these things can help you to resolve
many JNDI problems. However, if you can't determine the cause of the problem,
continue to the Troubleshooting JNDI and naming problems.
No, continue to question 8.
4. The error message shown in the administrative console when the Test Connection
button fails is important to understanding the cause of the problem. You can also
check the SystemOut.log to get more information about the error and see the stack
trace.
Does the error message and the accompanying SQLException indicate a problem with
the user ID and password that are used to connect to the database? The actual error
and exception will differ depending on the JDBC driver and database that you're
using.
Yes, the root cause of the problem is either that no userid and password are
passed to the database, or the userid or password is not correct. Ensure that a
J2C authentication alias, containing the correct userid and password, is
specified on the data source. Also check with your DBA to make sure that the
userid and password that you are using are correct for connecting to the
database.
No, continue to question 5.
5. If you are using a Type 2 driver or Type 3 JDBC driver, does a
java.lang.UnsatisfiedLinkError occur when you use the Test Connection button? A
java.lang.UnsatisfiedLinkError occurs when the JVM is unable to load a native
library that is needed by the JDBC driver.
Yes, the root cause of the problem is that the application server JVM is not
properly configured to load the native libraries. For a DB2 database, see
documents java.lang.UnsatisfiedLinkError: SQLConnect error
connecting to DB2 type 2 datasource, Failure in loading native
library db2jcct2, UnsatisfiedLinkError: ERRORCODE=-4472, and
WebSphere database connection exception
java.lang.UnsatisfiedLinkError. If you are trying to connect to an Oracle
database, ensure that the ORACLE_HOME and the LIBPATH (on AIX®),
LD_LIBRARY_PATH (on Sun Solaris or Linux®), or SHLIB_PATH (on HP-UX) JVM
environment entries are set. The required JVM environment entries might be
different for other databases.
You should also make sure that the user that is being used to run the application
server has the proper permissions to access the native libraries. Incorrect
permissions are another common cause of the java.lang.UnsatisfiedLinkError.
Finally, remember that 32-bit native libraries cannot be used with a 64-bit
application server JVM and vice versa. Mismatches between the two can also
cause the java.lang.UnsatisfiedLinkError.
1. This section will help you to troubleshoot problems that occur when using a JMS
connection factory to establish a connection to a messaging system such as
WebSphere MQ. Only JMS connection problems are covered here.
To troubleshoot this type of problem, you should have access to the SystemOut.log for
your application server. When troubleshooting a JMS connection problem to MQ, it is
useful to determine the MQ reason code associated with the JMSException. In the
SystemOut.log, find the JMSException and review the stack trace. You should see one
or more linked exceptions. Find the last linked exception. You should see a line like
this:
The four-digit number is the reason code. You can review this Messages document
to find out more about the reason code that you see. The reason code helps you to
understand the root cause of the problem.
Yes, the root cause of the problem is that the application caches or fails to close
JMS session objects. Review technote J2CA0020E: The Connection Pool
Manager could not allocate a Managed Connection:
java.lang.IllegalStateException: Internal Error: cannot find the
PoolManager Reference to resolve the problem.
No, continue to question 5.
5. Are either of the following authentication errors occurring when trying to establish a
JMS connection?
MSGS0508E: The JMS Server security service was unable to authenticate userid:
myuser
Yes, the root cause of the problem is either that no userid and password are
passed to the queue manager, or the userid or password is not correct. Ensure
that a J2C authentication alias is specified on your JMS connection factory.
No, continue to question 6.
6. Is the problem that you observe (using a tool such as netstat) more TCP/IP
connections than you expect between your application server process and an MQ
queue manager?
Yes, this is not necessarily a problem. This condition is explained in great detail
in technote Explanation of connection pool and session pool settings
for JMS connection factories. You should also review the technote if you
want to gain an understanding of how JMS connection pooling in WebSphere
Application Server works.
No, continue to question 7.
7. Finally, if none of the previous troubleshooting steps helped to resolve the problem,
continue to the MustGather: Connection pooling problems for WebSphere
Application Server for connection pooling problems.
1. This section will help you to troubleshoot connection wait issues, including those that
are caused by application connection leaks, and general performance issues that are
related to connection pooling. These types of problems could occur for any type of
backend system, including databases, messaging systems, and enterprise information
systems.
To troubleshoot this type of problem, you should have access to the administrative
console and the SystemOut.log for your application server. It is also preferable for you
to have access to the source code for your application(s). Additionally, obtaining
javacores (also known as thread dumps) may help you to resolve the problem.
Does the problem eventually clear up or does it persist until you restart the
application server?
If the problem eventually clears up, the root cause of the problem is likely that
Maximum Connections is not set high enough for the amount of load on the
application. You should increase Maximum Connections for the connection pool.
You should conduct thorough load testing to find the optimal value for Maximum
Connections. You can enable PMI and monitor the connection pool
counters in the Tivoli Performance Viewer to help you tune this.
If the problem does not clear up until you restart the application server,
continue to question 5.
3. Is the application calling close() on every connection object that it obtains from a
WebSphere Application Server connection pool? When an application gets a
connection from the pool, it is considered "in use" until the application calls close()
on the connection, which then returns the connection to the free pool. If the
application does not call close(), the connection is leaked and never returns to the
free pool. Eventually, the pool might become filled with leaked connections, causing
connection wait problems.
If the application is failing to call close() on all connection objects, that is the
root cause of the problem. You should fix the application code to resolve the
problem.
If you determine that the application server is not leaking connections, continue
to question 6.
4. Connection wait problems and slow application performance when using
connection pooling can occur when the web container is not configured properly. If
the web container thread pool size is set too high relative to the Maximum
Connections setting for the connection pool, resource contention for the available
connections could occur. It is also strongly recommended not to check the isGrowable
checkbox for the web container thread pool.
Is the web container thread pool size set too high or is the isGrowable checkbox
checked in your configuration?
Yes, then this configuration is likely the root cause of the problem. Correct the
issue and then test to see if the problem is resolved.
No, continue to question 7.
5. Other issues might cause connection wait problems. Applications should follow the
"get/use/close" pattern and close all connections immediately after finishing using
them. If an application takes too long to close connections, or caches connections, the
connections will be in use longer, which could lead to connection wait issues. Even if
the "get/use/close" pattern is followed, connections will not return to the free pool
until the transaction in which the connection is obtained is committed. If the
transaction remains active for a long time after a connection is closed, connection wait
issues could occur. This frequently occurs in servlets when a shareable connection is
obtained in a local transaction containment (LTC). More details about this scenario
and solutions for it are documented in this Default behavior of managed
connections in WebSphere Application Server article.
Is your application following the "get/use/close" pattern and not caching connections,
without long periods of time elapsing before the transaction is committed?
1. This section will help you to troubleshoot problems with stale or invalid connections
in the connection pool. WebSphere Application Server can determine that a
connection is stale based on the exception that is returned by the backend when trying
to establish a connection. When there is a stale database connection, WebSphere
Application Server issues a StaleConnectionException. For other backends, a fatal
connection error occurs. As a result, WebSphere Application Server purges the
connection pool based on the Purge Policy setting and applications are able to
recover from the problem. The following two links have more information on this
issue: WebSphere Application Server StaleConnectionExceptions and
Demystifying the WebSphere StaleConnectionException.
To troubleshoot this type of problem, you should have access to the administrative
console and the SystemOut.log for your application server. It is also preferable for you
to have access to the source code for your application(s).
2. Is the backend system that you are connecting to a database, a messaging system, or
an enterprise information system?
If it is a database, continue to question 3.
If it is a messaging system or an enterprise information system, continue to
question 6.
3. If you are seeing StaleConnectionExceptions in the SystemOut.log, do they occur for
every connection attempt or do they occur intermittently or only after a certain time
period has elapsed?
If it occurs for every connection attempt, the root cause of the problem is likely
that the data source properties are misconfigured. Follow the steps in the
Troubleshooting JDBC connections section.
If it occurs intermittently or only after a certain time period has elapsed,
continue to question 4.
4. Is the application handling the StaleConnectionException properly by catching the
exception and then retrying the connection?
Yes, continue to question 5.
No, then the root cause of the problem is that the application is not catching the
exception and retrying, so it cannot recover from the problem. Review the
WebSphere Application Server StaleConnectionException document
on how to handle the StaleConnectionException and implement this in your
application code.
5. A data source can be configured to pretest connections to ensure that they are valid
before they are allocated to the application. Here is the document for pretest
connections. Have you configured your data source to do this?
Yes, continue to question 6.
No, the problem can probably be resolved by configuring the data source to
pretest connections before they are allocated to the application.
6. Is the Purge Policy set to EntirePool for the connection pool? This means that when a
StaleConnectionException occurs, every free connection in the pool will be purged,
which makes it easier for the application to recover.
Yes, continue to question 7.
No, the problem symptoms can be alleviated by changing the Purge Policy to
EntirePool.
7. Isthe Minimum Connections property in the connection pool set to 0? Setting it to 0
enables the pool maintenance thread to clean up all of the connections in the pool
after they are unused for more than the Unused Timeout number of seconds. If Minimum
Connections is set to a value greater than 0, WebSphere Application Server must keep
at least that number of connections in the pool indefinitely. The longer that a
connection remains in the pool, the more susceptible it is to becoming stale.
Yes, continue to question 8.
No, the problem symptoms can be alleviated by changing the value of Minimum
Connections to 0.
8. Does a firewall exist between the application server and the backend system that it is
connecting to?
Yes, the root cause of the problem is likely that the firewall is timing out and
dropping connections between the application server and the backend. To avoid
this possibility, set the Unused Timeout to half the value of the timeout setting on
the firewall. This way, WebSphere Application Server can clean up its unused
connections before the firewall drops them.
No, continue to question 9.
9. Finally, if none of the previous troubleshooting steps helped to resolve the problem,
continue to the MustGather: Connection pooling problems for WebSphere
Application Server for database connection and connection pooling problems.
Also, see the following documents for additional information on the StaleConnection
Exception:
What to do next If the preceding troubleshooting steps did not solve your problem, see
the MustGather: Connection pooling problems for WebSphere Application
Server for Connection Pooling problems to continue investigation.