Kennis Blogs Java LDAP connection pooling

Java LDAP connection pooling

One of our software platforms uses LDAP to manage users and organisations. All application servers and webservers connect to the LDAP servers to verify user credentials and such. Unfortunately, after migrating to a new datacenter with new hardware, we discovered some errors in our logging from time to time:

 

javax.naming.ServiceUnavailableException: ldaploadbalancer:389;
socket closed; remaining name '...'
 at com.sun.jndi.ldap.Connection.readReply(Connection.java:454)
 at com.sun.jndi.ldap.LdapClient.getSearchReply(LdapClient.java:639)
 at com.sun.jndi.ldap.LdapClient.search(LdapClient.java:562)
 at com.sun.jndi.ldap.LdapCtx.doSearch(LdapCtx.java:1985)

 

These errors occurred sometimes at random, with two interesting observations:

  • The were no similar errors for the webservers;
  • The error occurred much at night, when some scheduled jobs are active.

 

After some debugging, we found out that this exception occurs when a reference to a socket object in the com.sun.jndi.ldap.Connection class is null. Because the code had not been changed, but the environment did, the load balancer was suspicious. In our code, the connection pool was configured to drop connections after an idle time of 5 minutes. The connection pool was also configured to drop connections after 5 minutes, and the load balancer was also configured with an idle timeout of 5 minutes! The old load balancer gave connections a maximal idle time of one hour. So probably, the load balancer was just a bit faster closing idle connections than the application server did. It also explains why webservers did not have the same issue; they are just differently configured. We changed the load balancer's idle timeout to 7 minutes assuming the problem was fixed.

 

ldap

 

Next day, we looked at the logging to see if the error had occurred during night. Unfortunately, it did.. We had to start over all again. This time, we decided to verify our facts. Before the migration, nobody complained about anything that led us to this problem, but did it really not occur? With some research, we found out it did! Much less than it did after the migration, but it did. Much less sounded like much less because the idle timeout was much higher. But there was no reason why a timeout of 7 minutes would give other results than a timeout of an hour, as far as we knew.

 

Running the following command on test, where no load balancer is configured, running the following command gave some interesting result:

 

$ lsof -i :ldap | wc -l 
616

616 open connections on a test environment?! That is way too much. It was time to dig in the code. We discovered the following.

 

DirContext connection;
Map<String, String> context = new HashMap<>();
context.put("com.sun.jndi.connect.pool.timeout", "30000")
try {
    connection = new InitialLdapContext(new Hashtable<Object, Object>(context), null);
} catch (NamingException e) {
    throw new LdapException("Can't connect to LDAP.", e);
}

 

Looks great, but there is a mistake in here: Some properties have to be set on the InitialLdapContext environment. But the pool properties, have to be set as system properties. So the pool timeout was not set at all! We fixed this, and redeployed one of the applications on the test environment.

 

In the controlled test environment it was easy to test our change: Make sure with lsof -i :ldap that the application server has no outstanding connections to LDAP, and do some request. Verify some connections are made, and take a coffee to wait five minutes for the connections to disappear. And? Of course, no good luck again, we started to hate the problem..

 

The Java LDAP connection pool has some useful debug logging which can be turned on with setting the system property "com.sun.jndi.ldap.connect.pool.debug" to "fine" or "all" (for a ridiculous amount of logging). The logging showed for some of the connections the following:

 

2015-03-05 14:04:52,670 ERROR [stderr] (ajp--0.0.0.0-8409-5) Use com.sun.jndi.ldap.LdapClient@3f376452
2015-03-05 14:13:12,843 ERROR [stderr] (Finalizer) Release com.sun.jndi.ldap.LdapClient@3f376452

 

Some connections did not get closed in the application code, but later on by the Finalizer (garbage collection), a resource leak. After some search, one of the parts of code that contained a resource leak was the following (bit simplified):

 

protected String lookup(String dn, String query) throws LdapException {
    DirContext connection = provider.getConnection();
    NamingEnumeration results = null;
    String foundDn;
    try {
        results = connection.search(dn, query, getSearchControls());
        if (results.hasMore()) {
            foundDn = results.next().getNameInNamespace();
            if (results.hasMore()) {
                throw new LdapException("Multiple objects returned");
            }
        } else {
            foundDn = null;
        }
    } catch (NamingException e) {
        throw new LdapException("Can't lookup object DN", e);
    } finally {
        LdapUtil.closeQuietly(results);
        LdapUtil.closeQuietly(connection);
     }
    return foundDn; }

 

Looks good. But the getSearchControls method contains a dangerous setting:

 

SearchControls searchControls = new SearchControls();
searchControls.setReturningObjFlag(true);

This setting gives to access to underlying SearchResult object using searchResult.getObject(). For some reason you have to cast this to an LdapCtx, and than call close on it, if setReturningObjectFlag(true) is called.

 

Another resource leak we had was with requesting attributes, a quick overview of objects that have to be closed with the Java LDAP library:

  • Of course, the DirContext, which represents the connection, has to be closed.
  • Also, a NamingEnumeration has a close method which should be called.
  • As I said, a SearchResult's object (getObject()) has to be closed when returningObjFlag is set to true.
  • Finally, Attributes have a NamingEnumeration inside which has be closed and can be accessed by calling Attributes.getIDs().

 

To conclude, the standard Java LDAP library is old and contains much opportunities to create resource leaks. If you can, it might be a better idea to pick another LDAP library.