« Cheaper HA with Linux DRDB and WebSphere | Main | Oracle buys JBoss, then what happens to Hibernate? »

January 21, 2006

Dealing with long running synchronous operations

I've had this question several times now so it's clearly not explained well in the docs. This scenario is typically around a remote EJB method that takes minutes to run. The method may be running a very expensive database operation involving many rows or it may be a large report.

The problem is basically that WebSphere's default settings will not allow you to call this type of method. You'll need to do some tuning first.

Tuning the client

The main problem on the client is that WebSphere has a timeout on the client. The available parameters are described here in the info center. The main one to look for our purposes is com.ibm.CORBA.RequestTimeout. This specifies how long a client will wait for a server to respond to a request before timing out on the client. This value needs to be set to a value thats double the expected average response time. Double is an estimate, you'll need to verify what you use makes sense in your environment.

This value can't unfortunately be set per EJB or per EJB method which would have been very useful. You can just set it at the JVM level. The value has these basic side effects:

  • Hung remote servers
    This means the remote server has probably hung. In these cases, we want the client to unblock and throw an exception.
  • Hard failed or disconnected remote servers.
    This means a network issue or the server actually failed (power or component failure). The TCP socket to the server will eventually timeout (TCP KEEP ALIVE operating system tuning) but this is usually a considerable period of time. The corba timeout can also be used to unblock the server in these cases.
  • Hung short period methods take a long time to unblock
    Remember, you can only set the value once per JVM. If your JVM calls EJB methods with a big range of response times then you'll only be able to tune it for the longest response time. Clearly, this isn't optimal for the more normal short remote calls.

Tuning the server

The server will need the transaction timeout raised to a value large enough to allow the long running methods to complete successfully. You can find the documentation for these settings in the infocenter. Unfortunately, once again these are JVM settings which means you can only tune them for the longest methods in a single JVM except on WebSphere 6.0 where it can be changed on a per EJB basis. But, if you can then I'd recommend splitting the EJBs in to groups based on response time and deploying EJBs with similar response times to the same JVM. This at least will allow some fine tuning at the expense of running more JVMs.

Obviously, you can't split up the app on the client side as the client will probably be calling remote EJBs with different response times.

"Doctor, when I do this then it always hurts? Well, don't do that..."

Obviously, this isn't great. The lack of fine grained tuning makes the compromises hard to swallow. Long running synchronous operations are just tricky to do and get right if you're invoking them using remote EJB calls. You could really look at message driven beans and JMS to trigger these types of tasks. The client can then send a JMS message to trigger the task. The task may still need a way to send results back to the client. Temporary JMS destinations would be an option or just write the results in a table using a client supplied UUID as the key. The client regardless will need to check for messages or poll the table for the results. Table polling may be done using a fixed poll time or when the client clicks a button to check the status.

A good alternative to JMS is to use a remote EJB call to initiate the long running task. The EJB method would pick a UUID for the results, start an async bean thread for the long running task. The ejb method then returns the UUID to the client in the response. The async bean was provided the UUID when it was started. It could write partial results or status to a database table using the UUID as a key. The client could use another remote EJB method provided with the UUID to poll the status or receive the result. This is simpler than the JMS approach and is easier to make fault tolerant (no JMS server).

But, this type of approach is probably much easier to deal with with and make robust than to try to do this using remote EJB calls.

January 21, 2006 | Permalink

Comments

You mention starting an "async bean thread". Isn't this a violation of the spec? Or is there something with WAS that allows this to be done with a resource adapter allowing the container to manage it?

Posted by: Scott Carlson | Jan 21, 2006 10:12:48 PM

WebSphere allows application threading through the use of async beans/WorkManager APIs. The resource adapter threading isn't really suitable for application use. The WorkManager APIs are currently being standardized by JSR 236/237.

Posted by: Billy | Jan 23, 2006 8:58:16 AM

Hi Sir,

My name is Varadaraj working as J2EE Developer.I wanted some Parameter List for IBM WAS 6.0 to tune the web application so that it increases the response time and application becomes faster to response.

Regards,
Varadaraj.M

Posted by: Varadaraj.M | Jan 25, 2006 12:40:10 AM

Billy

Yup - clients should not wait on long running requests. They should submit a request and receive some form of ticket (your UUID). Clients can then be notified of completion via some sort of push technology. I would not recommend increasing timeouts as its a road to trouble!

Andrew

Posted by: Andrew Ward | Jan 25, 2006 7:59:24 PM

Billy,

I'd like to clarify the following statements: “…The EJB method would pick a UUID for the results, start an async bean thread for the long running task. The ejb method then returns the UUID to the client in the response.”

Does it mean that the EJB method will start async bean thread and will not wait for its completion?

Let’s assume that this is true and the EJB method will start async bean thread and will not wait for its completion.

It is OK from the spec point of view, for example, in JSR 237 “…Short-lived Works may exceed the life of the submitting request method…”.

However, is it a good practice to follow?

We may have the following potential issues in case when the EJB method will not wait for completion of async beans threads:

• One of the problems is that you do not know if your task, which was running on async bean thread, completed successfully or failed.

• WM thread pool should not be growable, because otherwise it could exhaust threads in the OS.

• I believe that Tivoli does not provide with stats for WM thread pool usage. It means it could be difficult to detect if we have kind of “run away” processes that were initiated by the EJB method.

Posted by: Jeffrey Smith | Jan 27, 2006 8:33:42 PM

Post a comment