April 01, 2007
Messaging only APIs need to go, JPA needs to step up
I'm getting tired of the need for separate messaging and data access APIs. You could characterize these APIs as:
- Ordered accessible data (OAD)
- Key accessible data (KAD)
Key accessible data arguably has the most capability and has seen the most innovation over the last 5 years. Most notably, the JPA specification has arrived which standardizes the POJO oriented work done in OR mappers like Hibernate and Toplink. We have automatic mapping, relationship management, query support, lots of nice stuff. Now lets look at JMS. It's about as stagnate as you can imagine. Why can't I go all the stuff with OAD as I can with KAD? Do we now need to enhance JMS. NO, NO, NO. We need to force messaging to simply extend the JPA like standards, not compete with them. Messaging is just state, state should be managed the same way whether its state from a message or state from an inmemory database like ObjectGrid or state from a relational database.
For me, this status quo isn't acceptable at all. The JPA specification is unfortunately very short sighted in this area and is too relational database focused. It should be a main specification focusing on entities and relationship between entities and a subspec specializing it to relational databases but a product that works with persistent state should have been able to use JPA APIs but alas, that isn't the case and the JPA 2.0 stuff doesn't look any different. Why couldn't we use the JPA spec for messaging or for non relational databases or as an API to an in memory database directly?
JPA type interfaces could easily support messaging semantics with one little change. OAD and KAD have different but not incompatible locking requirements. OAD needs a way to scan a filtered set of records (think query) looking for unlocked records and then lock the first one and return it to the application. The JMS receive method does just this. If many threads/consumers go after the top of a queue then the locking model allows this behavior. Find the first unlocked message in FIFO order, lock it and return it.
Why then don't databases provide this locking model, scan for unlocked record and lock? I don't know. We have added it to ObjectGrid for the next release and this simple addition allows an ObjectGrid Map to be used for both a KAD and an OAD. This completely unifies data and persistent messaging and it's very, very cool.
No more is messaging data and key data treated differently, it's the same and it should be. Why can't I annotate a message with a relationship to data outside the message in a TABLE or ObjectGrid entity? It should be absolutely seamless and messages should not be orphans in your data model, a message is just another record type and it should be able to have relationships with other records and this should be supported first class.
I think it's been too long now for this change to happen in the industry. Messaging APIs are dying on the vine, ws-* stuff just redefines the existing APIs with ws-* APIs and transports but does nothing to simplify the life of a programmer. ws-* just provides a new syntatic sugar for the existing APIs and supports more protocols but fundamentally does nothing new for the programmer. We need one data programming model and that data programming model needs to support both KAD and OAD with the same data model, tooling and APIs. I say, split JPA in two specs and extend the base spec with an optional addition to add efficient OAD support to the API.
April 1, 2007 in J2EE | Permalink Sphere | Comments (4)
June 22, 2005
Total JMS Message ordering
A lot of customers think JMS products provide message order. I haven't found one yet that really does it.
Naively, you'd think that if a single publisher sends messages then a single receiver will process those message in the published order.
We make an MDB with a single listener thread and run it on a single server. If nothing goes wrong then this is possible. But, in the real world, sadly, things do go wrong.
First problem is a crashed client where the socket to the JMS server remains open. This can happen when a network breaks or the server physically crashes. The message(s) buffered on that client stay locked until the socket is closed. This can take a while depending on TCP KEEP ALIVE settings, if you haven't tuned this then it's likely 2 hours. In the meanwhile, if you restart your client on a different server and it asks for messages then it gets the next messages. Oops. Out of order. When the socket is eventually closed, the 'trapped' messages are unlocked and delivered to the client. Busted.
Next up is indoubts. If an application server was processing messages and then crashed or the box it was running on failed then we may have some indoubt transactions. These are transactions interrupted during the XA commit phase. Normally, these transactions will need to be recovered either by a peer cluster member (WAS v6 in 15 seconds) or by restarting that server on the same or a different box (minutes). During this time, if you started receiving messages on a different server then it will receive the messages after the ones trapped by the indoubts. Oops. Out of order. When the transactions are recovered, it may be that the transactions will be rolled back as a result of recovery. Now, the old messages in use by that transaction become available again and are delivered to your client out of order.
This kind of requirements comes up a lot but it's complicated to get right as you should be realizing from the above. There are also interesting edge conditions. JMS offers a time to live on messages. If it's not delivered within 5 tries then the message moves to a dead letter queue. But, we want in order messaging. When this happens, do we suspend the queue until an administrator looks at the suspect message and tells the JMS server put it back or discard it? Poison messages would have the same impact. Then, when we want message order do we mean maintain message order per individual publisher or for the order the queue received the messages. What makes the order? Commit time or publish time? It can turn in to a rat hole pretty quickly.
I don't think people realize what 'I must have message order' means in terms of implementation. You will need application logic to do this correctly for now. I don't know of any product that can reliably do this at the moment. Anyway, it's a heads up.
May 16, 2005
When is a rollback a commit...
Thought I'd raise this up to the collective conscience. Heres a scenario. A JVM is using a cache in front of a database. It uses the database in one phase mode. It issues commits to the database. Lets assume the database crashes before it processes the commit. This is fine, the app server interprets this as a rollback and we rollback and inform the application.
Now, lets assume the database processed the commit but crashed before returning the ACK on JDBC to the application server. This looks the same as the previous scenario. So, the server rolls back and tells the application it rolled back.
Problem is, when the application reconnects to the database once the database restarts, the cache is now not in sync with thats in the database and general wierdness can now occur because it's not in sync.
Basically, the moral here is that whenever the application sees a rollback due to a stale connection (i.e. a connectivity problem or a DB2/Oracle reconnect) then it should flush/reload the cache or figure out whats the story when it gets a connection again because it's not clear which transactions that were in flight and rolledback by the app server actually commited on the database so the cache cannot be relied upon to be up to date.
Course, if your cache is optimistic then this doesn't apply but if its a write through cache expecting exclusive access then it should definitely be on your radar as a scenario to handle.
April 01, 2005
Faster optimistic locking with DB2
Just saw this for the first time today. DB2 allows tables to have a hidden column that is basically a 'last updated timestamp'. This column can be used to do overqualified updates when doing optimistic locking. But, an issue with this was that the application needed to read the updated row to get the new timestamp. DB2 has new statements that allow the updated columns to be returned from the insert or update sql statement. This basically halves the number of statements sent to the database when using this kind of mechanism for optimistic locking.
Read about it here or google for data-change-table-reference
December 10, 2004
Tips for efficient frequent JMX calls to WebSphere.
Currently using JMX (through the wsadmin utility in WAS) to call JMX MBeans to query runtime state but it's too expensive to do frequently. Starting wsadmin takes too much CPU and it just takes too long to get the data.
My current workaround is to deploy a servlet to the server and the servlet does the JMX query or other commands against the MBean. The servlet parameters provide the necessary data for what I want to do.
I then use a perl script with the perl http client to invoke the servlet and get the answer. This is signficantly more efficient than using wsadmin for these scenarios. Typically takes well under a second to get the data versus 15-30 seconds using wsadmin, best case. I'll see if I can publish the code, it's pretty straightforward to write the servlet and the perl script is also simple.
Worth remembering if you need to do this sort of checking frequently, probably applies to everyones application server and not just WebSphere also. A perl script simply starts a hell of a lot faster than a JVM.
December 08, 2004
OSGI, not just for clients.
Andrew Binstock recently published an informative article on the uses of OSGI for clients. Andrews article can be viewed here. It's a good introduction on OSGI and you can find another good introduction at this link on OSCAR-OSGI at source forge.
I think OSGI also has a big future on the server side. If we could get the open source and commercial application server vendors using OSGI for their runtime frameworks for servers then a lot of interesting things could happen. Rather than the current situation where every vendor has its own runtime component mechanism and the open source ones are no better here (JBoss with JMX, Apache with GBeans, ...), if we can all agree on OSGI than third parties making plugins for application servers such cache providers, persistence providers should have a much simpler time integrating those products. Maybe products like Tangosol or Gigaspaces could adopt an OSGI based runtime.
Application programmers would benefit from having a standard service registry in which to find these third party plugins. Application servers would be easy to 'upgrade'/integrate with third party function. It has a lot of potential, if we can just get the vendors to agree. I think a positive initial step would be for the open source J2EE servers to adopt OSGI as their runtime framework in lieu of their current private runtimes.
Another thing to think about is how could OSGI impact J2EE, could web modules/ejb modules be bundles. Would this be easier or harder than today. Could JCA RAs be bundles also? Lots of scope here.
If you want to play with OSGI then an OSGI 3.0 is included in IBM Workplace Client Technology, Micro Edition 5.7.1 and this is available for download. An OSGI V3++ is included in Eclipse also.
November 29, 2004
WebSphere Partitioning Facility (WPF) Programming guide online
The PDF guide to writing J2EE partitioned applications using the new WebSphere Partition Facility (WPF) APIs is online now at this link. These APIs are part of WebSphere Extended Deployment (XD) 5.1.
November 05, 2004
WebSphere XD ships
We just shipped WebSphere 5.1 XD a couple of weeks ago. XD has many features that should appeal to customers.
The OnDemand Router (or ODR) is a Java proxy server that sits in front of a set of HTTP servers. These HTTP servers can be WAS servers or servers from other vendors such as BEA as well as servers on the LAMP stack (PHP etc). The ODR can 'shape' the traffic on to those boxes so that service level agreements can be met. When used with WebSphere XD servers it can also supports dynamic clusters. A normal cluster is a set of nodes running the application. A dynamic cluster is an application deployed on to a node group. The node group is a set of nodes on which the application can 'potentially' run. Multiple applications can be deployed to the node group. The ODR needs to be configured with a service level agreement for each application. It then decides based on how well the application is meeting it's SLA, how many servers in the node group will run the application. If the application is too slow then it will start the application on more nodes in the node group. If all nodes are currently being used and an application still cannot meet the SLA then lower priority applications will be stopped to 'make room' for the higher priority application. It's pretty cool.
XD also has the new WebSphere Partitioning Facility (WPF). This allows a J2EE application to split it self in to named partitions. A partition is like a named daemon running in the cluster. They are declared programatically by the application at startup and can also be added/removed dynamically at runtime. The partition runs inside the app server. IIOP work can use content based routing to route an IIOP request to the partition that can service it. This allows J2EE applications to have a singleton running in a cluster and then have requests for that singleton routed to it. JMS requests or requests from middleware like Tibco RV or Reuters Connect can be similarly routed to the server handling that particular data stream. A partition can use threading to dynamically subscribe/unsubscribe to just the feeds that the partition requires.
If a cluster member fails then all partitions running there are failed over to the surviving cluster members very quickly. How quickly depends on tuning but 6 seconds isn't unreasonable.
Partitions can be grouped together using an application grouping scheme and then those groups can be placed on particular cluster members. This allows 'busy' partitions to be placed on more powerful servers or fewer busy partitions can be placed on a single server. Backup servers for these groups can also be specified and whether failback is desired if the original server restarts. This placement logics lets a rule based mechanism be used to determine simple placement. 5-10 rules can manage 9000 partitions in a cluster.
This lets XD be used for very demanding applications such as electronic trading systems or highend batch processing as these applications are typically stateful and which only scale vertically or don't perform well as a stateless server. WPF allows customers use partition such applications and then each partition can be stateful and runs on a cluster member. If the member fails then the partition fails over to another cluster member. WPF provides failover/restart events to the application so that it can react to these events. When WPF is combined with the async beans features from WBI then some pretty serious applications can now be built that leverage the J2EE environment and stateful applications can be made to scale almost linearly horizontally and still be highly available through our failover mechanism thats built in.
WPF and the high availability manager can also integrate with BladeCenter hardware. This allows WebSphere to make sure that a server is really dead if becomes unresponsive before WebSphere actually fails over the partition. This allows WPF applications to be written and not have to worry about a partition running on another server when unexpected events such as a network partition or split brain syndrome occur.
Anyway, it's pretty cool and it's exciting working with some of the big Wall St firms on projects exploiting these features. Enjoy. For more information, check it out here.
September 19, 2004
Problems with JMS 1.1 and J2EE 1.3 app servers
I wish the JMS vendors would take care and ensure that their clients don't depend on a JMS 1.1 jar being in the class path. For the next 2/3 years, I think most of the deployed servers (i.e. the ones JMS vendors can sell to) will be at J2EE 1.3.
It seems a lot of the JMS 1.1 products do and as a result don't run out of the box on J2EE 1.3 servers as they only bundle the JMS 1.0x jars and classes. The JMS classes are unfortunately inside the J2EE.jar and thus are kind of hard to get at for most people, I think.
The workarounds currently are to expand the j2ee.jar and then unjar the jms 1.1 files in to it and then rejar. This would work from what I can tell but from a support point of view, a J2EE vendor may punt and claim the configuration isn't supported. It may also be possible to simply add a JMS 1.1 jar to the classpath and hopefully it'll find the extra classes but I haven't tried this.
I guess the JMS JCP should have gone out of their way and insisted that a JMS 1.1 compliant JMS product is required to support JMS 1.0.2 clients, i.e. a J2EE 1.3 server.
November 07, 2003
Why are JMS APIs restricted in J2EE and how to workaround it.
The J2EE specification forbids certain JMS APIs. Section 6.6 of the 1.3 specification documents which ones are forbidden. When you declare a JMS provider to an application server then the application looks up this provider, it actually receives a wrapped JMS provider (wrapped connection, wrapper sessions, wrapped everything). The wrapper serves to add XA support as the appropriate times, implement connection/session pooling and also to police the APIs which are callable on the provider. The same mechanism used to be used on JDBC providers but the JCA mechanism now takes care of this aspect for JDBC and JCA connectors in general. JCA 1.5 should remove this wrapper need for JMS providers in the J2EE 1.4 time frame.
Thus, if you want to call the restricted methods such as JMSConnection.setExceptionListener or APIs to do with ConnectionConsumer then these will fail with a compliance exception. Basically, the J2EE spec says you can't call these. When J2EE 1.4 comes along, then JMS providers will have to implement this 'forbidden API' support when they run in a JCA connector.
There is also a restriction of a single JMS session per connection and you can't call Session.setMessageListener. This basically means an application can't do async receiving of messages at all except using MDBs and MDBs are typically too static for the kinds of customers/applications I'm alluding to here.
There are times when this just plain gets in the way if you are an advanced customer, typically a financial customer. If you're using async beans in WAS-E 5.0 to build advanced messaging applications then you need to work around this problem. The reason the spec disallows these APIs is that they restrict the ability of an application server to provide a managed environment. The server may want to set it's own exception listener. When Connection.start is called when the JMS provider starts a thread for each JMS session associated with the connection and calls message listeners on those threads. Clearly, this thread isn't one made by the app server and hence the objection.
JCA 1.5 should help here as the JMS provider would use threads it obtained from the JCA WorkManager for these sessions. An issue with the JCA 1.5 specification here is that the provider has no way to tell the app server that the thread is long lived and thus the app server will likely take threads from a pool for the session threads which is a bad thing as they are clearly daemon threads so when J2EE 1.4 comes along, size your thread pools for connectors appropriately.
But, nevertheless. Some times, a developer wants to do this. The only workaround available to customers is to load the JMS provider directly, i.e. configure some properties in your application for "InitialContextFactory", "URL", "JNDI Name", "UserID", "Password" and directly create a TopicConnectionFactory or QueueConnectionFactory. This basically gives you a unwrapped/altered JMS provider which will allow you to make the forbidden calls. BUT, you will lose the builtin XA support and pooling if you do this because it's the wrappers which gave this capability but from what I've seen, the typical application that we're talking about here doesn't need this support anyway. This may also require your application to use a J2EE security policy to allow it to instantiate the JMS provider if Java 2 security is enabled.
If you're using WebSphere Enterprise then when a message listener is called on an unmanaged JMS session thread then you should use a WorkManager to process the message on a managed thread before accessing any WebSphere APIs. The pattern here is basically, use one or two sessions to pull the messages in and then execute them on a WorkManager using a pool, i.e. decouple the pulling of the messages from the message processing.
I'm going to shortly publish an article showing how JMS + async beans can be used to build some very common patterns in the kinds of advanced applications found in electronic trading systems. More later...
October 02, 2003
When MDBs aren't enough
J2EE 1.3 added support for J2EE applications statically declaring they needed to process incoming JMS messages arriving on a predeclared JMS topic or queue. This works well for a variety of applications but has problems with some advanced scenarios.
The topic/queue for the incoming message is not known when the application is deployed.
Here, MDBs don't work. They require this information to be known in advance. There is no way to do this in the spec. Some people think servlets can spawn threads in a J2EE application but this is false. The J2EE spec clearly says that this behaviour may not be supported in J2EE servers and hence shouldn't be relied on.
An environment where queues/topics are added and removed frequently.
Here even if they were known up front, it's difficult to remove a queue/topic or add one without redeploying the application. MDBs don't support dynamic messaging environments.
MDBs only work for JMS messages.
There are a lot of legacy messaging transports which need to be integrated and wrapping them with a JMS adapter or installing a gateway to bridge between the native and JMS transport may be undesirable. JCA 1.5 which is part of J2EE 1.4 addresses this aspect but doesn't help with the other issues.
The threading model isn't suitable for the application.
JMS internalizes the threading model used by the JMS client and then the MDB container further add its threading model on top of this. This means that if that model doesn't match what you need then it won't be good enough. The principle scenarios here deal with message ordering and when messages from a single queue/topic can and cannot be processed in parallel. Some other non JMS messaging products have more flexible APIs in this regard.
So, given these problems, there are two choices for a vendor trying to help here. Support generically all of the above features which is a lot of work but would be simpler for customers or provide enough flexibility so that the small percentage of customers that have these requirements can build a solution using the toolkit/APIs provided with the application server. The latter is the approach that works with WAS 5.0E.
The async beans capability which allows J2EE applications to take advantage of managed daemon threads as well as pooled threads, transient fast timers and managed callbacks/observers allows these solution to be implemented by 'advanced' customers using WAS-E 5.0.
Such applications can take full advantage of the J2EE programming model as well as run in a managed environment where as in the past, such applications were forced to run in a J2SE JVM managed by the customer.
September 07, 2003
JDO and CMP 2.0
A lot of people say JDO is better, vendors say CMPs are the standard. I don't think it matters to tell you the truth. The bottom line is that CMP 2.0 should perform as well as JDO in all but a single aspect.
CMP objects are managed J2EE components. They have a managed environment. They have declarative security, a env section in JNDI etc. JDO objects don't have this. This environment is setup on every method call and removed when every method call returns.
This preinvoke/postinvoke doesn't seem to be required in the JDO model. My experience is that calling pre and post invoke on method calls costs around 4-5% CPU path length depending on the application server. This isn't a lot for me, given the typical sorts of performance problems I've seen, i.e. taking this 4-5% out of the path length won't solve them problem.
As CMP and JDO implementations mature, I expect both to offer the same feature sets in terms of optmizations etc, i.e. both can be made to be identical in terms of JDBC/caching performance. The remaining difference in performance is this 4-5% cost of supporting the J2EE entity bean versus using just JavaBeans as the component and like I said, if it's come down to just this being your performance problem, you don't have a problem :)