March 01, 2006
A new way to build oltp apps
Many applications built today are struggling to meet the performance requirements when they are using traditional best practises. Customers still use the stateless application in front of a database or host. They can spend a fortune making the database bigger, more CPUs, more clock speed, more memory but scaling it like that is the most expensive way to do it and eventually it works to a point but cost wise, it becomes exponentially more expensive as the box gets larger.
Then, I'm frequently given an application built using a conventional approach and then asked how will the ObjectGrid make this faster. This is a tough call. If the application is running on stateless servers but is modifying data from all over then place then caching is a tough thing to do given all the invalidation going on and this invalidation traffic and frequent cache misses will not provide an optimal solution when compared with one designed from the ground up.
Partitioning is a way to build scalable applications and now with the ObjectGrid, customers have the opportunity of splitting up their data in to slices and running application code colocated with that data.
This is a very fast solution as no data is moving around, the business logic is colocated with the data, i.e. running in the same JVM. Not all applications can work this way, some must be implemented the conventional way as the data model isn't amiable to partitioning. But many, if we stepped back, could be redesigned to use a partitioned pattern. If the application can be designed like this then huge speeds ups are possible on commodity hardware. Ideally, the application would be designed with this in mind from the beginning. It's hard to bolt this on.
Adding on performance at the end is a tough thing to do. It's much easier to think about building the application using a GRID OLTP style with partitioning and applications running colocated with the data up front. This isn't conventional wisdom of course, except in the old days where basically everything is colocated and surprise, surprise it's fast. If I look at applications I wrote back in the early 90s and then look at the huge boxes today in comparison (even a modern 2 way), it staggers me to see the low performance now achieved on such a high performance box when you look at what we did on an old 90s RISC unix box with 256MB of memory.
One approach often peddled is stored procedures. I saw a story advocating this on the server side the other day. Stored procedures are an attempt to do this with databases but I don't buy in to it because of the limited programming model and the fact that it's still limited to vertical only scaling and databases are an expensive license to buy and big boxes are big bucks and usually you can't integrate stuff in to a stored procedure. Develeopers want to use aspects, JMS, JMX, Spring, make threads and database guys will cringe at the thoughts of that kind of code running in the database. The conventional application OLTP patterns are basically geared to sell more relational databases and bigger SMP boxes. As memory, CPU power and networks are getting cheaper every year, SMP prices are, while falling, still expensive.
If we can think differently then we can leverage the network infrastructure, he cheap CPU power, and memory in farms to have applications that are distributed using well defined robust patterns that work (implemented by middleware such as ObjectGrid) which can deliver OLTP performance that can crush a conventional system. But, we need developers to start thinking lke this for this to take off. We need middleware that helps do all the hard stuff while the developers design the business logic.
I'm seeing companies with TB databases with week long round trip times on queries looking to go drop roundtime times from a week to realtime. A farm with enough redundant memory to host a TB database is probably 250 2 ways with 8GB of memory. Thats about 2TB of memory and costs about one million dollars (4k per blade with discounting). Thats a 500 processors (3.6Ghz Pentium processors). A one million dollar SMP will be no where near as fast or as available as this farm with the right middleware. If you need real time access to a TB of data for specific applications, this will likely work better than a conventional approach. This won't happen on a conventional architecture. We need to start thinking outside the box. Middleware is starting to evolve and we need application architects to start evolving with it :)
March 1, 2006 | Permalink
TrackBack URL for this entry:
Listed below are links to weblogs that reference A new way to build oltp apps:
» Scaling through Distribution and ObjectGrid from Dan Creswell's Weblog
I've been saying this for a while, as has Adam Bosworth. Seems like it's starting to get some mainstream acceptance judging by what Billy is saying . Of course, Cameron would say Coherence got there first which is probably fair. No fighting lads [Read More]
Tracked on Mar 2, 2006 7:29:15 AM
Tracked on Apr 26, 2006 4:50:01 PM
Very interesting and thought-provoking piece. Somehow this reminded me of a presentation by Paul Strassmann ("Google: Model for the Systems Architecture of the Future") available at http://www.strassmann.com/ - have you seen it ?
Google is a great example of a systems architecture that was built from scratch for a specific purpose, without any legacy - the ideal clean sheet. It's also the perfect example of always colocating data and processing.
Now the fascinating issue that Strassmann does not even begin to tackle is: how does the Google model apply, mutatis mutandis, to the modern enterprise ? How can we apply this model on a smaller scale for solving complex data processing issues in, say, finance or telecom ? What kind of new middleware do we need for that ? Are application servers still relevant in that model ? There are more questions than answers right now, which is why the topic is actually interesting.
Posted by: Alain Rogister | Mar 2, 2006 5:23:08 AM
We'll still have application servers but they will be different from todays. The middleware has to still contain the applications, but it's a different model.
Posted by: Billy | Mar 2, 2006 7:27:38 AM
"Now the fascinating issue that Strassmann does not even begin to tackle is: how does the Google model apply, mutatis mutandis, to the modern enterprise?"
The key thing about most of what Google has done is that it's based on a collection of architectural principles that aren't in common use across the enterprise. They are, in fact, almost at the other end of the spectrum. They have actually been around and discussed for quite some time.
I wrote a blog about this a while back that contains a whole bunch of interesting reference links that you might find useful:
I actually think a fair amount of the implementation puzzle that goes with it is already lying around. On app servers, well that term has become synonymous with J2EE IMHO. If you backed away from that and looked at the role the app server performs, I think it's clear that role will still be present in some form or another.
Posted by: Dan Creswell | Mar 2, 2006 7:27:46 AM
How can we learn more about these solutions, and how to apply them to massive OLTP? In particular how would this approach help in the classic banking transaction, Debit account A + Credit account B?
Posted by: Ken Norcross | Mar 13, 2006 9:50:23 AM