« Global map of visitors to this site | Main | AJAX and its impact on servers »

April 17, 2006

Heads up: There are many flavors of query

We're playing around with queries of various kinds right now. So far, we've identified three types of query which I'll quickly cover next:

Conventional one shot queries

Most people are familiar with this. It's a conventional SQL query executed against a database or EJBQL against a OR mapper. You run it, it executes, and returns a resultset with the result. When you want new results, you run it again. After you get the result set, thats it, the query is not running anymore.

Continous queries

This look like the above query but it's continuous. The query language will still be SQL or EJBQL. You provide a callback to the query when you execute it and it returns with a result set but the query is still running in the background, updating the result set as the data is changed. The callback is then called when ever the data used by the query changes. You'd need to 'stop' the query when you're done to release the resources (CPU on every change and memory for intermediate tables). This continuous query can be cheaper from a CPU point of view than having the application run the same query using a poll type approach every second and the query engine can cache intermediate results saving work over simply running the whole query from scratch every time.

Temporal Queries

An example of this would be IBMs SMILE technology. Temporal queries use SQL but allow time based operators which allow things like averages over sliding windows like the last 15 minutes, the latest value. They are queries on streams of events where each event represents a version of the underlying data. Again, you'd provide a callback when submitting the query and receive events as the result set changes. Temporal queries are wierd in that they require a full history for the answers to be accurate. For example, if you suddenly say "give me the average price over the last 15 minutes of the top 20 traded stocks by volume" continuously then it'll be wrong until 15 minutes have passed. It's wrong because how did the query engine know to keep the last 15 minutes of history until you gave it the query? If it kept all versions of all the data forever then obviously, you'd run out of memory pretty quickly. So, it only keeps histories for objects that participate in current queries running against the dataset. This is in contrast to the continuous queries where you don't need history and the first resultset is simply the query run against the current data. Several press articles about smile can be found be simply doing a google search for IBM and SMILE.


All three of these queries types will be useful to customers building advanced applications using data. It's important that architects/developers understand the capabilities and be able to exploit engines that provide these query types as support for these query types are added to middleware products.

April 17, 2006 | Permalink


Post a comment