Friday, 12 April 2013

Oracle Advanced Queues and Streams: A Definition in Plain English

A mathematician, an accountant and an economist apply for the same job. The interviewer calls in the mathematician and asks "What do two plus two equal?" 

The mathematician replies "Four." 

The interviewer asks "Four, exactly?" The mathematician looks at the interviewer incredulously and says "Yes, four, exactly." 

Then the interviewer calls in the accountant and asks the same question "What do two plus two equal?" The accountant says "On average, four - give or take ten percent, but on average, four." 

Then the interviewer calls in the economist and poses the same question "What do two plus two equal?" 

The economist gets up, locks the door, closes the shade, sits down next to the interviewer and says "What do you want it to equal?" 

And now for something completely different: 

What is AQ? Advanced Queues, or AQ, is Oracle's messaging solution. AQ provides a persistent (or non-persistent for specialty apps) queue mechanism that can guarantee delivery of a message. It has interfaces to PL/SQL, OCI and Java. It's Oracle's answer to IBM's MQ Series. 

A message can be an XML document, a set of fields, an array of data and just about anything else you can think of. 

AQ works on a publish/subscribe model. That means that someone (a publisher puts a message on the queue and someone else (a subscriber) takes the message off. A queue can have multiple subscribers. Technically it can have multiple publishers but I haven't worked with that configuration and I'm not sure what the usefulness of that is. I think I would prefer multiple queues, one for each publisher. 

What's AQ good for? What does it do? 

One example would be replication. In Oracle Advanced Replication, AQ is the mechanism that copies data from one instance to another. The master site (publisher) will receive an update, it puts the update and a before and after image of the data in a queue. The slave sites (Subscribers) pull the data off the queue and apply them to the local database. The before and after images are used by replication to find the correct record and see if there are any update conflicts. 

Besides Oracle replication, or your own home-grown replication, there are a lot of other uses for AQ. 

You can drop a message on a queue for local usage. Say if you have a transactional system and are getting backed up but don't want to turn away incoming transactions. You can implement a queue. The receiving procedure can drop the transactions on the queue and a local de-queue procedure in the background can pull them off when it has time. 

You can use AQ to interface with Java. AQ supports the Java Messaging Specification (JMS) API. Using Java, XML and AQ you can easily implement a SOA (service oriented architecture) web service. 

What is Streams? Here's a brief description of streams and what you can use it for. 

AQ and Replication both entail data movement. Streams is the current technology enabling that data movement. Streams is kind of like AQ, but with rules applied. 

Let's think about AQ. AQ is basically a table and some table maintenance code wrapped around streams. When you enqueue a record, you're using AQ. Streams takes over and moves it to the next database, enqueueing it locally. AQ then takes over again, dequeuing it for consumption. 

Streams has some nice features. I think the most important is the rule based transformations. A transformation allows you to modify the payload in flight. A receiving application doesn't need to be aware of the sending applications formats, it just receives what it needs. 

Think of the way mainframes send data down to a data warehouse. The mainframe doesn't send entire vsam files down to let the warehouse figure out what pieces it needs. The warehouse group defines the fields it needs, a mainframe programmer writes a Cobol (maybe) program and sends a new, specific file down. 

With streams, the receiving application can define what it needs and the sending application can define rules to match. The nice thing with streams is that there can be multiple consumers receiving the same payload but have different rules applied for them. The sender sends one payload and it's transformed many times in different ways for multiple consumers. Rules are also easily defined as opposed to writing a program, scheduling a batch processing window, writing a load routine, etc. With streams, identify the source, define the rules and write a consumer dequeue. 

Speaking of data warehouses, another use of streams is in change data capture (CDC). You identify a source object, say your transaction detail table in the oltp system. You can create a rule that says capture all transactions that are approved and billable. Define the billing warehouse as a consumer for that stream. That payload can be applied to a staging table for loading into a warehouse table. The apply to the staging table can be done without coding. With almost 0 lines of code you can move the data you want, i.e. billable items, from your oltp system directly to your warehouse. 

Streams is also non-stressful to the source database. Streams reads the redo-logs and gathers information from that as opposed to running queries or DML against the source database's tables. 

Streams and AQ are both pretty fascinating technologies. And there is a lot more to them than I speak about here. In the near future I want to show how to do some setup and build a little application using CDC. If you're using AQ in 10g, then you're already using streams behind the scenes. I think that's the hallmark of good technology; it makes life easier and you never have to see it. 


No comments:

Post a Comment

Number of Visitors