Middleware, an appreciation….

Throughout my working life the role of middleware has played a crucial part.  For many people it’s something of a dark art that is not well understood.  This post will set out a description of some use cases, some solutions offered by middleware, a taxonomy of different types of middleware and a forward looking analysis.

[For an introduction, Wikipedia has a decent page here and there is a more in-depth analysis by Peter Egli here ]
 
A simple use case
So, let’s begin with a nice and simple use case.  Imagine a trading system with a nice simple topology.  Two traders, each running a fat client application.  One server that persists state.  The firm has a strict limit that overall there cannot be more than $3m worth of open orders at any time.  From a start of day net flat position Trader A decides to buy $2m of XYZ Inc.  That is within bounds and so when his fat client passes that message to the server the rule is passed and the trade is completed.  Now, Trader B decides to buy $1.5m of XYZ Inc.  From his fat client he enters an order and this is passed to the server.  The server checks and sees that $2m+$1.5m would create a limit breach and so the order is rejected.

Messages flow therefore looks like this:
  1. A==>Server – buy $2m of ZYX Inc
  2. Server==>A - buy $2m of ZYX Inc done.
  3. B==>Server – buy $1.5m of ZYX Inc
  4. Server==>A - buy $1.5m of ZYX Inc not done.
So, we see that there is a use case here – for order messages to be passed from client to server and acceptance/rejection messages to be passed from server to client.  Now, let’s think a little more about this.  A more productive scenario would be for the server to notify all clients of the current state of the trading limit.  So in this scenario we would see:
  1. A==>Server – buy $2m of ZYX Inc
  2. Server==>A - buy $2m of ZYX Inc done by A.
  3. Server==>B - buy $2m of ZYX Inc done by A.
  4. B==>Server – buy $1.5m of ZYX Inc
  5. Server==>A - buy $1.5m of ZYX Inc not done.
So, in this case we see more messages passing round the system, since once A has traded $2m of XYZ Inc all clients would see an update message.  Now, if we extend this further we might see that the server side does further calculation and message distribution:
  1. A==>Server – buy $2m of ZYX Inc
  2. Server==>A - buy $2m of ZYX Inc done by A.
  3. Server==>B - buy $2m of ZYX Inc done by A.
  4. Server==>CalculationServer - buy $2m of ZYX Inc done by A.
  5. CalculationServer==>Server- available limit for ZYX Inc is now $1m
  6. Server==>A - available limit for ZYX Inc is now $1m
  7. Server==>B - available limit for ZYX Inc is now $1m
Now, let’s extend this even further to a scenario where any trade is validated before being executed against these rules:
  1. A==>Server – buy $2m of ZYX Inc
  2. Server==>A - buy $2m of ZYX Inc done by A.
  3. Server==>B - buy $2m of ZYX Inc done by A.
  4. Server==>CalculationServer - buy $2m of ZYX Inc done by A.
  5. CalculationServer==>Server- available limit for ZYX Inc is now $1m
  6. Server==>A - available limit for ZYX Inc is now $1m
  7. Server==>B - available limit for ZYX Inc is now $1m
  8. B==>Server – request check for buy $1.5m of ZYX Inc for B
  9. Server==>B– checking for buy $1.5m of ZYX Inc for B
  10. Server==>CalculationServer – check for buy $1.5m of ZYX Inc for B
  11. CalculationServer==>Server– checking for buy $1.5m of ZYX Inc for B
  12. Server==>B– check in progress for buy $1.5m of ZYX Inc for B
  13. CalculationServer==>Server– check failed for buy $1.5m of ZYX Inc for B
  14. Server==>B– check failed for buy $1.5m of ZYX Inc for B
Now, the above scenario is flawed.  While the order for $1.5m of ZYX Inc is being checked it’s possible that another order for ZYX Inc could be received.  So a more realistic scenario would be:
  1. A==>Server – buy $2m of ZYX Inc
  2. Server==>A - buy $2m of ZYX Inc done by A.
  3. Server==>B - buy $2m of ZYX Inc done by A.
  4. Server==>CalculationServer - buy $2m of ZYX Inc done by A.
  5. CalculationServer==>Server- available limit for ZYX Inc is now $1m
  6. Server==>A - available limit for ZYX Inc is now $1m
  7. Server==>B - available limit for ZYX Inc is now $1m
  8. B==>Server – request check for buy $1.5m of ZYX Inc for B
  9. Server==>B– checking for buy $1.5m of ZYX Inc for B
  10. Server==>A– checking for buy $1.5m of ZYX Inc for B
  11. Server==>CalculationServer – check for buy $1.5m of ZYX Inc for B
  12. CalculationServer==>Server– checking for buy $1.5m of ZYX Inc for B
  13. Server==>B– check in progress for buy $1.5m of ZYX Inc for B
  14. Server==>A– check in progress for buy $1.5m of ZYX Inc for B
  15. CalculationServer==>Server– check failed for buy $1.5m of ZYX Inc for B
  16. Server==>B– check failed for buy $1.5m of ZYX Inc for B
  17. Server==>A– check failed for buy $1.5m of ZYX Inc for B

Now, this is getting to a point where it’s looking more like a real life workflow example.  One key point to note: you very quickly end up with a huge volume of messaging when you are building a trading platform.  Further – consider the case where A is running on Window 7, B is running on within internet explorer 11 on Windows 8, Server is C++ on Linux and CalculationServer is Java on Linux. To write all of the plumbing code to connect all of these platforms together would be onerous and have the problem of requiring frequent testing as the individual platforms go through their own lifecycle of upgrades and patches.

So – what is middleware and what does it do here?
Each interaction between client and server can be managed in a number of different ways.  One way to do this would be for a client to FTP a file to a server where a server process picks up the files, processes them and then distributes a response file to another directory.  That sounds pretty crazy but I have seen this done in real world applications.

A more architecturally sound pattern is for the client applications to communicate with the server using middleware.  So what is this middleware of which I write without explanation?

Simply put – a client or server will have one or more processes running that may need to communicate with one or more other clients or servers.  The software that empowers this communication is middleware.  In other words, middleware is the plumbing that gets your data moving around your network.

Where does middleware come from?
The history of vendor based middleware is rather like that of Fleetwood Mac and progressive rock – lots of promiscuous relationships and people jumping from one firm to another. A simple example: a firm called Talarian wrote a product called SmartSockets.  They sold out to TIBCO and the former Talarian COO founded 29West.  29West was sold out to Informatica and renamed while TIBCO EOL’d SmartSockets and migrated as many clients as they could to other TIBCO products.   As such, there is an ecosystem of vendor software but one that is consolidating as firms are acquired, renamed, integrated and EOL’d.
On the open source side you have to look at two sides of this: JMS and wire protocols.  JMS stands for “Java Message Service” (Wikipedia here http://en.wikipedia.org/wiki/Java_Message_Service).
As with so many things in open source land, the way that JMS works is split between interface and implementation.  Only the interfaces for JMS are defined within Java Specification Request 914 – JSR914.  The challenge to anyone using JMS is that JMS only provides an interface but does not provide an implementation.  As such, pretty much all JMS implementations have their own distinct behaviours and message formats and are therefore not generally wire protocol compatible – your JMS is not the same as your neighbours.

In order to resolve this a number of firms pulled together to create the AMQP organisation – Advanced Message Queueing Protocol.  The point of AMQP (Wikipedia here http://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol) was to create a back-end standard to work with JMS.  In other words, the wire protocol to sit behind the JMS interfaces.  By having a standardised wire protocol it should be possible to have interoperability of clients and servers using different AMQP implementations. 

The JMS/AMQP world and the vendor world both generally exist as software implementations written in C++, Java and so on.  There is a third way which is hardware based middleware.

We therefore have three distinct groups of middleware products:

Open Source Software
http://www.rabbitmq.com/features.html
http://activemq.apache.org/
http://kafka.apache.org/

[This is a list of "classic" open source middleware but within the open source world there are a number of interesting offerings that are integrating JSON with messaging.  This is of use for taking applications from a best efforts basis to incorporating reliable messaging.]

So – why choose one option over another? The answer to that depends on what you are trying to do….
Simple example – if you are a firm with a mountain of IBM MQ inhouse and a great deal of experience with support issues, upgrades and keep-the-lights-on then why choose anything else? If you are a start-up seeking to minimise expenditure then why look beyond the open-source software? So why look at hardware based middleware? Simply put, look at it because it’s better. Solace and Tervela are both firms that this blog has looked at in the past.

In a follow-on post I will cover some of the benefits of the hardware based approach in more detail….

[Updated 15th December, comments in square brackets.]
[Updated 30th December, added Apache Kafka]

Comments