Excel RTD Server implementation in C#......Part 1

What is an Excel RTD Server?

If you put the search string "Excel RTD Server" into a popular search engine you'll probably get a first hit Microsoft KnowledgeBase Article: 285339

"Microsoft Excel provides a new worksheet function, RTD, that allows you to call a Component Object Model (COM) Automation server for the purpose of retrieving data real-time. This article describes how to use Visual Basic to create a RealTimeData Server for use with Excel's RTD function."

The interface for the COM server is described in the article as:

"ServerStart
Called when Excel requests the first RTD topic for the server. ServerStart should return a 1 on success, and a negative value or 0 on failure. The first parameter of the ServerStart method is a callback object that the RealTimeData server uses to notify Excel when it should gather updates from the RealTimeData server.

ServerTerminate
Called when Excel no longer requires RTD topics from the RealTimeData server.
ConnectData
Called whenever Excel requests a new RTD topic from the RealTimeData server.
DisconnectData
Called whenever Excel no longer requires a specific topic.
HeartBeat
Called by Excel if a given interval has elapsed since the last time Excel was notified of updates from the RealTimeData server.
RefreshData
Called when Excel is requesting a refresh on topics. RefreshData is called after the server notifies Excel that updates exist, and it returns a count of the topics to update along with the topic id and value for each topic."

The complexity is (in my view) around three of these methods.  RefreshData is called by Excel to PULL data into Excel.  But Excel only calls RefreshData after a PUSH notification has been sent to Excel by the RTD Server.  By way of analogy - the push notification is like receiving a postcard from the mailman telling you that there is a parcel waiting for you.  When you are ready to collect the parcel, you call RefreshData.  This hybrid push/pull model has benefits in that the single threaded user interface thread of Excel has to be respected.  Remember that if you click F2 on a cell and it enters edit mode then the entire cell based user interface is locked.  This means that code that sends data to Excel has to be aware of this reality and work within these constraints.

The other issue is around ConnectData/DisconnectData.  Each time a new call to RTD is made from a worksheet the ConnectData method implemented in your RTD Server will be called.  In C# this is created as:

IRtdServer.ConnectData(int TopicID, ref Array Strings, ref bool GetNewValues)

So, the values in topic1...topic 253 will be based as strings within the Array.  So potentially over time a large number of these calls will be made.  Now, think about how you would handle this when two cells are identical.  Let's take an edge case - I put the same formula into every cells in an Excel 2007 worksheet 
Rows=1048576
Columns=16384  
= 17,179,869,184 cells in total

The RTD Server code has to be able to handle this and not blow up.  Hence the way that the ConnectData array is processed is critical to performance. If you tried to subscribe to the same market data element 17 billion times you'd probably find that it acts like a massive memory leak or it simply crashes Excel.
The DisconnectData method has to simply be managed such that there is no chance to leak memory.  In one RTD implementation I have seen (not mine), each time a new call to ConnectData is made some memory was allocated.  This memory was not freed by DisconnectData.  Hence this would cause failure towards the end of the trading day as the PC running Excel ran out of memory.  This can be tested by adding a large number of RTD calls from a spreadsheet and then deleting the cells while monitoring memory usage.

This leads to the point that the connection from the Excel user interface via the RTD function and RTD Server must be optimised to ensure that it can scale up and down properly and that there is no path-dependent bug introduced by adding and removing RTD function calls in the user interface. 
At the bottom of  Microsoft KnowledgeBase Article: 285339  is a link to Microsoft KnowledgeBase Article:284883 entitled: "RTD Server does not send update notifications to multiple Excel instances". Quoting liberally from this:
"When you use multiple instances of Microsoft Office Excel together with your RealTimeData (RTD) server, you may receive the following message:
The real-time data server 'servername.classname' is not responding. Would you like Microsoft Excel to attempt to restart the server?
Therefore, your RTD server is unable to send update notifications to multiple instances of Excel."
Further:
"You may receive this message if your RTD server is an ActiveX EXE that is built for MultiUse instancing, which is a default setting for ActiveX EXE projects in Visual Basic. An ActiveX EXE that is MultiUse can be shared among clients. Separate Excel instances cannot share RTD servers. When Excel starts an RTD Server, Excel calls the RTD Server ServerStart method and then passes it a reference to the CallBack object for that instance of Excel. Therefore, if two instances of Excel try to share the same RTD server, the second instance replaces the CallBack object for the first instance. This behavior invalidates the CallBack object for the first instance."

And the resolution suggested:
"Use SingleUse instancing for ActiveX EXE components that will act as RTD servers for Excel. When you build ActiveX EXE components as SingleUse, each instance of Excel has its own instance of the RTD server.  Another solution is to use an ActiveX DLL for your RTD server instead of an ActiveX EXE. ActiveX DLLs load in the same process space as their clients, and each instance of Excel always has its own instance of the RTD server."

This seems reasonable but then pushes us to a different problem.  If the way to resolve the issue is to pull the RTD Server into the Excel process space then we end up with multiple instances of the RTD server running.  The problem with that model is that the code that sits behind the COM RTD Server is then duplicated for each RTD server instance. If that code connects to third party libraries such as TIB R/V, Solace, Tervela, Terracotta Universal Messaging, IBM MQ Series then you'd getting into a potential minefield.  The solution is to make sure that you understand not just the threading model of your implementation language but also that of your middleware and COM.  Hence why RTD Servers are easy to implement in a  reference implementation but quite a bit harder in a field implementation.
If you look within Excel at the formula helper you see the following:

Excel 2007 RTD Function Wizard User Interface


The above image shows the key parts of how to call an Excel RTD Server from the Excel user interface.
Excel 2007 RTD Function Help Description

The above extract from the help file shows that Excel 2007 supports between 1 and 253 parameters to uniquely identify a request to make to the RTD Server.
If we look at the Excel user interface we see some disagreement on the number of parameters:



Excel 2007 user interface suggested a maximum of 38 parameters






Take a classic use-case for Excel RTD Servers - that of publishing real-time trading information from a Trading Platform such as Fidessa, CRD, Orc or similar to an Excel spreadsheet to allow for a dashboard to be created by a someone in trading or middle office.  Imagine that we know the order identifier for an order to trade a particular stock.  So an Excel RTD function call may look like:=RTD("vendor.rtdserver","","ID2313451","Executed_Quantity") 

Equally a realistic implementation could look like this:
=RTD("MyComAddIn.Progid",,"ID2313451","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",
"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",
"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",
"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",
"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",
"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","Executed_Quantity")

This more subtle challenge is now becoming clearer - the namespace of the function needs to be defined in a sensible way.  What does this mean?
Both of the above cases could be the correct implementation for a particular case to see the executed quantity for an order with the ID 2313451.  The issue clearly is to allow the user to understand the topics that are listed and what can be done with them.
There are a number of ways to do this:
  • User documentation to show the user that, for example, to see the price/earnings ratio for a stock they need to put the instrument identifier in topic 1 and the string "p_e_ratio" in topic 75
  • Helper functions written in VBA to simplify this - so a new function could be written as =Get_PERatio_ByRIC("VOD.L")
The point here is that the technology challenge is part of the problem, the main part (in my experience) is understanding the use-cases and business requirements alongside the data that is available from the back-end system.
In summary:
  • The ConnectData/DisconnectData issues described above need to be addressed for a robust implementation. 
  • The RefreshData usage has to fit within the constraints of the Excel run-time as well as the RTD threading model and the 'back-end' threading model. 
  • The namespace model and/or helper user-defined-functions need to be documented fully.
  • The subtleties of bridging from C# (or Java or whatever other implementation language)
    to COM must be understood.
A follow-on post will show a simple implementation of an RTD Server in C#.


Further suggested reading:


Comments