Buy-side analytics: Beyond TCA

Within most buy-sides there is a realisation of a need for TCA.  TCA is something we have covered before here. There is also a recognised need for performance measurement and attribution.  This post covers those topics yet has a focus on a new analytic area for buy-sides - behavioural analytics.

In essence, TCA assists a buy-side with understanding if their process from making an investment decision, through time to commence the trade through to completion of execution is working well.  So it measures performance of the trading process and indirectly the performance of the trading systems.  An extract from the previous post covers a list of providers:
  1. ITG Analytics
  2. Markit TCA
  3. Elkins McSherry TCA
  4. LiquidMetrix TCA
  5. Abel Noser Trade Analytics
[There are more - there's a nice list on the Greyspark website here.]

Performance Measurement and Attribution
Performance measurement and attribution is used to measure the performance of an investment portfolio and try and attribute where the performance comes from.  As a simple example, the performance measurement part will tell you that a portfolio has outperformed a benchmark by a certain amount and the attribution part will assist in understanding was this outperformance due to picking the right stocks, the right sectors, market timing or other factors? 
  1. DST Anova
  2. Bi-Sam
  3. SS&C Sylvan
  4. Eagle Performance
  5. StatPro Revolution
  6. CIBC Mellon Performance Measurement
  7. Market Street Advisors Archer 
  8. Sungard
  9. MSCI
  10. Wilshire Analytics
Behavioural Analytics
  1. Essentia Analytics
  2. Cabot
  3. Inalytics
So how does this all work for TCA?
For equities... 
A recurrent theme here is that it's all about the data - the computer science basic rule of "garbage in equals garbage out" applies here to a great extent.  Let's work through a simple example.

At time t a portfolio manager decides to sell stock x at a limit price of $5.08.  At t stock x the last traded price for x was $5.10 and the market bid/ask is $5.09/$5.11.

The portfolio manager works in a large asset management firm that has a chunky sized enterprise class order management system such as Fidessa Minerva, Charles River Trader, Eze OMS or similar.  So the order is created by the PM and routed to a trading desk within the firm.  The order is then accepted by a dealer at time t+a1 and the dealer then has to process the order, perhaps releasing it to a trading algorithm, DMA or to a care desk at time t+a2
So there are a series of datapoints here - the economic information of the order (stock, size, side, order conditions, limit price if any and so on).  The non-economic information of the order - the timestamps generated at different points in the order lifecycle.  Then there is the economic information of the trade - the data around the executions(s) for the order. 
In simple terms the process of TCA takes the datapoints and looks at the times of executions and the quantity executed versus the displayed market depth at that time.  One key issue here of course is around the quality of data  and the consistency and completeness of the dataset, if a large percentage of volume in a market is executed off the order book then using order book only information may give a misleading result of execution performance.  What TCA should be able to show is if the interaction with the market is flawed in certain clear ways.  So, if a large (as a percentage of average daily volume) order is just sent to a central limit order book then one would expect that algorithmic traders would jump ahead of this order and reduce execution performance.
The challenge is in taking a huge amount of data and rendering it into actionable information.  This is where many TCA vendors fall down by providing too much data and not enough information.  This is where good visualisation such as Datawatch (formerly Panopticon) or D3 can be extremely useful.

For other exchange traded instruments such as options and futures
Similar to equities in the sense that there is an exchange and therefore a stream of pricing information.

For foreign exchange and fixed income
The problem here is simple - how do you compare an executed price level to a "market" when the market is decentralised and trades over-the-counter.  There are a number of ways that firms perform this function but there is a greater degree of uncertainty in the significance of the results.

And for Performance Measurement and Attribution?
For a regular index or composite benchmark equity portfolio
Another simple example.  An equity investment portfolio of size z has a mandate to invest in the FTSE 100 stocks and to maintain a cash balance of as near to zero as possible.  As such, a decent benchmark may be 99% of the performance of the FTSE-100 plus 1% of the performance of cash.
At start of day on day d the portfolio is valued at v1
After one year the portfolio is valued at v365
If there are no portfolio inflows of outflows during this year then the portfolio value has increased in percentage points by 100* (v365 - v1)/v1 
(If there are inflows and/or outflows then Modified Dietz is your friend)
That is (a simple example of) the performance measurement part.  The attribution part requires more data - specifically the index data for a benchmark or composite applicable to the portfolio. 
As an extreme example, taking the above example, let's imagine that the portfolio manager invested 100% of the portfolio into a single stock (yes, not likely, but it's just an example).  There are two investment decisions at play here:
  1. Asset Allocation.  The manager has invested 100% in equities and 0% in cash, whereas the benchmark is 99% equities and 1% cash.
  2. Stock Selection.  The manager has invested all of the portion invested in equities in one stock rather than spread between the stocks in the FTSE 100 index.  
Performance attribution breaks this down into two or three effects:
  1. Stock selection effect
  2. Asset allocation effect
  3. (Sometimes) Interaction effect
Interaction effect is problematic since it's not clearly tied back to a particular decision by the portfolio manager, whereas clearly picking stocks and sectors is regarded as being a key deliverable of the role of a portfolio manager.
In terms of the number crunching required, the main datapoints for equity performance measurement and attribution are:
  1. Stock and cash positions at start of period (period being defined by the degree of granularity required - nowadays this may be at the level of intraday, historically often on a month-by-month basis).
  2. Stock and cash positions at end of period
  3. Cash inflows during the period
  4. Cash outflows during the period
  5. Stock prices at start of period
  6. Stock prices at end of period
  7. Cash interest rate over the period
  8. Index/indices weights at start of period
  9. Index/indices weights at end of period
  10. Index/indices changes during period
  11. Index/indices level at start of period
  12. Index/indices level at end of period
  13. Composite index weights at start of period
  14. Composite index weights at end of period
  15. Composite index changes during period
Provided all this data can be gathered then it's a case of number crunching to get the answer at the end. Typical systems in this space I have seen and worked with have worked on a gather/clean & validate/calculate basis.  By this I mean, data is sourced from multiple systems and loaded into a relational database backend.  The application is then configured to run on a scheduled/batch basis to clean any data items, calculate the required results and then persist them back to the relational database. As such these applications are prone to the well known issues of batch processing such as batch windows, batch failures and so on.
A more interesting solution would be to move to a real-time basis and/or a SaaS platform where compute could be hired at low price points during the time between the end of a reporting period and the time when reports are due to be sent to clients.

For a fixed income portfolio
There are a few challenges here:
  1. Bond valuation.  For an equity a simple measure is used - the mid-price on a particular date.  That's often flawed since if the holding of the equity is large relative to the available market liquidity then there would be market impact if a portfolio manager actually tried to trade at that level.  However, this is an accepted limitation.  The challenge for fixed income is that there is not really a worthwhile mid price available for the majority of bonds in size. As such this is a matter for debate and in some cases this is a modelling exercise rather than a true valuation exercise.
  2. Universe size.  For an equity portfolio the universe is generally clear and quite small. For fixed income portfolios the universe can be huge see for example here.
  3. The nature of fixed income returns is different.  An equity portfolio has stock selection and asset allocation effects. A fixed income portfolio has returns from changes in spreads, changes in the slope of the yield curve and overall yield curve level.
  4. In some cases, rather than benchmark to a constituent level index a Fixed Income portfolio may look to other measures such as yield or duration.
The overall effect is that the while there is a thriving Fixed Income attribution ecosystem there are a series of different caveats required when looking at the output reports.

And for Behavioural Analytics?
This is a newer field and so merits a bit more explanation.  In essence, the point here is to try and systematically analyse the overall capabilities of an investment manager.  Typically this involves looking at trades that are made by the portfolio manager based on the situation at the time they made the investment decision.  This is slightly different to performance attribution.  Let's look again at the example above.  Why did the portfolio manager put 100% of the stock into one equity?  One could see that as being a great big bet by the portfolio manager.  While in the above example it paid off, is this repeatable?  In other words, did the portfolio effectively trade based on the flip of a coin or was there method and reason to the trade?
Now clearly, one could ask the portfolio manager "why did you do that?" but that is a method fraught with biases from the portfolio manager and not good science.  A more reasoned way is to look at the numbers and the facts. 
So, in a normal investment portfolio one could look at the trades done over a year and compare the performance at a general level.  So rather than trying to establish if the portfolio manager is a good stock picker and/or sector selector, look at the performance of the stock before and after selling. Again, an example.  Say a portfolio manager hold stock W at the start of the year and after six months sells W and places the funds raised into stock X which is held until the end of the year.  So, at the end of the year we have six data points to examine.
  1. Price of W at start of year = W0
  2. Price of W at six months into the year  = W6
  3. Price of W at end of the year  = W12
  4. Price of X at start of year  = X0
  5. Price of X at six months into the year = X6
  6. Price of X at end of the year = X12
If the portfolio manager is good, then one would hope to see that after selling W the price of W falls (or rises by less than X) and that after buying X the price of X rises (or falls by less than X).  The point here is that the analysis of the portfolio manager is not just of the constituents of the portfolio but also what used to be in the portfolio.
This sort of analysis on a per stock level is interesting but not sufficiently illuminating to assist high level analysis.  The way to proceed is to aggregate the data into a visualisation.  So, for every stock sold or bought, look at the performance of the stock before and after the trade.  I would then create a heatmap to show under/over performance and the percentage of portfolio traded at the time.  It's important to scale this data - if a portfolio manager makes great decisions for small trades but bad decisions for large trades then there is an identifiable issue which can be looked at in more depth.  An ideal system would have a visualisation library to allow individual users to see data in a way that is intellectually rigorous but also intuitive for the observer.

There are a range of further analytics possible here - one could be to judge performance of a holding versus a ex-ante price target.  In some cases, portfolio managers sell when a target is hit, sometime they will hold anticipating further gains.  Can an analysis of sales made and not made when targets are hit reveal a behavioural bias?  That could be holding a position for too long or selling too quickly.  If the analysis shows that the positions are sold too soon, then perhaps the evidence points to research and analytics providing target prices that are not punchy enough. 
A further piece here is to incorporate news into the model.  Take a normalised market news feed (ThomsonReuters and others have them in XML format) and analyse trades in terms of news flow at the time that the trade is proposed and for a period before that time. If a portfolio manager sells at the time of bad news flow it may be too late to sell before the market has corrected and so perhaps it would be more sensible to hold the position? Let the analysis show some guidance on this...
The same is true for stock level price data, does a portfolio manager pay too much attention to stock price movements in a micro-period before the trade and not enough attention to the fundamental analysis.  Many years back I worked with a portfolio manager who would issue a sell order to the central dealing desk which he would cancel nigh on immediately if there was an uptick.  I never had the chance to analyse the dataset but it would have been very interesting to see the impact on portfolio performance of his 'nervous tick' as it was termed by the trading desk.

Another way to view this is that the "anti-monde" construct (the counterfactual case - what would have happened instead of what did happen) for this analysis is based predominantly upon the behaviour of the portfolio manager rather than an index or composite.

So how does this work?  The role of the data scientist is key to the delivery of meaningful behavioural analytics but this requires all of the groundwork to be completed - the data requirements for a sensible anti-monde can be onerous for an absolute return portfolio with no index weight style benchmark.  Consider the case where the price of X halves due to a two for one stock split - unless the data is adjusted to compensate this will lead to incorrect conclusions. Hence the GIGO problem.

One can almost argue that TCA and PMA are rather like crosswords (finite solution set) whereas behavioural analytics can take the form of a murder mystery - tracing the decision making process through the firm.
My view: TCA, Performance Measurement and Attribution and Behavioural Analytics are all in part big data problems that should be looked at collectively.  This is to ensure that overall portfolio outcomes are monitored and managed by the portfolio manager AND that the portfolio manager performance is managed by his or her management team. 
Beyond that, I believe that the combined platform to measurement portfolio management should exist in a real-time platform that scales and can be outsourced to a third party datacentre.