Bond fund distance, liquidity and trading

Distance is a concept of which we are all aware but few ever quantify except in the easiest way.  I know that London is close to Paris and Berlin is more distant.  We can measure these distances in miles, kilometres, flying time, time taken by train and so on...
[If you have time a quick read of Matching engine for corporate bonds? will provide some background.]
Take two words "dig" and "dog".  We know that by replacing the letter "i" in "dig" with the letter "o" we get "dog", so they are one letter apart.  Or for "door" and "roar" we would have to substitute "d" for "r" and "o" for "a" so they are two letters apart.  More generally, this is called "Levenshtein distance" and is used in many internet technology applications - think of the way that search engines will suggest words to you that are corrections - an application of Levenshtein distance.
So, what's the relevance to bond funds?  Let's take a worked example.  Imagine that below shows us two sets of holdings in two different bond funds. On the left is Portfolio One and on the right Portfolio Two.  The letters represent different ISINs (and at the bottom of the table, currencies) and the numbers the value of the holding expressed in a common base currency, here GBP.
We can see that both portfolios are valued at £360m and a visual inspection will show that both funds hold £10m each of bonds C and E.  The rest of the holdings are different.
So, could these bond funds trade with each other?  Put to one side any regulatory issues and look at this from a pure portfolio management perspective.  To trade requires a difference of opinion such that there is a willing seller and a willing buyer.  It would appear that there is a degree of difference of opinion here - that there is not much commonality of holding. Perhaps ownership/style as a guide? RETAIL+INSTITUTIONAL=TRADES!
Of course, there is not enough information here to draw a meaningful conclusion, two sets of positions are not enough. It could be that Portfolio One only invests in very short dated instruments and Portfolio Two only invests in long dated instruments and C and E are only in Portfolio Two as they have been bought and held from inception and have not yet hit a lower limit where they positions must be sold.
So a measure of Levenshtein distance is useful, but not the complete answer.  A real answer is a data analysis for the entire portfolio versus any other portfolio.  That data set to be analysed must include as much data as possible, including:
  • acquisition dates
  • any stock borrow/loan
  • any compliance/regulatory constraints on the portfolio
  • any rules such as no derivatives
The challenge here of course is that this is a massive culture shock for a buy-side, to actually open up their books and show someone what they hold.  Of course, the way to do this would be to have a highly compliant utility body that performs this analysis with high levels of encryption, access control, full audit log and so on.
In essence, each buy-side that wants to participate would expose as much or as little as they want from their in-house systems at a position level and at the order level to the utility.  The utility then has a wonderful big data problem which is challenging but manageable.  And at the end, the buy-side participants will have orders suggested to them on a fuzzy logic matching basis.
Conclusion: The more data the buy-side is willing to release to a compliant, agency basis utility the more likelihood of trading. This is counter intuitive but once considered makes perfect sense.