by Robert Huebscher, AdvisorPerspectives.com
Mankind has landed a spacecraft on a comet 300 million miles away. Yet, after decades of academic research, the challenge of distinguishing skill from luck among actively managed mutual funds has remained largely unsolved.
Much is at stake in this challenge. If skill can be identified, then it is likely to persist, affording clients superior performance. But a manager who is merely lucky will eventually succumb to underperformance.
If rocket science has a counterpart in financial analysis, it is in the quantitative analytics from companies like Boston-based Northfield Information Services. Last week, I spoke with Dan di Bartolomeo, founder and CEO, to see if he could detect skill or luck among the two biggest fixed-income managers: Bill Gross, when he managed the PIMCO Total Return Fund (PTTRX), and Jeffrey Gundlach, manager of the DoubleLine Total Return Fund (DBLTX).
Northfield has been providing risk analysis and tools for portfolio construction to institutional asset managers for 30 years. Among its noteworthy accomplishments, Northfield’s analysis was used by Harry Markopolos to confirm that Bernie Madoff was engaged in a massive Ponzi scheme.
Scores of academicians and commercial vendors have attempted to identify skillful managers. The problem, di Bartolomeo said, is that most people “do this badly” and don’t deal with all the issues in sufficient detail.
Northfield’s methodology was originally published in this 2006 paper. Di Bartolomeo said the published results documented predictive power that was three times stronger than what was previously reported in the academic literature. It was both economically and statistically significant.
I’ll discuss the results of the Gross versus Gundlach analysis, but first let’s review Northfield’s system for distinguishing skill from luck.
A four-step process
Northfield’s methodology is based on the assumption that skill -once identified-will persist. If a manager’s performance is due to skill, that skill – or lack thereof – will continue. If a manager’s performance is due to luck, however, the best guess for future performance is the average of an appropriately constructed peer group. In other words, if a manager’s outperformance is due to luck, it will eventually revert to the mean.
According to di Bartolomeo, the academic literature has found that performance is persistent over a relatively short time horizon, “one to three years, depending on who you believe.” Northfield tested its results over a one-year time horizon.
Each fund is analyzed using a four-step process. Northfield first determines the appropriate peer group for each fund. An iterative methodology with returns-based analysis is used, a tool first developed by William Sharpe. Di Bartolomeo described this as a “very numerically intensive” processes, which uses a large group of funds to find ones that act similarly. For every fund, Northfield determines a distinct and custom peer group.
“Unless you correctly classify funds, there is no persistence in fund performance,” di Bartolomeo said. “If you don’t, you might as well be throwing darts.”
The second step is to identify how much history should be used in that fund’s analysis. Northfield does this with a tool known as CUSUM. Developed in the 1950s, CUSUM is a sequential probability test that was first used to measure quality control on assembly lines. It looks for trends in the number of rejects. Bad performance for a mutual fund is like a reject on an assembly line.
The CUSUM analysis answers a very precise question: “At what time in the past was it least likely that the subsequent performance would have occurred, given the precedent performance?” Stated less precisely, “How much of the fund’s history is relevant?” Everything before that point in time, when a “regime change” essentially occurred in the fund’s management, is ignored by Northfield’s analysis.
Sometimes, di Bartolomeo said, nothing changes in a fund’s performance. Other times, however, it is incredibly relevant. For example, when Peter Lynch left the Magellan fund, its performance declined rapidly.
The third step is relatively straightforward. Northfield determines the risk-adjusted performance of the fund relative to a market index and relative to its peer group.
Northfield next asks whether the performance over that time period was due to skill or luck. To do so, it uses four pieces of information: the risk-adjusted performance of the fund; the volatility of that performance; the average return of the fund’s peer group; and the dispersion of returns among funds in the peer group.
The dispersion is critical to distinguishing skill from luck. If all the returns for a peer group are clustered around a mean, then it is statistically more likely that an outlier is due to luck, rather than skill. Alternatively, if returns are more dispersed, then outliers are most likely to represent skillful management.
The statistical method by which those four pieces of information are analyzed is known as Bayes Theorem. Bayes Theorem enjoyed recent popularity when Nate Silver of FiveThirtyEight.com used it to correctly predict the winner in 49 states in the November 2008 race. Di Bartolomeo said this theorem was first applied in mutual fund performance analysis in 2004.
The end result of the four-step process is what di Bartolomeo calls the PWER score, Northfield’s best estimate of fund performance going forward. It is a mathematical compromise between both the skill and luck assumptions and is a result of applying Bayes Theorem.
Gross versus Gundlach
I asked di Bartolomeo to discuss the results of this analysis for the PIMCO and DoubleLine total return funds.
According to the CUSUM process, the regime change point for the PIMCO fund was in October of 2008. Put formally, at this point subsequent performance is least likely to have occurred by random coincidence, conditional on the performance prior to that point. Since that time, the PIMCO fund performance has been mediocre. From October 2008 forward, the average fund in the same sector outperformed its respective “style benchmark” (best fit combination of relevant market indices) by 40 basis points a year, while the PIMCO fund had relative performance of between +5 basis point per year and negative 68 basis points per year for the same period depending on share class (different fees and expense ratios).
Northfield’s best estimate of expected performance going forward is +25 basis points annually for the Class A shares.
The same analysis for the DoubleLine funds was more positive. The most likely regime change point was between November 2011 and January 2012 (depending on whether you look at index relative or fund-sector relative performance). Since then the fund performance has been very strong with an annualized outperformance around 2%, with an information ratio (alpha/tracking error) greater than 1.0 against both indices and the fund sector.
Most funds in the category (lower quality corporate bonds) have done well over that period, outperforming their respective best fit index combination by 86 basis points. However, there was a wide dispersion of results across the set of competing peer funds, so the peer group average of 86 basis points is only very weakly reliable. As such, the compromise between the luck and skill assumptions leans toward the skill side.
Northfield’s best estimate of future outperformance for the DoubleLine fund is between 1.8% to 2% per annum relative to the historic “best fit” combination of market indices (allowing for differences across share classes arising from different fees and expense ratios).
Given that the most likely regime change time point for the PIMCO fund was in the middle of the global financial crisis (GFC), di Bartolomeo said that it’s possible that a lot of the comparable funds would be in the same boat. The GFC was a huge deal for fixed income markets. The PIMCO fund is also very large and invested in a lot of relatively illiquid instruments, so it would be hard to maneuver through a volatile period. Di Bartolomeo said he would have to do additional statistical work to determine if there are liquidity-related effects observable in the data.
Since Gundlach began managing the DoubleLine fund in 2009, we also looked to see if his prior fund, the TCW Total Return fund (TGLMX) had undergone a regime change during the financial crisis. It did not. The relevant time period for its performance, based on the CUSUM analysis, began in May 1994, shortly after the TCW fund was launched.
The future of fund analysis
While Northfield has commercialized portions of his analysis, di Bartolomeo uses it mostly to support consulting engagements with his clients, including the largest public pension funds. He has helped those funds in their external manager selection process and in the decision whether or not to use active managers at all.
One interesting application, he said, would be to look at aggregate scores for fund companies. Di Bartolomeo said that some fund companies have been known to “game” the fund classification process. By intentionally misclassifying funds, they can ensure that at least some funds are at the top of their respective peer groups. His methodology would expose such misclassifications, and could illuminate situations where fund companies were repeatedly engaging in such activities.
He cautioned that his results are only one piece of data that an advisor or investor should use when selecting an actively managed fund. Other considerations, such as the investor’s goals and risk tolerance, are paramount.
Advisors relying on active management must recognize and appreciate the level of sophistication in Northfield’s analysis, as well as the fact that its results were tested over a time horizon of only one year. Even if you could select skillful managers with Northfield’s tools, you would have to reassess your decisions every year. If nothing else, Northfield’s methodology sets the benchmark for the degree of “rocket science” an analytic framework must employ when attempting to identify a skillful manager.