Questions about the statistical validity of the Foolish Four are still being asked. Meet the Loyal Opposition, those who think the Fool Four's past success was due to a random correlation. The only way to determine who's right is to test the strategy on a different set of stocks.
By
Several professional statisticians, mathematicians, and others of like mind have questioned the validity of the strategy. Today I want to try to discuss the major points on both sides, so that those who don't regularly read the discussion board can get some idea of what all the fuss is about.
The data mining contingency -- let's call them the Loyal Opposition (they may find that amusing) -- contend that the Foolish Four and the other Dow strategies are the product of data mining and, therefore, the high past returns we cite are merely the product of a random correlation that occurred during the 1970s and 1980s that has no predictive power. The fact that the Foolish Four has not beaten the market by a significant margin since it was discovered three years ago is seen as evidence supporting their argument.
The term "data mining" is often used in a pejorative sense to describe the process of sifting through large amounts of data looking for correlations, without expending enough care to ensure that they actually make sense. However, it is not necessarily an illegitimate means of research. It can provide a valid hypothesis worthy of further research. The Loyal Opposition doesn't argue that data mining per se is invalid -- just that it has a high potential for abuse. (We agree.) They argue that we have not proven the strategy works and that, since it is likely to be a product of bad data mining, our statistical test showing the RP outperforming the Dow at a high confidence level was not an appropriate test.
Many of their arguments have merit. There is no doubt that much of the "research" we did in developing the Foolish Four was not up to academic standards. We've learned a lot since then -- much of it taught by discussion group participants.
I am not convinced that they are right, although they may turn out to be. One of their strongest arguments is that, if you try a large number of hypotheses out on a database, you will discover several that seem to be statistically significant. If you accept a 95% confidence level as your criterion for accepting a hypothesis, and you test 20 strategies, it's quite likely that one of them (5%) will produce high enough returns to earn a 95% confidence level quite by chance. That's the definition of a 95% confidence level. It says that the odds of a particular strategy turning out to be invalid are 5%, or one in 20.
Now, if you just try one hypotheses and you get a confidence level of 95%, you might feel pretty good about it. But, if you try 20 and one produces a confidence level that high, well, that's to be expected even if all 20 are totally invalid.
The Loyal Opposition's argument is that, since they don't know how many strategies were tested while various people were coming up with the various Dow strategies, there is no way to show that the link between high yield/low price and return is not random.
That's true. But, there is no evidence that the strategies were designed the way the Loyal Opposition seems to believe they were. Certainly, if Michael O'Higgins designed Beating the Dow by searching through mountains of data, then his conclusions would be suspect. But I don't see any reason to assume that.
So far, I am willing to assume that Michael O'Higgins used good sense when testing his Beating the Dow theory. If Beating the Dow is legitimate, then the tweaks and changes we have made to it in devising the various Foolish Four strategies are not founded on a random correlation. (And, as we discussed Friday, it had five years of market-doubling, post-publication returns... which goes a long way to prove to me that low price/high yield/high return was a legitimate association, even if market conditions have changed since.)
The only thing that will prove that the Foolish Four worked (past tense) is to test it on a completely different data set. If the original correlation was random, it won't show up when you use a different set of stocks. That's what we will be doing with the CRSP database. As far as I am concerned, that is really the only way to know for sure whether the strategy is or was valid.
Then we can tackle the question of why it's such a mess right now.
Fool on and prosper!