financial time series is famously nonstationary, due to all of the reasons given earlier.
It is possible to incorporate such regime shifts into a sophisticated “super”-model (as I will discuss in Example 7.1), but it is much simpler if we just demand that our model deliver good performance on recent data.
Does the Strategy Suffer from Data-Snooping Bias?
If you build a trading strategy that has 100 parameters, it is very likely that you can optimize those parameters in such a way that the historical performance will look fantastic. It is also very likely that the future performance of this strategy will look nothing like its historical performance and will turn out to be very poor. By having so many parameters, you are probably fitting the model to historical accidents in the past that will not repeat themselves in the future. Actually, this so-called data-snooping bias is very hard to avoid even if you have just one or two parameters (such as entry and exit thresholds), and I will leave the discussion on how to minimize its impact to Chapter 3. But, in general, the more rules the strategy has, and the more parameters the model has, the more likely it is going to suffer data-snooping bias. Simple models are often the ones that will stand the test of time. (See the sidebar on my views on artificial intelligence and stock picking.)
ARTIFICIAL INTELLIGENCE AND STOCK PICKING1
There was an article in the New York Times a short while ago about a new hedge fund launched by Mr. Ray Kurzweil, a pioneer in the field of artificial intelligence. (Thanks to my fellow blogger, Yaser Anwar, who pointed it out to me.) According to Kurzweil, the stock-picking decisions in this fund are supposed to be made by machines that “… can observe billions of market transactions to see patterns we could never see” (quoted in Duhigg, 2006).
While I am certainly a believer in algorithmic trading, it is a lot more difficult to successfully apply artificial intelligence to trading.
At the risk of oversimplification, we can characterize artificial intelligence (AI) as trying to fit past data points into a function with many, many parameters. This is the case for some of the favorite tools of AI: neural networks, decision trees, and genetic algorithms. With many parameters, we can for sure capture small patterns that no human can see. But do these patterns persist? Or are they random noises that will never replay again? Experts in AI assure us that they have many safeguards against fitting the function to transient noise. And indeed, such tools have been very effective in consumer marketing and credit card fraud detection. Apparently, the patterns of consumers and thefts are quite consistent over time, allowing such AI algorithms to work even with a large number of parameters. However, from my experience, these safeguards work far less well in financial markets prediction, and overfitting to the noise in historical data remains a rampant problem. As a matter of fact, I have built financial predictive models based on many of these AI algorithms in the past. Every time a carefully constructed model that seems to work marvels in backtest came up, they inevitably performed miserably going forward. The main reason for this seems to be that the amount of statistically independent financial data is far more limited compared to the billions of independent consumer and credit transactions available. (You may think that there is a lot of tick-by-tick financial data to mine, but such data is serially correlated and far from independent.)
This is not to say that no methods based on AI will work in prediction. The ones that work for me are usually characterized by these properties:
The targets are nonreflexive—targets that will not change their values in response to too many people successfully predicting them. If returns can be predicted, returns will change in response to the prediction. On the other hand, if weather can be predicted, weather will not change in response. Yet accurate weather prediction can benefit agricultural futures traders. Examples of financial targets that are nonreflexive include earnings surprises and nonfarm payroll surprises, both of which my research team has been successful in predicting (see predictnow.ai/blog/us-nonfarm-employment-prediction-using-riwi-corp-alternative-data/ for the latter).
The features (predictors) that are used as input for predictions are meaningful, numerous, and carefully scrubbed and engineered. For example many fundamental stock databases have embedded look-ahead bias because they report “restated” financials, not “point-in-time” financials. This look-ahead bias will make the backtest looks great, but will cause live trading performance to be much worse.
The prediction is applied to private instead of public targets. For example, instead of predicting the returns of SPY, AI should be used to predict whether your proprietary trading signals will be profitable. This way, you can avoid competing with many of the world's best financial machine learners in predicting the exact same target. This application of AI is called metalabeling. See how we applied metalabeling successfully at predictnow.ai/blog/what-is-the-probability-of-profit-of-your-next-trade-introducing-predictnow-ai/.
Does the Strategy “Fly under the Radar” of Institutional Money Managers?
Since this book is about starting a quantitative trading business from scratch, and not about starting a hedge fund that manages multiple millions of dollars, we should not be concerned whether a strategy is one that can absorb multiple millions of dollars. (Capacity is the technical term for how much a strategy can absorb without negatively impacting its returns.) In fact, quite the opposite—you should look for those strategies that fly under the radar of most institutional investors, for example, strategies that have very low capacities because they trade too often, strategies that trade very few stocks every day, or strategies that have very infrequent positions (such as some seasonal trades in commodity futures described in Chapter 7). Those niches are the ones that are likely to still be profitable because they have not yet been completely arbitraged away by the gigantic hedge funds.
SUMMARY
Finding prospective quantitative trading strategies is not difficult. There are:
Business school and other economic research websites.
Financial websites and blogs focusing on the retail investors.
Trader forums where you can exchange ideas with fellow traders.
Twitter!
After you have done a sufficient amount of Net surfing or scrolling through your Twitter feed, you will find a number of promising trading strategies. Whittle them down to just a handful, based on your personal circumstances and requirements, and by applying the screening criteria (more accurately described as healthy skepticism) that I listed earlier:
How much time do you have for babysitting your trading programs?
How good a programmer are you?
How much capital do you have?
Is your goal to earn steady monthly income or to strive for a large, long-term capital gain?
Even before doing an in-depth backtest of the strategy, you can quickly filter out some unsuitable strategies if they fail one or more of these tests:
Does it outperform a benchmark?
Does it have a high enough Sharpe ratio?
Does it have a small enough drawdown and short enough drawdown duration?
Does the backtest suffer from survivorship bias?
Does