Minkey Chang

Data Scientist / AI Engineer

Can we really get alpha from market data?

It has long been believed that you cannot reliably beat the market using public data—yet competitions and funds keep trying. Here’s how the Efficient Market Hypothesis, the Micro Alphas idea, and probabilistic forecasting fit together.

EDA of market data: missing values, correlations, volatility, target distribution.
Exploratory analysis of market and macro data used for return prediction and position sizing.

Efficient Market Hypothesis

It has long been believed that you cannot consistently earn excess returns (alpha) from publicly available market data. If prices already incorporate all relevant information, then any edge should be arbitraged away. The Efficient Market Hypothesis (EMH) formalizes this: in its strong form, asset prices reflect all information, so predictable patterns in returns should not persist. Empirical work on equity premium prediction has often supported a skeptical view: many variables that look good in-sample fail out-of-sample. Goyal and Welch (2008), for example, show that a wide set of popular predictors (dividend yield, earnings yield, term spread, etc.) did not reliably forecast equity returns out of sample; the predictive power that showed up in earlier samples largely disappeared. So the baseline view in much of academic finance is that getting alpha from market data is very hard, and that we should be cautious about overfitting and look-ahead bias.


Micro Alpha: there could be a signal turning into money

In the paper “Micro Alphas” (Hull et al., 2024), the authors claim that by accumulating many small, weak signals enough, you can still get usable predictability—and turn it into money. No single signal need be statistically strong; the idea is that a large number of weak, possibly non-linear and time-varying effects can add up when combined carefully. The paper uses machine learning (e.g. transformations, feature selection) and an elastic net–style model to aggregate these micro signals into a forecast. That approach has been implemented in a live fund and has delivered excess returns over the S&P 500. So the claim is not that one magic variable beats the market, but that many small edges, combined in a disciplined way, can. The Kaggle Hull Tactical Market Prediction competition is explicitly inspired by this framework: you get a large set of daily features (market, macro, volatility, sentiment, etc.) and must both forecast S&P 500 excess returns and choose portfolio positions (alpha between 0 and 2) under a volatility constraint—i.e. turn signals into a tradable strategy.


Getting good signals

So forecasting models matter—and probabilistic forecasting in particular. Point forecasts alone don’t tell you how uncertain the prediction is; for position sizing and risk limits you need a sense of the distribution. Models like the Temporal Fusion Transformer (TFT) (Lim et al., 2021) output quantile forecasts (e.g. 5th, 50th, 95th percentile), so you get both a central forecast and a spread. That spread can drive conservative sizing when uncertainty is high and more aggressive sizing when it’s low. TFT also uses interpretable attention over time and over variables, so you can see which inputs the model relies on—useful when combining many micro signals. In a Hull Tactical–style setup, you typically (1) forecast excess returns (often with a probabilistic model like TFT), (2) extract regime or factor information from the same or related series, and (3) map forecasts and factors into positions that respect leverage and volatility constraints. The “getting good signals” part is thus not just accuracy of the point forecast, but reliable quantification of uncertainty and integration with a policy that turns forecasts into positions without blowing through risk limits.


References

  1. Equity premium prediction (EMH skepticism) — Goyal, A., & Welch, I. (2008). A Comprehensive Look at the Empirical Performance of Equity Premium Prediction. Review of Financial Studies, 21(4), 1455–1508.

  2. Micro Alphas — Hull, B., Bakosova, P., Cocquemas, F., Sinclair, E., & Fast, P. (2024). Micro Alphas. Working Paper, Hull Tactical Asset Allocation. SSRN 5035294.

  3. Temporal Fusion Transformers (TFT) — Lim, B., Arik, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting. International Journal of Forecasting, 37(4), 1748–1764.

  4. Hull Tactical Market Prediction — Kaggle competition inspired by the Micro Alphas framework: Hull Tactical - Market Prediction.

Other posts