Chapter 01May 16, 20269 min read

What is a quant factor?

From Fama-French to the modern style factor zoo. Value, momentum, quality, low-volatility — what each one measures and how to compute it in pandas.

A factor is a number you compute for each asset, on each date, that you believe contains information about the asset's future return. That definition does almost all the work, but every word is doing something. "A number" means a scalar — not a forecast, not a recommendation, just a value you can rank across instruments. "For each asset, on each date" means a factor is a panel: rows are instruments, columns are dates, every cell is filled or explicitly missing. "Future return" makes the claim falsifiable: a factor is good if assets with a high value of it tend to outperform assets with a low value over the next day, week, or month. If they do not, it is not a factor; it is a coincidence.

The earliest factor most people learn about is market beta, the regression coefficient of a stock's return on the market's return. Sharpe (1964) used it to explain expected returns through the Capital Asset Pricing Model, and for two decades it was the only factor academic finance believed in. Then came the cracks. Banz (1981) noticed that small-cap stocks earned more than CAPM said they should. Basu (1977) noticed the same for low price-to-earnings stocks. By 1992, Fama and French had assembled the famous three-factor model: market, size (small minus big, "SMB"), and value (high book-to-market minus low, "HML"). They argued, controversially at the time, that two new factors were needed because the market alone could not price the cross-section of equity returns.

Fama and French's paper is the one to read first if you want to understand where the modern factor zoo comes from. The methodology has not changed much in thirty years: sort assets on a candidate variable, form long-short portfolios at the extremes, and measure whether the long-short portfolio earns a return that the existing factor model cannot explain. Carhart (1997) added momentum. Pástor and Stambaugh (2003) added liquidity. By the late 2010s the number of "discovered" factors in the academic literature exceeded 300, and Harvey, Liu and Zhu (2016) wrote a sobering paper called "...and the Cross-Section of Expected Returns" arguing that most of them were artifacts of multiple testing.

The four factors you actually need to know

Working quants tend to use a small subset. Four are unavoidable.

Value asks whether the asset is cheap relative to a fundamental anchor. Book-to-market (Fama and French's original), price-to-earnings, EV-to-EBITDA, dividend yield, free-cash-flow yield — all variants of the same idea. A "cheap" stock has a high value-factor score; the historical premise is that cheap stocks outperform expensive ones over horizons of one to five years, on average.

Momentum asks whether the asset is trending. The canonical specification due to Jegadeesh and Titman (1993) is the 12-month-minus-1-month return: rank assets by their total return over the prior twelve months, excluding the most recent month to avoid short-term reversal contamination. Stocks in the top decile have historically outperformed those in the bottom decile over the following one to twelve months. It is one of the most studied anomalies in finance, and one of the most painful to trade in real life because it crashes spectacularly in regime changes (2009, 2020).

Quality asks whether the firm is well-run. Definitions vary; Asness, Frazzini, and Pedersen (2019) use a composite of profitability (return on equity, gross profits to assets), growth (five-year change in profitability), and safety (low leverage, low earnings volatility, low beta). Quality is the factor that most often survives transaction costs because high-quality firms turn over slowly.

Low volatility is the empirical observation, going back at least to Black, Jensen, and Scholes (1972), that low-beta or low-realized-volatility stocks have historically earned higher risk-adjusted returns than high-volatility ones. This is the most theoretically uncomfortable factor — it contradicts the basic prediction that more risk should earn more return — but it has been replicated enough times across enough markets that most multi-factor models include it.

Almost every long-short equity strategy you encounter is some combination of those four, plus perhaps size and a sector or country control. The art of factor design is in the details: which fundamental measure of value, what lookback for momentum, whether to neutralise the factor against industry or market-cap, how to handle missing data, how to winsorise extreme values.

Computing a momentum factor

Enough theory. Here is the smallest piece of working code that produces a momentum factor for any panel of daily closing prices.

import pandas as pd

def momentum_12_1(prices: pd.DataFrame) -> pd.DataFrame:
    """
    Compute the 12-month minus 1-month total-return momentum factor.

    Parameters
    ----------
    prices : DataFrame
        Daily close prices. Rows are dates (DatetimeIndex). Columns are
        instruments. Assumes prices are already adjusted for splits and
        dividends; if they are not, this function silently lies.

    Returns
    -------
    DataFrame
        Momentum factor values, same shape as the input. The value on date
        t for asset i is the total return from t-252 to t-21. Cells where
        not enough history exists are NaN.
    """
    # 252 trading days ~ 12 months, 21 trading days ~ 1 month.
    twelve_month = prices.pct_change(252)
    one_month = prices.pct_change(21)
    return (1 + twelve_month) / (1 + one_month) - 1


# Example: build the factor on a universe of daily closes you already have.
factor = momentum_12_1(close_prices)
factor_today = factor.iloc[-1].dropna().sort_values(ascending=False)
print(factor_today.head(10))   # ten strongest momentum names
print(factor_today.tail(10))   # ten weakest

Three things to notice. First, the function takes no parameters about which universe or which dates — it computes the factor for whatever panel you hand it. Keeping factor functions universe-agnostic makes them composable later. Second, the return on the most recent month is removed from the lookback. Jegadeesh and Titman observed that the most recent month tends to reverse on the next month (short-term reversal), and including it pollutes the signal. Third, the function does not normalise, neutralise, or rank. Those are downstream operations; the factor itself is just a number.

Adjusted prices, not raw prices

If close_prices are raw trade prices and not adjusted for splits and dividends, this code will produce nonsense the day after every corporate action. AlphaHub's data API returns adjusted prices by default. If you are pulling from a free source like Yahoo Finance, request the adjusted close column and double-check at least one split event in your history before trusting any factor that touches returns.

What makes a factor good

A factor's value is whatever predictive content it has for future returns. The simplest measure is the information coefficient, or IC: the cross-sectional rank correlation between the factor today and the asset return tomorrow (or next week, or next month). Grinold and Kahn (2000) — the textbook every working quant has on their desk — derive the famous fundamental law of active management:

Information ratio ≈ IC × √breadth

Information ratio is the strategy's annualised excess return divided by its annualised tracking error. Breadth is the number of independent bets per year. IC for a real factor on real data is usually between 0.02 and 0.06. A "great" factor in the academic literature has an IC around 0.05; over a universe of 500 stocks and daily rebalancing that translates to an information ratio of roughly one, which would be considered excellent by any institutional investor. The fundamental law explains why hedge funds care so much about either improving IC by tenths of a percentage point or expanding the universe — both inputs to the same product.

Other diagnostics worth running on a candidate factor:

Decay profile. Compute the IC at horizons of one, five, twenty, sixty days. A factor whose IC peaks at day one and is zero by day five is hard to trade because of transaction costs. A factor whose IC is positive for one to three months is much friendlier.
Quintile spread. Sort assets into five buckets by factor value each day; track the forward return of each bucket. A real factor produces a monotonic spread — the top bucket beats the second, which beats the third, and so on. If only the extremes work, the factor may be picking up something else, like distress or extreme illiquidity.
Turnover. How much of the top quintile changes from one day to the next. High turnover kills strategies via transaction costs faster than anything else; we will spend half of chapter 3 on this.
Stationarity across regimes. Run the IC year by year. A factor that worked from 2003-2007 and again from 2014-2018 but lost money in between is not the same factor as one that produced a 0.03 IC every year for twenty years.

We will not run these diagnostics by hand. The strategy templates in AlphaHub do them automatically when you backtest a factor, and chapter 2 will show how to read the resulting report.

Try it in AlphaHub

Build your first factor.

Compute a 12-1 momentum factor on the S&P 500 universe over the last five years. Show me the top-decile-minus-bottom-decile spread as an equity curve.

Open workspace

The next chapter takes a factor and turns it into a trading strategy: how the signal becomes a portfolio, how the portfolio becomes a sequence of trades, and which numbers tell you whether any of it was worth doing.

References

Asness, C., Frazzini, A., and Pedersen, L. (2019). Quality minus junk. Review of Accounting Studies, 24(1), 34–112.
Banz, R. (1981). The relationship between return and market value of common stocks. Journal of Financial Economics, 9(1), 3–18.
Carhart, M. (1997). On persistence in mutual fund performance. Journal of Finance, 52(1), 57–82.
Fama, E. and French, K. (1992). The cross-section of expected stock returns. Journal of Finance, 47(2), 427–465.
Grinold, R. and Kahn, R. (2000). Active Portfolio Management, 2nd ed. McGraw-Hill.
Harvey, C., Liu, Y., and Zhu, H. (2016). ...and the cross-section of expected returns. Review of Financial Studies, 29(1), 5–68.
Jegadeesh, N. and Titman, S. (1993). Returns to buying winners and selling losers. Journal of Finance, 48(1), 65–91.
Sharpe, W. (1964). Capital asset prices: a theory of market equilibrium under conditions of risk. Journal of Finance, 19(3), 425–442.

← Previous · Chapter 00

How to read this tutorial

Next · Chapter 02 →

What is a trading strategy?