Factor-based equity investing: is the magic gone?

Factor-based equity investing has shown remarkable results against passive buy-and-hold strategies. However, our research shows that the magic may have diminished over the years.

Equity factor models are used by many successful hedge funds and asset management firms. Their ability to create rather consistent alpha has been the driving force behind their widespread adoption. As we will show the magic seems to be under pressure.

Our four-factor model is based on the well-researched equity factors value, quality, momentum and low volatility with S&P 1200 Global as the universe. Value is defined by 12-month trailing EV/EBITDA and price-to-book (if the former is not available, which is often the case for financials). Quality is defined as return on invested capital (ROIC) and return on equity (ROE, if the former is not available, which is the case for financials). Momentum is defined as the three-month change in price (the window has not been optimized). The stocks’ betas are estimated by linear regression against the market return based on 18 months of monthly observations (the estimation window has not been optimized). All factors have been lagged properly to avoid look-ahead bias.

To avoid concentration risk, as certain industries often dominate a factor at a given period, the factor model spreads its exposure systematically across the equity market. A natural choice is to use industry classifications such as GICS, but from a risk management perspective we are more interested in correlation clusters. We find these clusters by applying hierarchical clustering on excess returns (the stocks’ returns minus the market return) looking back 24 months (again, this window has not been optimized). We require at least 50 stocks in each cluster otherwise the algorithm stops.

The median number of clusters across 1,000 bootstraps is robust around 6 over time compared to the 10 GICS sectors that represent the highest level of segmentation, even – surprisingly – during the financial crisis of 2008 despite a dramatic increase in cross-sectional correlation over that period. It is only the period from June 2000 to June 2003 that sees a big change in the market structure with fewer distinct clusters.

Clusters across time

For each cluster at any given point in time in the backtesting period we rescale the raw equity factors to the range [0, 1]. Another approach would be to normalize the factors but that approach would be more sensitive to outliers which are quite profound in our equity factors. The average of rescaled factors are then ranked and those stocks with the highest combined rank (high earnings yield, high return on capital, strong price momentum and low beta) are chosen.

This is probably the simplest method in factor investing, but why increase complexitity if the increase in risk-adjusted returns are not significant? Another approach is use factors in regressions and compute the expected returns based on the stocks’ factors.

Our data goes back to 1996 but with 24 months reserved for the estimation window for calculating clusters the actual results start in 1998 and end in November 2015. We have specified the portfolio size to 40 stocks based on research showing that after point the marginal reduction in risk is not meaningful (in a long-only framework). Increasing the portfolio size reduces monthly turnover and hence cost, but we have not optimized this parameter. We have set trading costs (one-way average bid-ask spread and commission) to 0.15% which is probably on the high side in today’s market but on the low side in 1998, but on average likely a fair average trade cost for investors investing in global equities with no access to prime brokerage services.

The backtest results are based on 1,000 bootstraps in order to get more robust estimates of performance measures such as annualized return. In a follow-up post we will explain the importance of bootstrapping when testing cross-sectional equity strategies. One may object to the replacement requirement because it creates situations where the portfolio will select the same stock more than once, which increases concentration risk and departs from an equal-weight framework. However, sampling without replacement would reduce the universe from 1,200 securities.

Despite high trading costs and high turnover our four-factor model delivers alpha against random portfolios with both Kolmogorov-Smirnov (comparing the two distributions) and t-test (on excess annualized returns across bootstraps) showing it to be highly significant. The Kolmogorov-Smirnov statistic (D) is 0.82 and the t-test statistics on excess returns is 60. The four-factor model delivers on average 8.8% annualized return over the 1,000 bootstraps compared to 5.8% annualized return for S&P 1200 Global (orange vertical line) and an average 5.3% annualized return for random portfolios.

Four-factor model annualised returns vs random portfolios

In our previous post we showed that random portfolios beat buy-and-hold for the S&P 500 index, but in this case they do not. The reason is the small portfolio size being roughly three percent of the universe leading to excessive turnover and hence costs.

The Sharpe ratio is also decent at 0.56 on average across the 1,000 bootstraps compared to 0.28 for random portfolios – hence a very significant improvement. A natural extension would be to explore ways to improve the risk-adjusted return further, for example by shorting the stocks with the worst total score thereby creating a market-neutral portfolio. is one possible solution.

Sharpe Ratio annualized vs random portfolio

However, our research shows that a market-neutral version does not deliver consistent alpha. In general we find that our equity models do not capture alpha equally well on the long and short side, indicating that drivers of equity returns are not symmetric. So far, we have found market-neutral equity strategies to have more merit on shorter time-frames (intraday or daily), but encourage readers to share any findings on longer time-frames (weekly or monthly frequency).

Cumulative excess performance vs S&P 1200 Global

Interestingly the cumulative excess performance chart shows exactly what Cliff Asness, co-founder of AQR Capital Management, has explained at multiple occasions about the firm’s early start. Namely persistent underperformance as the momentum effect dominated the late 1990s and catapulted the U.S. equity market into a historical bubble. Value investing together with Warren Buffett was ridiculed. Basically stocks that were richly valued, had low or negative return on capital, high beta and high momentum performed very well.

However, starting in 2000 our four-factor model enjoys a long streak of outperformance similar to the fortunes of AQR and it continued until summer 2009. Since then excess return against S&P 1200 Global (we choose this as benchmark here because it beats random portfolios) has been more or less flat. In other words, it performs similarly to the market, but does not generate alpha for our active approach.

Why are traditional equity factor models not producing alpha to the same degree as the period 2000-2009? Two possible explanations come to mind. Competition in financial markets has gone up and with cheap access to computers, widespread adoption of open-source code and equity factors well-researched, the alpha has been competed away. Alternatively, the standard way of creating a four-factor model has run out of juice and factors can still work, but have to be applied in different ways. Maybe the factors should not be blended into a combined score, but instead the best stocks from each factor should be selected. There are endless ways to construct an equity factor model.

### risk.factors is an array with dimensions 239, 2439 and 7 (months, unique stocks in S&P 1200 Global over the whole period, equity factors).
### variables such as cluster.window etc. are specified in our data handling script
### hist.tickers is a matrix with dimensions 239, 2439 (months, unique stocks in S&P 1200 Global over the whole period) - basically a matrix with ones or NAs indicating whether a ticker was part of the index or not at a given point in time
### tr is a matrix containing historical total returns (including reinvesting of dividends and adjustments for corporate actions) with same dimensions as hist.tickers 

# variables
no.pos <- 40 # number of positions in portfolio
strategy <- "long" # long or long-short?
min.stock.cluster <- 50 # minimum stocks per cluster
B <- 1000 # number of bootstraps
tc <- 0.15 # one-way trade cost in % (including bid-ask and commission)

# pre-allocate list with length of dates to contain portfolio info over dates
strat <- vector("list", length(dates))
names(strat) <- dates

# pre-allocate xts object for portfolio returns
port.ret <- xts(matrix(NA, N, B), order.by = dates)

# pre-allocate xts object for random portfolio retuns
rand.port.ret <- xts(matrix(NA, N, B), order.by = dates)

# number of clusters over time
no.clusters <- xts(matrix(NA, N, B), order.by = dates)

# initialise text progress bar
pb <- progress::progress_bar$new(format = "calculating [:bar] :percent eta: :eta",
 total = B, clear = FALSE, width = 60)

# loop of bootstraps
for(b in 1:B) {
 # loop over dates
 for(n in (cluster.window+2):N) {
 # rolling dates window
 dw <- (n - cluster.window):(n - 1)
 # index members at n time
 indx.memb <- which(hist.tickers[n - 1, ] == 1)
 # if number of bootstraps is above one
 if(b > 1) {
 # sample with replacement of index members at period n
 indx.memb <- sample(indx.memb, length(indx.memb), replace = T)
 complete.obs <- which(apply(is.na(tr[dw, indx.memb]), 2, sum) == 0)
 # update index members at n time by complete observations for correlation
 indx.memb <- indx.memb[complete.obs]
 # temporary total returns
 temp.tr <- tr[dw, indx.memb]
 # normalised returns
 norm.ret <- scale(temp.tr)
 # fit PCA on normalised returns
 pca.fit <- prcomp(norm.ret)
 # estimate market returns from first PCA component
 x <- (norm.ret %*% pca.fit$rotation)[, 1]
 # estimate beta
 betas <- as.numeric(solve(t(x) %*% x) %*% t(x) %*% norm.ret)
 # estimate residuals (normalised return minus market)
 res <- norm.ret - tcrossprod(x, betas)
 # correlation matrix
 cm <- cor(res)
 # distance matrix
 dm <- as.dist((1 - cm) / 2)
 # fit a hierarchical agglomerative clustering
 fit <- hclust(dm, method = "average")
 for(i in 2:20) {
 # assign tickers into clusters
 groups <- cutree(fit, k = i)
 # minimum number of tickers in a cluster
 group.min <- min(table(groups))
 # if smallest cluster has less than minimum required number of stocks break loop
 if(group.min < min.stock.cluster) {
 groups <- cutree(fit, k = i - 1)
 # number of clusters
 G <- length(unique(groups))
 # insert number of clusters
 no.clusters[n, b] <- G
 # stocks per cluster
 cluster.size <- table(groups)
 # number of positions per cluster
 risk.allocation <- round(table(groups) / sum(table(groups)) * no.pos)
 # pre-allocate list for containing all trade info on each cluster
 cluster.info <- vector("list", G)
 # loop over clusters
 for(g in 1:G) {
 # find the ticker positions in the specific cluster
 cluster.pos <- indx.memb[which(groups == g)]
 # which tickers have total returns for period n
 has.ret <- which(!is.na(tr[n, cluster.pos]))
 # adjust stock's position for g cluster based on available forward return
 cluster.pos <- cluster.pos[has.ret]
 # rescale quality risk factor
 quality.1 <- risk.factors[n - 1, cluster.pos, "quality.1"]
 quality.2 <- risk.factors[n - 1, cluster.pos, "quality.2"]
 quality.1.rank <- (quality.1 - min(quality.1, na.rm = T)) /
 (max(quality.1, na.rm = T) - min(quality.1, na.rm = T))
 quality.2.rank <- (quality.2 - min(quality.2, na.rm = T)) /
 (max(quality.2, na.rm = T) - min(quality.2, na.rm = T))
 quality.rank <- ifelse(!is.na(quality.2.rank), quality.2.rank, quality.1.rank)
 # rescale value risk factor
 value.1 <- risk.factors[n - 1, cluster.pos, "value.1"]
 value.2 <- risk.factors[n - 1, cluster.pos, "value.2"]
 value.1.rank <- (value.1 - min(value.1, na.rm = T)) /
 (max(value.1, na.rm = T) - min(value.1, na.rm = T))
 value.2.rank <- (value.2 - min(value.2, na.rm = T)) /
 (max(value.2, na.rm = T) - min(value.2, na.rm = T))
 value.rank <- ifelse(!is.na(value.2.rank), value.2.rank, value.1.rank)
 # rescale momentum risk factor
 mom <- risk.factors[n - 1, cluster.pos, "mom"]
 mom.rank <- (mom - min(mom, na.rm = T)) /
 (max(mom, na.rm = T) - min(mom, na.rm = T))
 # rescale beta risk factor
 beta <- risk.factors[n - 1, cluster.pos, "beta"] * -1
 beta.rank <- (beta - min(beta, na.rm = T)) /
 (max(beta, na.rm = T) - min(beta, na.rm = T))
 # rescale reversal risk factor
 reversal <- risk.factors[n - 1, cluster.pos, "reversal"] * -1
 reversal.rank <- (reversal - min(reversal, na.rm = T)) /
 (max(reversal, na.rm = T) - min(reversal, na.rm = T))
 # combine all normalised risk factor ranks into one matrix
 ranks <- cbind(quality.rank, value.rank, mom.rank, beta.rank)#, reversal.rank)
 if(sum(complete.cases(ranks)) < risk.allocation[g]) {
 col.obs <- apply(!is.na(ranks), 2, sum)
 col.comp <- which(col.obs > (cluster.size[g] / 2))
 comb.rank <- rank(apply(ranks[, col.comp], 1, mean), na.last = "keep")
 } else {
 comb.rank <- rank(apply(ranks, 1, mean), na.last = "keep")
 if(strategy == "long") {
 long.pos <- cluster.pos[which(comb.rank > max(comb.rank, na.rm = T) - risk.allocation[g])]
 cluster.info[[g]] <- data.frame(Ticker = tickers[long.pos],
 Ret = as.numeric(tr[n, long.pos]),
 stringsAsFactors = FALSE)
 if(strategy == "long-short") {
 long.pos <- cluster.pos[which(comb.rank > max(comb.rank, na.rm = T) - risk.allocation[g])]
 short.pos <- cluster.pos[which(comb.rank < risk.allocation[g] + 1)]
 long.data <- data.frame(Ticker = tickers[long.pos],
 Sign = rep("Long", risk.allocation[g]),
 Ret = as.numeric(tr[n, long.pos]),
 stringsAsFactors = FALSE)
 short.data <- data.frame(Ticker = tickers[short.pos],
 Sign = rep("Short", risk.allocation[g]),
 Ret = as.numeric(tr[n, short.pos]) * -1,
 stringsAsFactors = FALSE)
 cluster.info[[g]] <- rbind(long.data, short.data)
 # rbind data.frames across clusters and insert into strat list
 strat[[n]] <- do.call("rbind", cluster.info)
 # insert portfolio return
 if(n == cluster.window + 2) {
 port.ret[n, b] <- mean(strat[[n]][,"Ret"]) - tc * 2 / 100
 } else {
 # turnover in % (only selling)
 strat.turnover <- 1 - sum(!is.na(match(strat[[n]][, "Ticker"], strat[[n-1]][, "Ticker"]))) /
 length(strat[[n-1]][, "Ticker"])
 port.ret[n, b] <- mean(strat[[n]][,"Ret"]) - tc * strat.turnover * 2 / 100
 # insert random portfolio return
 rand.pos <- sample(indx.memb[which(!is.na(tr[n, indx.memb]))],
 size = no.pos, replace = T)
 if(n == cluster.window + 2) {
 rand.port.ret[n, b] <- mean(tr[n, rand.pos]) - tc * 2 / 100
 prev.rand.pos <- rand.pos
 } else {
 rand.turnover <- 1 - sum(!is.na(match(prev.rand.pos, rand.pos))) / length(prev.rand.pos)
 rand.port.ret[n, b] <- mean(tr[n, rand.pos]) - tc * rand.turnover * 2 / 100
 # update progress bar
 Sys.sleep(1 / 100)

5 thoughts on “Factor-based equity investing: is the magic gone?

  1. Over at Quantopian there was a question related to “correlation clusters” – what are those exactly?

    Well the concept is more simpel than what it sounds. Basically we calculate the correlation matrix on the residual returns (each stock’s returns minus the market which in this case is approximated by the first PCA component) using some time window (in this case 24 monthly observations – not optimized).

    Then the correlation matrix is converted to a distance matrix which is then used to find the hierarchical agglomerative clustering (see code). The clustering is then “cut” into groups such that each group has minimum 50 stocks. As we write in the post this produces a fairly stable system over time with around 6 clusters.

    Why use clusters instead or GICS or any other industry classification?

    Foremost because several research shows that correlation clustering beats industry classification.

    Another reason is also that the industry classification is a human judgement and as such not necessarily reflected in the underlying statistical relationships (correlations).

    We have a follow-up post to our random portfolio research that will expand on the above concept and see whether the random portfolio can be improved further.

    The performance of clustering vs industry classification is something we will likely pursue in the future. Basically confirming the existing research on the topic through our own research.


If you have something intelligent to add, please write a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s