Towards a better equity benchmark: random portfolios

Random portfolios deliver alpha relative to a buy-and-hold position in the S&P 500 index – even after allowing for trading costs. Random portfolios will serve as our benchmark for our future quantitative equity models.

The evaluation of quantitative equity portfolios typically involves a comparison with a relevant benchmark, routinely a broad index such as the S&P 500 index. This is an easy and straightforward approach, but also – we believe – sets the bar too low resulting in too many favorable research outcomes. However, we want to hold ourselves to a higher standard and as a bare minimum that must include being able to beat a random portfolio – the classic dart-throwing monkey portfolio.

A market capitalization-weighted index, such as the S&P 500 index, is inherently a size-based portfolio where those equities which have done well in the long run are given the highest weight. It is, in other words, dominated by equities with long-run momentum. Given that an index is nothing more than a size-based trading strategy, we should ask ourselves whether there exist other simple strategies that perform better. The answer to this is in the affirmative and one such strategy is to generate a portfolio purely from chance. Such a strategy will choose, say, 20 equities among the constituents of an index each month and invest an equal amount in each position. After a month the process is repeated, 20 new equities are randomly picked, and so forth.

The fact that random portfolios outperform their corresponding indices is nothing new. David Winton Harding, the founder of the hedge fund Winton Capital, even went on CNBC last year to explain the concept, but he is far from only in highlighting the out-performance of random portfolios (e.g. here and here).

To demonstrate the idea we have carried out the research ourselves. We use the S&P 500 index – without survival bias and accounting for corporate actions such as dividends – and focus on the period from January 2000 to November 2015. We limit ourselves in this post to this time span as it mostly covers the digitization period, but results from January 1990 show similar results. Starting on 31 December 1999 we randomly select X equities from the list of S&P 500 constituents that particular month, assign equal weights, and hold this portfolio in January 2000. We then repeat the process on the final day of January and hold in February and every subsequent month until November 2015.

g1

In the chart above the annualized returns for 1000 portfolios containing 10, 20, and 50 equities are depicted (the orange line represents the S&P 500 index). A few things are readily apparent:

  • The mean of the annualized returns is practically unchanged across portfolio sizes
  • The standard deviation of the annualized returns narrows as the portfolio size increases
  • All three portfolio sizes beat the S&P 500 index
  • 6.5% of the 1000 random portfolios of size 10 have an annualized return which is lower than that of the S&P 500 index (4.2%). Of the portfolios with sizes 20 and 50 the percentages are 0.9% and 0%, respectively. In other words, not a single of the 1000 random portfolios of size 50 delivers a annualized return below 4.2%.

So far we have talked one or many random portfolios without being too specific, but for random portfolios to work we need a large number of samples (i.e. portfolios) so that performance statistics, such as the annualized return, tend toward stable values. The chart below shows the cumulative mean over the number of random portfolios (size = 10), which stabilizes as the number of random portfolios increases (the orange line represents the mean across all 1000 portfolios).

g2

All three portfolios beat the S&P 500 index in terms of annualized return, but we must keep in mind that these portfolios’ turnovers are high and hence we need to allow for trading costs. The analysis is thus repeated below  for 1000 random portfolios with 50 positions with trade costs of 0%, 0.1% and 0.2% (round trip).

g3

Unsurprisingly the mean annualized return declines as trading costs increase. Whereas not a single of the 1000 random portfolios of size 50 delivered an annualized return below 4.2% without trade costs , 2 and 40 portfolios have lower returns when trade costs of 0.1% and 0.2%, respectively, are added. Put differently, even with trade costs of 0.2% (round trip) every month a portfolio of 50 random stocks outperformed the S&P 500 index in terms of annualized return in 960 of 1000 instances.

g4

Weighing by size is simple, and we like simple. But when it comes to equity portfolios we demand more. Put differently, if our upcoming quantitative equity portfolios cannot beat a randomly-generated portfolio what is the point? Therefore, going forward, we will refrain from using the S&P 500 index or any other appropriate index and instead compare our equity models to the results presented above. We want to beat not only the index, we want to beat random. We want to beat the dart-throwing monkey!

ADDENDUM: A couple of comments noted that we must use an an equal weight index to be consistent with the random portfolio approach. This can, for example, be achieved by investing in the Guggenheim S&P 500 Equal Weight ETF, which has yielded 9.6% annualized since inception in April, 2003 (with an expense ratio of 0.4%). The ETF has delivered a higher annualized return than that of the random portfolios (mean) when trading costs are added.

In other words, if you want to invest invest equally in the S&P 500 constituents, there is an easy way to do it. We will continue to use random portfolios as a benchmark as it is a simple approach, which our models must beat and we can choose the starting date as we please.

###########################################################
# #
# INPUT: #
# prices: (months x equities) matrix of close prices #
# prices_index: (months x 1) matrix of index close prices #
# tickers: (months x equities) binary matrix indicating #
# whether a stock was present in the index in that month #
# #
###########################################################

draws <- 1000
start_time <- "1999-12-31"
freq_cal <- "MONTHLY"

N <- NROW(prices) # Number of months
J <- NCOL(prices) # Number of constituents in the S&P 500 index

prices <- as.matrix(prices)
prices_index <- as.matrix(prices_index)
prices_ETF <- as.matrix(prices_ETF)

# Narrow the window
#prices <- prices[-1:-40, , drop = FALSE]
#prices_index <- prices_index[-1:-40, , drop = FALSE]
#prices_ETF <- prices_ETF[-1:-40, , drop = FALSE]

#N <- NROW(prices)

# Combinations
sizes <- c(10, 20, 50) ; nsizes <- length(sizes) # Portfolio sizes
costs <- c(0, 0.1, 0.2) ; ncosts <- length(costs) # Trading costs (round trip)

# Array that stores performance statistics
perf <- array(NA, dim = c(nsizes, ncosts, 3, draws),
dimnames = list(paste("Size", sizes, sep = " "), paste("Cost", costs, sep = " "),
c("Ret(ann)", "SD(ann)", "SR(ann)"), NULL))

# Loop across portfolio sizes
for(m in 1:nsizes) {

# Storage array
ARR <- array(NA, dim = c(N, sizes[m], draws))

# Loop across time (months)
for(n in 1:(N - 1)) {

# Which equities are available?
cols <- which(tickers[n, ] == 1)

# Forward return for available equities
fwd_returns <- prices[n + 1, cols]/prices[n, cols] - 1

# Are these equities also available at n + 1?
cols <- which(is.na(fwd_returns) == FALSE)

# Forward return for available equities
fwd_returns <- fwd_returns[cols]

# Loop across portfolios
for(i in 1:draws) {

# Sample a portfolio of size 'sizes[m]'
samp <- sample(x = cols, size = sizes[m], replace = F)

# Store a vector of forward returns in ARR
ARR[n, , i] <- fwd_returns[samp]

} # End of i loop

} # End of n loop

ARR[is.na(ARR)] <- 0

# Loop across trading costs
for(m2 in 1:ncosts) {

# Performance calculations
returns_mean <- apply(ARR, c(1, 3), mean) - costs[m2]/100
returns_cum <- apply(returns_mean + 1, 2, cumprod)
returns_ann <- tail(returns_cum, 1)^(percent_exponent/N) - 1

std_ann <- exp(apply(log(1 + returns_mean), 2, StdDev.annualized, scale = percent_exponent)) - 1
sr_ann <- returns_ann / std_ann

perf[m, m2, "Ret(ann)", ] <- returns_ann * 100
perf[m, m2, "SD(ann)", ] <- std_ann * 100
perf[m, m2, "SR(ann)", ] <- sr_ann

} # End of m2 loop

} # End of m loop

# Index and ETF returns
returns_index <- prices_index[-1, 1]/prices_index[-N, 1] - 1
returns_ava_index <- sum(!is.na(returns_index))
returns_index[is.na(returns_index)] <- 0
returns_cum_index <- c(1, cumprod(1 + returns_index))
returns_ann_index <- tail(returns_cum_index, 1)^(percent_exponent/returns_ava_index) - 1

returns_ETF <- prices_ETF[-1, 1]/prices_ETF[-N, 1] - 1
returns_ava_ETF <- sum(!is.na(returns_ETF))
returns_ETF[is.na(returns_ETF)] <- 0
returns_cum_ETF <- c(1, cumprod(1 + returns_ETF))
returns_ann_ETF <- tail(returns_cum_ETF, 1)^(percent_exponent/returns_ava_ETF) - 1

# Print medians to screen
STAT_MED <- apply(perf, c(1, 2, 3), median, na.rm = TRUE)
rownames(STAT_MED) <- paste("Size ", sizes, sep = "")
colnames(STAT_MED) <- paste("Cost ", costs, sep = "")
print(round(STAT_MED, 2))
Advertisements

Welcome to Predictive Alpha

The Predictive Alpha blog is about quantitative trading and investing using the free statistical programming language R.

The Predictive Alpha blog favors complete transparency so all relevant R code is freely available. This allows visitors of the blog to recreate our results and hopefully also promotes discussion and feedback.