*We investigate whether two clustering techniques, k-means clustering and hierarchical clustering, can improve the risk-adjusted return of a random equity portfolio. We find that both techniques yield significantly higher Sharpe ratios compared to random portfolio with hierarchical clustering coming out on top.*

Our debut blog post **Towards a better benchmark: random portfolios** resulted in a lot of feedback (thank you) and also triggered some thoughts about the diversification aspect of random portfolios. Inspired by the approach in our **four-factor equity model** we want to investigate whether it is possible to lower the variance of the portfolios by taking account of the correlation clustering present in the data. On the other hand, we do not want to stray too far from the random portfolios idea due to its simplicity.

We will investigate two well-known clustering techniques today, **k-means clustering** and **hierarchical clustering** (HCA), and see how they stack up against the results from random portfolios. We proceed by calculating the sample correlation matrix over five years (60 monthly observations) across the S&P 500 index’s constituents every month in the period 2000-2015, i.e. 192 monthly observations. We then generate 50 clusters using either k-means (100 restarts) or HCA on the correlation matrix and assign all equities to a cluster. From each of these 50 clusters we randomly pick one stock and include it in our portfolio for the following month. This implies that our portfolios will have 50 holdings every month. (None of the parameters used have been optimized.)

*The vertical dark-grey line represents the annualized return of a buy-and-hold strategy in the S&P 500 index.*

The mean random portfolio returns 9.1% on an annualized basis compared to the benchmark S&P 500 index, which returns 4.1%. The k-means algorithm improves a tad on random portfolios with 9.2% while HCA delivers 9.5%. Not a single annualized return from the 1,000 portfolios is below the buy-and-hold return no matter whether we look at random portfolios, k-means or HCA.

We have added a long-only minimum variance portfolio (MinVar) to the mix. The 1,000 MinVar portfolios use the exact same draws as the random portfolios, but then proceed to assign weights so they minimize the portfolios’ variances using the 1,000 covariance matrices rather than simply use equal weights.

Unsurprisingly, MinVar delivers a mean return well below the others at 8.1%. The weights in the MinVar portfolios are only constrained to 0-100% implying that in any given MinVar portfolio a few equities may dominate in terms of weights while the rest have weights very close to 0%. As a result, the distribution of the annualized mean returns of the MinVar portfolios is much wider and 1.2% of the portfolio (12) are below the S&P 500 index’s annualized mean.

*The vertical dark-grey line represents the annualized Sharpe ratio of a buy-and-hold strategy in the S&P 500 index.*

Turning to the annualized Sharpe ratio using for simplicity a risk-free rate of 0%, we find that a buy-and hold strategy yields a Sharpe ratio of 0.25. Random portfolios, meanwhile, yields 0.46 while k-means and HCA yield 0.49 and 0.58, respectively.

The k-means and HCA algorithms result in lower volatility compared to standard random portfoliios (see table below). While the random portfolios have a mean annualized standard deviation of 19.8%, k-means and HCA have mean standard deviations of 18.7% and 16.3% respectively (in addition to the slightly higher mean annualized returns). Put differently, it seems from the table below and charts above that k-means and in particular HCA add value relative to random portfolios, but do they do so significantly?

Using a t-test on the 1,000 portfolios’ annualized means we find p-values of 0.7% and ~0% respectively, supporting the view that significant improvements can be made by using k-means and HCA. The p-values fluctuate a bit for the k-means algorithm with every run of the code, but the HCA’s outperformance (in terms of annualized mean return) is always highly significant. Testing the Sharpe ratios rather than mean returns yield highly significant p-values for both k-means and HCA.

We have looked at two ways to potentially improve random portfolios by diversifying across correlation clusters. The choice of k-means and HCA as clustering techniques was made to keep it as simple as possible, but this choice does come with some assumptions. **Variance Explained** – for example – details why using k-means clustering is not a free lunch. In addition, what distance function to use in the HCA was not considered in this post, we simply opted for the simple choice of: *1 – correlation matrix*. We leave it to the reader to test other, more advanced clustering methods, and/or change the settings used in this blog post.

**Portfolio size**

So far we have only looked at a portfolio size of 50, but in the original post on **random portfolios** we also included portfolios of size 10 and 20. For completeness we provide the results below noting simply that the pattern is the same across portfolio sizes. In other words, HCA is the best followed by k-means and random portfolios – though the outperformance decreases with the sample size.

**Excess returns**

The approach above uses the historical correlation matrix of the returns, but there is a case to be made for using excess returns instead. We have therefore replicated the analysis above using excess returns instead of returns, where excess returns are defined as the residuals from regressing the available constituent equity returns on the market returns (i.e. S&P 500 index returns) using the same lookback window of five years (60 observations).

Using the matrix of residuals we construct a correlation matrix and repeat the analysis above; we follow the same steps and use the same parameter settings. Disappointingly, the results do not improve. In fact, the results deteriorate to such a degree that neither k-means nor HCA outperform regardless of whether we look at the mean return or Sharpe ratio.

Postscript: while we use k-means clustering on the correlation matrix we have seen it used in a variety of ways in the finance blogosphere, from **Intelligent Trading Tech** and **Robot Wealth**, both of which look at candlestick patterns, over **Turing Finance**’s focus on countries’ GDP growth to MKTSTK’s **trading volume prediction**, to mention a few.

########################################################### # # # INPUT: # # prices: (months x equities) matrix of close prices # # prices.index: (months x 1) matrix of index close prices # # tickers: (months x equities) binary matrix indicating # # whether a stock was present in the index in that month # # # ########################################################### library(TTR) library(PerformanceAnalytics) library(quadprog) library(xtable) # Function for finding the minimum-variance portfolio min_var <- function(cov.mat, short = FALSE) { if (short) { aMat <- matrix(1, size.port, 1) res <- solve.QP(cov.mat, rep(0, size.port), aMat, bvec = 1, meq = 1) } else { aMat <- cbind(matrix(1, size.port, 1), diag(size.port)) b0 <- as.matrix(c(1, rep(0, size.port))) res <- solve.QP(cov.mat, rep(0, size.port), aMat, bvec = b0, meq = 1) } return(res) } N <- NROW(prices) # Number of months J <- NCOL(prices) # Number of constituents in the S&P 500 index prices <- as.matrix(prices) prices.index <- as.matrix(prices.index) port.names <- c("Random", "K-means", "HCA", "MinVar") # Array that stores performance statistics perf <- array(NA, dim = c(3, draws, length(port.names)), dimnames = list(c("Ret (ann)", "SD (ann)", "SR (ann)"), NULL, port.names)) # Storage array ARR <- array(NA, dim = c(N, draws, size.port, length(port.names)), dimnames = list(rownames(prices), NULL, NULL, port.names)) rows <- which(start.port == rownames(prices)):N # Loop across time (months) for (n in rows) { cat(rownames(prices)[n], "\n") # Which equities are available? cols <- which(tickers[n, ] == 1) # Forward and backward return for available equities fwd.returns <- prices[n, cols]/prices[n - 1, cols] - 1 bwd.returns <- log(prices[(n - lookback):(n - 1), cols]/ prices[(n - lookback - 1):(n - 2), cols]) # Are these equities also available at n + 1? cols <- which(is.na(fwd.returns) == FALSE & apply(is.na(bwd.returns) == FALSE, 2, all)) # Returns for available equities fwd.returns <- fwd.returns[cols] bwd.returns <- bwd.returns[, cols] bwd.returns.index <- log(prices.index[(n - lookback):(n - 1), 1]/ prices.index[(n - lookback - 1):(n - 2), 1]) # Covariance and correlation matrices cov.mat <- cov(bwd.returns) cor.mat <- cor(bwd.returns) # K-means on covariance matrix km <- kmeans(x = scale(cor.mat), centers = size.port, iter.max = 100, nstart = 100) hc <- hclust(d = as.dist(1 - cor.mat), method = "average") for (d in 1:draws) { samp <- sample(x = 1:length(cols), size = size.port, replace = FALSE) opt <- min_var(cov.mat[samp, samp], short = FALSE) ARR[n, d, , "Random"] <- fwd.returns[samp] ARR[n, d, , "MinVar"] <- opt$solution * fwd.returns[samp] * size.port } hc.cut <- cutree(hc, k = size.port) for (q in 1:size.port) { ARR[n, , q, "K-means"] <- sample(x = fwd.returns[which(km$cluster == q)], size = draws, replace = TRUE) ARR[n, , q, "HCA"] <- sample(x = fwd.returns[which(hc.cut == q)], size = draws, replace = TRUE) } } ARR <- ARR[rows, , , ] # Performance calculations returns.mean <- apply(ARR, c(1, 2, 4), mean) - cost.port/100 ; N2 <- NROW(ARR) returns.cum <- apply(returns.mean + 1, c(2, 3), cumprod) returns.ann <- returns.cum[N2, , ]^(12/N2) - 1 std.ann <- exp(apply(log(1 + returns.mean), c(2, 3), StdDev.annualized, scale = 12)) - 1 sr.ann <- returns.ann / std.ann returns.index <- c(0, prices.index[-1, 1]/prices.index[-N, 1] - 1) returns.index <- returns.index[rows] returns.cum.index <- c(1, cumprod(returns.index + 1)) returns.ann.index <- tail(returns.cum.index, 1)^(12/N2) - 1 std.ann.index <- exp(StdDev.annualized(log(1 + returns.index[-1]), scale = 12)) - 1 sr.ann.index <- returns.ann.index / std.ann.index perf["Ret (ann)", , ] <- returns.ann * 100 perf["SD (ann)", , ] <- std.ann * 100 perf["SR (ann)", , ] <- sr.ann results.mean <- as.data.frame(apply(perf, c(1, 3), mean)) results.mean$Benchmark <- c(returns.ann.index * 100, std.ann.index * 100, sr.ann.index) print(round(results.mean, 2)) t.test(returns.ann[,2], returns.ann[, 1], alternative = "greater") t.test(returns.ann[,3], returns.ann[, 1], alternative = "greater")

This is very interesting and very well presented. I am concerned, however, about the introduction of a lookahead bias by your mention of using 50 clusters that were constructed from a point in the future and then sampled from in the past to draw your random equity from each month. I may have misunderstood, but can you please clarify?

LikeLike

Very interesting article! Thank you. Basically I have the same thoughts about the lookback/survivorship bias. Therefore it would be interesting to use an equal weighted basket of the underlying stocks as benchmark.

LikeLike