Hidden Markov Models – Trend Following – Part 4 of 4

Update: There was a look forward bug in the code when calculating outOfSampleLongReturns and outOfSampleShortReturns, this has been corrected and sadly reduced the sharpe ratio from 3.08 to 0.857. This is why I always publish the code to provide greater transparency than other sites

Part 3 of this series demonstrated how to train a HMM on a toy model, this post will focus on how to actually go about modelling real life data. A trend following strategy will be developed for trading the S&P 500.

In most machine learning classification problems you need a set of training data which has class labels. Whilst we have the market data we don’t have class labels, the first task is therefor to generate the class labels for the training data.

We are wanting to develop a trend following strategy, we need to select parts of the S&P 500 time series and label them as either in an up trend or down trend (could also have no trend label). In theory you could do this by hand, however this would be not be feasible to do over the whole universe of stocks. Instead we can write a program to automatically classify the data for us, how you classify data as in a trend depends upon your definition of trend.

The labels are seen in the chart below, and price with green underneath means it’s been labelled as a long signal, any price with red above it means that it has the short label.

In the code provided I check to see if the current price is the lowest price for the next 10 periods (variable NDayLookforwardLowHigh), if it is then go long, if the variable is the highest over the next 10 periods then close the long. Do the reverse for shorts. It’s a little crude and I imagine there are significantly better ways to classify trends, might be useful to put a volatility constraint on the trends (should hopefully filter through and improve the sharpe ratio).

In addition to labeling the data a set of vector “features” must be generated, the features should contain variables that are beneficial to detecting trends. The feature vector I used has the ratios of open to close price, open to high price, open to low price and all the intermediate combinations. Often it is desirable to model the dynamics of these variables and put the one period change in these variables inside of the feature vector.

The above image shows the likelihood of each market regime given the HMM trained on the same data set. It is reassuring to see that the Long Regime became very unlikely during the crash of 2008.

One of the excellent properties of HMM is that they allow the modelling of situations that have different duration but are of the same class. For example a trend might last for 10 days, and another trend might last for 35 days, we can pass both of these examples into the HMM and it will try and model the duration difference using the internal state transition probabilities.

Out of sample results:

Sharpe Ratio of 0.857

Model Fit:

Long Model

```HMMFit(obs = inSampleLongFeaturesList, nStates = 3)

Model:
------
3 states HMM with 7-d gaussian distribution

Baum-Welch algorithm status:
----------------------------
Number of iterations : 21
Last relative variation of LLH function: 0.000001

Estimation:
-----------

Initial probabilities:
Pi 1      Pi 2      Pi 3
1.690213e-47 0.3734194 0.6265806

Transition matrix:
State 1   State 2    State 3
State 1 0.4126480 0.3419075 0.24544450
State 2 0.1116068 0.8352273 0.05316591
State 3 0.5475525 0.2303324 0.22211504

Conditionnal distribution parameters:

Distribution parameters:
State 1
mean    cov matrix
0.01565943  1.922717e-04  1.724953e-04  1.785035e-04 -7.870798e-06 -1.764319e-04 -1.687845e-04  6.898374e-06
1.02210441  1.724953e-04  1.920546e-04  1.736241e-04  2.327852e-06 -1.615346e-04 -1.768012e-04 -1.651055e-05
1.01805768  1.785035e-04  1.736241e-04  1.777602e-04  2.663971e-06 -1.653924e-04 -1.595871e-04  5.067094e-06
1.00264545 -7.870798e-06  2.327852e-06  2.663971e-06  1.095711e-05  8.195588e-06  8.473222e-06  3.178401e-07
0.98502360 -1.764319e-04 -1.615346e-04 -1.653924e-04  8.195588e-06  1.644647e-04  1.589815e-04 -4.749521e-06
0.98113485 -1.687845e-04 -1.768012e-04 -1.595871e-04  8.473222e-06  1.589815e-04  1.732240e-04  1.542815e-05
0.99605695  6.898374e-06 -1.651055e-05  5.067094e-06  3.178401e-07 -4.749521e-06  1.542815e-05  2.064964e-05

State 2
mean    cov matrix
0.001670502  3.103878e-05  3.555352e-06  1.781044e-05 -1.336108e-05 -3.092612e-05 -1.670114e-05  1.410578e-05
1.009361131  3.555352e-06  1.249497e-05  9.181451e-06  5.644685e-06 -3.464184e-06 -6.714792e-06 -3.238512e-06
1.005638565  1.781044e-05  9.181451e-06  1.714606e-05 -7.256446e-07 -1.770669e-05 -9.748050e-06  7.905439e-06
1.003957940 -1.336108e-05  5.644685e-06 -7.256446e-07  1.271107e-05  1.335557e-05  7.009564e-06 -6.279198e-06
0.998346405 -3.092612e-05 -3.464184e-06 -1.770669e-05  1.335557e-05  3.081873e-05  1.660615e-05 -1.409313e-05
0.994653572 -1.670114e-05 -6.714792e-06 -9.748050e-06  7.009564e-06  1.660615e-05  1.353501e-05 -3.017033e-06
0.996315167  1.410578e-05 -3.238512e-06  7.905439e-06 -6.279198e-06 -1.409313e-05 -3.017033e-06  1.101157e-05

State 3
mean    cov matrix
-0.01296153  1.481273e-04 -8.848326e-05  2.231101e-05 -1.286390e-04 -1.503760e-04 -3.945685e-05  1.032046e-04
1.02416584 -8.848326e-05  1.349494e-04  2.353928e-05  1.150009e-04  9.081677e-05 -1.853468e-05 -1.027771e-04
1.00458706  2.231101e-05  2.353928e-05  3.596043e-05  1.384302e-05 -2.234068e-05 -9.215909e-06  1.259424e-05
1.01746162 -1.286390e-04  1.150009e-04  1.384302e-05  1.485395e-04  1.338477e-04  3.324363e-05 -9.313984e-05
1.01283801 -1.503760e-04  9.081677e-05 -2.234068e-05  1.338477e-04  1.555764e-04  4.225851e-05 -1.053565e-04
0.99347206 -3.945685e-05 -1.853468e-05 -9.215909e-06  3.324363e-05  4.225851e-05  5.013390e-05  8.770049e-06
0.98098355  1.032046e-04 -1.027771e-04  1.259424e-05 -9.313984e-05 -1.053565e-04  8.770049e-06  1.075952e-04

Log-likelihood: 64226.28
BIC criterium: -127633.2
AIC criterium: -128226.6
```

Short Model

```HMMFit(obs = inSampleShortFeaturesList, nStates = 3)

Model:
------
3 states HMM with 7-d gaussian distribution

Baum-Welch algorithm status:
----------------------------
Number of iterations : 20
Last relative variation of LLH function: 0.000001

Estimation:
-----------

Initial probabilities:
Pi 1      Pi 2      Pi 3
3.784166e-15 0.1476967 0.8523033

Transition matrix:
State 1    State 2   State 3
State 1 0.4760408 0.12434214 0.3996170
State 2 0.3068272 0.54976794 0.1434049
State 3 0.5243733 0.06315371 0.4124730

Conditionnal distribution parameters:

Distribution parameters:
State 1
mean    cov matrix
-0.009827328  5.102050e-05 -2.137629e-05  1.047989e-05 -3.999886e-05 -5.043558e-05 -1.834830e-05  3.090145e-05
1.016726031 -2.137629e-05  2.915402e-05  4.001168e-06  2.551672e-05  2.145395e-05 -3.484775e-06 -2.426317e-05
1.002951135  1.047989e-05  4.001168e-06  1.070077e-05  4.343448e-07 -1.032399e-05 -3.489498e-06  6.607115e-06
1.012808350 -3.999886e-05  2.551672e-05  4.343448e-07  4.073945e-05  4.019962e-05  1.502427e-05 -2.422162e-05
1.009838504 -5.043558e-05  2.145395e-05 -1.032399e-05  4.019962e-05  5.047663e-05  1.847048e-05 -3.082551e-05
0.996150195 -1.834830e-05 -3.484775e-06 -3.489498e-06  1.502427e-05  1.847048e-05  1.816514e-05 -3.045290e-08
0.986475577  3.090145e-05 -2.426317e-05  6.607115e-06 -2.422162e-05 -3.082551e-05 -3.045290e-08  2.992073e-05

State 2
mean    cov matrix
-0.005393441  0.0008501205 -1.231927e-04  3.413652e-04 -0.0004927836 -0.0008165848 -3.496732e-04  4.379093e-04
1.037824136 -0.0001231927  3.546602e-04  7.820615e-05  0.0002160609  0.0001422901 -1.206772e-04 -2.508658e-04
1.013889133  0.0003413652  7.820615e-05  2.198099e-04 -0.0001068837 -0.0003166324 -1.713360e-04  1.374691e-04
1.019557602 -0.0004927836  2.160609e-04 -1.068837e-04  0.0003949026  0.0004952245  1.733093e-04 -2.998924e-04
1.005903113 -0.0008165848  1.422901e-04 -3.166324e-04  0.0004952245  0.0007960268  3.345499e-04 -4.319633e-04
0.982515481 -0.0003496732 -1.206772e-04 -1.713360e-04  0.0001733093  0.0003345499  2.726893e-04 -5.242376e-05
0.977179046  0.0004379093 -2.508658e-04  1.374691e-04 -0.0002998924 -0.0004319633 -5.242376e-05  3.611514e-04

State 3
mean    cov matrix
0.003909801  2.934983e-05  1.656072e-05  2.335206e-05 -5.814278e-06 -2.877715e-05 -2.196575e-05  6.761736e-06
1.010126341  1.656072e-05  1.900251e-05  1.943331e-05  2.909245e-06 -1.625496e-05 -1.574899e-05  4.694645e-07
1.007300746  2.335206e-05  1.943331e-05  2.554298e-05  2.309188e-06 -2.288733e-05 -1.676155e-05  6.090567e-06
1.003413014 -5.814278e-06  2.909245e-06  2.309188e-06  8.147512e-06  5.796633e-06  5.191743e-06 -5.902698e-07
0.996163167 -2.877715e-05 -1.625496e-05 -2.288733e-05  5.796633e-06  2.830213e-05  2.164898e-05 -6.603805e-06
0.993369564 -2.196575e-05 -1.574899e-05 -1.676155e-05  5.191743e-06  2.164898e-05  2.055797e-05 -1.040800e-06
0.997202266  6.761736e-06  4.694645e-07  6.090567e-06 -5.902698e-07 -6.603805e-06 -1.040800e-06  5.564118e-06

Log-likelihood: 47728.08
BIC criterium: -94666.79
AIC criterium: -95230.16

```

Onto the code:

?View Code RSPLUS
 ```library("quantmod") library("PerformanceAnalytics") library('RHmm') #Load HMM package library('zoo')   #Specify dates for downloading data startDate = as.Date("2000-01-01") #Specify what date to get the prices from trainingEndDate = as.Date("2010-01-01") # Specify the date we take as our traning sample NDayLookforwardLowHigh <- 10 #Parameter used when classifing in sample data as in a trend or not HmmLikelihoodTestLength <- 5 #How many days of data to calculate the likehood ratio on to compare models   symbolData <- new.env() #Make a new environment for quantmod to store data in symbol <- "^GSPC" #S&p 500     getSymbols(symbol, env = symbolData, src = "yahoo", from = startDate) mktdata <- eval(parse(text=paste("symbolData\$",sub("^","",symbol,fixed=TRUE)))) mktdata <- head(mktdata,-1) #Hack to fix some stupid duplicate date problem with yahoo inSampleMktData <- window(mktdata,start=startDate ,end=trainingEndDate) outOfSampleMktData <- window(mktdata,start=trainingEndDate+1) dailyRet <- Delt(Cl(mktdata),k=1,type="arithmetic") #Daily Returns dailyRet[is.na(dailyRet)] <-0.00001   inSampleDailyRet <- window(dailyRet,start=startDate ,end=trainingEndDate) outOfSampleDailyRet <- window(dailyRet,start=trainingEndDate+1)     ConvertTofullSignal <- function(signal){ results <- rep(0,length(signal)) intrade <- F for(i in seq(1,length(signal))){ if(signal[i]==-1){ intrade <- F } if(signal[i]==1 || intrade){ results[i]<-1 intrade <- T } } return(results) }   #Generate long trend signal LongTrendSignal <- rep(0,nrow(inSampleMktData)) for(i in seq(1,nrow(inSampleMktData)-NDayLookforwardLowHigh)){ dataBlock <- Cl(inSampleMktData[seq(i,i+NDayLookforwardLowHigh),])   if(coredata(Cl(inSampleMktData[i,])) == min(coredata(dataBlock))){ LongTrendSignal[i] <- 1 } if(coredata(Cl(inSampleMktData[i,])) == max(coredata(dataBlock))){ LongTrendSignal[i] <- -1 }   } LongTrendSignal <- ConvertTofullSignal(LongTrendSignal)   #Generate short trend signal ShortTrendSignal <- rep(0,nrow(inSampleMktData)) for(i in seq(1,nrow(inSampleMktData)-NDayLookforwardLowHigh)){ dataBlock <- Cl(inSampleMktData[seq(i,i+NDayLookforwardLowHigh),])   if(coredata(Cl(inSampleMktData[i,])) == max(coredata(dataBlock))){ ShortTrendSignal[i] <- 1 } if(coredata(Cl(inSampleMktData[i,])) == min(coredata(dataBlock))){ ShortTrendSignal[i] <- -1 }   } ShortTrendSignal <- ConvertTofullSignal(ShortTrendSignal)   #Plot our signals LongTrendSignalForPlot <- LongTrendSignal LongTrendSignalForPlot[LongTrendSignalForPlot==0] <- NaN LongTrendSignalForPlot <- Cl(inSampleMktData)*LongTrendSignalForPlot - 100 inSampleLongTrendSignalForPlot <-LongTrendSignalForPlot   ShortTrendSignalForPlot <- ShortTrendSignal ShortTrendSignalForPlot[ShortTrendSignalForPlot==0] <- NaN ShortTrendSignalForPlot <- Cl(inSampleMktData)*ShortTrendSignalForPlot + 100 inSampleShortTrendSignalForPlot <- ShortTrendSignalForPlot   dev.new() layout(1:2) plot(Cl(inSampleMktData), main="S&P 500 Trend Follow In Sample Training Signals") lines(inSampleLongTrendSignalForPlot,col="green",type="l") lines(inSampleShortTrendSignalForPlot,col="red",type="l") legend(x='bottomright', c("S&P 500 Closing Price","Long Signal","Short Signal"), fill=c("black","green","red"), bty='n')   #Calculate Returns LongReturns <- Lag(LongTrendSignal)* (inSampleDailyRet) LongReturns[is.na(LongReturns)] <- 0 ShortReturns <- Lag(-1*ShortTrendSignal)* (inSampleDailyRet) ShortReturns[is.na(ShortReturns)] <- 0 TotalReturns <- LongReturns + ShortReturns plot(cumsum(TotalReturns),main="S&P 500 Trend Follow In Sample HMM Training Signals Strategy Returns") lines(cumsum(LongReturns),col="green") lines(cumsum(ShortReturns),col="red") legend(x='bottomright', c("Total Returns","Long Trend Returns","Short Trend Returns"), fill=c("black","green","red"), bty='n')   #Extracts a list of varying length features for each signal/class label CreateListOfMatrixFeatures <- function(features,signal){ results <- list() extract <- F for(i in seq(1,length(signal))){ if(signal[i]==1 && !extract){ startIndex <- i extract <- T } if(signal[i]==0 && extract){ endIndex <- i-1 dataBlock <- features[startIndex:endIndex,] extract <- F #print(dataBlock) results[[length(results)+1]] <- as.matrix(dataBlock)   } } return(results) }   #HMM Training #Generate the features that describe the data & split into training and out of sample sets features <- cbind(dailyRet,Hi(mktdata)/Lo(mktdata),Hi(mktdata)/Op(mktdata),Hi(mktdata)/Cl(mktdata),Op(mktdata)/Cl(mktdata),Lo(mktdata)/Cl(mktdata),Lo(mktdata)/Op(mktdata)) inSampleTrainingFeatures <- window(features,start=startDate ,end=trainingEndDate) outOfSampleFeatures <- window(features,start=trainingEndDate+1)   #For each long / short position extract the corresponding features data and create list of them inSampleLongFeaturesList <- CreateListOfMatrixFeatures(inSampleTrainingFeatures,LongTrendSignal) inSampleShortFeaturesList <- CreateListOfMatrixFeatures(inSampleTrainingFeatures,ShortTrendSignal)   #Train the HMM models LongModelFit = HMMFit(inSampleLongFeaturesList, nStates=3) ShortModelFit = HMMFit(inSampleShortFeaturesList, nStates=3)   #Will take NDayLookforwardLowHigh days of data and calculate the rolling log likelihood for each HMM model inSampleLongLikelihood <- rollapply(inSampleTrainingFeatures,HmmLikelihoodTestLength,align="right",na.pad=T,by.column=F,function(x) {forwardBackward(LongModelFit,as.matrix(x))\$LLH}) inSampleShortLikelihood <- rollapply(inSampleTrainingFeatures,HmmLikelihoodTestLength,align="right",na.pad=T,by.column=F,function(x) {forwardBackward(ShortModelFit,as.matrix(x))\$LLH}) outOfSampleLongLikelihood <- rollapply(outOfSampleFeatures,HmmLikelihoodTestLength,align="right",na.pad=T,by.column=F,function(x) {forwardBackward(LongModelFit,as.matrix(x))\$LLH}) outOfSampleShortLikelihood <- rollapply(outOfSampleFeatures,HmmLikelihoodTestLength,align="right",na.pad=T,by.column=F,function(x) {forwardBackward(ShortModelFit,as.matrix(x))\$LLH})   #Create signals for plot / trading outOfSampleLongTrendSignalForPlot <- 1*(outOfSampleLongLikelihood > outOfSampleShortLikelihood) outOfSampleLongTrendSignalForPlot[outOfSampleLongTrendSignalForPlot==0] <- NaN outOfSampleLongTrendSignalForPlot <- outOfSampleLongTrendSignalForPlot*Cl(outOfSampleMktData)-100   outOfSampleShortTrendSignalForPlot <- 1*(outOfSampleLongLikelihood < outOfSampleShortLikelihood) outOfSampleShortTrendSignalForPlot[outOfSampleShortTrendSignalForPlot==0]<-NaN outOfSampleShortTrendSignalForPlot <- outOfSampleShortTrendSignalForPlot*Cl(outOfSampleMktData)+100     dev.new() layout(1:2) plot(Cl(inSampleMktData), main="S&P 500 Trend Follow In Sample HMM Training Signals") lines(inSampleLongTrendSignalForPlot,col="green",type="l") lines(inSampleShortTrendSignalForPlot,col="red",type="l") legend(x='bottomright', c("S&P 500 Closing Price","Long Signal","Short Signal"), fill=c("black","green","red"), bty='n')   #tt <- Cl(inSampleMktData) #tt[,1] <- inSampleLongLikelihood plot(inSampleLongLikelihood,main="Log Likelihood of each HMM model - In Sample") lines(inSampleLongLikelihood,col="green") lines(inSampleShortLikelihood,col="red") legend(x='bottomright', c("Long HMM Likelihood","Short HMM Likelihood"), fill=c("green","red"), bty='n')   dev.new() layout(1:3) plot(Cl(outOfSampleMktData), main="S&P 500 HMM Trend Following Out of Sample") lines(outOfSampleLongTrendSignalForPlot,col="green",type="l") lines(outOfSampleShortTrendSignalForPlot,col="red",type="l") legend(x='bottomright', c("S&P 500 Closing Price","Long Signal","Short Signal"), fill=c("black","green","red"), bty='n')   #tt <- Cl(outOfSampleMktData) #tt[,1] <- outOfSampleLongLikelihood plot(outOfSampleLongLikelihood,main="Log Likelihood of each HMM model - Out Of Sample") lines(outOfSampleLongLikelihood,col="green") lines(outOfSampleShortLikelihood,col="red") legend(x='bottomright', c("Long HMM Likelihood","Short HMM Likelihood"), fill=c("green","red"), bty='n')   #Calculate Out of Sample Returns outOfSampleLongReturns <- Lag((1*(outOfSampleLongLikelihood > outOfSampleShortLikelihood)))* (outOfSampleDailyRet) outOfSampleLongReturns[is.na(outOfSampleLongReturns)] <- 0 outOfSampleShortReturns <- Lag(-1*(1*(outOfSampleLongLikelihood < outOfSampleShortLikelihood)))* (outOfSampleDailyRet) outOfSampleShortReturns[is.na(outOfSampleShortReturns)] <- 0 outOfSampleTotalReturns <- outOfSampleLongReturns + outOfSampleShortReturns outOfSampleTotalReturns[is.na(outOfSampleTotalReturns)] <- 0   plot(cumsum(outOfSampleTotalReturns),main="S&P 500 HMM Trend Following Out of Sample Strategy Returns") lines(cumsum(outOfSampleLongReturns),col="green") lines(cumsum(outOfSampleShortReturns),col="red") legend(x='bottomright', c("Total Returns","Long Trend Returns","Short Trend Returns"), fill=c("black","green","red"), bty='n')   print(SharpeRatio.annualized(outOfSampleTotalReturns))```

13 thoughts on “Hidden Markov Models – Trend Following – Part 4 of 4”

1. There is a bug in this code – I put random data in and still gave me out of sample sharpe ratio 2.5

• Thank you, I have updated the article. My sharpe ratio has evaporated 🙁

This is why I hate R, it’s slow, not object orientated and most of all it’s too easy to introduce look forward bias!

• Which is why I look forward to seeing this code converted to python. 🙂
Seriously, I’d love to see your machine-learning work as examples in python. And likely many other folks would too.

All the best and thanks for sharing,

M

• What are you missing in R that is present in other languages that would prevent look ahead bias?

(It is object-oriented by the way, key words are generic functions, S3- and S4-objects)

• Gekko,

I’d recommend breaking up the code a little bit more and adding in a bit more explanation per chunk of code. The horizontal scrolling also makes the code very difficult to read. I believe style guidelines recommend fewer than 80 characters per line.

Also, when using xts data, lag is a positive number, though I believe if not, it goes the other way.

2. Gekko – RHmm doesn’t seem to work anymore – any suggestion for a suitable replacement? Novice quant here so appreciate your advice….awesome blog btw!

3. Gekko,

Thanks for share the code, without it I probably won’t figure out how to use HMM.

4. Thank you for sharing the code. I have question: why do you fit two models with three states? To my mind that trains two models to find three latent states within either a long or short trend but does not tell you if the features are bullish or bearish. The LLH just gives you a measure on how well the three states fit the test features in one model or the other.