This post explores applying NEAT to trading the S&P. The learned strategy significantly out performs buying and holding both in and out of sample.
Features:
A key part of any machine learning problem is defining the features and ensuring that they’re normalised in some fashion.
The features will be rolling percentiles of the following economic data, a rolling percentile takes the last n data points and calculates what % of data point the latest data point is greater than.
- Non-farm payrolls
- Unemployment Rate
- GDP
Fitness Function
The fitness function is final equity, and aims to maximise the final equity
Termination Function
Any genome that has a 20% draw down, or attempts to use a leverage greater than +/- 2 is terminated. In practise you wouldn’t want to make your system machine learn the risk controls as there is potential that they don’t get learned. The reason they are embedded inside the strategy is to speed up the learning process as we can kill genomes early before the simulation is complete based upon breaking the risk rules.
Plot of all data / features
It appears that when non-farms fall to their lower percentiles / unemployment reaches it’s highest percentiles the day to day returns in the S&P become more volatile. It is hoped that the learning can take advantage of this.
Training Results
The learning has identified a strategy that out performs simply buying and holding. The proposed strategy has a max drawdown around 20% vs the buy and hold having a draw down of 40%. Additionally the strategy shorted the index between 2000-2003 as it was selling off before going long to 2007. Generating a return of 80% vs buy and hold of 7%!
Out of sample results
In the out of sample data (not used during the training) the strategy significantly out performed buying and holding, approx 250% return vs 50% with a max drawdown close to 20% vs buy and hold draw down of 50%.
Onto the code:
install.packages("devtools") library("devtools") install_github("RNeat","ahunteruk") #Install from github as not yet on CRAN library("RNeat") library("quantmod") marketSymbol <- "^GSPC" econmicDataSymbols <- c("UNRATE","PAYEMS","GDP") mktData <- new.env() #Make a new environment for quantmod to store data in economicData <- new.env() #Make a new environment for quantmod to store data in #Specify dates for downloading data, training models and running simulation dataDownloadStartDate <- as.Date("2000-06-01") trainingStartDate = as.Date("2001-01-01") #Specify the date to start training (yyyy-mm-dd) trainingEndDate = as.Date("2006-12-31") #Specify the date to end training outOfSampleStartDate = as.Date("2007-01-01") outOfSampleEndDate = as.Date("2016-07-15") #Download Data getSymbols(marketSymbol,env=mktData,from=dataDownloadStartDate) #S&P 500 getSymbols(econmicDataSymbols,src="FRED",env=economicData,from=dataDownloadStartDate) #Payems is non-farms payrolls nEconomicDataPercentileLookbackShort <- 20 nEconomicDataPercentileLookbackMedium <- 50 nEconomicDataPercentileLookbackLong <- 100 rollingPercentile <- function(data,n){ percentile <- function(dataBlock){ last(rank(dataBlock)/length(dataBlock)) } return (as.zoo(rollapply(as.zoo(data),width=n,percentile,align="right",by.column=TRUE))) } stockCleanNameFunc <- function(name){ return(sub("^","",name,fixed=TRUE)) } clClRet <- as.zoo((lag(Cl(get(stockCleanNameFunc(marketSymbol),mktData)),-1)/Cl(get(stockCleanNameFunc(marketSymbol),mktData))-1)) payemsShortPercentile <- rollingPercentile(economicData$PAYEMS,nEconomicDataPercentileLookbackShort) payemsMediumPercentile <- rollingPercentile(economicData$PAYEMS,nEconomicDataPercentileLookbackMedium) payemsLongPercentile <- rollingPercentile(economicData$PAYEMS,nEconomicDataPercentileLookbackLong) unrateShortPercentile <- rollingPercentile(economicData$UNRATE,nEconomicDataPercentileLookbackShort) unrateMediumPercentile <- rollingPercentile(economicData$UNRATE,nEconomicDataPercentileLookbackMedium) unrateLongPercentile <- rollingPercentile(economicData$UNRATE,nEconomicDataPercentileLookbackLong) gdpShortPercentile <- rollingPercentile(economicData$GDP,nEconomicDataPercentileLookbackShort) gdpMediumPercentile <- rollingPercentile(economicData$GDP,nEconomicDataPercentileLookbackMedium) gdpLongPercentile <- rollingPercentile(economicData$GDP,nEconomicDataPercentileLookbackLong) #join the data sets, fill in any missing dates with the previous none NA value mergedData <- na.locf(merge(economicData$PAYEMS,merge(Cl(get(stockCleanNameFunc(marketSymbol),mktData)), economicData$PAYEMS,payemsShortPercentile,payemsMediumPercentile,payemsLongPercentile, economicData$UNRATE,unrateShortPercentile,unrateMediumPercentile,unrateLongPercentile, economicData$GDP,gdpShortPercentile,gdpMediumPercentile,gdpLongPercentile ,all.x=T),all=T)) mergedData <- mergedData[,-1] ClClRet <- as.zoo(lag(mergedData[,1],-1)/mergedData[,1]-1) ClTZero <- as.zoo(mergedData[,1]) ClTOne <- as.zoo(lag(mergedData[,1],-1)) mergedData <- merge(ClClRet,ClTOne,ClTZero,mergedData) mergedData <- window(mergedData,start=dataDownloadStartDate) colnames(mergedData) <- c("ClClRet","ClTOne","ClTZero","Price","Payems","Payems.short","Payems.medium","Payems.long", "Unrate","Unrate.short","Unrate.medium","Unrate.long", "Gdp","Gdp.short","Gdp.medium","Gdp.long","all.x") dev.new() par(mfrow=c(4,2)) plot(mergedData[,"Price"], main="S&P Close Price",ylab="Close Price") plot(mergedData[,"ClClRet"], main="S&P Close Price",ylab="Close Price") plot(mergedData[,"Payems"], main="Non-Farm Payrolls",ylab="Thousands of Persons") plot(mergedData[,"Payems.short"], main="Non-Farm Payrolls Rolling Percentile",ylab="Percentile") lines(mergedData[,"Payems.medium"], col="red") lines(mergedData[,"Payems.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') plot(mergedData[,"Unrate"], main="Unemployment Rate",ylab="Percent") plot(mergedData[,"Unrate.short"], main="Unemployment Rate Rolling Percentile",ylab="Percentile") lines(mergedData[,"Unrate.medium"], col="red") lines(mergedData[,"Unrate.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') plot(mergedData[,"Gdp"], main="GDP",ylab="Billions of USD") plot(mergedData[,"Gdp.short"], main="GBP Rolling Percentile",ylab="Percentile") lines(mergedData[,"Gdp.medium"], col="red") lines(mergedData[,"Gdp.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') featuresTrainingData <- window(mergedData,start=trainingStartDate,end=trainingEndDate) featuresOutOfSampleData <- window(mergedData,start=outOfSampleStartDate,end=outOfSampleEndDate) #Genetic algo setup simulationData <- featuresTrainingData trading.InitialState <- function(){ state <- list() state[1] <- 100 #Equity state[2] <- 0 #% of Equity allocated to share (-ve for shorts) state[3] <- state[1] #Maximum equity achieved state[4] <- 1 #Trading day number state[5] <- simulationData[1,"Price"] state[6] <- simulationData[1,"Payems.short"] state[7] <- simulationData[1,"Payems.medium"] state[8] <- simulationData[1,"Payems.long"] state[9] <- simulationData[1,"Unrate.short"] state[10] <- simulationData[1,"Unrate.medium"] state[11] <- simulationData[1,"Unrate.long"] state[12] <- simulationData[1,"Gdp.short"] state[13] <- simulationData[1,"Gdp.medium"] state[14] <- simulationData[1,"Gdp.long"] return(state) } trading.ConvertStateToNeuralNetInputs <- function(currentState){ return (currentState) } trading.UpdateState <- function(currentState,neuralNetOutputs){ #print(currentState) equity <- currentState[[1]] equityAllocation <- neuralNetOutputs[[1]] maxEquityAchieved <- currentState[[3]] tradingDay <- currentState[[4]] pctChange <- as.double((simulationData[tradingDay+1,"Price"]))/as.double((simulationData[tradingDay,"Price"]))-1 #print(paste("pctChange",pctChange)) #print(paste("equityAllocation",equityAllocation)) pnl <- equity * equityAllocation * pctChange equity <- equity + pnl maxEquityAchieved <- max(maxEquityAchieved,equity) tradingDay <- tradingDay + 1 currentState[1] <- equity currentState[2] <- equityAllocation currentState[3] <- maxEquityAchieved currentState[4] <- tradingDay currentState[5] <- simulationData[tradingDay,"Price"] currentState[6] <- simulationData[tradingDay,"Payems.short"] currentState[7] <- simulationData[tradingDay,"Payems.medium"] currentState[8] <- simulationData[tradingDay,"Payems.long"] currentState[9] <- simulationData[tradingDay,"Unrate.short"] currentState[10] <- simulationData[tradingDay,"Unrate.medium"] currentState[11] <- simulationData[tradingDay,"Unrate.long"] currentState[12] <- simulationData[tradingDay,"Gdp.short"] currentState[13] <- simulationData[tradingDay,"Gdp.medium"] currentState[14] <- simulationData[tradingDay,"Gdp.long"] return (currentState) } trading.UpdateFitness <- function(oldState,updatedState,oldFitness){ return (as.double(updatedState[1])) #equity achieved } trading.CheckForTermination <- function(frameNum,oldState,updatedState,oldFitness,newFitness){ equity <- updatedState[[1]] equityAllocation <- updatedState[[2]] maxEquityAchieved <- updatedState[[3]] tradingDay <- updatedState[[4]] if(tradingDay >= nrow(simulationData)){ return(T) } if(abs(equityAllocation) > 2){ #Too much leverage return(T) } if(equity/maxEquityAchieved < 0.8){ #20% draw down return(T) } else { return (F) } } trading.PlotState <-function(updatedState){ equity <- currentState[[1]] equityAllocation <- currentState[[2]] maxEquityAchieved <- currentState[[3]] plot(updatedState) } plotStateAndInputDataFunc <- function(stateData, inputData, titleText){ buyandholdret <- inputData[,"Price"]/coredata(inputData[1,"Price"]) strategyret <- stateData[,"Equity"]/100 maxbuyandholdret <- cummax(buyandholdret) buyandholddrawdown <- (buyandholdret/maxbuyandholdret-1) strategydrawdown <- (stateData[,"Equity"]/stateData[,"MaxEquity"]-1) dev.new() par(mfrow=c(4,2),oma = c(0, 0, 2, 0)) plot(stateData[,"Price"],main="Price",ylab="Price") plot(buyandholdret,main="Performance (Return on Initial Equity)", ylab="Return", ylim=c(min(buyandholdret,strategyret),max(buyandholdret,strategyret))) lines(strategyret,col="red") legend(x='bottomright', c('Buy & Hold','Strategy'), fill=c("black","red"), bty='n') plot(inputData[,"ClClRet"],main="Stock Returns", ylab="Return") plot(maxbuyandholdret*100,main="Max Equity", ylim=c(min(maxbuyandholdret*100,stateData[,"MaxEquity"]),max(maxbuyandholdret*100,stateData[,"MaxEquity"])),ylab="Equity $") lines(stateData[,"MaxEquity"],col="red") legend(x='bottomright', c('Buy & Hold','Strategy'), fill=c("black","red"), bty='n') plot(inputData[,"Payems.short"], main="Payrolls Rolling Percentile",ylab="Percentile") lines(inputData[,"Payems.medium"], col="red") lines(inputData[,"Payems.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') plot(buyandholddrawdown,main="Draw Down",ylab="Percent (%)") lines(strategydrawdown,col="red") legend(x='bottomright', c('Buy & Hold','Strategy'), fill=c("black","red"), bty='n') plot(stateData[,"Allocation"],main="Allocation",ylab="Allocation") mtext(titleText, outer = TRUE, cex = 1.5) } config <- newConfigNEAT(14,1,500,50) tradingSimulation <- newNEATSimulation(config, trading.InitialState, trading.UpdateState, trading.ConvertStateToNeuralNetInputs, trading.UpdateFitness, trading.CheckForTermination, trading.PlotState) tradingSimulation <- NEATSimulation.RunSingleGeneration(tradingSimulation) for(i in seq(1,35)){ save.image(file="tradingSim.RData") #So we can recover if we crash for any reason tradingSimulation <- NEATSimulation.RunSingleGeneration(tradingSimulation) } dev.new() plot(tradingSimulation) stateHist <- NEATSimulation.GetStateHistoryForGenomeAndSpecies(tradingSimulation) colnames(stateHist) <- c("Equity","Allocation","MaxEquity","TradingDay","Price", "Payems.short","Payems.medium","Payems.long", "Unrate.short","Unrate.medium","Unrate.long", "Gdp.short","Gdp.medium","Gdp.long") row.names(stateHist)<-row.names(as.data.frame(simulationData[1:nrow(stateHist),])) stateHist <- as.zoo(stateHist) plotStateAndInputDataFunc(stateHist,simulationData,"Training Data") simulationData <- featuresOutOfSampleData stateHist <- NEATSimulation.GetStateHistoryForGenomeAndSpecies(tradingSimulation) colnames(stateHist) <- c("Equity","Allocation","MaxEquity","TradingDay","Price", "Payems.short","Payems.medium","Payems.long", "Unrate.short","Unrate.medium","Unrate.long", "Gdp.short","Gdp.medium","Gdp.long") row.names(stateHist)<-row.names(as.data.frame(simulationData[1:nrow(stateHist),])) stateHist <- as.zoo(stateHist) plotStateAndInputDataFunc(stateHist,simulationData,"Out of Sample Data") |