This post explores applying NEAT to trading the S&P. The learned strategy significantly out performs buying and holding both in and out of sample.
Features:
A key part of any machine learning problem is defining the features and ensuring that they’re normalised in some fashion.
The features will be rolling percentiles of the following economic data, a rolling percentile takes the last n data points and calculates what % of data point the latest data point is greater than.
- Non-farm payrolls
- Unemployment Rate
- GDP
Fitness Function
The fitness function is final equity, and aims to maximise the final equity
Termination Function
Any genome that has a 20% draw down, or attempts to use a leverage greater than +/- 2 is terminated. In practise you wouldn’t want to make your system machine learn the risk controls as there is potential that they don’t get learned. The reason they are embedded inside the strategy is to speed up the learning process as we can kill genomes early before the simulation is complete based upon breaking the risk rules.
Plot of all data / features
It appears that when non-farms fall to their lower percentiles / unemployment reaches it’s highest percentiles the day to day returns in the S&P become more volatile. It is hoped that the learning can take advantage of this.
Training Results
The learning has identified a strategy that out performs simply buying and holding. The proposed strategy has a max drawdown around 20% vs the buy and hold having a draw down of 40%. Additionally the strategy shorted the index between 2000-2003 as it was selling off before going long to 2007. Generating a return of 80% vs buy and hold of 7%!
Out of sample results
In the out of sample data (not used during the training) the strategy significantly out performed buying and holding, approx 250% return vs 50% with a max drawdown close to 20% vs buy and hold draw down of 50%.
Onto the code:
install.packages("devtools") library("devtools") install_github("RNeat","ahunteruk") #Install from github as not yet on CRAN library("RNeat") library("quantmod") marketSymbol <- "^GSPC" econmicDataSymbols <- c("UNRATE","PAYEMS","GDP") mktData <- new.env() #Make a new environment for quantmod to store data in economicData <- new.env() #Make a new environment for quantmod to store data in #Specify dates for downloading data, training models and running simulation dataDownloadStartDate <- as.Date("2000-06-01") trainingStartDate = as.Date("2001-01-01") #Specify the date to start training (yyyy-mm-dd) trainingEndDate = as.Date("2006-12-31") #Specify the date to end training outOfSampleStartDate = as.Date("2007-01-01") outOfSampleEndDate = as.Date("2016-07-15") #Download Data getSymbols(marketSymbol,env=mktData,from=dataDownloadStartDate) #S&P 500 getSymbols(econmicDataSymbols,src="FRED",env=economicData,from=dataDownloadStartDate) #Payems is non-farms payrolls nEconomicDataPercentileLookbackShort <- 20 nEconomicDataPercentileLookbackMedium <- 50 nEconomicDataPercentileLookbackLong <- 100 rollingPercentile <- function(data,n){ percentile <- function(dataBlock){ last(rank(dataBlock)/length(dataBlock)) } return (as.zoo(rollapply(as.zoo(data),width=n,percentile,align="right",by.column=TRUE))) } stockCleanNameFunc <- function(name){ return(sub("^","",name,fixed=TRUE)) } clClRet <- as.zoo((lag(Cl(get(stockCleanNameFunc(marketSymbol),mktData)),-1)/Cl(get(stockCleanNameFunc(marketSymbol),mktData))-1)) payemsShortPercentile <- rollingPercentile(economicData$PAYEMS,nEconomicDataPercentileLookbackShort) payemsMediumPercentile <- rollingPercentile(economicData$PAYEMS,nEconomicDataPercentileLookbackMedium) payemsLongPercentile <- rollingPercentile(economicData$PAYEMS,nEconomicDataPercentileLookbackLong) unrateShortPercentile <- rollingPercentile(economicData$UNRATE,nEconomicDataPercentileLookbackShort) unrateMediumPercentile <- rollingPercentile(economicData$UNRATE,nEconomicDataPercentileLookbackMedium) unrateLongPercentile <- rollingPercentile(economicData$UNRATE,nEconomicDataPercentileLookbackLong) gdpShortPercentile <- rollingPercentile(economicData$GDP,nEconomicDataPercentileLookbackShort) gdpMediumPercentile <- rollingPercentile(economicData$GDP,nEconomicDataPercentileLookbackMedium) gdpLongPercentile <- rollingPercentile(economicData$GDP,nEconomicDataPercentileLookbackLong) #join the data sets, fill in any missing dates with the previous none NA value mergedData <- na.locf(merge(economicData$PAYEMS,merge(Cl(get(stockCleanNameFunc(marketSymbol),mktData)), economicData$PAYEMS,payemsShortPercentile,payemsMediumPercentile,payemsLongPercentile, economicData$UNRATE,unrateShortPercentile,unrateMediumPercentile,unrateLongPercentile, economicData$GDP,gdpShortPercentile,gdpMediumPercentile,gdpLongPercentile ,all.x=T),all=T)) mergedData <- mergedData[,-1] ClClRet <- as.zoo(lag(mergedData[,1],-1)/mergedData[,1]-1) ClTZero <- as.zoo(mergedData[,1]) ClTOne <- as.zoo(lag(mergedData[,1],-1)) mergedData <- merge(ClClRet,ClTOne,ClTZero,mergedData) mergedData <- window(mergedData,start=dataDownloadStartDate) colnames(mergedData) <- c("ClClRet","ClTOne","ClTZero","Price","Payems","Payems.short","Payems.medium","Payems.long", "Unrate","Unrate.short","Unrate.medium","Unrate.long", "Gdp","Gdp.short","Gdp.medium","Gdp.long","all.x") dev.new() par(mfrow=c(4,2)) plot(mergedData[,"Price"], main="S&P Close Price",ylab="Close Price") plot(mergedData[,"ClClRet"], main="S&P Close Price",ylab="Close Price") plot(mergedData[,"Payems"], main="Non-Farm Payrolls",ylab="Thousands of Persons") plot(mergedData[,"Payems.short"], main="Non-Farm Payrolls Rolling Percentile",ylab="Percentile") lines(mergedData[,"Payems.medium"], col="red") lines(mergedData[,"Payems.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') plot(mergedData[,"Unrate"], main="Unemployment Rate",ylab="Percent") plot(mergedData[,"Unrate.short"], main="Unemployment Rate Rolling Percentile",ylab="Percentile") lines(mergedData[,"Unrate.medium"], col="red") lines(mergedData[,"Unrate.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') plot(mergedData[,"Gdp"], main="GDP",ylab="Billions of USD") plot(mergedData[,"Gdp.short"], main="GBP Rolling Percentile",ylab="Percentile") lines(mergedData[,"Gdp.medium"], col="red") lines(mergedData[,"Gdp.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') featuresTrainingData <- window(mergedData,start=trainingStartDate,end=trainingEndDate) featuresOutOfSampleData <- window(mergedData,start=outOfSampleStartDate,end=outOfSampleEndDate) #Genetic algo setup simulationData <- featuresTrainingData trading.InitialState <- function(){ state <- list() state[1] <- 100 #Equity state[2] <- 0 #% of Equity allocated to share (-ve for shorts) state[3] <- state[1] #Maximum equity achieved state[4] <- 1 #Trading day number state[5] <- simulationData[1,"Price"] state[6] <- simulationData[1,"Payems.short"] state[7] <- simulationData[1,"Payems.medium"] state[8] <- simulationData[1,"Payems.long"] state[9] <- simulationData[1,"Unrate.short"] state[10] <- simulationData[1,"Unrate.medium"] state[11] <- simulationData[1,"Unrate.long"] state[12] <- simulationData[1,"Gdp.short"] state[13] <- simulationData[1,"Gdp.medium"] state[14] <- simulationData[1,"Gdp.long"] return(state) } trading.ConvertStateToNeuralNetInputs <- function(currentState){ return (currentState) } trading.UpdateState <- function(currentState,neuralNetOutputs){ #print(currentState) equity <- currentState[[1]] equityAllocation <- neuralNetOutputs[[1]] maxEquityAchieved <- currentState[[3]] tradingDay <- currentState[[4]] pctChange <- as.double((simulationData[tradingDay+1,"Price"]))/as.double((simulationData[tradingDay,"Price"]))-1 #print(paste("pctChange",pctChange)) #print(paste("equityAllocation",equityAllocation)) pnl <- equity * equityAllocation * pctChange equity <- equity + pnl maxEquityAchieved <- max(maxEquityAchieved,equity) tradingDay <- tradingDay + 1 currentState[1] <- equity currentState[2] <- equityAllocation currentState[3] <- maxEquityAchieved currentState[4] <- tradingDay currentState[5] <- simulationData[tradingDay,"Price"] currentState[6] <- simulationData[tradingDay,"Payems.short"] currentState[7] <- simulationData[tradingDay,"Payems.medium"] currentState[8] <- simulationData[tradingDay,"Payems.long"] currentState[9] <- simulationData[tradingDay,"Unrate.short"] currentState[10] <- simulationData[tradingDay,"Unrate.medium"] currentState[11] <- simulationData[tradingDay,"Unrate.long"] currentState[12] <- simulationData[tradingDay,"Gdp.short"] currentState[13] <- simulationData[tradingDay,"Gdp.medium"] currentState[14] <- simulationData[tradingDay,"Gdp.long"] return (currentState) } trading.UpdateFitness <- function(oldState,updatedState,oldFitness){ return (as.double(updatedState[1])) #equity achieved } trading.CheckForTermination <- function(frameNum,oldState,updatedState,oldFitness,newFitness){ equity <- updatedState[[1]] equityAllocation <- updatedState[[2]] maxEquityAchieved <- updatedState[[3]] tradingDay <- updatedState[[4]] if(tradingDay >= nrow(simulationData)){ return(T) } if(abs(equityAllocation) > 2){ #Too much leverage return(T) } if(equity/maxEquityAchieved < 0.8){ #20% draw down return(T) } else { return (F) } } trading.PlotState <-function(updatedState){ equity <- currentState[[1]] equityAllocation <- currentState[[2]] maxEquityAchieved <- currentState[[3]] plot(updatedState) } plotStateAndInputDataFunc <- function(stateData, inputData, titleText){ buyandholdret <- inputData[,"Price"]/coredata(inputData[1,"Price"]) strategyret <- stateData[,"Equity"]/100 maxbuyandholdret <- cummax(buyandholdret) buyandholddrawdown <- (buyandholdret/maxbuyandholdret-1) strategydrawdown <- (stateData[,"Equity"]/stateData[,"MaxEquity"]-1) dev.new() par(mfrow=c(4,2),oma = c(0, 0, 2, 0)) plot(stateData[,"Price"],main="Price",ylab="Price") plot(buyandholdret,main="Performance (Return on Initial Equity)", ylab="Return", ylim=c(min(buyandholdret,strategyret),max(buyandholdret,strategyret))) lines(strategyret,col="red") legend(x='bottomright', c('Buy & Hold','Strategy'), fill=c("black","red"), bty='n') plot(inputData[,"ClClRet"],main="Stock Returns", ylab="Return") plot(maxbuyandholdret*100,main="Max Equity", ylim=c(min(maxbuyandholdret*100,stateData[,"MaxEquity"]),max(maxbuyandholdret*100,stateData[,"MaxEquity"])),ylab="Equity $") lines(stateData[,"MaxEquity"],col="red") legend(x='bottomright', c('Buy & Hold','Strategy'), fill=c("black","red"), bty='n') plot(inputData[,"Payems.short"], main="Payrolls Rolling Percentile",ylab="Percentile") lines(inputData[,"Payems.medium"], col="red") lines(inputData[,"Payems.long"], col="blue") legend(x='bottomright', c(paste(nEconomicDataPercentileLookbackShort,"Points"), paste(nEconomicDataPercentileLookbackMedium,"Points"), paste(nEconomicDataPercentileLookbackLong,"Points")), fill=c("black","red","blue"), bty='n') plot(buyandholddrawdown,main="Draw Down",ylab="Percent (%)") lines(strategydrawdown,col="red") legend(x='bottomright', c('Buy & Hold','Strategy'), fill=c("black","red"), bty='n') plot(stateData[,"Allocation"],main="Allocation",ylab="Allocation") mtext(titleText, outer = TRUE, cex = 1.5) } config <- newConfigNEAT(14,1,500,50) tradingSimulation <- newNEATSimulation(config, trading.InitialState, trading.UpdateState, trading.ConvertStateToNeuralNetInputs, trading.UpdateFitness, trading.CheckForTermination, trading.PlotState) tradingSimulation <- NEATSimulation.RunSingleGeneration(tradingSimulation) for(i in seq(1,35)){ save.image(file="tradingSim.RData") #So we can recover if we crash for any reason tradingSimulation <- NEATSimulation.RunSingleGeneration(tradingSimulation) } dev.new() plot(tradingSimulation) stateHist <- NEATSimulation.GetStateHistoryForGenomeAndSpecies(tradingSimulation) colnames(stateHist) <- c("Equity","Allocation","MaxEquity","TradingDay","Price", "Payems.short","Payems.medium","Payems.long", "Unrate.short","Unrate.medium","Unrate.long", "Gdp.short","Gdp.medium","Gdp.long") row.names(stateHist)<-row.names(as.data.frame(simulationData[1:nrow(stateHist),])) stateHist <- as.zoo(stateHist) plotStateAndInputDataFunc(stateHist,simulationData,"Training Data") simulationData <- featuresOutOfSampleData stateHist <- NEATSimulation.GetStateHistoryForGenomeAndSpecies(tradingSimulation) colnames(stateHist) <- c("Equity","Allocation","MaxEquity","TradingDay","Price", "Payems.short","Payems.medium","Payems.long", "Unrate.short","Unrate.medium","Unrate.long", "Gdp.short","Gdp.medium","Gdp.long") row.names(stateHist)<-row.names(as.data.frame(simulationData[1:nrow(stateHist),])) stateHist <- as.zoo(stateHist) plotStateAndInputDataFunc(stateHist,simulationData,"Out of Sample Data") |
Pingback: Quantocracy's Daily Wrap for 10/23/2016 | Quantocracy
hi, when i ran this script, I got Error in line 259,
> row.names(stateHist)<-row.names(as.data.frame(simulationData))
Error in `row.names<-.data.frame`(`*tmp*`, value = c("2007-01-01", "2007-01-03", :
'row.names''length is not right.
I got stateHist only 392 rows, them i ran it again,stateHist only 534 rows.
I think the right rows maybe 41548.
why I got 392 or 534 rows ?
I’ve updated the script, making the change:
row.names(stateHist)<-row.names(as.data.frame(simulationData[1:nrow(stateHist),])) What was happening was that the simulation was terminating early due to either over leverage or max draw down being exceeded. Hence the length of the stateHist didn't match the length of the simulationData. Thanks
Hi Gekko,
Thank you for posting this series on the NEAT algorithm, as well as porting the code into R. I’m having some trouble installing the RNeat package. When I run:
library(“devtools”)
install_github(“ahunteruk/RNeat”)
I get the following errors:
Error: object ‘add_edges’ is not exported by ‘namespace:graph’
ERROR: lazy loading failed for package ‘RNeat’
Error: Command failed (1)
I’m using R version 3.1.3 on Platform: x86_64-apple-darwin10.8.0 (64-bit).
Any idea on how I can get the package to install properly?
Thanks in advance, Neil
There was a typo in the first error I wrote. It should read:
‘namespace:igraph’ rather than ‘namespace:graph’
Thanks.
hi!
trying tutorial and getting “Error: could not find function “newConfigNEAT”
Any idea whats going on?
Full log here: http://pastebin.com/PRUJe5tu
Pingback: Trading and Neural Network | Ing. Lele's Blog - HeadQuarter
Hey GekkoQuant,
first of all thank you for creating the NEAT code.
I tried to run it but unfortunately the simulation of training and out of sample sets doesn´t work over the defined length.
The stateHist takes the value “zoo series from 2007-01-01 (only) to 2009-01-02” (instead of 2016-07-15) in the out of sample simulation and a similar problem occurs when simulating the training set.
Has this problem already been discovered?
Is there any solution?
Thanks in advance!
d> install_github(“ahunteruk/RNeat”) #Install from github as not yet on CRAN
Error in curl::curl_fetch_disk(url, x$path, handle = handle) :
Problem with the SSL CA cert (path? access rights?)
I get this error when trying to install the package. Anyway around this?