This post is going to investigate a strategy called Buy on Gap that was discussed by E.P Chan in his blog post “the life and death of a strategy”. The strategy is a mean reverting strategy that looks to buy the weakest stocks in the S&P 500 at the open and liquidate the positions at the close. The performance of the strategy is seen in the image below, **Annualized Sharpe Ratio (Rf=0%) 2.129124**.

All numbers in this table are %(ie 12.6 is 12.6%) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec BuyOnGap S&P500 2005 0.0 0.0 0.0 0.0 0.0 -0.4 -0.6 1.1 0.7 1.0 -0.2 0.3 1.8 -0.1 2006 0.2 -0.6 -0.3 0.0 1.1 0.1 0.0 0.4 0.1 0.1 0.2 -0.2 1.1 -2.0 2007 0.8 0.9 0.1 -1.1 0.3 -0.2 -1.5 0.2 -0.2 0.9 -0.4 0.3 0.0 1.0 2008 4.3 -1.9 0.8 -0.5 0.0 0.7 0.2 -0.7 2.0 3.3 2.0 2.0 12.6 6.0 2009 -2.9 0.1 1.4 -1.1 1.3 -0.8 0.4 0.0 -0.3 -1.9 0.8 -1.0 -4.0 -7.3 2010 -1.0 -0.1 0.0 -1.1 -0.7 -0.6 1.1 0.7 -0.5 0.6 0.5 0.1 -1.1 -5.9 2011 0.6 0.3 0.2 -0.1 0.2 0.4 0.7 0.1 -1.4 -1.2 1.4 -0.2 1.0 2.1 2012 -0.3 -0.5 -0.1 0.0 -1.0 NA NA NA NA NA NA NA -1.8 -0.8

From the post two trading criterion were mentioned:

- Buy the 100 stocks out of the S&P 500 constituents that have the lowest previous days lows to the current days opening price
- Provided that the above return is less than the 1 times the 90day standard deviation of Close to Close returns

- nStocksBuy – How many stocks to buy
- stdLookback – How many days to look back for the standard deviation calculation
- stdMultiple – Number to multiply the standard deviation by (was 1 in criterion 2.), the larger this variable the more stocks that will satisfy criterion 2.

The code is split into 5 distinct sections.

**Section 1**: Loop through all the stocks loaded from the data file, for each stock calculate the previous day close to current days open (lowOpenRet). Calculate the Close Close return and calculate the standard deviation (stdClClRet). Also calculate the Open to Close return for every day (dayClOpRet), if we decide to trade this day this would be the return of the strategy for the day.

**Section 2**: This section combines columns from each of the individual stock data frames into large matrices that cover all the stocks. retMat contains the lowOpenRet for each stock. stdMat contains the stdClClRet for all stocks, dayretMat contains the dayClOpRet for all stocks.

Essentially instead of having lots of variables, we combine them into a big matrix.

**Section 3**: This will check if matrices in section 2 match the trade entry criterion. This section produces two matrices (conditionOne and conditionTwo). The matrices contain a 1 for a passed entry criterion and a 0 for a failed entry criterion.

**Section 4**: This multiples the conditionOne with conditionTwo to give conditionsMet, since those matricies are binary multiplying them together identifies the regions where both conditions passed (1*1=1 ie a pass). This means enter a trade.

The script assumes capital is split equally between all the stocks that are bought at the open, if less than 100 stocks meet the entry criteria then it is acceptable to buy less.

**Section 5**: This section does simple performance analytics and plots the equity curve against the S&P 500 index.

Onto the code (note the datafile is generated in Stock Data Download & Saving R):

^{?}View Code RSPLUS

#install.packages("quantmod") library("quantmod") #install.packages("caTools") #for rolling standard deviation library("caTools") #install.packages("PerformanceAnalytics") library("PerformanceAnalytics") #Load the PerformanceAnalytics library datafilename = "stockdata.RData" stdLookback <- 90 #How many periods to lookback for the standard deviation calculation stdMultiple <- 1 #A Number to multiply the standard deviation by nStocksBuy <- 100 #How many stocks to buy load(datafilename) #CONDITION 1 #Buy 100 stocks with lowest returns from their previous days lows #To the current days open #CONDITION 2 #Provided returns are lower than one standard deviation of the #90 day moving standard deviation of close close returns #Exit long positions at the end of the day #SECTION 1 symbolsLst <- ls(stockData) #Loop through all stocks in stockData and calculate required returns / stdev's for (i in 1:length(symbolsLst)) { cat("Calculating the returns and standard deviations for stock: ",symbolsLst[i],"\n") sData <- eval(parse(text=paste("stockData$",symbolsLst[i],sep=""))) #Rename the colums, there is a bug in quantmod if a stock is called Low then Lo() breaks! #Ie if a column is LOW.x then Lo() breaks oldColNames <- names(sData) colnames(sData) <- c("S.Open","S.High","S.Low","S.Close","S.Volume","S.Adjusted") #Calculate the return from low of yesterday to the open of today lowOpenRet <- (Op(sData)-lag(Lo(sData),1))/lag(Lo(sData),1) colnames(lowOpenRet) <- paste(symbolsLst[i],".LowOpenRet",sep="") #Calculate the n day standard deviation from the close of yesterday to close 2 days ago stdClClRet <- runsd((lag(Cl(sData),1)-lag(Cl(sData),2))/lag(Cl(sData),2),k=stdLookback,endrule="NA",align="right") stdClClRet <- stdMultiple*stdClClRet + runmean(lag(Cl(sData),1)/lag(Cl(sData),2),k=stdLookback,endrule="NA",align="right") colnames(stdClClRet) <- paste(symbolsLst[i],".StdClClRet",sep="") #Not part of the strategy but want to calculate the Close/Open ret for current day #Will use this later to evaluate performance if a trade was taken dayClOpRet <- (Cl(sData)-Op(sData))/Op(sData) colnames(dayClOpRet) <- paste(symbolsLst[i],".DayClOpRet",sep="") colnames(sData) <- oldColNames eval(parse(text=paste("stockData$",symbolsLst[i]," <- cbind(sData,lowOpenRet,stdClClRet,dayClOpRet)",sep=""))) } #SECTION 2 #Have calculated the relevent returns and standard deviations #Now need to to work out what 100 (nStocksBuy) stocks have the lowest returns #Make a returns matrix for (i in 1:length(symbolsLst)) { cat("Assing stock: ",symbolsLst[i]," to the returns table\n") sDataRET <- eval(parse(text=paste("stockData$",symbolsLst[i],"[,\"",symbolsLst[i],".LowOpenRet\"]",sep=""))) sDataSTD <- eval(parse(text=paste("stockData$",symbolsLst[i],"[,\"",symbolsLst[i],".StdClClRet\"]",sep=""))) sDataDAYRET <- eval(parse(text=paste("stockData$",symbolsLst[i],"[,\"",symbolsLst[i],".DayClOpRet\"]",sep=""))) if(i == 1){ retMat <- sDataRET stdMat <- sDataSTD dayretMat <- sDataDAYRET } else { retMat <- cbind(retMat,sDataRET) stdMat <- cbind(stdMat,sDataSTD) dayretMat <- cbind(dayretMat,sDataDAYRET) } } #SECTION 3 #CONDITON 1 test output (0 = failed test, 1 = passed test) #Now will loop over the returns matrix finding the nStocksBuy smallest returns conditionOne <- retMat #copying the structure and data, only really want the structure conditionOne[,] <- 0 #set all the values to 0 for (i in 1:length(retMat[,1])){ orderindex <- order((retMat[i,]),decreasing=FALSE) #order row entries smallest to largest orderindex <- orderindex[1:nStocksBuy] #want the smallest n (nStocksBuy) stocks conditionOne[i,orderindex] <- 1 #1 Flag indicates entry is one of the nth smallest } #CONDITON 2 #Check Close to Open return is less than 90day standard deviation conditionTwo <- retMat #copying the structure and data, only really want the structure conditionTwo[,] <- 0 #set all the values to 0 conditionTwo <- retMat/stdMat #If ClOp ret is < StdRet tmp will be < 1 conditionTwo[is.na(conditionTwo)] <- 2 #GIVE IT FAIL CONDITION JUST STRIPPING NAs here conditionTwo <- apply(conditionTwo,1:2, function(x) {if(x<1) { return (1) } else { return (0) }}) #SECTION 4 #CHECK FOR TRADE output (1 = passed conditions for trade, 0 = failed test) #Can just multiply the two conditions together since they're boolean conditionsMet <- conditionOne * conditionTwo colnames(conditionsMet) <- gsub(".LowOpenRet","",names(conditionsMet)) #Lets calculate the results tradeMat <- dayretMat colnames(tradeMat) <- gsub(".DayClOpRet","",names(tradeMat)) tradeMat <- tradeMat * conditionsMet tradeMat[is.na(tradeMat)] <- 0 tradeVec <- as.data.frame(apply(tradeMat, 1,sum) / apply(conditionsMet, 1,sum)) #Calculate the mean for each row colnames(tradeVec) <- "DailyReturns" tradeVec[is.nan(tradeVec[,1]),1] <- 0 #Didnt make or loose anything on this day plot(cumsum(tradeVec[,1]),xlab="Date", ylab="EPCHAN Buy on Gap",xaxt = "n") #SECTION 5 #### Performance Analysis ### #Get the S&P 500 index data indexData <- new.env() startDate = as.Date("2005-01-13") #Specify what date to get the prices from getSymbols("^GSPC", env = indexData, src = "yahoo", from = startDate) #Calculate returns for the index indexRet <- (Cl(indexData$GSPC)-lag(Cl(indexData$GSPC),1))/lag(Cl(indexData$GSPC),1) colnames(indexRet) <- "IndexRet" zooTradeVec <- cbind(as.zoo(tradeVec),as.zoo(indexRet)) #Convert to zoo object colnames(zooTradeVec) <- c("BuyOnGap","S&P500") #Lets see how all the strategies faired against the index dev.new() charts.PerformanceSummary(zooTradeVec,main="Performance of EPCHAN Buy on Gap",geometric=FALSE) #Lets calculate a table of montly returns by year and strategy cat("Calander Returns - Note 13.5 means a return of 13.5%\n") table.CalendarReturns(zooTradeVec) dev.new() #Lets make a boxplot of the returns chart.Boxplot(zooTradeVec) dev.new() #Set the plotting area to a 2 by 2 grid layout(rbind(c(1,2),c(3,4))) #Plot various histograms with different overlays added chart.Histogram(zooTradeVec, main = "Plain", methods = NULL) chart.Histogram(zooTradeVec, main = "Density", breaks=40, methods = c("add.density", "add.normal")) chart.Histogram(zooTradeVec, main = "Skew and Kurt", methods = c("add.centered", "add.rug")) chart.Histogram(zooTradeVec, main = "Risk Measures", methods = c("add.risk")) |

Possible Future Modifications

- Add shorting the strongest stocks so that the strategy is market neutral
- Vary how many stocks to hold
- Vary the input variables (discussed above)
- Try a different asset class, does this work for forex?

In EPChans blog he talks about this strategy collapsing, the above code must be slightly different to his implementation since the performance still looks OK post 2008.

Another plausible explanation might be survival bias, the list of S&P constituents is from 2011 however EPChan went live in 2007 where the constituents are different. For example we know that Lehman Brothers folded in this time but this isn’t back tested.

Great analysis! This data has survivorship bias, but only back to 2005, I wonder how much that would really change the results…

Hi GekkoQuant,

It’s really weird that your results are different of those of Chan’s. I commented out the line when you add the average to the standard deviation and the results don’t change much.

Then, I applied the same strategy to Bovespa (^BVSP) stocks since I live in Brazil and work with that market. It should yield similar results in comparison with S&P, since “this strategy exploits a particular inefficiency in the opening auction price of equities” (Chan’s words).

We don’t have as many stocks that are conveniently liquid to “safely” trade, so I tested a maximum of 10 and 20 stocks being held during the day. For the period of Jan 2007 up to today, I got a cumulative return of 7.7 and 4.6, respectively.