This weekend I was spammed for a “binary option trading system with 90% accuracy”. The advert caught my curiosity in essence it detailed a method that was a variation of the well known roulette playing strategy that mathematically guarantees a profit (assuming infinite money, and no table limit).

Roulette Strategy

If you double your bet size after a loss and repeat the same bet you are guaranteed a profit, your next winning will cover all the preceding loses.

e.g Bet $1 on red, lose, Bet $2 on red, Win get $4 back ($2 is your stake, and $1 covers the loss from your first bet) giving $1 profit.

Exponential growth of lot size, no thanks 🙂

Binary options are analogous to betting on red, they offer virtually fixed odds for up or down directional bets. Naturally I want to know whats is the maximum number of consecutive up or down days in the market, how much pain would I have to suffer with this strategy.

**Occurrences of n Consecutive Up or Down Days**

Analysing the last 12 years of returns data for the S&P 500, the maximum consecutive number of up days is 9 (occurred in 2004-2005), the maximum consecutive number of down days is -8 which, you guessed it, occurred in 2008-2009.

So 9 days of pain should we always short the direction of the market.

Instead of enduring the 9 days of draw down, it is interesting to see what the consecutive number of up/down days says about the probability the next day is an up day. The maximum likelihood probability of an up day is count(up days)/count(up and down days). Naturally we will condition this data on the consecutive number of up/down days.

**Consecutive Up or Down Days vs Maximum Likelihood Probability the next day is up**

This data is fairly nice looking, for example in 2012-2013 there is a clear relationship. The more down days in a row the higher the likelihood of an up day. 6 down days implied the probability of an up day is 80%! I must raise a note of caution here, 6 down days in a row was seen less than 5 times in the year. Hence the probability estimate is based on 5 points and not statistically significant, perhaps looking at 5 years of returns might be better.

I appreciate that most people don’t trade binary options can we trade the index/stock out right, it is interesting to see what the consecutive number of up/down days says about future Open to Close Returns. The image below regresses Open to Close Returns (time t) with Consecutive Up/Down days (time t-1).

**Consecutive Up or Down Days vs Next Day Open to Close Returns**

**Prediction Accuracy**

The plot below shows the accuracy of using this maximum likelihood estimate approach. The model takes the last 250 days of returns and calculates the probability of an up move given that the current day has seen n consecutive days of trading. If the prob of an up move or is over a certain threshold go long, if its below a certain threshold go short.

**Heatmap of Accuracy vs Model Parameters**

The histogram on the heatmap shows that approximately half of the parameter combinations can predict the direction with accuracy greater than 50%.

**Final Comments**

The beauty of this approach is that it’s simple, it can be applied to any asset class but most importantly it can be applied across different time frames.

Onto the code:

^{?}View Code RSPLUS

library("quantmod") library("reshape") library("gplots") #Control Parameters dataStartDate = as.Date("2000-01-01") symbol<- "^GSPC" longThreshold <- 0.70 #If the probability of an upday is greater than this limit go long shortThreshold <- 1-longThreshold #If the probability of an upday is lower than this limit go short nLookback<-250 #Days to look back when generating the rolling probability distribution (approx 250 trading days in a year) #Function to turn a boolean vector into a vector containing the consecutive num of trues or falses seen #Will be used to calculate the consecutive number of up and down days consecutiveTruesExtractor <- function(data){ genNumOfConsecutiveTrues <- function(x, y) { (x+y)*y } #Y is either 0 or 1 upDaysCount <- Reduce(genNumOfConsecutiveTrues,data,accumulate=TRUE) upDaysCount <- as.vector(Lag(upDaysCount)) upDaysCount[is.na(upDaysCount)] <- 0 downDaysCount <- Reduce(genNumOfConsecutiveTrues,!data,accumulate=TRUE) downDaysCount <- as.vector(Lag(downDaysCount)) downDaysCount[is.na(downDaysCount)] <- 0 consecutiveTruesExtractor <- upDaysCount-downDaysCount } #Function to plot data and add regression line doPlot <- function(x,y,title,xlabel,ylabel,ylimit){ x<-as.vector(x) y<-as.vector(y) boxplot(y~x,main=title, xlab=xlabel,ylab=ylabel,ylim=ylimit) abline(lm(y~x),col="red",lwd=1.5) } #Function to calculate the percentage of updays from returns set calcPercentageOfUpdaysFromReturnsSet <- function(data){ calcPercentageOfUpdaysFromReturnsSet <- sum(data>0)/length(data) } #Function takes a set of returns and consecutive up down day data and aggregates it into a probability distribution #Generated a matrix of consecutive Direction vs prob of up move generateProbOfUpDayDistribution <- function(dataBlock){ y <- as.matrix(by(dataBlock[,"OpClRet"],list(ConsecutiveDir = dataBlock[,"ConsecutiveDir"]),calcPercentageOfUpdaysFromReturnsSet)) #Prob of upmove x <- as.matrix(as.numeric(as.matrix(rownames(y)))) res <- cbind(x,y) colnames(res) <- c("ConsecutiveDir","Prob") generateProbOfUpDayDistribution <- res } #Given current consecutive up down day data, what is the probability the current day is an up day #For use with the rollapply function (since needs to use the past n days worth of data for generating probability distribution) probOfUpDayForUseWithRollApply <- function(dataBlock){ dist <- generateProbOfUpDayDistribution(head(dataBlock,-1)) #Use head to drop the last row, prevents a lookforward issue currentConsecutiveRun <- last(dataBlock[,"ConsecutiveDir"]) probOfUpDay <- dist[dist[,"ConsecutiveDir"] == rep(coredata(currentConsecutiveRun), length(dist[,"ConsecutiveDir"])),"Prob"] if(!is.numeric(probOfUpDay)) {probOfUpDay <- 0.5 } #Never this many consecutive days before, dont know what will happen make up and down events equally likely #print(paste("Current Run:",coredata(currentConsecutiveRun),"Prob of up day:",probOfUpDay )) probOfUpDayForUseWithRollApply <- probOfUpDay } #Just a quick test to check that the consecutiveTruesExtractor is working as expected #Define the input data, and define the expected output #Check that the output of the function equals the expected output data <- c(0, 0, 0, 1,0, 1,1,0,0) #0 is down day, 1 is up day expectedOutput <- c(0,-1,-2,-3,1,-1,1,2,-1) res <- consecutiveTruesExtractor(data) if( identical(res,expectedOutput)){ print("Match consecutiveTruesExtractor is correct") } else { print("Error consecutiveTruesExtractor contains bugs") } #Download the data symbolData <- new.env() #Make a new environment for quantmod to store data in getSymbols(symbol, env = symbolData, src = "yahoo", from = dataStartDate) mktdata <- eval(parse(text=paste("symbolData$",sub("^","",symbol,fixed=TRUE)))) opClRet <- (Cl(mktdata)/Op(mktdata))-1 consecutiveDir <- consecutiveTruesExtractor(as.matrix(opClRet>0)) completeData<- cbind(opClRet,consecutiveDir) colnames(completeData) <- c("OpClRet","ConsecutiveDir") #Plot of consecutive up down days vs next day Open Close Returns dev.new() par(oma=c(0,0,2,0)) par(mfrow=c(3,3)) for(i in seq(2012,2004,-1)){ windowedData <- window(completeData,start=as.Date(paste(i,"-01-01",sep="")),end=as.Date(paste(i+1,"-01-01",sep=""))) doPlot(windowedData$ConsecutiveDir,windowedData$OpClRet,paste("Consecutive Up / Down days vs Open close Return (",i,",",i+1,")"),"Consecutive Up or Down days","Open Close Returns",c(-0.07,0.07)) } title(main=paste("Symbol",symbol),outer=T) #Plot of consecutive up down days vs the maximum likelihood probability that the next day is an up day dev.new() par(oma=c(0,0,2,0)) par(mfrow=c(3,3)) for(i in seq(2012,2004,-1)){ windowedData <- window(completeData,start=as.Date(paste(i,"-01-01",sep="")),end=as.Date(paste(i+1,"-01-01",sep=""))) y <-as.matrix(by(as.vector(windowedData$OpClRet),list(ConsecutiveDir = windowedData$ConsecutiveDir),calcPercentageOfUpdaysFromReturnsSet)) #Prob of upmove x <- as.matrix(as.numeric(as.matrix(rownames(y)))) #Consecutive up or down days plot(cbind(x,y),main=paste("Consecutive Up / Down days vs Next day Dir (",i,",",i+1,")"), xlab="Consecutive Up or Down days",ylab="Conditional Probability of an Up day") abline(lm(y~x),col="red",lwd=1.5) } title(main=paste("Symbol",symbol),outer=T) #Plot of consecutive up down days vs the number of occurences seen dev.new() par(oma=c(0,0,2,0)) par(mfrow=c(3,3)) for(i in seq(2012,2004,-1)){ windowedData <- window(completeData,start=as.Date(paste(i,"-01-01",sep="")),end=as.Date(paste(i+1,"-01-01",sep=""))) y <-abs(as.matrix(by(windowedData$ConsecutiveDir,list(ConsecutiveDir = windowedData$ConsecutiveDir),sum))) #Count the number of occurences of each consecutive run x <- as.matrix(as.numeric(as.matrix(rownames(y)))) plot(y,xaxt="n",main=paste("Consecutive Up / Down days vs Next day Dir (",i,",",i+1,")"), xlab="Consecutive Up or Down days",ylab="Occurences of consecutive run",type="l") axis(1, at=1:length(x),labels=x) } title(main=paste("Symbol",symbol),outer=T) predictionPerformance <- function(completeData,longThreshold,shortThreshold,nLookback,displayPlot){ #Calcuate the probabiliy of an up day using a nLookback day window rollingProbOfAnUpday <- rollapply(completeData,FUN=probOfUpDayForUseWithRollApply,align="right",fill=NA,width=nLookback,by.column=FALSE) rollingProbOfAnUpday <- as.matrix(rollingProbOfAnUpday) print(head(rollingProbOfAnUpday)) colnames(rollingProbOfAnUpday) <- "ProbTodayIsAnUpDay" completeData <- cbind(completeData,rollingProbOfAnUpday) suggestedTradeDir <- rollingProbOfAnUpday #Just to copy the structure colnames(suggestedTradeDir) <- "SuggestedTradeDir" suggestedTradeDir[rollingProbOfAnUpday>longThreshold] <- 1 #Long Trade suggestedTradeDir[rollingProbOfAnUpday<shortThreshold] <- -1 #Short Trade suggestedTradeDir[rollingProbOfAnUpday<longThreshold & rollingProbOfAnUpday>shortThreshold] <- 0 #Do nothing completeData <- cbind(completeData,suggestedTradeDir) isPredictionCorrect <- suggestedTradeDir #Just to copy structure isPredictionCorrect <- sign(completeData$SuggestedTradeDir * completeData$OpClRet) #sign(0) is 0 so will capture no trades as well isPredictionCorrect[is.na(isPredictionCorrect)] <- 0 isPredictionCorrect[is.nan(isPredictionCorrect)] <- 0 if(displayPlot){ dev.new() plot(cumsum(isPredictionCorrect), main=paste("Market Direction Prediction Performance for",symbol,"(Probability Threshold, Long=",longThreshold,"Short=",shortThreshold,"Lookback=",nLookback,")"),xlab="Date",ylab="Cumulative Sum of Correct (+1) and Wrong(-1) Predictions") msgIncorrectPred <- (paste("Incorrect Predictions (out of the days when a prediction was made)",100*abs(sum(isPredictionCorrect[isPredictionCorrect==-1]))/length(isPredictionCorrect[isPredictionCorrect!=0]),"%")) msgCorrectPred <- (paste("Correct Predictions (out of the days when a prediction was made)",100*sum(isPredictionCorrect[isPredictionCorrect==1])/length(isPredictionCorrect[isPredictionCorrect!=0]),"%")) msgPercOfDaysWithPred <- (paste("Percent of days when a prediction was made",100*sum(abs(isPredictionCorrect[isPredictionCorrect!=0]))/length(isPredictionCorrect),"%")) legend("topleft",c(msgIncorrectPred,msgCorrectPred,msgPercOfDaysWithPred),bg="lightblue") } predictionPerformance <- sum(isPredictionCorrect[isPredictionCorrect==1])/length(isPredictionCorrect[isPredictionCorrect!=0]) } predictionPerformance(completeData,longThreshold,shortThreshold,nLookback,TRUE) #Parameter search resultsMat <- matrix(numeric(0), 0,3) colnames(resultsMat) <- c("Lookback","LongProbThreshold","Accuracy") for(nLookback in seq(30,240,30)){ for(longThreshold in seq(0.55,1,0.05)){ shortThreshold <- 1-longThreshold accuracy <- predictionPerformance(completeData,longThreshold,shortThreshold,nLookback*2,FALSE) resultsMat <- rbind(resultsMat,as.matrix(cbind(nLookback,longThreshold,accuracy))) print(resultsMat) } } print(resultsMat) tt <-(as.matrix(as.data.frame(cast(as.data.frame(resultsMat),Lookback~LongProbThreshold)))) rownames(tt)<-tt[,"Lookback"] heatmap.2(tt[,-1],key=TRUE,Rowv=FALSE,Colv=FALSE,xlab="Prob of Up Day Threshold",ylab="Lookback",trace="none",main=paste("Prediction Accuracy (correct predictions as % of all predictions) for ",symbol)) |

What exactly does the Lookback LongProbThreshold Accuracy mean? Its a matrix that takes a long time to print at the end of the program. I am trying to really understand stat arb and quant trading techniques and would like to start with this one.

Thanks

It is interesting how we could come up with the same idea, I like it.

I have been providing the same analysis daily for each of the s&p500 stocks with intuitive graphs. Check it out. http://www.fotetah.com

You have brought up a very excellent details , regards for the post. deebfgdekefa

Interesting article. On your market direction prediction performance chart using MLE for the probabilities what are your confidence intervals? So what are the 43% and 56% incorrect and correct, respectively, bound by?

I wonder what the odds for consecutive up days are for a single stock? A stock I just bought is up 13 straight trading days since my purchase. This says very little about how it will work out in the long run, but it seems *very* unusual to me.

So I bet my friend 50 to 1 that he couldn’t pick a stock that would trade up. I would guess that he has less than a 1% chance of winning this bet, but I wonder what the actual historical odds for this kind of trend are?

To be clear, the bet was he would get $50 for every $1 he put up if his pick traded up for 13 days straight.