Analysis of returns after n consecutive up/down days – Predicting the Sign of Open to Close Returns

This weekend I was spammed for a “binary option trading system with 90% accuracy”. The advert caught my curiosity  in essence it detailed a method that was a variation of the well known roulette playing strategy that mathematically guarantees a profit (assuming infinite money, and no table limit).

Roulette Strategy

If you double your bet size after a loss and repeat the same bet you are guaranteed a profit, your next winning will cover all the preceding loses.

e.g Bet $1 on red, lose, Bet $2 on red, Win get $4 back ($2 is your stake, and $1 covers the loss from your first bet) giving $1 profit.

Exponential growth of lot size, no thanks 🙂

Binary options are analogous to betting on red, they offer virtually fixed odds for up or down directional bets. Naturally I want to know whats is the maximum number of consecutive up or down days in the market, how much pain would I have to suffer with this strategy.

Occurrences of n Consecutive Up or Down Days

Analysing the last 12 years of returns data for the S&P 500, the maximum consecutive number of up days is 9 (occurred in 2004-2005), the maximum consecutive number of down days is -8 which, you guessed it, occurred in 2008-2009.

So 9 days of pain should we always short the direction of the market.

Instead of enduring the 9 days of draw down, it is interesting to see what the consecutive number of up/down days says about the probability the next day is an up day. The maximum likelihood probability of an up day is count(up days)/count(up and down days). Naturally we will condition this data on the consecutive number of up/down days.

Consecutive Up or Down Days vs Maximum Likelihood Probability the next day is up

This data is fairly nice looking, for example in 2012-2013 there is a clear relationship. The more down days in a row the higher the likelihood of an up day. 6 down days implied the probability of an up day is 80%! I must raise a note of caution here, 6 down days in a row was seen less than 5 times in the year. Hence the probability estimate is based on 5 points and not statistically significant, perhaps looking at 5 years of returns might be better.

I appreciate that most people don’t trade binary options can we trade the index/stock out right, it is interesting to see what the consecutive number of up/down days says about future Open to Close Returns. The image below regresses Open to Close Returns (time t) with Consecutive Up/Down days (time t-1).

Consecutive Up or Down Days vs Next Day Open to Close Returns

Very disappointing chart, doesn’t really show much relationship between returns and consecutive up/down days. For some of the data points the up move is more probable than a down move but the magnitude of the up moves are significantly smaller than that of the down moves. These charts vary greatly by asset class and by security, single stocks have much more favorable plots.

Prediction Accuracy

The plot below shows the accuracy of using this maximum likelihood estimate approach. The model takes the last 250 days of returns and calculates the probability of an up move given that the current day has seen n consecutive days of trading. If the prob of an up move or is over a certain threshold go long, if its below a certain threshold go short.

Heatmap of Accuracy vs Model Parameters

 

The histogram on the heatmap shows that approximately half of the parameter combinations can predict the direction with accuracy greater than 50%.

Final Comments

The beauty of this approach is that it’s simple, it can be applied to any asset class but most importantly it can be applied across different time frames.

Onto the code:

?View Code RSPLUS
library("quantmod")
library("reshape")
library("gplots")
 
#Control Parameters
dataStartDate = as.Date("2000-01-01")
symbol<- "^GSPC"
longThreshold <- 0.70  #If the probability of an upday is greater than this limit go long
shortThreshold <- 1-longThreshold  #If the probability of an upday is lower than this limit go short
nLookback<-250 #Days to look back when generating the rolling probability distribution (approx 250 trading days in a year)
 
#Function to turn a boolean vector into a vector containing the consecutive num of trues or falses seen
#Will be used to calculate the consecutive number of up and down days
consecutiveTruesExtractor <- function(data){
        genNumOfConsecutiveTrues <- function(x, y) { (x+y)*y  } #Y is either 0 or 1
        upDaysCount <- Reduce(genNumOfConsecutiveTrues,data,accumulate=TRUE)
        upDaysCount <- as.vector(Lag(upDaysCount))
        upDaysCount[is.na(upDaysCount)] <- 0
 
 
        downDaysCount <- Reduce(genNumOfConsecutiveTrues,!data,accumulate=TRUE)
        downDaysCount <- as.vector(Lag(downDaysCount))
        downDaysCount[is.na(downDaysCount)] <- 0
        consecutiveTruesExtractor <- upDaysCount-downDaysCount
}
 
#Function to plot data and add regression line
doPlot <- function(x,y,title,xlabel,ylabel,ylimit){
  x<-as.vector(x)
  y<-as.vector(y)
  boxplot(y~x,main=title, xlab=xlabel,ylab=ylabel,ylim=ylimit)
  abline(lm(y~x),col="red",lwd=1.5)
}
 
#Function to calculate the percentage of updays from returns set
calcPercentageOfUpdaysFromReturnsSet <- function(data){
  calcPercentageOfUpdaysFromReturnsSet <- sum(data>0)/length(data)
}
 
#Function takes a set of returns and consecutive up down day data and aggregates it into a probability distribution
#Generated a matrix of consecutive Direction vs prob of up move
generateProbOfUpDayDistribution <- function(dataBlock){
 
 y <- as.matrix(by(dataBlock[,"OpClRet"],list(ConsecutiveDir = dataBlock[,"ConsecutiveDir"]),calcPercentageOfUpdaysFromReturnsSet)) #Prob of upmove
 x <- as.matrix(as.numeric(as.matrix(rownames(y))))
 
 res <- cbind(x,y)
 colnames(res) <- c("ConsecutiveDir","Prob")
 generateProbOfUpDayDistribution <- res
}
 
#Given current consecutive up down day data, what is the probability the current day is an up day
#For use with the rollapply function (since needs to use the past n days worth of data for generating probability distribution)
probOfUpDayForUseWithRollApply <- function(dataBlock){
  dist <- generateProbOfUpDayDistribution(head(dataBlock,-1)) #Use head to drop the last row, prevents a lookforward issue
 
  currentConsecutiveRun <- last(dataBlock[,"ConsecutiveDir"])
  probOfUpDay <- dist[dist[,"ConsecutiveDir"] == rep(coredata(currentConsecutiveRun), length(dist[,"ConsecutiveDir"])),"Prob"]
  if(!is.numeric(probOfUpDay)) {probOfUpDay <- 0.5 } #Never this many consecutive days before, dont know what will happen make up and down events equally likely
  #print(paste("Current Run:",coredata(currentConsecutiveRun),"Prob of up day:",probOfUpDay ))
  probOfUpDayForUseWithRollApply <- probOfUpDay
}
 
#Just a quick test to check that the consecutiveTruesExtractor is working as expected
#Define the input data, and define the expected output
#Check that the output of the function equals the expected output
data <-           c(0, 0, 0, 1,0, 1,1,0,0)  #0 is down day, 1 is up day
expectedOutput <- c(0,-1,-2,-3,1,-1,1,2,-1)
res <- consecutiveTruesExtractor(data)
if( identical(res,expectedOutput)){
  print("Match consecutiveTruesExtractor is correct")
} else {
  print("Error consecutiveTruesExtractor contains bugs")
}
 
 
 
 
#Download the data
symbolData <- new.env() #Make a new environment for quantmod to store data in
getSymbols(symbol, env = symbolData, src = "yahoo", from = dataStartDate)
mktdata <- eval(parse(text=paste("symbolData$",sub("^","",symbol,fixed=TRUE))))
opClRet <- (Cl(mktdata)/Op(mktdata))-1
consecutiveDir <- consecutiveTruesExtractor(as.matrix(opClRet>0))
completeData<- cbind(opClRet,consecutiveDir)
colnames(completeData) <- c("OpClRet","ConsecutiveDir")
 
 
#Plot of consecutive up down days vs next day Open Close Returns
dev.new()
par(oma=c(0,0,2,0))
par(mfrow=c(3,3))
 
for(i in seq(2012,2004,-1)){
  windowedData <- window(completeData,start=as.Date(paste(i,"-01-01",sep="")),end=as.Date(paste(i+1,"-01-01",sep="")))
  doPlot(windowedData$ConsecutiveDir,windowedData$OpClRet,paste("Consecutive Up / Down days vs Open close Return (",i,",",i+1,")"),"Consecutive Up or Down days","Open Close Returns",c(-0.07,0.07))
}
title(main=paste("Symbol",symbol),outer=T)
 
#Plot of consecutive up down days vs the maximum likelihood probability that the next day is an up day
dev.new()
par(oma=c(0,0,2,0))
par(mfrow=c(3,3))
 
for(i in seq(2012,2004,-1)){
  windowedData <- window(completeData,start=as.Date(paste(i,"-01-01",sep="")),end=as.Date(paste(i+1,"-01-01",sep="")))
  y <-as.matrix(by(as.vector(windowedData$OpClRet),list(ConsecutiveDir = windowedData$ConsecutiveDir),calcPercentageOfUpdaysFromReturnsSet)) #Prob of upmove
  x <- as.matrix(as.numeric(as.matrix(rownames(y)))) #Consecutive up or down days
  plot(cbind(x,y),main=paste("Consecutive Up / Down days vs Next day Dir (",i,",",i+1,")"), xlab="Consecutive Up or Down days",ylab="Conditional Probability of an Up day")
  abline(lm(y~x),col="red",lwd=1.5)
}
title(main=paste("Symbol",symbol),outer=T)
 
#Plot of consecutive up down days vs the number of occurences seen
dev.new()
par(oma=c(0,0,2,0))
par(mfrow=c(3,3))
 
for(i in seq(2012,2004,-1)){
  windowedData <- window(completeData,start=as.Date(paste(i,"-01-01",sep="")),end=as.Date(paste(i+1,"-01-01",sep="")))
  y <-abs(as.matrix(by(windowedData$ConsecutiveDir,list(ConsecutiveDir = windowedData$ConsecutiveDir),sum))) #Count the number of occurences of each consecutive run
  x <- as.matrix(as.numeric(as.matrix(rownames(y))))
  plot(y,xaxt="n",main=paste("Consecutive Up / Down days vs Next day Dir (",i,",",i+1,")"), xlab="Consecutive Up or Down days",ylab="Occurences of consecutive run",type="l")
  axis(1, at=1:length(x),labels=x)
}
title(main=paste("Symbol",symbol),outer=T)
 
predictionPerformance <- function(completeData,longThreshold,shortThreshold,nLookback,displayPlot){
    #Calcuate the probabiliy of an up day using a nLookback day window
    rollingProbOfAnUpday <- rollapply(completeData,FUN=probOfUpDayForUseWithRollApply,align="right",fill=NA,width=nLookback,by.column=FALSE)
    rollingProbOfAnUpday <- as.matrix(rollingProbOfAnUpday)
    print(head(rollingProbOfAnUpday))
    colnames(rollingProbOfAnUpday) <- "ProbTodayIsAnUpDay"
    completeData <- cbind(completeData,rollingProbOfAnUpday)
 
    suggestedTradeDir <- rollingProbOfAnUpday #Just to copy the structure
    colnames(suggestedTradeDir) <- "SuggestedTradeDir"
    suggestedTradeDir[rollingProbOfAnUpday>longThreshold] <- 1  #Long Trade
    suggestedTradeDir[rollingProbOfAnUpday<shortThreshold] <- -1 #Short Trade
    suggestedTradeDir[rollingProbOfAnUpday<longThreshold & rollingProbOfAnUpday>shortThreshold] <- 0 #Do nothing
    completeData <- cbind(completeData,suggestedTradeDir)
 
    isPredictionCorrect <- suggestedTradeDir #Just to copy structure
    isPredictionCorrect <- sign(completeData$SuggestedTradeDir * completeData$OpClRet) #sign(0) is 0 so will capture no trades as well
    isPredictionCorrect[is.na(isPredictionCorrect)] <- 0
    isPredictionCorrect[is.nan(isPredictionCorrect)] <- 0
    if(displayPlot){
      dev.new()
      plot(cumsum(isPredictionCorrect), main=paste("Market Direction Prediction Performance for",symbol,"(Probability Threshold, Long=",longThreshold,"Short=",shortThreshold,"Lookback=",nLookback,")"),xlab="Date",ylab="Cumulative Sum of Correct (+1) and Wrong(-1) Predictions")
      msgIncorrectPred <- (paste("Incorrect Predictions (out of the days when a prediction was made)",100*abs(sum(isPredictionCorrect[isPredictionCorrect==-1]))/length(isPredictionCorrect[isPredictionCorrect!=0]),"%"))
      msgCorrectPred <- (paste("Correct Predictions (out of the days when a prediction was made)",100*sum(isPredictionCorrect[isPredictionCorrect==1])/length(isPredictionCorrect[isPredictionCorrect!=0]),"%"))
      msgPercOfDaysWithPred <- (paste("Percent of days when a prediction was made",100*sum(abs(isPredictionCorrect[isPredictionCorrect!=0]))/length(isPredictionCorrect),"%"))
      legend("topleft",c(msgIncorrectPred,msgCorrectPred,msgPercOfDaysWithPred),bg="lightblue")
    }
    predictionPerformance <- sum(isPredictionCorrect[isPredictionCorrect==1])/length(isPredictionCorrect[isPredictionCorrect!=0])
}
 
 
predictionPerformance(completeData,longThreshold,shortThreshold,nLookback,TRUE)
 
#Parameter search
resultsMat <- matrix(numeric(0), 0,3)
colnames(resultsMat) <- c("Lookback","LongProbThreshold","Accuracy")
for(nLookback in seq(30,240,30)){
  for(longThreshold in seq(0.55,1,0.05)){
    shortThreshold <- 1-longThreshold
    accuracy <- predictionPerformance(completeData,longThreshold,shortThreshold,nLookback*2,FALSE)
    resultsMat <- rbind(resultsMat,as.matrix(cbind(nLookback,longThreshold,accuracy)))
    print(resultsMat)
  }
}
print(resultsMat)
tt <-(as.matrix(as.data.frame(cast(as.data.frame(resultsMat),Lookback~LongProbThreshold))))
rownames(tt)<-tt[,"Lookback"]
heatmap.2(tt[,-1],key=TRUE,Rowv=FALSE,Colv=FALSE,xlab="Prob of Up Day Threshold",ylab="Lookback",trace="none",main=paste("Prediction Accuracy (correct predictions as % of all predictions) for ",symbol))

 

Skewness Revisited

In my last post http://gekkoquant.com/2013/03/03/trend-following-skewness-signal/ I suggested that measuring the skewness of asset returns can possibly be used to identify trends. Or mathematically more accurately put, it can identify when the return distribution is symmetric and hence NOT in a trending environment (assuming returns are Gaussian with 0 mean).

This post presents a simple regression of Skewness vs Asset returns. The skewness is calculated at time t using OpenCloseReturns[t-1,t-2,…..,t-lookback]. It is then regressed against OpenCloseReturn[t]. Where t donates today, and t-1 donates yesterday.

The data set is 2000 to present.

The red line in the above plot is a linear regression. It’s clear to see that the skewness doesn’t explain the Open Close Returns. I must eat humble pie, the returns seen in the previous post are down to a lucky parameter selection. Thanks must go to Pietro for testing over a longer period.

Onto the code:

?View Code RSPLUS
library("quantmod")
library("PerformanceAnalytics")
library(e1071)     #For the skewness command
 
 
#Script parameters
symbol <- "^GSPC"     #Symbol
 
#Specify dates for downloading data
startDate = as.Date("2000-01-01") #Specify what date to get the prices from
symbolData <- new.env() #Make a new environment for quantmod to store data in
getSymbols(symbol, env = symbolData, src = "yahoo", from = startDate)
mktdata <- eval(parse(text=paste("symbolData$",sub("^","",symbol,fixed=TRUE))))
mktdata <- head(mktdata,-1) #Hack to fix some stupid duplicate date problem with yahoo
OpClRet  <- (Cl(mktdata)/Op(mktdata))    - 1
colnames(OpClRet) <- "OpClRet"
mktdata <- cbind(mktdata,OpClRet)
 
 
getSkewnessFromReturns <- function(mktdata,skewLookback){
   #Calculate the skew
  rollingSkew <- Lag(rollapply(mktdata$OpClRet ,FUN="skewness",width=skewLookback, align="right",na.pad=TRUE),1)
   colnames(rollingSkew) <- "Skew"
  getSkewnessFromReturns <- na.omit(cbind(mktdata$OpClRet,rollingSkew))
}
 
 
 
dev.new()
par(mfrow=c(3,3))
startSkew <-30
stepSize <- 30
numOfSteps <- 9
 
doPlot <- function(x,y,title,xlabel,ylabel){
  #Function to plot data and add regression line
  plotData <-cbind(x,y)
  print(head(plotData))
  colnames(plotData) <-c("x","y")
  plot(coredata(plotData),type="p",main=title, xlab=xlabel,ylab=ylabel)
  abline(lm(y~x),col="red",lwd=1.5)
}
 
 
for(i in seq(startSkew,numOfSteps*stepSize,stepSize)){
dat <- getSkewnessFromReturns(mktdata,i)
colnames(dat) <- c("OpClRet","Skew")
doPlot(dat$Skew,dat$OpClRet,paste("Skewness vs Open Close Returns (lookback=",i,")"),"Skewness","Open Close Returns")
}

 

$20mm seed capital tournament, “The Sharpe Ratio Shootout”

Hi, I’ve been asked to advertise this seed capital tournament. I get asked to advertise all sorts of things but this looks very promising for beginner quant traders to launch their careers.

Looking forwards to competing against some of you 🙂

Announcing: $20mm seed capital tournament, “The Sharpe Ratio Shootout” Registration ends March 15th, 2013.

Battle-Fin is launching its 4th systematic trading tournament.  Tournament Sponsors have pledged up to $20mm in allocations.

Battle-Fin uses cutting edge real-time tournaments to democratically identify tomorrow’s best trading strategies across asset classes.

The idea is to harness the power of the internet to identify trading strategies.

If you are a quantitative trader with a successful investment strategy, register for our next tournament by March 15th, 2013. There is no cost to enter and we do not ask to own trader IP.

REGISTER AT: www.battlefin.com

You can also contact us at (212) 201-5376 or info@battlefin.com

BATTE-FIN IN THE NEWS

Business Week recently wrote a feature article on Battle-Fin highlighting the 4 winners that are currently trading their allocations.
Click: http://www.businessweek.com/articles/2012-12-20/the-hedge-fund-hunger-games

Here is a recent Bloomberg TV interview on Market Makers with Battle Fin.

http://bloom.bg/XFIluz
Like us on Facebook at: http://www.facebook.com/BattleFin

Trend Following – Skewness Signal

Oxford Dictionary Definition of a trend

  • noun: a general direction in which something is developing or changing
Often people try to capture a trend using technical indicators such as moving average cross overs. If there is a trend this technique is generally good at capturing it. However if there isn’t a trend it becomes difficult to identify side ways markets with moving averages (see http://gekkoquant.com/2012/08/29/parameter-optimisation-backtesting-part-2/ moving averages performing excellently well during 2009 when there was a real trend).

Is there a better way to identify a trend? Potentially. A common modelling assumption in finance is to say that stock returns follow Brownian motion and that over short periods the mean return is 0, or in other words daily returns are Gaussian distributed with 0 mean.

This post will use the same assumption, asset returns are Guassian and hence returns are symmetrically distributed around their mean (assume 0 mean). Symmetric distributions have a skewness of 0, any deviation from 0 indicates that one of the tails is beginning to get bigger and possibly that the stock is trending.

Skewness is a measure of assymetery for a probability distribution, the skewness tells us the amount and direction of the skew. This strategy will take a rolling distribution of returns and calculate the skew for those returns. The skew will then be used to take trades.

Strategy:
  • If skewness > UptrendSkewLimit then go Long
  • If skewness < DowntrendSkewLimit then go Short
  • If skewness between UptrendSkewLimit and DowntrendSkewLimit then returns are symmetrical, it’s a sideways market do nothing
 

 

Trend Following – Annualized Sharpe Ratio (Rf=0%) 0.7162438

S&P500 Long Open Close Returns – Annualized Sharpe Ratio (Rf=0%) 0.2129997

?View Code RSPLUS
library("quantmod")
library("PerformanceAnalytics")
library(e1071)     #For the skewness command
 
#Model paramters
nLookback <- 30 #When calculating the rolling skew use the nLookback number of days
UptrendSkewLimit <- 0.3 #If the rolling skew is greater than this value go long
DowntrendSkewLimit <- -0.5 #If the rolling skew is lower than this value go short
#If the rolling skew is between the two limits do nothing, the skew is too weak to indicate a trend
 
#Script parameters
symbol <- "^GSPC"     #Symbol
 
#Specify dates for downloading data
startDate = as.Date("2005-01-01") #Specify what date to get the prices from
symbolData <- new.env() #Make a new environment for quantmod to store data in
getSymbols(symbol, env = symbolData, src = "yahoo", from = startDate)
mktdata <- eval(parse(text=paste("symbolData$",sub("^","",symbol,fixed=TRUE))))
mktdata <- head(mktdata,-1) #Hack to fix some stupid duplicate date problem with yahoo
dayOpClRet <- Cl(mktdata)/Op(mktdata)    - 1
cat("About to calculate the rolling skew")
 
#Lets calculate the rolling skew
#Lag the rolling skew by one day so the skew measured at the close of day T is sifted to day T+1
#The skew will be used to determine the trade at the open
rollingSkew <- Lag(rollapply(dayOpClRet,FUN="skewness",width=nLookback, align="right"),1)
 
#Possible improvement - Do a exponential moving average on the skew signal to smooth it
#rollingSkew <- EMA(rollingSkew,n=nLookback)
 
longSignals <- (rollingSkew>UptrendSkewLimit)
longReturns <- longSignals*dayOpClRet
shortSignals <- (rollingSkew<DowntrendSkewLimit)
shortReturns <- -1*shortSignals*dayOpClRet
totalReturns <- longReturns + shortReturns
#Uncomment the line below to increase the position size for larger skews
#totalReturns <- totalReturns * (abs(rollingSkew)+1)
totalReturns[is.na(totalReturns)] <- 0
 
GEKKORed <- rgb(255/255, 0/255, 0/255, 0.1)
GEKKOGreen <- rgb(0/255, 255/255, 0/255, 0.1)
 
dev.new()
par(mfrow=c(3,1))
plot(Cl(mktdata), main="Close of S&P 500")
lines(longSignals*max(Cl(mktdata)), col=(GEKKOGreen),type="h")
lines(shortSignals*max(Cl(mktdata)), col=(GEKKORed),type="h")
plot(rollingSkew)
abline(UptrendSkewLimit,0,col="green")
abline(DowntrendSkewLimit,0,col="red")
plot(cumsum(totalReturns), main="Cumulative Returns - Trend Following")
 
 
#### Performance Analysis ###
colnames(dayOpClRet) <- "Long IndexOpCloseRet"
zooTradeVec <- cbind(as.zoo(totalReturns),as.zoo(dayOpClRet)) #Convert to zoo object
colnames(zooTradeVec) <- c("Trend Following","S&P500 Long Open Close Returns")
zooTradeVec <- na.omit(zooTradeVec)
#Lets see how all the strategies faired against the index
dev.new()
charts.PerformanceSummary(zooTradeVec,main="Performance of Trend Following",geometric=FALSE)
 
#Calculate the sharpe ratio
cat("Sharpe Ratio")
print(SharpeRatio.annualized(zooTradeVec))