Stock Data Download & Saving R

The quantmod library will be used to download prices from yahoo. This script allows a csv of tickers to be loaded and automatically downloaded. The script also has primitive error handling and is capable of retrying the download of the price history for a stock. Sometimes yahoo randomly returns error 404 for stock downloads, this script will retry a couple of times using getting around this problem.

Any data downloaded will be saved to a RData file, this allows the data to be analysed by another script. Rather than having each trading strategy maintaining its own stock data it is preferential to have one script (this one) to download the data which can then be shared across strategies.

From now on the strategies examined on this site will their price data from the output of this script.

Attached is a list of sp500 tickers (taken from the brilliant blog.quanttrader.org)

Onto the code:

?View Code RSPLUS
#install.packages("quantmod")
library("quantmod")
#Script to download prices from yahoo
#and Save the prices to a RData file
#The tickers will be loaded from a csv file
 
#Script Parameters
tickerlist <- "sp500.csv"  #CSV containing tickers on rows
savefilename <- "stockdata.RData" #The file to save the data in
startDate = as.Date("2005-01-13") #Specify what date to get the prices from
maxretryattempts <- 5 #If there is an error downloading a price how many times to retry
 
#Load the list of ticker symbols from a csv, each row contains a ticker
stocksLst <- read.csv("sp500.csv", header = F, stringsAsFactors = F)
stockData <- new.env() #Make a new environment for quantmod to store data in
nrstocks = length(stocksLst[,1]) #The number of stocks to download
 
#Download all the stock data
for (i in 1:nrstocks){
    for(t in 1:maxretryattempts){
 
       tryCatch(
           {
               #This is the statement to Try
               #Check to see if the variables exists
               #NEAT TRICK ON HOW TO TURN A STRING INTO A VARIABLE
               #SEE  http://www.r-bloggers.com/converting-a-string-to-a-variable-name-on-the-fly-and-vice-versa-in-r/
                if(!is.null(eval(parse(text=paste("stockData$",stocksLst[i,1],sep=""))))){
                    #The variable exists so dont need to download data for this stock
                    #So lets break out of the retry loop and process the next stock
                    #cat("No need to retry")
                    break
                }
 
              #The stock wasnt previously downloaded so lets attempt to download it
              cat("(",i,"/",nrstocks,") ","Downloading ", stocksLst[i,1] , "\t\t Attempt: ", t , "/", maxretryattempts,"\n")
              getSymbols(stocksLst[i,1], env = stockData, src = "yahoo", from = startDate)
           }
        #Specify the catch function, and the finally function
       , error = function(e) print(e))
     }
}
 
#Lets save the stock data to a data file
tryCatch(
    {
    save(stockData, file=savefilename)
    cat("Sucessfully saved the stock data to %s",savefilename)
    }
    , error = function(e) print(e))

5 thoughts on “Stock Data Download & Saving R

  1. Pingback: Trading Strategy – Buy on Gap (EPChan) | Gekko Quant – Quantitative Trading

  2. Thanks a lot for sharing.

    Before seeing your blog post I had seen the one by quanttrader that you refer to. Below I copy the quanttrader code adapted to run on both windows and linux (doMC is not available on Windows, so I used doSNOW). I copy it here for the record because quanttrader appears to have stopped posting comments and new blog posts.

    The code below worked. Your code worked too. Both codes rely on having a list of the tickers. However, this list appears to get modified every couple of months (mergers, bankruptcies, downgrading, etc. I guess). And currently it fails for a half a dozen tickers or so.

    Would you have a way to create a list of the SP500 tickers “on the fly” ? That would be great to have. Just a thought.

    Also, if you would please post a one- or two-liner to show how to retrieve the data saved, as you have done, in an “environment”. I’m not familiar with new.env(), so while I have saved the data, I don’t know how to access it afterwards — I suspect it’s trivial. If you have time, it would be great.

    ### Define directories
    if(.Platform$OS.type == “windows”) {
    currentdir <- "c:/R/sp500"
    } else {
    currentdir <- "~/R/sp500"
    }
    setwd(currentdir)

    ### Download S&P 500 Data to R
    ### Download stock prices of companies included in S&P 500 index.
    ### Download from finance.yahoo
    ### sp500.csv contains a list of nearly all 500 companies
    ### loop to download all the data for every company in the list.
    ### linux version by: quanttrader, April 2012
    ### http://blog.quanttrader.org/2012/04/download-prices-from-yahoo-in-parallel/
    ### windows version by: annoporci, March 2013

    ### Parallel implementation
    ### package doMC not available on Windows

    library(quantmod)
    library(tseries)
    library(timeDate)
    library(foreach)
    symbols = read.csv("sp500.csv", header = FALSE, stringsAsFactors = FALSE)
    nStocks = length(symbols[,1])
    dateStart = "1999-12-31"

    # doSNOW library also available on LINUX
    # efficiency untested

    if(.Platform$OS.type == "windows") {

    # WINDOWS

    library(doSNOW)
    cl <- makeCluster(8) # number of CPU cores to be used
    registerDoSNOW(cl)
    z <- foreach(i = 1:nStocks, .combine = merge.xts) %dopar%
    {
    cat("Downloading ", i, " out of ", nStocks , "\n")
    x <- try(get.hist.quote(instrument = symbols[i,],
    start = dateStart,
    quote = "AdjClose",
    retclass = "zoo",
    quiet = TRUE),
    TRUE)
    colnames(x) <- symbols[i,1]
    x <- as.xts(x)
    ### as.xts(x) is more efficient
    }
    save(x,file="sp500.RData")
    stopCluster(cl)
    registerDoSNOW()

    } else {

    # LINUX

    library(doMC)
    ncores <- getDoParWorkers() # query number of cores
    registerDoMC(cores=ncores) # number of CPU cores to be used
    z <- foreach(i = 1:nStocks, .combine = merge.xts) %dopar%
    {
    cat("Downloading ", i, " out of ", nStocks , "\n")

    x <- try(get.hist.quote(instrument = symbols[i,],
    start = dateStart,
    quote = "AdjClose",
    retclass = "zoo",
    quiet = TRUE),
    TRUE)
    colnames(x) <- symbols[i,1]
    x <- as.xts(x)
    ### as.xts(x) is more efficient
    }
    save(x,file="sp500linux.RData")
    registerDoMC()

    }

  3. after reading all info into stockData enviorment, how can I retrive them iteratively. for example, list all them in a plot?

    for all t in env,
    plot t?

    thanks

  4. Pingback: Retreiving data from LSE – R in the UK Market

Leave a Reply

Your email address will not be published. Required fields are marked *