%\documentclass[article,nojss]{jss} %\DeclareGraphicsExtensions{.pdf,.eps} %%\newcommand{\mysection}[2]{\subsubsection[#2]{\textbf{#1}}} %\let\mysection=\subsubsection %\renewcommand{\jsssubsubsec}[2][default]{\vskip \preSskip% % \pdfbookmark[3]{#1}{Subsubsection.\thesubsubsection.#1}% % \refstepcounter{subsubsection}% % {\large \textbf{\textit{#2}}} \nopagebreak % \vskip \postSskip \nopagebreak} %% -*- encoding: utf-8 -*- %\VignetteIndexEntry{xts FAQ} %\VignetteDepends{zoo} \documentclass{article} % \usepackage{Rd} \usepackage{Sweave} \usepackage{hyperref} \hypersetup{colorlinks,% citecolor=black,% linkcolor=blue,% urlcolor=blue,% } %%\encoding{UTF-8} %%\usepackage[UTF-8]{inputenc} % \newcommand{\q}[1]{\section*{#1}\addcontentsline{toc}{subsection}{#1}} \author{xts Deveopment Team% \footnote{Contact author: Joshua M. Ulrich \email{josh.m.ulrich@gmail.com}} \footnote{Thanks to Alberto Giannetti and Michael R. Weylandt for their many contributions.} } \title{\bf xts FAQ} %\Keywords{irregular time series, time index, daily data, weekly data, returns} %\Abstract{ % This is a collection of frequently asked questions (FAQ) about the % \pkg{xts} package together with their answers. %} \begin{document} \SweaveOpts{concordance=TRUE, engine=R, eps=FALSE} %\SweaveOpts{engine=R, eps=FALSE} <>= library("xts") Sys.setenv(TZ="GMT") @ \makeatletter \makeatother \maketitle \tableofcontents \q{What is \pkg{xts}?} % \pkg{xts} is an \pkg{R} package offering a number of functionalities to work on time-indexed data. \pkg{xts} extends \pkg{\pkg{zoo}}, another popular package for time-series analysis. % should point to the zoo FAQ here (or at some early point) \q{Why should I use \pkg{xts} rather than \pkg{zoo} or another time-series package?} % The main benefit of \pkg{xts} is its seamless compatibility with other packages using different time-series classes (\pkg{timeSeries}, \pkg{zoo}, ...). In addition, \pkg{xts} allows the user to add custom attributes to any object. See the main \pkg{xts} vignette for more information. \q{How do I install \pkg{xts}?} % \pkg{xts} depends on \pkg{zoo} and suggests some other packages. You should be able to install \pkg{xts} and all the other required components by simply calling \code{install.packages('pkg')} from the \pkg{R} prompt. \q{I have multiple .csv time-series files that I need to load in a single \pkg{xts} object. What is the most efficient way to import the files?} % If the files have the same format, load them with \code{read.zoo} and then call \code{rbind} to join the series together; finally, call \code{as.xts} on the result. Using a combination of \code{lapply} and \code{do.call} can accomplish this with very little code: <>= filenames <- c("a.csv", "b.csv", "c.csv") sample.xts <- as.xts(do.call("rbind", lapply(filenames, read.zoo))) @ \q{Why is \pkg{xts} implemented as a matrix rather than a data frame?} % \pkg{xts} uses a matrix rather than data.frame because: \begin{enumerate} \item \pkg{xts} is a subclass of \pkg{zoo}, and that's how \pkg{zoo} objects are structured; and \item matrix objects have much better performance than data.frames. \end{enumerate} \q{How can I simplify the syntax when referring to \pkg{xts} object column names?} % \code{with} allows you to use the colmn names while avoiding the full square brackets syntax. For example: <>= lm(sample.xts[, "Res"] ~ sample.xts[, "ThisVar"] + sample.xts[, "ThatVar"]) @ can be converted to <>= with(sample.xts, lm(Res ~ ThisVar + ThatVar)) @ \q{How can I replace the zeros in an \pkg{xts} object with the last non-zero value in the series?} % Convert the zeros to \code{NA} and then use \code{na.locf}: <<>>= sample.xts <- xts(c(1:3, 0, 0, 0), as.POSIXct("1970-01-01")+0:5) sample.xts[sample.xts==0] <- NA cbind(orig=sample.xts, locf=na.locf(sample.xts)) @ \q{How do I create an \pkg{xts} index with millisecond precision?} % Milliseconds in \pkg{xts} indexes are stored as decimal values. This example builds an index spaced by 100 milliseconds, starting at the current system time: <<>>= data(sample_matrix) sample.xts <- xts(1:10, seq(as.POSIXct("1970-01-01"), by=0.1, length=10)) @ \q{I have a millisecond-resolution index, but the milliseconds aren't displayed. What went wrong?} % Set the \code{digits.secs} option to some sub-second precision. Continuing from the previous example, if you are interested in milliseconds: <<>>= options(digits.secs=3) head(sample.xts) @ \q{I set \code{digits.sec=3}, but \pkg{R} doesn't show the values correctly.} % Sub-second values are stored with approximately microsecond precision. Setting the precision to only 3 decimal hides the full index value in microseconds and might be tricky to interpret depending how the machine rounds the millisecond (3rd) digit. Set the \code{digits.secs} option to a value higher than 3 or convert the date-time to numeric and use \code{print}'s \code{digits} argument, or \code{sprintf} to display the full value. For example: <<>>= dt <- as.POSIXct("2012-03-20 09:02:50.001") print(as.numeric(dt), digits=20) sprintf("%20.10f", dt) @ \q{I am using \code{apply} to run a custom function on my \pkg{xts} object. Why does the returned matrix have different dimensions than the original one?} % When working on rows, \code{apply} returns a transposed version of the original matrix. Simply call \code{t} on the returned matrix to restore the original dimensions: <>= sample.xts.2 <- xts(t(apply(sample.xts, 1, myfun)), index(sample.xts)) @ \q{I have an \pkg{xts} object with varying numbers of observations per day (e.g., one day might contain 10 observations, while another day contains 20 observations). How can I apply a function to all observations for each day?} % You can use \code{apply.daily}, or \code{period.apply} more generally: <<>>= sample.xts <- xts(1:50, seq(as.POSIXct("1970-01-01"), as.POSIXct("1970-01-03")-1, length=50)) apply.daily(sample.xts, colMeans) period.apply(sample.xts, endpoints(sample.xts, "days"), colMeans) period.apply(sample.xts, endpoints(sample.xts, "hours", 6), colMeans) @ \q{How can I process daily data for a specific time subset?} % First use time-of-day subsetting to extract the time range you want to work on (note the leading \code{"T"} and leading zeros are required for each time in the range: \code{"T06:00"}), then use \code{apply.daily} to apply your function to the subset: <>= apply.daily(sample.xts['T06:00/T17:00',], colMeans) @ \q{How can I analyze my irregular data in regular blocks, adding observations for each regular block if one doesn't exist in the origianl time-series object?} % Use \code{align.time} to round-up the indexes to the periods you are interested in, then call \code{period.apply} to apply your function. Finally, merge the result with an empty xts object that contains all the regular index values you want: <<>>= sample.xts <- xts(1:6, as.POSIXct(c("2009-09-22 07:43:30", "2009-10-01 03:50:30", "2009-10-01 08:45:00", "2009-10-01 09:48:15", "2009-11-11 10:30:30", "2009-11-11 11:12:45"))) # align index into regular (e.g. 3-hour) blocks aligned.xts <- align.time(sample.xts, n=60*60*3) # apply your function to each block count <- period.apply(aligned.xts, endpoints(aligned.xts, "hours", 3), length) # create an empty xts object with the desired regular index empty.xts <- xts(, seq(start(aligned.xts), end(aligned.xts), by="3 hours")) # merge the counts with the empty object head(out1 <- merge(empty.xts, count)) # or fill with zeros head(out2 <- merge(empty.xts, count, fill=0)) @ \q{Why do I get a \pkg{zoo} object when I call \code{transform} on my \pkg{xts} object?} % There's no \pkg{xts} method for \code{transform}, so the \pkg{zoo} method is dispatched. The \pkg{zoo} method explicitly creates a new \pkg{zoo} object. To convert the transformed object back to an \pkg{xts} object wrap the \code{transform} call in \code{as.xts}: <>= sample.xts <- as.xts(transform(sample.xts, ABC=1)) @ You might also have to reset the index timezone: <>= tzone(sample.xts) <- Sys.getenv("TZ") @ \q{Why can't I use the \code{\&} operator in \pkg{xts} objects when querying dates?} % \code{"2011-09-21"} is not a logical vector and cannot be coerced to a logical vector. See \code{?"\&"} for details. \pkg{xts}' ISO-8601 style subsetting is nice, but there's nothing we can do to change the behavior of \code{.Primitive("\&")}. You can do something like this though: <>= sample.xts[sample.xts$Symbol == "AAPL" & index(sample.xts) == as.POSIXct("2011-09-21"),] @ or: <>= sample.xts[sample.xts$Symbol == "AAPL"]['2011-09-21'] @ \q{How do I subset an \pkg{xts} object to only include weekdays (excluding Saturday and Sundays)?} % Use \code{.indexwday} to only include Mon-Fri days: <<>>= data(sample_matrix) sample.xts <- as.xts(sample_matrix) wday.xts <- sample.xts[.indexwday(sample.xts) %in% 1:5] head(wday.xts) @ \q{I need to quickly convert a data.frame that contains the time-stamps in one of the columns. Using \code{as.xts(Data)} returns an error. How do I build my \pkg{xts} object?} % The \code{as.xts} function assumes the date-time index is contained in the \code{rownames} of the object to be converted. If this is not the case, you need to use the \code{xts} constructor, which requires two arguments: a vector or a matrix carrying data and a vector of type \code{Date}, \code{POSIXct}, \code{chron}, \ldots, supplying the time index information. If you are certain the time-stamps are in a specific column, you can use: <<>>= Data <- data.frame(timestamp=as.Date("1970-01-01"), obs=21) sample.xts <- xts(Data[,-1], order.by=Data[,1]) @ If you aren't certain, you need to explicitly reference the column name that contains the time-stamps: <<>>= Data <- data.frame(obs=21, timestamp=as.Date("1970-01-01")) sample.xts <- xts(Data[,!grepl("timestamp",colnames(Data))], order.by=Data$timestamp) @ \q{I have two time-series with different frequency. I want to combine the data into a single \pkg{xts} object, but the times are not exactly aligned. I want to have one row in the result for each ten minute period, with the time index showing the beginning of the time period.} % \code{align.time} creates evenly spaced time-series from a set of indexes, \code{merge} ensure two time-series are combined in a single \pkg{xts} object with all original columns and indexes preserved. The new object has one entry for each timestamp from both series and missing values are replaced with \code{NA}. <>= x1 <- align.time(xts(Data1$obs, Data1$timestamp), n=600) x2 <- align.time(xts(Data2$obs, Data2$timestamp), n=600) merge(x1, x2) @ \q{Why do I get a warning when running the code below?} <<>>= data(sample_matrix) sample.xts <- as.xts(sample_matrix) sample.xts["2007-01"]$Close <- sample.xts["2007-01"]$Close + 1 #Warning message: #In NextMethod(.Generic) : # number of items to replace is not a multiple of replacement length @ % This code creates two calls to the subset-replacement function \code{xts:::`[<-.xts`}. The first call replaces the value of \code{Close} in a temporary copy of the first row of the object on the left-hand-side of the assignment, which works fine. The second call tries to replace the first \emph{element} of the object on the left-hand-side of the assignment with the modified temporary copy of the first row. This is the problem. For the command to work, there needs to be a comma in the first subset call on the left-hand-side: <>= sample.xts["2007-01",]$Close <- sample.xts["2007-01"]$Close + 1 @ This isn't encouraged, because the code isn't clear. Simply remember to subset by column first, then row, if you insist on making two calls to the subset-replacement function. A cleaner and faster solution is below. It's only one function call and it avoids the \code{\$} function (which is marginally slower on xts objects). <>= sample.xts["2007-01","Close"] <- sample.xts["2007-01","Close"] + 1 @ %%% What is the fastest way to subset an xts object? \end{document}