Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts
In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot. The scenario is that you are fitting a model to a time series object with training data, then forecasting out, and then visually evaluating the fit with the observations that your forecast tried to duplicate. Then you want a plot that includes: the original observations, the fitted values, the forecast values, and the observations in the forecast period. The function I presented in the last post extracts all that information in a nice ggplot ready data.frame. In this post I simulate data from an Arima process, fit an incorrect model, use the function from the last post to extract the data, and then plot in ggplot.
#----------Simulate an Arima (2,1,1) Process------------- library(forecast)
set.seed(1234) y<-arima.sim(model=list(order=c(2,1,1),ar=c(0.5,.3),ma=0.3),n=144) y<-ts(y,freq=12,start=c(2000,1)) #-- Extract Training Data, Fit the Wrong Model, and Forecast yt<-window(y,end=2009.99) yfit<-Arima(yt,order=c(1,0,1)) yfor<-forecast(yfit) #---Extract the Data for ggplot using funggcast() pd<-funggcast(y,yfor) #---Plot in ggplot2 0.9 library(ggplot2) library(scales) p1a<-ggplot(data=pd,aes(x=date,y=observed)) p1a<-p1a+geom_line(col='red') p1a<-p1a+geom_line(aes(y=fitted),col='blue') p1a<-p1a+geom_line(aes(y=forecast))+geom_ribbon(aes(ymin=lo95,ymax=hi95),alpha=.25) p1a<-p1a+scale_x_date(name='',breaks='1 year',minor_breaks='1 month',labels=date_format("%b-%y"),expand=c(0,0)) p1a<-p1a+scale_y_continuous(name='Units of Y') p1a<-p1a+opts(axis.text.x=theme_text(size=10),title='Arima Fit to Simulated Data\n (black=forecast, blue=fitted, red=data, shadow=95% conf. interval)') #p1a
Created by Pretty R at inside-R.org
Reader Comments (5)
Hi while trying to execute the above code... It gave me an error while I was fitting the incorrect Arima model i.e. c(1,0,1). The error was: non-stationary AR part from CSS. This gets fixed if you change the method to "ML". So new code:
yfit<-Arima(yt,order=c(1,0,1),method="ML")
I wonder if you can explain this though ..... Thanks !
Hi Ankur- Thanks for the comment. I ran the code again a few times and was not able to reproduce the error but did get a warning on some occasions (but not all). My guess is that sometimes data that gets generated from
arima.sim()
has properties that do not lend themselves well to using conditional sum of squares to find the starting value for the maximum likelihood. Beyond that I don't know. I found that when I set the random number seedset.seed(1234)
the problem does not occur. My R session info is below, in case it has something to do with your version of R or forecast().R version 2.14.2 (2012-02-29)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] forecast_3.19 RcppArmadillo_0.2.35 Rcpp_0.9.10
[4] fracdiff_1.4-0 tseries_0.10-28 zoo_1.7-7
[7] quadprog_1.5-4 rj_1.0.0-3
loaded via a namespace (and not attached):
[1] grid_2.14.2 lattice_0.20-0 rj.gd_1.0.0-1 tools_2.14.2
My husband is a progarmmer, so I'll show him this article and I know for sure he'll be pleased and use it for his project.
Great post!!!
I am new to time series and found your function to be very helpful when dealing with dates along the x axis in ggplot2. I have a question if you don't mind, how can I apply funggcast to be able to show weekly breaks along the x axis?
I am pretty impressed by the things that you tell here about the forecast, observations and fits.