Plotting forecast() objects in ggplot part 1: Extracting the Data
Lately I've been using Rob J Hyndman's excellent forecast package. The package comes with some built in plotting functions but I found I wanted to customize and make my own plots in ggplot. In order to do that, I need a generalizable function that will extract all the data I want (forecasts, fitted values, training data, actual observations in the forecast period, confidence intervals, et cetera) and place it into a data.frame with a properly formatted date field (ie, not a ts() object).
The function below does all that and should work for any forecast object (though I've only tested it on Arima() outputs). The only arguments it takes are the original observations and the forecast object (whatever results from calling forecast()). In my next post I'll give some examples of plotting the results using ggplot and explain why I wanted more than the default plot.forecast() function.
#--Produces a data.frame with the Source Data+Training Data, Fitted Values+Forecast Values, forecast data Confidence Intervals funggcast<-function(dn,fcast){ require(zoo) #needed for the 'as.yearmon()' function en<-max(time(fcast$mean)) #extract the max date used in the forecast #Extract Source and Training Data ds<-as.data.frame(window(dn,end=en)) names(ds)<-'observed' ds$date<-as.Date(time(window(dn,end=en))) #Extract the Fitted Values (need to figure out how to grab confidence intervals) dfit<-as.data.frame(fcast$fitted) dfit$date<-as.Date(time(fcast$fitted)) names(dfit)[1]<-'fitted' ds<-merge(ds,dfit,all.x=T) #Merge fitted values with source and training data #Exract the Forecast values and confidence intervals dfcastn<-as.data.frame(fcast) dfcastn$date<-as.Date(as.yearmon(row.names(dfcastn))) names(dfcastn)<-c('forecast','lo80','hi80','lo95','hi95','date') pd<-merge(ds,dfcastn,all.x=T) #final data.frame for use in ggplot return(pd) }
Created by Pretty R at inside-R.org
Reader Comments (6)
Thanks for showing how to do this Frank. Just a couple of comments. You shouldn't need to pass the original observations as they are stored as component x in the forecast object. Also, if the forecast object contains other prediction intervals than 80 and 95% intervals, this function will cause an error. Finally, rather than use as.Date() on the row names, it would be simpler (and probably more robust) to use times() on the mean component.
Hi Rob-
Thanks for the feedback! The idea behind including the original observations was that they also contain the observations during the forecast period. I suppose to make it simpler though I could just have the user pass the those observations rather than the whole series.
Hello Rob, i'm using ur function via R 3.1 on Windows 8 64 bits
Your function is very helpful since i'm also conducting the ARIMA forecast
however, it seems that there's some bug in the function ?
after running with my time series data
this function left the "NA" in all forecast value
so i changed
pd<-merge(ds,dfcastn,all.x=T)
to
pd<-merge(ds,dfcastn,all=TRUE)
and all work fine
Here is the debugged function
funggcast<-function(dn,fcast){
require(zoo) #needed for the 'as.yearmon()' function
en<-max(time(fcast$mean)) #extract the max date used in the forecast
#Extract Source and Training Data
ds<-as.data.frame(window(dn,end=en))
names(ds)<-'observed'
ds$date<-as.Date(time(window(dn,end=en)))
#Extract the Fitted Values (need to figure out how to grab confidence intervals)
dfit<-as.data.frame(fcast$fitted)
dfit$date<-as.Date(time(fcast$fitted))
names(dfit)[1]<-'fitted'
ds<-merge(ds,dfit,all.x=T) #Merge fitted values with source and training data
#Exract the Forecast values and confidence intervals
dfcastn<-as.data.frame(fcast)
dfcastn$date<-as.Date(as.yearmon(row.names(dfcastn)))
names(dfcastn)<-c('forecast','lo80','hi80','lo95','hi95','date')
#pd<-merge(ds,dfcastn,all.x=T) #final data.frame for use in ggplot
#Changed
pd<-merge(ds,dfcastn,all=TRUE)
return(pd)
}
Thanks a lot for the great content here. If you are looking for some writing services, make sure to check this best writing services review here. It will help you avoid fake & scam services out there. Enjoy.
Hello Frank,
With some extra research, i think it's easier to use autoplot as it's a member of ggplot2. It gives you ggplot2 compatible plot and you could customize the plot as much as ggplot2 allows.
Cheers,
Islam
That is good to work at such a good company. I am pretty sure of its success. I must be going.