Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more. |
Home Bivariate Data Time Series Models for Time Series Establishing ARIMA models | |
See also: definition of ARIMA models | |
Time Series - Establishing ARIMA modelsThe process of finding appropriate ARIMA models
has been studied intensively. As a result, detailed guidelines exist. The
method described in [Box and Jenkins, 1970] is referred to as the "Box-Jenkins
approach."
1. Model SelectionIn the model selection phase, a single model is chosen. This requires determining the values p, d, and q of an ARIMA[p,d,q]-model. In this phase, it is important to collect as much relevant information on the time series as possible. The first steps involve the filtering of trends and the removal of seasonal effects. The correlation function can be inspected to reveal the best choice of d. Heuristics exist as to which filter to use, depending on the shape of the correlation function: for instance, when it is descending, alternatingly positive and negative, when it has peaks, or when it is periodical. There exist heuristics guiding the selection of the appropriate model for time series with p<= 2 and q <= 2. Surprisingly, the majority of time series can be modeled very well with such simple models. The auto-correlation function (ACF) and the partial auto-correlation function (PACF) can be used for determining p and q of the ARIMA[p,d,q]-models. They are determined for a limited number of time lags τ, e.g. 20. Then, confidence intervals (e.g. 95% intervals) are calculated. The time lags τ lying outside the confidence intervals can be taken as p and q. Those found outside the confidence interval around the ACF function indicate that a MA[τ] model should be used, and those of the PACF function indicate that an AR[τ] model may be applicable.2. Parameter EstimationIn order to estimate the time series value x(t) with an ARIMA[p,d,q]-model, p, d, and q have to be selected first. The number of differentiation steps d determines how often the original time series is differentiated before the respective formula is applied. This procedure is required for filtering trends.When p, d, and q of an ARIMA model are given, the parameters αi and βj can be estimated. This is done by minimizing (some function of) the error. This is the distance between the time series produced by the original time series and the time series produced by the model. When d is used, i.e. 0<d, the errors for the d-th derivative of the time series are taken. The "least squares approach" is the most common technique. It minimizes the squared errors. Depending on the overall task, other performance
measures may be formulated to measure the quality of the model. It is often
used as default, but other measures may be more reasonable for a given
application.
3. Performance CheckingTo check the performance, it is important to use independent test sets consisting of time series which have not yet been involved in the modeling process. The error on these independent test sets is compared to that obtained with other models. Usually, the error is a value obtained by applying some function on the difference between the observed and the forecast value.
Box and Jenkins advise taking a look at the autocorrelation
functions of the time series and of the errors. If the latter contains
any suspicious peaks, the model does not exploit all the available information.
Moreover, it is reasonable to evaluate the performance of ARIMA models
of higher order: ARIMA[p+1,d,q] and ARIMA[p,d,q+1]. This shows whether
models of higher order improve the forecasts. If a model does not provide
better forecasts, the model of lower order is preferred, because it has
fewer parameters. In order to avoid under- and overdifferentiation, the
models with higher and lower d (ARIMA[p,d-1,q] and ARIMA[p,d+1,q]) should
also be tested. Finally, more complex models may be checked.
|
|
Home Bivariate Data Time Series Models for Time Series Establishing ARIMA models |