DATA SCIENCE
using PYTHON
LIVE ONLINE TRAININGAutoregressive Integrated Moving Average (ARIMA) Model
In the realm of time series forecasting, the Autoregressive Integrated Moving Average (ARIMA) model serves as a fundamental tool. Whether you're predicting stock prices, weather patterns, or economic indicators, ARIMA offers a systematic approach to modeling and forecasting time-dependent data.
ARIMA is a class of statistical models designed for analyzing and forecasting stationary time series data. Its name reflects its essential components:
AutoRegressive (AR): This component focuses on the relationship between an observation and several previous observations (lagged values).
Integrated (I): This aspect involves differencing the raw observations—subtracting the value of one observation from that of the previous observation—to render the time series stationary.
Moving Average (MA): This component models the relationship between an observation and the residual error from a moving average model applied to lagged observations.
By combining these components, ARIMA effectively models time series data that exhibit trends and non-stationarity.
The values of a series of data at a particular point in time is highly correlated with the values that precede and succeed them. In simple term, observations are not independent. This can be checked by using Durbin-Watson statistics as follow:
d = \(\frac {∑ (et – et-1)}{∑e2}\)
where e = Error term at time t
- Durbin-Watson (d) test can be used to test for the autocorrelation in the time-series data.
- Smaller d (near 0) means positively correlated data.
- Larger d (near 4) means negatively correlated data.
- d approximately 2 indicates a no autocorrelation present in the time-series.
We can model the autocorrelation if present in the time-series by using Autoregressive Integrated Moving Average (ARIMA) models.
First - Order Autocorrelation Model (association between two consecutive values in the series)
Yi = A0 + A1Yi-1 + Ɛi
Second - Order Autocorrelation Model (association between values that are two periods apart)
Yi = A0 + A1Yi-1 + A2Yi-2+ Ɛi
pth Order Autocorrelation Model (association between values that are pth periods apart)
Yi = A0 + A1Yi-1 + A2Yi-2+ ...ApYi-p +Ɛi
Yi = Observed value of time-series at time i
Yi-1 = Observed value of time-series at time i – 1
A0 = Fixed least-square parameter
A1, A2, Ap = Autoregressive parameters to be estimated using least-square regression.
Ɛi = Random error at time i
Forecast = Ŷn + j = A0 + A1 Ŷn + j -1 + A2 Ŷn + j -2 + … Ap Ŷn + j -p
Notes:
It is important to select the order of autocorrelation in the Auto-Regressive models. You can use the t-test to test for the significance of respective order of autocorrelation. In Eastman Kodak revenue, third order auto-regressive model is not significant and hence, second order model is fitted to the time series data and same is used to forecast the value for Year = 2000 and 2001.
You must be equally concern with selecting the high order model as it requires estimation of high order parameters. This may cause problem especially when the number of observations are less.