Time to start talking about some of the most popular models in time series - ARIMA models. First things first, let's look at the AR piece - autoregressive models!
I like the way you convey the intuition behind AR and MA models. One thing that might be confusing is however the terminology, in particular with regard to short and long memory, which is different in common literature. Therein, AR, MA and ARMA models are considered to be short-memory models, because their autocovariances are summable. Also AR models, whose autocovariance function (ACVF) decays quite quickly towards zero for increasing lags, even though the ACVF values in fact never fully reach zero, has summable autocovariances. In contrast long-memory behavior is indicated by a hyperbolically decaying ACVF, which results in an ACVF whose elements are not summable anymore. A popular example is the fractionally integrated ARMA model, often denoted by either FARIMA or ARFIMA, that can still have ACVF values of notable magnitude for large lags.
Thank you, j had seen this equation when a was studying reinforcement learning, it's like the Value function weighted by a discount factor.... Great explanation!!!
A lot of overlap here with an infinite impulse response filter from DSP. Im about to watch the moving average model video, but am wondering if that is the finite impulse response equivalent :)
Hi Aric! This was such a splendidly explained video. I have a doubt though about NARX. Do they function the same way as this one (explained in the video) because NARX is also autoregressive model? If not, could you please explain about NARX as well?
Hi Aric, thanks for the explanatory video. Can it be said that AR(1) is equivalent to Single Exponential Smoothing algorithm because it too depends on the Previous forecast and error.
Actually, a single exponential smoothing model is equivalent to a moving average of order 1 after taking a single time difference (more formally called an ARIMA (0,1,1) model or sometimes an IMA(1,1))! This is because of the structure of the single exponential smoothing model. It is a combination of past and prediction, but the prediction is more past, etc. Hope this helps!
the underlying assumption is that we know the data up to time t-1, and we use the observed data to estimate the parameters (ϕ1,ϕ2,…,ϕpϕ1,ϕ2,…,ϕp and e_t) , right?
Love your videos! I am on a quest to find out why we need stationarity for ARIMA model (many explanations online but I cannot say I have a very clear understanding). Is stationarity necessary for Simple Exponential Smoothing?
We need stationarity because the structure of ARIMA models are that they revert to the average of the series if you predict out far enough. That wouldn't work very well at all if we have trending or seasonal data! Simple ESM's don't need stationarity, but do require no trend or seasonality to make them work best. Stationarity is more mathematically rigorous than just no trend or seasonality. Hope this helps!
Does anyone know where the line is between autoregression and regression is, because, eg lowess and loess functions are called local regression, yet it looks like "local regression" is a form of autogression from a 10,000 ft view. My guess atm is that local regression does not add stochastic noise making it just barely miss the definition, but I am only guessing here. It could also be local regression is a form of autoregression but everyone is too lazy to write it all out. Whatever it is, I would like to know!
I could not undrestand how do you calculate the φ because I 've seen a lot of correlation types and I do not know which one to use. Thank you for your time.
It actually isn't a correlation directly (unless it is an AR(1) model and then it is the Pearson correlation if the variables are standardized). The best way to think about it is that it is a weight in a regression model. The model chooses the weight that maximizes the likelihood (MLE) of the model and predictions. Hope this helps!
For 3:51, what is the manipulation done should be explained a little. Since I am not from this background it will be difficult for me to go through what and how it is happening?
@@ArunKumar-yb2jn u r so smart that's why I am asking...if he has told some references or a bit of manipulation done......if I have already some background then definitely I will not be here
If I am using a AR(1) model, and I have data of Yt-1, do I need to recursive back all the way to start point to predict Yt? or I can just use the formula shown at @1:17
You just use the formula! The recursive piece is to just show what is happening in concept if you keep plugging in what each lag truly represents. All you need for an AR(1) is just the lagged values (for each time point) to build the model!
Yes they can! AR models are long memory models, but there are also short memory models (think quick shocks that don't last long in time) called Moving Average (MA) models. That is the next video about to come out! If you are talking about normal predictors (think X's in linear regression) then this class of model is called an ARIMAX model. I'll have a video on these coming soon!
@@AricLaBarr Thanks for the quick reply!. I had to review a paper last week which used predictors (like X's) to examine stock prices in a time series model. I really had no clue and if and when u make a video, please do include how to run these models, and evaluate these models. Thanks a lot. stay safe.