Тёмный
No video :(

Time Series Anomaly Detection with LSTM Autoencoders using Keras & TensorFlow 2 in Python 

Venelin Valkov
Подписаться 26 тыс.
Просмотров 64 тыс.
50% 1

Subscribe: bit.ly/venelin-...
Complete tutorial + source code: www.curiousily...
GitHub: github.com/cur...
📖 Read Hacker's Guide to Machine Learning with Python: bit.ly/Hackers-...
Detect anomalies in S&P 500 daily closing price. Build LSTM Autoencoder Neural Net for anomaly detection using Keras and TensorFlow 2.

Опубликовано:

 

21 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 90   
@MichalMonday
@MichalMonday 2 года назад
if anyone has a problem with plot statements at the end, it helped when I used: scaler.inverse_transform(test[TIME_STEPS:].close.values.reshape(1,-1)).reshape(-1), and scaler.inverse_transform(anomalies.close.values.reshape(1,-1)).reshape(-1),
@boomtrack5176
@boomtrack5176 2 года назад
Great thank you
@Wissam-rk7tv
@Wissam-rk7tv Год назад
Thank you so much , do you have an idéa of how to prepare our data, in the case of a multivariate analysis but with redundant dates, for example the prediction of temperature in several regions ? ( we don't have a unique key )
@CodeEmporium
@CodeEmporium 4 года назад
This is gold. I'm doing something similar for work. Glad I discovered this channel. Subscribed! Looking forward to more content!
@leoparada69
@leoparada69 4 года назад
Great tutorial. Just wanted to point out that the problem at 24:50 is in the way that the mean absolute error is calculated: np.abs(X_train_pred, X_train) is no the same as np.abs(X_train_pred - X_train)
@franziskahuber9664
@franziskahuber9664 4 года назад
Doing my bachelor's thesis on this. Very helpful gaining overview over the topic, thank you!
@christalone1693
@christalone1693 2 года назад
Appreciate all your help man, it's really made a difference in how quickly I've learned a lot of these concepts! You are the best.
@mikhailb8026
@mikhailb8026 3 года назад
Dear Venelin, You are training your model using labels (y_train) which are t+1 timestamps for each training sequence (X_train), but Autoencoder is implied to train model with labels that is the same as input training sequence, it means you should use model.fit(X_train, X_train), I guess. Could you kindly explain why you use this scheme of training and name it like Autoencoder ?
@pratiksingh2840
@pratiksingh2840 4 года назад
Great work Venelin. Very Clean step by step explanation. Keep it up
@vigneshpadmanabhan
@vigneshpadmanabhan 4 года назад
This is exactly what i wanted to learn.. would you be able to do the same for a multistep multivariate time-series and identify the anomaly and forecast? Thanks!
@xRandom112
@xRandom112 4 года назад
Great Tutorial, it's really noticeable that you know what you're doing. Keep it up
@martintabikh494
@martintabikh494 4 года назад
Hi, why do we use y_train to fit the model and not X_train? it is autoencoder right? so we train the model to be able to reproduce the input so X_train
@maziarkasaeiroodsari6473
@maziarkasaeiroodsari6473 3 года назад
Yes, that is indeed very confusing in this tutorial. Why create target label when you are doing an unsupervised analysis.
@arielgroisman4724
@arielgroisman4724 2 года назад
same question here, this doesn't look like an autoencoder architecture. It should be model.fit(X_train, X_train,...)
@BeCorbie
@BeCorbie 4 года назад
Very helpful tutorial! I have to do something similar for university and this helps a lot! :)
@hipphipphurra77
@hipphipphurra77 4 года назад
I am wondering a little bit what we gain from detecting historical anomalies? It is like knowing that last weeks weather probably had an anomalie. What we need is a prediction (not of the future price, that is not enough) of the future performance. If we can't have this than we would at least like to have a prediction of future anomalies.
@conduit242
@conduit242 3 года назад
Uhh...this is non-stationary data, you need to remove the trend or you’ll get these bogus results. LSTMs assume stationarity. Convert it to daily percentage change for stock data.
@dafliwalefromiim3454
@dafliwalefromiim3454 3 года назад
Exactly, in case of time series data, samples are too auto correlated, its can't be modelled straight, without removing the trend. Hi Rob, can i talk to you one to one, please ? my contact, gautamk2017@email.iimcal.ac.in
@maziarkasaeiroodsari6473
@maziarkasaeiroodsari6473 3 года назад
Thanks for the tutorial. Question: Why do you create target label (y), when you are doing an unsupervised analysis?
@cedricvillani8502
@cedricvillani8502 3 года назад
Because the 🐐 Goat told him so
@xenophon167
@xenophon167 2 года назад
Excellent video, thanks a lot! However I would like to see an extension of this example using multiple features. I tried to extend it using more features with no luck so far.
@shyamkarthikrameshbabumis5367
@shyamkarthikrameshbabumis5367 2 года назад
Really really helpful to help me with my time series problem related to climate change, thank you!
@mariaclaradantas5419
@mariaclaradantas5419 3 года назад
This tutorial helped me a lot! Thank you!!
@prathameshpradipdatar2003
@prathameshpradipdatar2003 4 года назад
Great walkthrough with the code!
@AnirudraDiwakar
@AnirudraDiwakar 4 года назад
This is very nicely explained. Thank you sir.
@Nofakeable
@Nofakeable 2 года назад
That was a really well done video, thank you!
@ayushpantdeptofcs1635
@ayushpantdeptofcs1635 4 года назад
How do we take multivariate features to perform anomaly detection? I.e. X1, X2, X3 as the input and we want to predict the future Y value.
@vigneshpadmanabhan
@vigneshpadmanabhan 4 года назад
I would like to know the same thing
@massiivelli4267
@massiivelli4267 4 года назад
Create a train - test set with all features. Then fit the scaler separately for train[[X1,X2,X3]] and train[y]. Then when you call the create_dataset you will call it like: X_train, y_train = create_dataset(train, train[y], time_steps) Note that train (X_train) contains X1, X2, X3 and y (of the past N time steps) while y_train contains only the y to predict. The rest should be pretty much the same
@iLoveBrezels
@iLoveBrezels 4 года назад
@@massiivelli4267 could be further explain where I get train[y] from? Let's say my three features are cpu, ram and hd usage, I did train['cpu'] = scaler.fit_transform(train['cpu']) and for the other two respectively. Where do I get train[y] from? What do I pass into X_train, y_train = create_dataset(train, ?????, time_steps)?
@massiivelli4267
@massiivelli4267 4 года назад
@@iLoveBrezels train[y] is the variable you want to predict. In other words it is the thing you want to know. So in a time series situation, normally it is the value of a something in a specific time in the future based on the past values.
@maziarkasaeiroodsari6473
@maziarkasaeiroodsari6473 3 года назад
@@massiivelli4267 The thing is: in Autoencoders, you are not predicting anything. You shouldn't need any target as this is unsupervised!
@wijdanchoukri775
@wijdanchoukri775 2 года назад
Thank you so much for this, this is what I wanted to learn
@donwoodlock15
@donwoodlock15 2 года назад
Thank you for the tutorial. There is one piece I didn't understand. The shape of y_test is 380, so I was thinking that the model would make 380 predictions, but the shape of the predictions (y_test_pred) is 380*30. Is it making the 30 predictions per date? For example it uses the prior 30 days as the input sequence and its predicting the next 30 days? I was also thinking since the shape of y_train is a single closing price per day that the model would be trained to only predict one value per date, not 30. Can you clarify?
@Breno9629
@Breno9629 Месяц назад
Hey Mr. Venelin, thank you for the video. If you allow me to ask you some questions, why do we have, while train the model, pass the X and the Y? Is the model reconstructing the original sequence and trying to predict the next value based on the 30 values provided? (I am asking because I was expecting that we would bass the same sequence, something similar as we perform using a vanilla autoencoder). It seems that we input a sequence, tries to predict the next value for the given sequence while we reconstruct the initial sequence. When we calculate the error, the error is based on the reconstruction process am I right? Thank you in advance!
@2guestuser
@2guestuser 4 года назад
Fantastic tutorial!
@gcvictorgc
@gcvictorgc 3 года назад
Thanks for this! Could you elaborate on your choice for the loss function? Would you make things different if you had >1 features (multivariate time series)? Cheers
@marouaslafa7571
@marouaslafa7571 4 года назад
very good tutorial . Can you do another one about anomaly detection in images ? it will be very interessting
@nguyenanhnguyen7658
@nguyenanhnguyen7658 2 года назад
This is cool !
@FRUXT
@FRUXT 2 года назад
I have an ad every 3 minuts... Except that, excellent video. However the anomaly detected don't seem to be abnormal for me. It' more abnormal when the change is big and sudden
@vahidjoudakian8649
@vahidjoudakian8649 2 года назад
Very informative, thank you
@studyhub3950
@studyhub3950 Год назад
Firstly thanks. My question is that when input is 30*1 means 30 then how can be output 64 while in autoencoder we compress data then decode for example 30 to 15 to 10 then decode
@priyankadas7102
@priyankadas7102 3 года назад
Excellent content on your channel. Thanks
@idotsuk
@idotsuk 4 года назад
standard scaling doesn't work well here since the S&P 500 is increasing (test samples are strictly larger) But I guess batch normalization makes up for it when only looking at 30 days Maybe it'd make sense to scale with a logarithmic function of the date?
@Cyberfako
@Cyberfako 2 года назад
You are Great! That helped well so thx 🙏
@doudi0101
@doudi0101 4 года назад
Very interesting, thank you !
@tangibleoxygen1986
@tangibleoxygen1986 4 года назад
Note: the shared colab notebook also gives exact same error too. Hence I double checked my lines with yours. Any help would be super beneficial
@vishwasgowda
@vishwasgowda 4 года назад
First of all, thank you for the video tutorial. I am curious if can do a video on how to setup a email systems once the value reaches the anomaly threshold. The idea is to set an alarm before some thing bad happens. You can also point me where I can get an idea to set up an alarm system. Thank you
@DanBarbatti
@DanBarbatti 3 года назад
Hi Great tutorial. Trying to utilize your code with some of my data. Only change was number of time steps. Getting shape incompatibility errors when I try to use y_train in the fit method. Also using Keras 2.2.4 and tensorflow 1.13.1 ... Any advice?
@maxmag76
@maxmag76 3 года назад
Thank you so much for the nice video.
@nataliagromova961
@nataliagromova961 2 года назад
Very cool 👍 was it helpful for you to predict the stock price in real life?
@blackisfav7222
@blackisfav7222 4 года назад
Consider behaviour of user logins and find the anomalies
@mohammedghouse235
@mohammedghouse235 3 года назад
Amazing video, Could you also do the same anomaly detection on oil production profiles?
@suyashsonawane4690
@suyashsonawane4690 4 года назад
I tried to implement on multi variable dataset but it doesn't work , the last layer gives incompatible shape error
@AdityalikeThe
@AdityalikeThe 4 года назад
Same with me, did you find a solution to that?
@MrProzaki
@MrProzaki 4 года назад
same here xD , still looking for a solution .... that i can understand.
@shilpashivamallu9056
@shilpashivamallu9056 3 года назад
In order to predict for next 8 hours, what needs to be changed in the code? Time_Steps should be 8?. How the model identifies it is in hours or Days? Thanks
@vamsikrishnabhadragiri402
@vamsikrishnabhadragiri402 3 года назад
Why did we use time distributed dense layer any specific reason?
@douglaszechin3233
@douglaszechin3233 4 года назад
Wouldn't it perform any better if used the return_states in encoder and used it and initial_state in the decoder? It seems that your approach passes just the last output of the LSTM, wich doesn't carry much information...
@abhijeet6989
@abhijeet6989 3 года назад
Dear Sir, Greetings!! Thank you very much for guiding us throughout the tutorial. Kindly guide the error to solve the below issues. I am getting an error here: THRESHOLD = 1.9 test_score_df = pd.DataFrame(index=test[TIME_STEPS:].index) test_score_df['loss'] = test_mae_loss test_score_df['threshold'] = THRESHOLD test_score_df['anomaly'] = test_score_df.loss > test_score_df.threshold test_score_df['close'] = test[TIME_STEPS:].close Errors are: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 1 THRESHOLD = 1.9 2 test_score_df = pd.DataFrame(index=test[TIME_STEPS:].index) ----> 3 test_score_df['loss'] = test_mae_loss 4 test_score_df['threshold'] = THRESHOLD 5 test_score_df['anomaly'] = test_score_df.loss > test_score_df.threshold 3 frames /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __setitem__(self, key, value) 3042 else: 3043 # set column -> 3044 self._set_item(key, value) 3045 3046 def _setitem_slice(self, key: slice, value): /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in _set_item(self, key, value) 3118 """ 3119 self._ensure_valid_index(value) -> 3120 value = self._sanitize_column(key, value) 3121 NDFrame._set_item(self, key, value) 3122 /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in _sanitize_column(self, key, value, broadcast) 3766 3767 # turn me into an ndarray -> 3768 value = sanitize_index(value, self.index) 3769 if not isinstance(value, (np.ndarray, Index)): 3770 if isinstance(value, list) and len(value) > 0: /usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in sanitize_index(data, index) 746 if len(data) != len(index): 747 raise ValueError( --> 748 "Length of values " 749 f"({len(data)}) " 750 "does not match the length of index " ValueError: Length of values (7752) does not match the length of index (380).
@mindbodyzaid7814
@mindbodyzaid7814 2 года назад
why do you need to create a "y" dataset if for autoencoders "x" should be mapped to "x"?
@rhithickm2689
@rhithickm2689 3 года назад
At around 24:05 time, it should be np.abs(x - y) and not np.abs(x, y) right?
@nielspalmans6237
@nielspalmans6237 2 года назад
is there a way to do what you did on data consisting of multiple attributes rather than just one?
@blackisfav7222
@blackisfav7222 4 года назад
How to convert it as script where we give data and output is going to be anomaly ..mean end to end functions based script rather than jupyter notebooks
@farhanjavid6474
@farhanjavid6474 4 месяца назад
😍😍😍😍😍😍😍
@awaisumar5125
@awaisumar5125 3 года назад
how can we actually feed the real-time test data to this model to get real-time predictions? is there any tutorial or link for this?
@adityahpatel
@adityahpatel 2 года назад
in autoencoder you should do .fit(x,x), not .fit(x,y)
@skorpio3110
@skorpio3110 Год назад
Did you get any answer for that? I'm confused too
@ogochukwuujunwa4680
@ogochukwuujunwa4680 2 года назад
Please can you change the font of your system to make the text legible
@mp3311
@mp3311 2 года назад
I get the ValueError: Expected 2D array, got 1D array instead at scaler.inverse_transform(test[TIME_STEPS:].close).How could I fix this?
@gnn816
@gnn816 2 года назад
Hello there, did you manage to solve this problem. I am facing the same issue.
@harshitbhargav
@harshitbhargav 3 года назад
Does that work for multiple time series of varied time length also?
@TOM-cd1zb
@TOM-cd1zb 4 года назад
Hi there, my val_loss is constant from the first epoch, so it is overfitting. Any tips?
@shreyasshinde5451
@shreyasshinde5451 2 года назад
@ Venelin Valkov Thanks for the video and great explaination. I am working with multiple fearture (Multiple attributes) for Anomaly detection. Could you provide any sample code or any reference for that. Would be great. Thanks :)
@Bruno.FERGANI
@Bruno.FERGANI 4 года назад
Thanks Venelin for the tutorial ! 👍 Prefered way to use Tensorflow 2.x on Colab is via the %tensorflow_version magic: colab.research.google.com/notebooks/tensorflow_version.ipynb
@tangibleoxygen1986
@tangibleoxygen1986 4 года назад
Hi, thanks for such a detailed and compact explanation. I need some help in .fit() method. I am getting an error time_distributed_ to have 3 dimensions, but got array with shape (7752, 1). I checked github and error logs. The shape of X_train is still 7752, 30, 1. Is there any solution?
@ismailwafaadenwar3254
@ismailwafaadenwar3254 4 года назад
In the fit() method, change the y_train to also X_train. Theoretically, in autoencoders, X is the input and X is the output
@sagar8460830871
@sagar8460830871 3 года назад
Can we do for multiple variable
@AlonAvramson
@AlonAvramson Год назад
If it would be profitable, would you still invest time to create and publish a video?
@dibyakantaacharya4104
@dibyakantaacharya4104 4 года назад
can i execute this by using of image datasets?what will be the code ??
@blackisfav7222
@blackisfav7222 4 года назад
Any help for logintime based anomaly
@cyrusazamfar6220
@cyrusazamfar6220 2 года назад
You are copying stuff from another screen and STILL, you messed it up :) 🤣
@cedricvillani8502
@cedricvillani8502 3 года назад
And then TradingView stomped on him. Go there, learn Pine Script, Make money. Then go on his Patreon and give him money. OK GO GO GO
@susantisisteminformasi4154
@susantisisteminformasi4154 4 года назад
Hello noob here. Why gpu?
@FrancescoLucrezia
@FrancescoLucrezia 3 года назад
There is a course on Coursera with identical content of this video. So someone is plagiarizing. The course on Coursera is a paid one: www.coursera.org/projects/anomaly-detection-time-series-keras
@rajarams3722
@rajarams3722 11 месяцев назад
Sorry, this is fundamentally wrong...Autoencoder should try to reconstruct 30 time steps from the input 30 time step values...Here you are trying to mix forecast of 31st value with autoencoder...It should be trained with target values same as input values.
@kacperogorek3958
@kacperogorek3958 2 года назад
You are close
Далее
New Trends in Time Series Anomaly Detection
1:39:14
Просмотров 6 тыс.
SHIRT NUMBER OR SWIM 🙈💦
00:32
Просмотров 5 млн
PEDRO PEDRO INSIDEOUT
00:10
Просмотров 1,9 млн
180 - LSTM Autoencoder for anomaly detection
26:53
Просмотров 88 тыс.
Autoencoders in Python with Tensorflow/Keras
49:39
Просмотров 76 тыс.
Anomaly Detection with AutoEncoders using Tensorflow
32:33
Anomaly Detection in Keras with AutoEncoders (14.3)
10:03
Anomaly Detection : Time Series Talk
9:38
Просмотров 63 тыс.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Time Series Forecasting with XGBoost - Advanced Methods
22:02