Finally a new PyTorch tutorial. I hope you enjoy it :) Also, check out Tabnine, the FREE AI-powered code completion tool I used in this video: www.tabnine.com/?.com&PythonEngineer * ---------------------------------------------------------------------------------------------------------- * This is a sponsored link. You will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏
Thank you for making this, I've been struggling with this stuff on and off for months. These videos on PyTorch made things click, I really appreciate you taking the time to make them. They've helped me immensly.
Hi, thank you for sharing your valuable information through this channel. I am one of the new followers of time series. If possible, could you create a series on how to implement Transformers on time series data, covering both univariate and multivariate approaches? Focusing on operations like forecasting, classification, or anomaly detection-just one of these would be greatly appreciated. There are no videos available on RU-vid that have implemented this before. It would be extremely helpful for students and new researchers in the field of time series.
Amazing content! Although, quick question. I noticed you called 'self.hidden' at 29:48. However I didnt see a corresponding parameter to self.hidden i.e self.n_hidden has n_hidden parameters, while i cant see the number of parameters for self.hidden
Is it better for prediction performance to pass the output of one LSTM to the next or to pass the previous hidden state (as done in the video)? I've seen both methods used and don't know which is better. Do you have any advice on when to use each approach?
This is amazing! How do you always know exactly what I need and make a tutorial about it? Any chance you could make a tutorial about how to make an estimator that can give out the width of the given sine function and the x-shift of the 3 sine functions relative to each other? That would quite literally save my life. I know it should be possible to do with a similar method employed in the video, but I just can't do it...
Hi, great video! Just want to ask, why do we have 2 Lstm cells, and not a single one? And not sure if I get it... in the forward() func we split samples by dim=1 to feed a sequence of elements right? So if target_inputs has say 1000 elements(columns in this case) it means, that our lstm knows what happened 1000 points behind and "use" all of them to make the very next prediction? Thanks!
It could be a single LSTM cell. He just wanted to make it deeper. He split the tensor in the axis of the sequence to process each time step at each for loop.
Thanks for this video. Such a great help and cleared up some confusion.. One question I had was, for the training, why are you only using the values from y and not the x?
Don't you "destroy" some of the knowledge learned during training by initialising the hidden and cell state as zeros in each forward pass? Or is this a better approach than initialising the states once in the beginning? Maybe you could elaborate on that? :)
I understand the fact that your videos are "code along" style ones BUT for the implementation, there is too much from the HOW and saddly, nothing from the WHY.
Hey! I was wondering why are there multiple colors at the end when at the start there was only 1 sine wave? I'm confused where all the additional red and green lines came from. Thanks :)
I am trying to run this code on my gpu. It should work, but it doesn't. device = torch.device("cuda" if torch.cuda.is_available() else "cpu") returns 'cuda', so my GPU is being detected. I also copied the training and test inputs and targets to the gpu with .to(device) as well as the model (model = LSTMPredictor(hiddenstates).to(device)). But i still get the error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_mm). It occurs in the optimizer step (optimizer.step(closure)). What do you think?
idk i am a beginner and i use jupyter notebooks and i copied the code perfectly( after running no errors) but i did not get any predictions or loss ? any idea what must be the case?
Would we not want to initialize the hidden state and cell state outside of the forward, so they capture long term features? Since they are in forward, aren't we removing all notions of long-term connectivity as they get cleared on every forward call?
Usually, you only want the LSTM to keep the memory during the sequence. For example, if I have a LSTM that recognizes activity in videos, then I want it to keep the memory while processing the frames in one video, but then I want it to forget it for the next video.
can you dive deeper into the various pytorch package functions in a future video? e.g. detach vs item, .Tensor, .tensor when to use datatype longtensor, ...? Thanks and best
LSTMs are recurrent networks: you need the result of the previous iteration to get the next. That's the way they work, and also one of their main weakness.