Тёмный
No video :(

LSTM Networks - The Math of Intelligence (Week 8) 

Siraj Raval
Подписаться 770 тыс.
Просмотров 176 тыс.
50% 1

Опубликовано:

 

29 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 270   
@IgorKaplun-kj9ud
@IgorKaplun-kj9ud 6 лет назад
I have spent hours reading scientific papers and trying to get some minor misunderstandings of mine. After all it took me just 20 minutes of watching this video to understand everything. Siraj, you are a great lecturer!
@skipintro9988
@skipintro9988 3 года назад
I learnt more from you in the last one month than I learnt everything from my teachers during the whole university life
@filipcoja
@filipcoja 5 лет назад
Please start using longer variable names it helps to understand the code drastically
@IgorAherne
@IgorAherne 7 лет назад
The timing is just right! Thanks Siraj, don't listen too much on harsh comments - your style and presenting is very good. You are not talking down on audience & it really helps to grasp the material.
@aliaghabeigiha2275
@aliaghabeigiha2275 5 лет назад
Siraj I love you. I can literally say that you are a man who found your way and that made you incredible. and I can tell from your passion and the way you speak, because of that you love yourself too which is so goddamn beautiful. I deeply appreciate your work and explanation and more importantly your energy... keep it up, buddy.
@richard9332
@richard9332 6 лет назад
For those who want to learn to derive the backpropagation for LTSM, I would avoid using this code, I already found a couple bugs, in addition, to the syntax errors and the confusing variable naming. I don't know LSTM (yet), but I think my calculus is right. RNN backprop function - when the error variable is redefined as w*error, it's missing a dsigmoid(oa[i]) , and don't know why it's still call error, it's actually dL/dH which will be propragated into the LSTM cell LSTM backprop function - when computing ou (should be dL/dwxo, or dweight for output gate), it should be dsigmoid, not dtangent in the derivative these are bugs I found as I was trying to figure out the code.
@altugunal5529
@altugunal5529 5 лет назад
Hi could you solve the issues and make it run, I can't reduce the error.
@sisyphus_619
@sisyphus_619 7 лет назад
Thank you so much. You're an awesome teacher. For me, you're like a light of hope in the darkness of despair.
@musti81uk
@musti81uk 4 года назад
This dude is a bit over the top!
@datasciencewithr1039
@datasciencewithr1039 7 лет назад
Most people are ungrateful jealous Bacteria & fungus , who can't say anything nice. Siraj is doing such a good job putting out good content on otherwise hard to understand AI content , in simple terms , and that too for free. Some mistakes happen and that is ok. All the jealous fellows find such a problem with Siraj's content . . . then why the hell aren't they pumping out quality videos instead of complaining here anonymously ?? Siraj will keep learning and teaching . . . so he will improve and so will his followers . . . and be more job ready . . . .! God Bless you Siraj. Although It is grammatically to say 'It's Siraj' I think ' I'm Siraj or Siraj here makes more sense.' It is usually used to refer to inanimate objects :p
@SirajRaval
@SirajRaval 7 лет назад
thanks!
@gzitterspiller
@gzitterspiller 4 года назад
This guy plagerized hard work.
@architgupta6037
@architgupta6037 7 лет назад
You are awesome man, I have been watching your videos and learning a lot. I am completing master`s programme in Prob. theory and by watching your videos I am really understanding the core of programming mathematical models in Python. Sincerely appreciate your effort, please keep making more videos.
@Fernandinh00
@Fernandinh00 7 лет назад
I was hoping you would rap the generated lyrics in the end haha Great video!
@SirajRaval
@SirajRaval 7 лет назад
haha i shouldve rap is coming tho thanks
@z17seattle
@z17seattle 6 лет назад
WHERE IS THE RAP
@AfifatulMukaroh
@AfifatulMukaroh 5 лет назад
I wish all my lecturer in collage is like you!!! OMG!!! OMG!!! If so, I don't want to leave collage or graduate soon anymore... Love yaaaaaaaaa so much!!!
@sanomi4492
@sanomi4492 5 лет назад
This is very trivial thing about the notation, but, in the equation of f_t, bias b_t shouldn't be b_f because he uses W_f.(15:55) Thank you for your informative videos!!
@tanevw
@tanevw 5 лет назад
Whenever I need to learn anything about AI, you are my man Siraj. You're my first query all the time.
@MrAztrup
@MrAztrup 5 лет назад
Thanks for teaching in such an enthusiastic way, soooo much better than class or any of the papers! :D
@codethings271
@codethings271 7 лет назад
Ur making the world a better place :)
@Arti1234-b7x
@Arti1234-b7x 4 года назад
very well explained ..thank you for sharing
@ao9779
@ao9779 6 лет назад
Some of you asked some basic questions here and I had to search a bit too. Original code comes from Kevin Bruhwiler on Github, search its "Simple Vanilla LSTM" that's exaclty the same code but without the typo of Siraj. Jupyter isn't needed, you can just execute its code like you usually do . Also, note that it doesn't provide the text file which can be anything (Kevin used a text by Shakespeare). It is also DAMN SLOW ! One iteration takes few seconds , reaching 100 iterations is done in about five minutes on my poor laptop and it's not sufficient to get an interesting result. Here, while I'm writing, I reached 735th iteration with a small text ( maybe too small) and none of the words generated is readable. So Siraj, you did probably a good job about your explanation but that example needs some improvement or better guidelines
@lacorreia65
@lacorreia65 3 года назад
Agreed. It is too slow. Probably if you code in C++ the performance is much better. I'm wondering how Tensorflow manages to perform such task in much smaller time.
@simontyroll5470
@simontyroll5470 4 года назад
4:56 "The problem is that, uhh, there are 99 of these problems" Dude you fucking had me! You are hilarious! Thank you for some great content for an aspiring machine learner!
@rookiedrummer6838
@rookiedrummer6838 5 лет назад
Samaj aa gaya bhai 1 number ----- thanks for all your efforts :)
@jurajmuran5274
@jurajmuran5274 6 лет назад
man, this is the best video about LSTM, thank you very much
@291cicinho
@291cicinho 6 лет назад
My first comment ever on youtube ... You rock man !! Thank you
@craigmatthews4517
@craigmatthews4517 4 года назад
Thanks. Nice presentation.
@sudhanshuchadha
@sudhanshuchadha 5 лет назад
Siraj you are a great motivator, keep it up.
@lucien04100
@lucien04100 6 лет назад
You're my neural net hero!
@kyoshinronin
@kyoshinronin 7 лет назад
So excited for Blockchain!!!
@olgakonoval5147
@olgakonoval5147 6 лет назад
Thanks so much for beautiful videos and lots of useful code for understanding. Really helps!
@quickdudley
@quickdudley 7 лет назад
There are actually two ways to handle the initial values for the memory cells. If I understood your code you initialized them all to zero, but in some implementations of backpropagation through time once the initial state is reached the remaining error signal is used to adjust the initialization vector.
@yhisi1031
@yhisi1031 7 лет назад
We need a "draw my life" episode, I like your channel man, it's awesome ;)
@SirajRaval
@SirajRaval 7 лет назад
its coming
@yhisi1031
@yhisi1031 7 лет назад
Siraj Raval really I want to do the french version of SIRAJ data scientist
@user-xs6hr1ol7d
@user-xs6hr1ol7d 7 лет назад
Thanks for your work.
@SirajRaval
@SirajRaval 7 лет назад
thanks for listening
@vito135c
@vito135c 6 лет назад
Thanks you man I am studying machine learning and your channel is great resource to me I am subscribe. Thanks 👍
7 лет назад
Thank you! I really appreciate your work.
@SirajRaval
@SirajRaval 7 лет назад
thanks Marek!
@TomRiecken
@TomRiecken 7 лет назад
Don't listen to the haters, you're an epic vid creator, a real learning accelerator, master matrix manipulator, a real hidden state operator. BOOM! Keep it up. I do like that you're going back to basics with numpy here. Your only flaw is using Python 3. JK
@TomRiecken
@TomRiecken 7 лет назад
Also, where is eminem.txt ?
@SirajRaval
@SirajRaval 7 лет назад
thanks Tom!
@arslanelahmer2729
@arslanelahmer2729 4 года назад
Great !
@rishabhbhatnagar6795
@rishabhbhatnagar6795 5 лет назад
Good video, math explained well. RU-vid needs to add an option of 3x as well.
@truliapro7112
@truliapro7112 7 лет назад
Siraj , This LSTM Viedo was really really great. You rock :)
@ranaameer7988
@ranaameer7988 6 лет назад
The video was awesome , thank you so much Siraj
@larryteslaspacexboringlawr739
@larryteslaspacexboringlawr739 7 лет назад
thank you for LSTM network video
@kanharusia9399
@kanharusia9399 6 лет назад
Code is not working for me, it generated random text without any meaning eg- dd,r,ufr,d. Error is not decreasing in my computer
@Abhisingh-cl9xm
@Abhisingh-cl9xm 2 года назад
Great video
@DigitalWerkz
@DigitalWerkz 7 лет назад
There are 2 lines of code in the github source for this project that are incorrect. 'for i range(yada, yada,yada)'. That should read 'for i in range(yada,yada,yada)'. The other line of code that chokes, is the line that says 'alt text' in the source. I installed Jupyter Notebook and tried to run it, and couldn't find any sort of 'run' function. What are those In [ ]: thingies anyways? :-)How did you get the code to run and produce the output you had shown at the end of the lesson? I may be doing something wrong. I hope to figure this out soon. I have an idea.... :D I still enjoy the lesson and am still learning from it. Is this forum the only way to contact you? I'd rather not post such lengthy messages publically. I tend to be Verbose. Keep at it. I really appreciate your contributions. Çheers. :-)
@enriquegalceran5873
@enriquegalceran5873 6 лет назад
Quite a late response, but It's when I saw this video ^^" Jupyter Notebook uses "cells", which you can execute independently. If you have one of the cells active, press Shift+enter or the "Run" icon on top and it executes the active cell. the '[]' is empty when it hasn't been run, and fills with a number when it is run. Hope that helps!
@jony7779
@jony7779 6 лет назад
I love this guy. Thanks Siraj :)
@Felipedotcom
@Felipedotcom 5 лет назад
Great video Siraj! as always!
@luizgarciaaa
@luizgarciaaa 7 лет назад
Hi, Siraj! In RNN used for stock prices, often people normalize the prices for a given window. Why is this necessary? There is another (better) method for it?
@metincanduruel4850
@metincanduruel4850 7 лет назад
Keep it up Siraj! Thanks!
@lacorreia65
@lacorreia65 3 года назад
Both routines from Siraj and Kevin's appears to have problems in routine 'ExportText()': when calling 'np.random.choice(data, p=prob)' we get 'ValueError: probabilities contain NaN'
@venugopalaraomanneni547
@venugopalaraomanneni547 7 лет назад
Great Explanation
@anony88
@anony88 7 лет назад
You're hilarious lol. Love your videos. Thanks
@SirajRaval
@SirajRaval 7 лет назад
thanks!
@lindsaymillard2811
@lindsaymillard2811 5 лет назад
Love the enthusiasm,
@electrookosh
@electrookosh 7 лет назад
Thanks again for the vids & tuts Siraj! Quick heads up tho, be sure to update your MOI playlist, seems like you've been forgetting to do that since week 6!
@SirajRaval
@SirajRaval 7 лет назад
thanks oktay just updated!
@visheshkumar2785
@visheshkumar2785 5 лет назад
Amazing, explanation..thank you very much...
@sifiso5055
@sifiso5055 7 лет назад
Great video Saraj
@ankitshr21
@ankitshr21 7 лет назад
Siraj, what kind of model would you suggest for the following: 1) predict daily sales for lets say google play store 2) we have past data for every line item of purchase 3) With this provided can we predict each line item for the next day 4) I'm interested in line item level as it could be used to generate other sort of insights like: which category will sell more, demographics of the customer and so on 5) Is something like this possible? Looking forward for your thoughts on this
@TheFunkiGuy
@TheFunkiGuy 5 лет назад
LSTM cells considers both previous input and previous output or only previous output. Because some use cases only output will confuse and we will need previous input features.
@countstep000
@countstep000 6 лет назад
Great tutorial, thanks!
@RoxanaNoe
@RoxanaNoe 6 лет назад
I'm your fan Siraj!
@craigmatthews4517
@craigmatthews4517 4 года назад
Ran the program to trace through it and error value never changed. Also had the following spewed out: RuntimeWarning: invalid value encountered in double_scalars prob[j] = output[i][j] / np.sum(output[i]). Never saw in the video any real output statements. It was like you were showing some fictitiously made up output.txt file??? Please explain.
@marcogannetti1592
@marcogannetti1592 7 лет назад
Where we could find this pdf?
@RahulBonnerjee
@RahulBonnerjee 7 лет назад
I'm excited about blockchain too!! V soon ^^
@cybergame.
@cybergame. 3 года назад
My Doubt: In an RNN,lets take an input data as [[2,3,4],[2,4,6],[5,6,8],[3,6,8] [3,5,7],[1,2,4],[5,7,8],[75,4,] [4,4,6],[0,6,0],[2,5,3],[2,2,2] [1,1,1],[4,8,0],[2,4,3],[6,4,0] [3,5,7],[1,2,4],[5,7,8],[75,4,]] This is a 3D array of shape (5,4,3).Am I right? My question is How is this input data processed in a RNN? How the dimension of the weight matrix(between input and hidden neuron) is determined? How the dimension of the another weight matrix(between hidden neuron and previous hidden state) is determined? How does the input data undergo computation? I know the equations. s(t) = tanh[Ux + Ws(t-1)] y(t) = softmax[Vs(t)] I need to know how the matrix is multiplied and the bias is added? I don't know how to determine the shape of the bias and other weight vectors(U,V,W). I need to know the computation in detail. You can suggest some related articles to read. Thank you
@nickpapagiorgio8738
@nickpapagiorgio8738 7 лет назад
Never effing change dude
@arekunakashima9646
@arekunakashima9646 7 лет назад
I've been waiting for blockchain :D please make it soon
@aidenstill7179
@aidenstill7179 5 лет назад
very great video. please answer me. What do I need to know to build a deep learning library? tell me the courses and books
@JeremyCoppin
@JeremyCoppin 5 лет назад
Assuming you have been through the previous videos then check out DeepLizard and Udemy
@aidenstill7179
@aidenstill7179 5 лет назад
@@JeremyCoppin and what courses on udemy should take for this?
@JeremyCoppin
@JeremyCoppin 5 лет назад
@@aidenstill7179 Python Bootcamp, Machine Learning A-Z
@aidenstill7179
@aidenstill7179 5 лет назад
@@JeremyCoppin Thanks you! Will this knowledge be sufficient to develop a deep learning framework?
@JeremyCoppin
@JeremyCoppin 5 лет назад
@@aidenstill7179 It will be sufficient to use the deep learning frameworks that have already been developed e.g. Keras and Tensorflow. Once you are comfortable developing applications with these you could look into building your own, but I doubt you will feel the need aside from being interested in doing that.
@emorange
@emorange 6 лет назад
Nice work, bro!
@liemh9290
@liemh9290 4 года назад
Great for biz people to get an idea.. Check out MIT open course, the professor explains it down to the math, useful for coding.
@karpenkovarya9796
@karpenkovarya9796 6 лет назад
Fantastic! thank you!
@kristophererickson1865
@kristophererickson1865 5 лет назад
I may have set my learning rate too high. I got to a point where the error on Iteration kept going up and then down to the same numbers. Also I had to use the tweaks in someone's fork of Siraj to get it to work, but I also had to make another "rounding" tweak because I was getting the error "np.random.choice: probabilities do not sum to 1" so I had to re-scale the probabilities to sum to one. Needless to say the output was junk over 500 iterations. Learning a ton though.
@turingtechacademy
@turingtechacademy 6 лет назад
Really awesome !!!!!
@andyroider97
@andyroider97 7 лет назад
Let's give this little girl a like. @4:43 Just kidding, but I pressed like.
@SirajRaval
@SirajRaval 7 лет назад
haha thanks
@UgoGbenimachor928
@UgoGbenimachor928 6 лет назад
Good job, thanks
@andreistoica7981
@andreistoica7981 6 лет назад
Great video!
@clark87
@clark87 5 лет назад
thank you siraj
@Trafulgoth
@Trafulgoth 7 лет назад
"I spit a plant, the star and the motherfuckers, but I got a bottle of the streets in the back" --Eminem Bot
@ltc0060
@ltc0060 7 лет назад
Hey siraj, I did download the code and I tried to run it, but it failed. I checked and tried to fix the problems (missing variables, outputs and such), and run it again for a day, I used 42 Kb text file (Turkish text) for training, but the output was giberrish as in " . . . vVVvAV:,,VVDAAVVVVvvvV:.k . . . " --> "which is, Trust me, not turkish" . I mean I'm not expecting well written text, but I hoped to get some words atleast, there was not a single word. Can you fix the code and upload it again when you get a chance. because it seems like I messed it up while trying to fix it. Thank you!
@avishkabaddage1243
@avishkabaddage1243 6 лет назад
I faced the same problem, have u found a solution??
@AnanyaChadha
@AnanyaChadha 5 лет назад
same problem over here!
@ajaysurendranath7023
@ajaysurendranath7023 5 лет назад
def LoadText(): replacing text = list(data) with text = list(data.split(' ')) should give better predictions The earlier one predicts at a character level the new one at a word level
@prakashkafle454
@prakashkafle454 3 года назад
How lstm can be used for multi class classification can you make video like this clear explanation
@taiaalaoui1500
@taiaalaoui1500 4 года назад
Hello, many thanks for this video, I have a question: How do we know that the forget gate is really forgetting ? I mean how do we mathematically set it to make sure its function is to learn what to forget and not something else ?
@j1nchuika
@j1nchuika 7 лет назад
4:43 this is me every time I see a new Siraj video
@SirajRaval
@SirajRaval 7 лет назад
haha
@jeevanel44
@jeevanel44 7 лет назад
Awesome! \m/ Please make videos on GRUs and Reinforcement learning as well !
@tonydenion3557
@tonydenion3557 7 лет назад
He already does a vid on RL, called Q learning
@jeevanel44
@jeevanel44 7 лет назад
Oh didn't know that, thanks for the info Tony!
@tonydenion3557
@tonydenion3557 7 лет назад
My pleasure , Enjoy :D
@GlazeAndMaren
@GlazeAndMaren 7 лет назад
Hey there, Mr. Siraja sauce. You need to go over the forwardProp method in the RecurrentNeuralNetwork class. You missed an "in" in the for loop and an equal sign in the second to last line in the for loop.
@GlazeAndMaren
@GlazeAndMaren 7 лет назад
Other than that, your vids are sublime
@williemaddox9919
@williemaddox9919 7 лет назад
Also the forwardProp doesn't catch the correct number of arguments from the LSTM.forwardProp. I'm guessing the line should be, cs, hs, f, inp, c, o = self.LSTM.forwardProp() is this right?
@SirajRaval
@SirajRaval 7 лет назад
great feedback thanks
@duncanduke5966
@duncanduke5966 6 лет назад
Which program do you use to put your computer screen (or slides) in the background? I really like how you pull this off.
@alphacharith
@alphacharith 7 лет назад
Thank you siraj. keep up the great work. ignore the idiotic comments of TheDiscoMole
@alexmckinney5761
@alexmckinney5761 7 лет назад
I may have misunderstood your code but in this blog colah.github.io/posts/2015-08-Understanding-LSTMs/ the creation of new candidate values to add to the cellstate is done within the LSTM cell however in yours it is done (I think) using a RNN. Is there anything wrong with creating candidate values within the cell or am I going down the wrong route?
@bryanphang5686
@bryanphang5686 4 года назад
Hey, is anyone getting this error `ValueError: too many values to unpack (expected 5)` ? Please help!
@charlescrawford1103
@charlescrawford1103 7 лет назад
So, I've been experimenting with LSTMs with learning both letter by letter and word by word. Why did you choose word by word? My networks seem to predict better when learning letter by letter. Also, maybe I missed it, but how do you inpiut your data? Is it in one hot tensor form like (num_batches, sequence_len, vocab_size)? Thanks, your vids are good.
@petter9078
@petter9078 7 лет назад
If I wanted, could i use neural networks to manipulate sound? For instance recreate old non-linear analogue synthesizers or compressors?
@edenaut
@edenaut 7 лет назад
Hey Raval, do you know if the neural nets learn better if (for example a man stands in front of a greenscreen) and u exchange the green with a random noise, or it is better to exchange with real backgrounds. Maybe with the random noise the net doesnt learn features out of it or?
@caueZero
@caueZero 7 лет назад
Very nice video, again! Hey, do you know any material to understand better some of the quant concepts applied in the code? Thanks.
@DP-bl7nh
@DP-bl7nh 6 лет назад
annoying , less mastery and more noise
@TheFunkiGuy
@TheFunkiGuy 5 лет назад
What is this hidden state? Is it only previous output or both previous output and input.
@Ak1990-e6z
@Ak1990-e6z 6 лет назад
Why we do not update the weight 'U' that connects our input layer & hidden layer while BPTT? in RNN?
@sikor02
@sikor02 6 лет назад
Siraj, what would you recommend to someone who wants to evolve recurrent lstm networks? Tensorflow by itself does not provide evolution algorithms as far as I know.
@dimitriosbizopoulos8226
@dimitriosbizopoulos8226 6 лет назад
cool video once again. Your shirt needs some ironing
@DigitalWerkz
@DigitalWerkz 7 лет назад
Hi... I'm a big fan. I have only one piece of feedback for now. I get halfway through typing the Python code, and then your body blocks the tail end off of a critical piece of coding. I'm still learning, so filling in the blanks is a little difficult. Sometimes I have to find just the right frame in order to see the actual proper code. Could you please work on keeping your body out of the way of the source. Other than that, I think you're doing a great job here! Thank you Siraj. :-)
@SirajRaval
@SirajRaval 7 лет назад
will do thank u
@DigitalWerkz
@DigitalWerkz 7 лет назад
Thanks, and keep up thegreat work. :-)
@cliccme
@cliccme 5 лет назад
Hi Siraj make a vide on BiDirectional LSTM on emotion detection like this. Helps alot for students like us.
@yiy11005
@yiy11005 6 лет назад
很棒的视频。
@13MrMusic
@13MrMusic 4 года назад
Disclamer: New at all of this stuff I have a question, I dont see any bias vector in the LSTM cell. Am I looking it wrong or what?. Then in wikipedia Ive read that to calculate the output of each gate not only do you need to add a bias but also theres another matrix of weights that needs to be multiplied to the previous hidden state and the sum of all of those is the output of the gate. for example for the forget gate: f = sigmoid(Wf*x(t)+Uf*h(t-1)+bf) where W and U are the matrix weights and bf is the bias vector Am I reading your code wrong and this is taken into account? Or this is a simpler version of a perfectly valid LSTM just different.
@clownheino9607
@clownheino9607 3 года назад
It is possible to incorporate the bias into the weight matrix. Not sure whether he is doing that here. But consider the following example of a plain. 0 = a * x1 + b * x2 + c * x3 + d (d being your bias, and x your input) That is the same as: 0 = [a b c] * [x1 x2 x3]^T + d Now incorporating d: 0 = [a b c d] * [x1 x2 x3 1] Meaning you can always add the bias to your weights, when you add a 1 to your input matrix / vector. The thing is here you just have one single multiplication instead of an additional addition. It's just more compact basically.
@shankar2chari
@shankar2chari 7 лет назад
Awesome video but the variable naming can be much better than this. Its quite confusing, every time I got to go back and recollect what exactly it.
@omarlopezrincon
@omarlopezrincon 7 лет назад
beautiful
@I77AGIC
@I77AGIC 7 лет назад
in your video on different gradient descent optimizers didn't you say adam was usually the best choice? just curious if there is a reason why you chose RMSprop for this
@josiah42
@josiah42 7 лет назад
@Siraj Raval you haven't added the last 5 videos to your "Math of Intelligence" playlist yet. Thanks man!
@SirajRaval
@SirajRaval 7 лет назад
just added, thanks!
Далее
Random Forests - The Math of Intelligence (Week 6)
36:27
LSTM Networks: Explained Step by Step!
34:48
Просмотров 22 тыс.
Can You Bend This Bar?
01:00
Просмотров 5 млн
Long Short-Term Memory (LSTM), Clearly Explained
20:45
Просмотров 539 тыс.
K-Means Clustering - The Math of Intelligence (Week 3)
30:56
What is LSTM (Long Short Term Memory)?
8:19
Просмотров 194 тыс.
Why Neural Networks can learn (almost) anything
10:30
Can You Bend This Bar?
01:00
Просмотров 5 млн