No video :(

Neural Networks - The Math of Intelligence #4

Siraj Raval

Подписаться 770 тыс.

Просмотров 53 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

29 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 131

@danielparrado3605 6 лет назад

wow... this 11 min video took me 2 hours to understand most of it. You did a really good job putting ALL that information in such a short amount of time. Great job Siraj, keep up the good work!

@PaulGoux 7 лет назад

This video went from 0 - 100 real quick.

@basharjaankhan9326 7 лет назад

Hey Siraj, you're AWESOME! Nothing less. I am watching your videos to learn Machine Learning while my college admissions are going on. Never stop, cuz i too want to see AI solved in my lifetime.

@basharjaankhan9326 7 лет назад

Did I forget to mention that your videos are easy to understand. Sorry for that.

@lionelt.9124 5 лет назад

Those beats... deserved a rewind all on their own. A beat souffle I would say.

@jefkearns 5 лет назад

The flute beat is mine. Hurricane. Video is on my channel.

@Throwingness 7 лет назад

I really appreciate all the work you're doing with these videos. Sorry for my caustic comments before. I am a rank ammeter. You're videos are getting better and better.

@Murderface666 7 лет назад

All the talk about neural networks from conferences to individual series are cool, but what a lot of people aren't clearing up is exactly how to apply it based on real-world example. Its like giving a person an engine and showing how the engine itself works, but one person may want a car engine, another may want a boat engine, another may want a jet engine and another may want whatever engine the Starship Enterprise uses. So in all actuality, there is not really any information on how to use neural networks so that programmers can use it to apply to whatever problem.

@SirajRaval 7 лет назад

+Partisan Black see my intro to deep learning playlist

@Murderface666 7 лет назад

***dammit man, Your explanations are awesome!

@wibiyoutube6173 5 лет назад

Thanks for the amazing info mate. In the Fast Ai course, they say: one should learn the code first then the theory, but you prove them wrong in my opinion. Thanks again my friens.

@JapiSandhu 6 лет назад

this channel is so under rated

@simetry6477 6 лет назад

Japi Sandhu he is a great communicator.

@davidutra2304 7 лет назад

hello Siraj, make a video talking about what is necessary to start to learning machine learning, like basic math necessary and programming language to learn before start. sorry for my English, I'm Brazilian, thanks

@saminchowdhury7995 5 лет назад

Take a shot everytime he says function. Great vid btw

@BiranchiNarayanNayak 7 лет назад

I liked the "LOVE" equation was too good.... Thanks Siraj :)

@SirajRaval 7 лет назад

lol thanks

@novansyahherman6788 5 лет назад

Terima kasih mas siraj, saya di kasih tugas karena anda

@RyanRamadhanii 5 лет назад

sama sama mas novan -mas siraj

@Tozziz 9 месяцев назад

This video is awasome!!!! Thank you so much :)

@SubarnoPal 7 лет назад

Explaining LSTM and Conv Net implementations would be very helpful in upcoming tutorials!

@SirajRaval 7 лет назад

i will

@ebimeshkati4729 7 лет назад

Siraj, Could you kindly provide us with an example (tutorial) on how properly to update a trained deep learning model based on new data (lets say from a sensor)?

@ManojChoudhury99 6 лет назад

Learning from you is amazing.

@tanmayrauth7367 6 лет назад

can anyone please explain me why derivative of sigmoid function is taken as x*(x-1) . ??

@Wherrimy 6 лет назад

Can someone clarify the part at 2:26 about dot product and matrix multiplication? It says that they're the same, while they're completely different, dot product producing a scalar, and matrix multiplication producing a matrix.

@j1nchuika 7 лет назад

Kind of late but, could somebody explain why the random wight matrix at 2:15 is multiplied by 2 and minus 1? I tried without them and it worked pretty much the same, but I'm doing the simple AF one...

@ericalcaidealdeano7674 7 лет назад

Hey I was unable to install PIL via pip, so I changed 3 lines and it worked: import matplotlib.pyplot as plt from scipy.misc import toimage # from pillow import Image def show(self): plt.imshow(toimage(self.weights.astype('uint8'), mode='RGB')) plt.show()

@khalidjaradat 5 лет назад

Siraj Raval , thank you for all these great videos Can you become a little bit slower? Because our first language isn't English

@robinranabhat3125 7 лет назад

hey siraj , are you a speedreader/ speedlearner ? if yes , please try to make a video series on your fast learning style too

@anastasia_onion 7 лет назад

King of memology!

@computersciencebasis6051 6 лет назад

When time mattered in the input sequence then RNN Comes in. Good.

@manojnagthane 7 лет назад

Hi... Could you please explain the difference and relation between big data,data science,machine learning and neural networks. please please make a video on that.

@venu589 6 лет назад

Excellent lecture bro but i have some doubts....why neural networks need hidden layer with multiple neurons why cant it adjust with one neuron in the hidden layer?Moreover same inputs are connected to each neuron in the hidden layer which gives the same output.Do we need to give different set of weights to each input so that differentiates one neuron from other?What every neuron in the hidden layer is computing?

@yasar723 6 лет назад

This video is GOLD!!!!

@ranojoybarua6468 7 лет назад

Hey, how do we optimize the total number of hidden layers required and number of neurons present in each layer for a model. e.g., Like a image recognition problem can be solved by having 2 hidden layers and each layer having 100 neurons each but same can be solved by using 5 layers each having 400 neurons. So how do we optimize these numbers ?

@MrDominosify 7 лет назад

Siraj, I wonder. Sigmoid function is y = 1 / (1 + e^-x). It's derivative is equal to e^x / (e^x + 1)^2 Why in this video are you using different function as derivative? x*(1 - x)

@simonmandlik910 7 лет назад

that is exactly what I was thinking as well. The derivative can be rewritten as s(x)*(1-s(x)), where s(x) is sigmoid function, but definitely not as x*(1-x). His training seems to be working though :O

@simonmandlik910 7 лет назад

I get it now. I am probably used to different order of computation. Error is defined as partial derivative of cost function, w.r.t. weighed input z (W*x + b). If you want do calculate error in the last layer, according to chain rule, you have: error = dC/da * ds/dz, where C is cost function, a is activation in the last layer and s is sigmoid/activation function. If you want to compute exact value of second term, you should plug in z to sigmoid prime, but Siraj plugs in activation (sigmoid already applied) and that's why we don't have to apply sigmoid in the function

@XRobotexEditz 7 лет назад

@Simon Mandlik I still do not understand . see activation(np.array([2.0,1.0,-1.0]),True) and np.array([2.0,1.0,-1.0])*(1-np.array([2.0,1.0,-1.0])) generates the same result. I do not see how x*(1-x) is the same as S(x)*(1-S(x)). ?

@XRobotexEditz 7 лет назад

@rahls7 6 лет назад

So, it does appear that nonlin returns x*(1-x) when deriv=True, however when it is called, the x that is passed to it is itself a sigmoid function L1, effectively making it the same thing. I guess, it just helps to represent it as x instead of typing it again.

@DrewBive 7 лет назад

I have a problem in the last line of code .In your notebook u have this -' #testing print(activate(np.dot(array([0, 1, 1]), syn0))) [ 0.99973427 0.98488354 0.01181281 0.96003643]' So when i just copy-past this i had an error like NameError.Then i 'from numpy import array' and got different result from activation function.it was like that = [ 0.36375058].What the prroblem? Ps.U have a mistake in this code -github.com/llSourcell/neural_networks/blob/master/simple_af_network.ipynb .( #Use it to compute the gradient layer2_gradient = l2_error*activate(layer2,deriv=True) .In this line we have l2_error parametr.Instead of this u need to use layer2_error).Thank you

@akashvaidsingh 7 лет назад

Hey Siraj,Plz make lots of tutorial videos on Neural Network.For students(just like me) who want to learn about ANN.

@pinkiethesmilingcat2862 7 лет назад

akash vaid perhaps if you support him on Patreon. i dont know.

@SirajRaval 7 лет назад

i have so many countless neural network videos see my intro to deep learning playlist. i will make more

@ThomasHauck 7 лет назад

You got it goin on ...

@JordanShackelford 7 лет назад

breh why are people still using sigmoid? I thot ReLu was superior

@nandanp.c.7775 7 лет назад

As far as i know Sigmoid is still used because to get the probabilities. [0-1] values , which is not the case with ReLU(for binary classification problems | Softmax in case of multiclass classification problems. ) . So they are used just in last layer. ReLu doesn't suffer from vanishing gradient problem so they are all used in hidden layers so that errors can be propagated back effectively.

@SirajRaval 7 лет назад

what Nandan said is true.

@Tagraff 7 лет назад

Why don't we use each channel/layer as a form of captured time? Say 5 second length as a captured time. Then use that as a channel/layer and apply it to the system. Action as a symbol.

@LogOfOne 7 лет назад

great vid again.....really helpful

@leonnowsden7802 6 лет назад

This video is very helpful

@UsmanAhmed-sq9bl 7 лет назад

awesome siraj

@ongjiarui6961 7 лет назад

Hi Siraj! Here's my solution for this week's coding challenge: github.com/jrios6/Math-of-Intelligence/tree/master/4-Self-Organizing-Maps

@hammadshaikhha 7 лет назад

I read over your notebook, I liked the nice and simple vectorized code. I am trying to understand the general intuition behind how you did the MNIST example. Correct me if I am wrong, but your output lattice of nodes is 20 x 20, so you have 400 weight vectors lying in dimension 784 (number of pixels in image). You then represented this information as a 3D matrix of size 20x20x784. After training this matrix has the finalized weights. Its not clear to me what your doing next? Are you now using these 400 weights, to form 400 clusters in your data, and then plotting the each image in the clusters on the 20x20 lattice to get the visualization?

@ongjiarui6961 7 лет назад

hammad shaikh Yeah, you're right. To visualise the 3D Tensor, we have to transform it to a 2D matrix first. So each 768 weight vector is converted to 28x28 matrix and aligned according to the parent node in the lattice.

@SirajRaval 7 лет назад

u rule Ong

@JordanShackelford 7 лет назад

Wow amazing!

@larryteslaspacexboringlawr739 7 лет назад

thank you for math intelligence video

@Nightphil1 7 лет назад

Hey Siraj, why do we add the gradients after we backproped them instead of subtracting. We are going for the minima right?!

@javerenzoaugustinejao4729 7 лет назад

I think it's because the results are negative? or am I just dumb :D

@superchefliumaohsing 7 лет назад

Hi Siraj, all of your videos are playable offline except this one. Im trying to learn machine learning and i downloaded all of your videos to watch it when im travelling going to work. Hope that in a few weeks i could send an entry for your github contests. Anyway, can you change the setting to be saved offline?

@SirajRaval 7 лет назад

hmm use keepvid dot com

@0newhero0 7 лет назад

love ur vids, keep up the good work!

@SirajRaval 7 лет назад

thank u!

@Leon-pn6rb 7 лет назад

in IN[43] 2:22 , can someone tell me what does this line mean: *synaptic_weights = 2 * np.random.random((3,1)) - 1* What is the significance of (3,1) - 1 and why was his code working without affixing 'np' in the beginning (like I did)? And why Random.random (random 2 times )?

@MultiverseHacker 7 лет назад

np was "affixed" (you mean imported) by import numpy as np It sounds to me like you are a total beginner, but I'm going to answer anyway. (3,1) is a python data structure called tuple, it's packing 2 values into one variable. It's supposed to describe the dimensions of the output matrix which will be 3 rows and one column. By default random() returns a matrix with random values between 0 and 1, the matrix size is specified by this tuple. np is the numpy module you imported np.random is the random number generator inside of numpy np.random.random((3,1)) calls the function random() on the random number generator, requests matrix dimensions 3x1 2*np.random.random((3,1)) multiplies all values in this (3x1) matrix by 2, resulting in a matrix with random values betweem 0 and 2 2*np.random.random((3,1))-1 the minus one subtracts one from each value, making a matrix with random values between -1 and 1

@Leon-pn6rb 7 лет назад

nono i knew np was numpy , my question was that he didnt import it and yet his code worked I always consider myself a beginner in everything, but I am brand new to python Ohhh i get it now , u r a BOSS danke very mush

@MultiverseHacker 7 лет назад

He simply didn't bother to show the import. If you look at his code on github you'll see it's there

@marr73 6 лет назад

12345a scroll up, there is his import

@sivaprasad-pw3xt 7 лет назад

hai sriraj kindly provide links to learn machine learning iam new to this field

@tyhuffman5447 7 лет назад

Simplified AF network, not familiar with that one.

@SirajRaval 7 лет назад

Hmm. yeah. technical definition is a single layer feedforward network. older terminology is perceptron. i shouldve said that instead. thanks

@tyhuffman5447 7 лет назад

No, keep it. You're entertaining AF! Best channel for learning AI. Keep up the good work.

@kondziossj3 7 лет назад

@Siraj Raval can you create some video about data overfitting? And of course solution for that problems... Because i try a lot of time create your previous challenge, but sometimes I have big overfitting problems, like when I use train data then I have 100% acc. but with test data I have ~10% -_- (of course I check the best prediction in tensorboard, but it isn't great solution for it. Correct me if I am wrong :D) Have you any better solution for overfitting problems?

@412kev2 7 лет назад

Was crackin up at 1:15

@transitioningtech Год назад

I'm a business programmer and I just have one thing to say. If I ever have to program like this to keep a job ... I'm screwed. What the hell is a "sigmoid function?"

@SirajRaval Год назад

Sigmoid function is a type of activation function for neural networks. Search activation functions siraj on RU-vid and watch that vid you’ll love it

@rishikksh20 7 лет назад

Please make a video how to configure, train and use Tensorflow new Object Detection API with own dataset and model

@Leon-pn6rb 7 лет назад

*On Sigmoid* : I was just reading about it The derivative of a sigmoid function S'(x) = S(x) * (1-S(x)) But here you did: S'(x) = x * (1-x) *Can someone please explain?*

@Leon-pn6rb 7 лет назад

oly shit i got it now u were being cheeky smart with puttin those 2 important parts in one function or mayb i am a dumb shit god, i m so laggin you

@Christian-mn8dh 5 лет назад

They are the same things just in different syntaxes.

@RAHULGUPTA-ce6zb 6 лет назад

Can anyone tell me how we calculated gradient?

@nickellis1553 7 лет назад

when you realize siraj knows several languages and probably actually went to go practice his Dutch

@morkovija 7 лет назад

why is the print function parameter censored? hehe =)

@Gioeufshi 7 лет назад

It is not censored it is a reference to "black box"

@rgrimoldi 6 лет назад

yo - what's the name of the song? it's amazing!

@jefkearns 6 лет назад

Hurricane - Jef Kearns

@rajscuba 7 лет назад

yay

@ozzzer 5 лет назад

Ngl, siraj has the weirdest sense of humour

@dimitrilambrou 7 лет назад

Why would you need to practice dutch?

@SirajRaval 7 лет назад

i live in amsterdam

@AkashMishra23 7 лет назад

is this Deep Learning or is this Art?

@SirajRaval 7 лет назад

both

@emperorjustinianIII4403 7 лет назад

I heard practising Dutch, which triggered me because I'm a Dutchie.

@emperorjustinianIII4403 7 лет назад

BTW, if you want a suggestion for learning material I'm using Duolingo and I like it.

@bobcrunch 7 лет назад

What's the correct way to say: Met een windmolen in het hoofd slaan or Door een windmolen in het hoofd slaan ?

@pinkiethesmilingcat2862 7 лет назад

Doulingo is well, but i think that the practice is the better way to learn than have a boring course of any language that you wish. por example, my english is not perfect, but learned a lot making the subs in english and spanish of math of intelligence. before, i have been colaborator in other videos of philosophy, memes, reviews etc. without a basic english and now i'm here, typing you.

@emperorjustinianIII4403 7 лет назад

Bob Crunch, I have never heard either of those sentences. But that's possible because even I sometimes don't know an idiom that's not used very often. But I'd choose 'Met een windmolen in het hoofd slaan', because you can hit someone in the head with a windmill (as in a windmill-toy), but one can't literally 'through a windmill hit in the head'. Note that the word 'door' means 'through' in this sentence.

@bobcrunch 7 лет назад

I heard from a Dutch native speaker that "Hit in the head by a windmill" was an idiom for someone who is crazy or maybe someone with a bad idea. Thanks for the reply.

@abhisheksinghchauhan6115 7 лет назад

what are the prerequisite for ML?

@user-ic7ii8fs2j 7 лет назад

Calculus + basic knowledge of programming

@nickellis1553 7 лет назад

Abhishek Singh Chauhan patience

@abhisheksinghchauhan6115 7 лет назад

Nick Ellis I mean to say which programming language

@nickellis1553 7 лет назад

Abhishek Singh Chauhan well Python. but honestly that doesn't matter as much as having the patience to go line by line and equation by equation and trusting that yr brain will make sense of it all. also a good "statistics vocabulary" , and familiarity with Linear Algebra and matrix operations lol.

@abhisheksinghchauhan6115 7 лет назад

Aditya Abhyankar voice responsive automated system like assistant