Back propagation to neural networks is what negative feedback is to closed loop systems. The understanding come pretty much naturally to the people which studied automation and control engineering. However - many articles tend to mix thing up. In this case back propagation and gradient descent. Back propagation is the process of passing the error back through the layers and using it to recalculate the weights. Gradient descent is the algorithm used for recalculation. There are other algorithms for recalculation of the weights.
The gradient is passed backward using the chain rule from calculus. The gradient is just a multivariable form of the derivative. It is an actual numerical quantity for each "atomic" part of the network; usually a neuron's weights and bias.
I can't get enough of you brilliant videos. Thank you for making what it seemed to me before as complicated easy to understand . Could you please post a video about loss functions and gradient decent?
probably would have been good to describe that this is supervised learning as this would nit translate well for a beginner trying to apply this to other form of NNs
5:36 Actually, convergence is does not necessarily mean the network is able to do its task reliably. It just means that its reliability has reached a plateau. We hope that the plateau is high, i.e. that the network does a good job of predicting the right outputs. For many applications, NNs are currently able to reach a good level of performance. But in general, what is optimal is not always very good. For example, a network with just 1 layer of 2 nodes is not going to be successful at handwriting recognition, even if its model converges.
Hi, you seem to have good knowledge on this, can I ask you a question please. Do you know if neural networks will be good for recognizing handwritten math expressions? (digits, operators, variables, all elements seperated to be recognized individually). I need a program that would do that and I tried a neural network, it is good for images from dataset but terrible for stuff from outside the dataset. Would you have any tips? I would be really greatful
Thank you for the information. Could you please tell if the the BP is only available and applicable for Supervised models, as we have to have a pre computed result to compare against!! Certainly, unsupervised models could also use this theoretically but does / could it effect in a positive way? Additionally how the comparison actually performed? Especially for the information that can't be quantised !
When doing the Loss Function hove is the "Correct" output given? Is it training data and the compared an other data file with desired outcomes? In the example of "Martin" how does the neural network get to know that your name was not Mark?
Isn't Back Propagation used to lower the computation needed to adjust the weights? I understand that doing so in a "forward" fashion is much more expensive than in a "backward" fashion?
It’s done by calculating the derivatives of the y hats with respect to the weights, and the function done backwards in the network applying the chain rule of calculus
A neural network cannot be connected by weights, this is nonsense. It can be connected by synapses, that is, by resistances. The way the network learns is incredibly tricky: not only does it have to remember the correct result, which is not easy in itself, but it has to continue to remember the correct result while remembering a new correct result. This is what distinguishes a neural network from a fishing net.