Wow. What a great talk. Substantial, yet easy to follow. And that seems to be a characteristic of papers he co-authored as well. Great research, but also great education. 🎉
Has someone more insight in what happens in the phase where the loss does not improve but it is still substantial to train on(i.e. have patience)? Seems to me there is still some significant weight reorganization taking place which not directly yields an improvement but sets the stage for the next lr phase