Lesson 5: Practical Deep Learning for Coders 2022

Jeremy Howard

Подписаться 122 тыс.

Просмотров 69 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

13 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 48

@GilesThomas 6 месяцев назад

Awesome as always! Worth noting that the added columns (""Sex_male", "Sex_female", etc) are now bools rather than ints, so you need to explicitly coerce the df[indep_cols] explicitly at around @25.42 -- t_indep = tensor(df[indep_cols].astype(float).values, dtype=torch.float)

@mohamedahmednaji5544 4 месяца назад

❤

@mattambrogi8004 4 месяца назад

Great lesson! I found myself a bit confused by the predictions and loss.backward() at ~37:00. did some digging to clear my confusion up which might be helpful for others: - At 37:00 minutes when we're creating the predictions, Jeremy says we're going to add up (each independent variable * coef) over the columns. There's nothing wrong with how he said this, it just didn't click for my brain: we're creating a prediction for each row by adding up each of the indep_vars*coeffs. So at the end we have a predictions vector with the same number of predictions as we have rows of data. - This is what we then calculate the loss on. Then using the loss, we do gradient descent to see how much changing each coef could have changed the loss (backprop). Then we go and apply those changes to update the coefs, and that's one epoch.

@tumadrep00 Год назад

What a great lesson given by the one and only Mr. Random Forests

@dhruvnigam7488 7 месяцев назад

why is he Mr. Random Forests?

@goutamgarai9624 2 года назад

Thanks Jeremy for this great tutorial.

@zzznavarrete Год назад

Amazing as always Jeremy

@howardjeremyp Год назад

Glad you think so!

@senditco 6 месяцев назад

I might be cheating a lil because I've already done a deep learning subject at Uni, but this course so far is fantastic. It's really helping me flesh out what I didn't fully understand before.

@minkijung3 Год назад

Thank yo so much for this lecture, Jeremy🙏

@blenderpanzi 9 месяцев назад

1:02:38 Does trn_dep[:, None] do the same as trn_dep.reshape(-1, 1)? For me reshape seems a tiny bit less cryptic (though the -1 is still cryptic).

@blenderpanzi 9 месяцев назад

Semi off topic: What I really dislike about Python is the lack of types (or that type hints are optional). It really makes it difficult to understand things if you learn complicated new stuff like this. Is that argument a float or a tensor? What is the shape of the tensor? If that would be in a type of the function argument it would make reading the code much more easy when learning this stuff.

@Kevin-mw6kc 9 месяцев назад

If python had a strong type system it would be misaligned with it's purpose.

@noahchristie5267 7 месяцев назад

You can force more typing and type restrictions with different external sources and scripts

@DevashishJose Год назад

Thank you for this lecture jeremy.

@garfieldnate 2 года назад

It drives me absolutely batty to do matrix work in Python because it's so difficult to get the dimension stuff right. I always end up adding asserts and tests everywhere, which is sort of fine but I would rather not need them. I really want to have dependent types, meaning that the tensor dimensions would be part of the type checker and invalid operations would fail at compile time instead of run time. Then you could add smart completion, etc. to help get everything right quickly.

@howardjeremyp 2 года назад

You might be interested in hasktorch, which does exactly that!

@garfieldnate 2 года назад

@@howardjeremyp Hey that's pretty neat! Wish it worked in Python, though :D

@c.c.s.1102 2 года назад

What helped me was reading the PyTorch source code with the `??` operator and thinking about the operations in terms of linear algebra. It's hard to keep all of the ranks in mind. At the end of the day I just have to keep hacking through the errors.

@user-xn1ly6xt8o 4 месяца назад

it's awesome! thanks a lot

@michaelphines1 Год назад

If the gradients are updated inline, don't we have to reset the gradients after each epoch?

@LordMichaelRahl Год назад

Just to note, this has been fixed in the actual downloadable "Linear model and neural net from scratch" notebook.

@rizakhan2938 11 месяцев назад

54:43 looks like its tough to control here.Good Question

@tegsvid8677 Год назад

Why we divide the layer1 with n_hidden?

@iceiceisaac 7 месяцев назад

I get different results after the matrix section.

@jimshtepa5423 Год назад

why use sigmoid and not just round the absolute values of predictions to either 1 or 0?

@VolodymyrBilyachat 5 месяцев назад

Instead of splitting code to cells I like to run notebook in VsCode and i can debug as normal

@420_gunna 9 месяцев назад

Does anyone have an alternate way of explaining what he's trying to get across at 1:05:00?

@mdidactics 7 месяцев назад

If the coefficients are too large or too small, they create gradients that are either too steep or too gentle. When the gradient is too gentle, a small horizontal step won't take you down very far, and the gradient descent will take a long time. If the gradient is too steep, a small horizontal step will correspond to a big vertical drop and a big vertical swoop up the other side of the valley. So you might even get further away from the minimum. So what you want is something in between.

@blenderpanzi 9 месяцев назад

What does Categorify do? I looked it up and didn't understand. Is it converting names ("male", "female") to numbers (1, 2) or something?

@thomasdeniffel2122 4 месяца назад

Adding a dimension at ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-_rXzeWq4C6w.html is very important as otherwise the minus in the loss function, which then would do incorrect broadcasting leading do an model, which achieves at most 0.55 accuracy. The error is silent, as the mean in the loss function hides this.

@ansopa Год назад

coeffs =torch.rand(n_coeff)-0.5 What is the use of subtracting 0.5 from the coefficients? Is there a problem that the values are just between 0 and 1? Thanks a lot.

@m-gopichand 10 месяцев назад

torch.rand() generates random numbers range 0 to 1, subtracting 0.5 from the random coefficients is a simple technique to center the random values around zero, I believe that help in optimizing the gradient descent.

@emirkanmaz6059 9 месяцев назад

Shifting the range between -0.5, 0.5 so it can take positive and negative. There is different strategies you can google "weight initialization strategy" Libraries does this auto for relu or tanh etc

@hausdorffspace 11 месяцев назад

This is going to sound very pedantic, you use the word "rank" where I think "order" would be more correct. Rank usually means the number of independent columns in a matrix. At about 1:02:00, you say that the coefficients vector is a rank 2 matrix, but I would say its rank is 1 and its order is 2.

@alyaaka82 6 месяцев назад

I was wondering why you replaced the NaN in the data frame with the mode not the mean?

@user-ic9oi8qo3g 3 месяца назад

what would be the mean of names ?

@mustafaaytugkaya3020 Год назад

@howardjeremyp I haven't examined the best-performing Gender Surname Model for the Titanic dataset in detail, but something seems rather strange to me. Isn't using the survival status of other family members constituting a data leak? After all, at the time of inference, which is before the Titanic incident, I would not have this information.

@twisted_cpp Год назад

Depends on how you look at it. If you're truing to predict whether a person has survived or not, and you already have a list of confirmed survivors and casualties then it's probably a good way to make the prediction, as in if Mrs X has died, then it's safe to assume that Mr X has died as well. Or if their children have died, then it's safe to assume that both their parents are dead if you consider that women and children board the lifeboats first.

@thomasdeniffel2122 4 месяца назад

In `one_epoch` at 44.09, there is a `coeffs.grad.zero_()` missing :-)

@navrajsharma2425 8 месяцев назад

27:18 Why don't we have a constant in our model? How can we know that there's not going to be a constant in the equation? Can someone explain this to me?

@Deco354 7 месяцев назад

I think the dummy variables effectively act as a constant because they’re either 1 or 0

@anthonypercy1770 7 месяцев назад

Simply brilliant workshop... I had to change/add dtype=float e.g pd.get_dummies(tst_df, columns=["Sex","Pclass","Embarked"], dtype=float) to get it to work maybe due to a later version of pandas?

@ukaszgandecki9106 Месяц назад

Thanks! should be pinned, I had the same problem. For people googling, error was: "can't convert np.ndarray of type numpy.object_"

@leee5487 7 месяцев назад

torch.rand(n_coeff, n_hidden) How does one set of coeffs, output 20 n_hidden values? I mean, mathematically, a single set of coefficients multiplied by a specific set of values will alway equal the same thing right?

@bobuilder4444 6 месяцев назад

Im assuming you are in the section about NN (before deep learning). The term n_hidden is a bad variable name. Its only 1 hidden layer, but the hidden layer is the linear combination of n_hidden relu's. Each of the relus have coefficients to learn which we store in a matrix size n_coeff by n_hidden.