The Most Important Curve in Data Science

Подписаться 164 тыс.

Просмотров 7 тыс.

50% 1

My Patreon : www.patreon.co...
www.flaticon.c...
Carrot icons created by Freepik - Flaticon
www.flaticon.c...
Fruit icons created by Pixel perfect - Flaticon

Опубликовано:

17 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 36

@minchulkim87 Год назад

My favourite curve is the learning curve ;) - which you are helping with. So, thank you.

@ritvikmath Год назад

You're very welcome!

@fintech1378 Год назад

damn good witty comment

@pectenmaximus231 Год назад

I recently tried to explain something of this sort to colleagues as a fishing net trying to catch some kind of medium sized fish. The size of the gaps in the net needed to guarantee you never miss a single fish you were actually trying to catch, happens to be small enough that plenty of other marine life gets caught too. This lowers the quality of the catch and introduce the costs of having to pick through when processing, potentially even ruining some batches when the wrong thing gets scooped up. On the other hand, the size of gap needed that guarantees you never catch anything besides what you want, is large enough that you will also occasionally lose out on the catch of fish you set out for. In my real work example, we can afford to have a bit of junk mixed in that gets picked out, but we can’t afford to miss anything, so I went with the smallest net-gap size I could that still never missed the legitimate targets.

@vps071 Год назад

your video are 0% click baits & 100% info! :)

@Set_Get Год назад

EVERY TIME you choose a useful topic to talk about. thanks.

@ritvikmath Год назад

Thanks 🙏

@IbrahimSobh Год назад

I love these intuitive explanations! thank you

@ritvikmath Год назад

Glad you like them!

@jsebdev1539 Год назад

I freaking love your videos. I have breakfast, lunch and fall asleep watching them

@i.dragons Год назад

Just WOW!

Год назад

Excellent video, thanks!

@youriwatson Год назад

Very nice explanation to see as an econ student!

@ritvikmath Год назад

Awesome!

@cheesecake202020 Год назад

Great video and great thumbnail!

@Ucefmjb Год назад

The explanation is very good, thank you

@ritvikmath Год назад

You are welcome!

@barnabyinteractive Год назад

awesome

@ritvikmath Год назад

👍🏼 thanks!

@ToughLuck808 Год назад

Could you make a video on useful resources like blogs for inspiration? Also a kaggle series would be 😎 You are the best teacher of advanced statistics topics on YT ♥️

@ritvikmath Год назад

Great suggestions! And thanks!

@OddBarasch Год назад

Into to Micro Economics

@MrMoore0312 Год назад

Absolutely love the video! Would be interested to see a code with me on the topic. How does one go about exploring the precision recall frontier? Is it just hyperparameter tuning via grid search, or a more deliberate method I'm not aware of? I have a neural net trained with decent accuracy for what I want, but it deals with stocks so I'd much rather have no signal than a false signal. Not necessarily asking for neural net, could be logistic regression or random forest, just more clarity on the question would be wonderful! Thanks for all you do man!!

@ritvikmath Год назад

“How does one go about exploring the precision recall frontier?” is an excellent question. If the question is more about how we construct the frontier, usually that is done in binary classification problems by varying the threshold for marking some example in the positive class. Low thresholds lead to strong recall but poor precision and vice versa for high thresholds.

@MrMoore0312 Год назад

@@ritvikmath that was my question and your answer makes good sense, thanks! I'll iterate over different classification thresholds to establish the curve and then pick the level that works best for me :)

@abhigyandatta2008 Год назад

Hi Jesse, my understanding is that while searching for the best model, you need to hyper-parameter tune for area under PRC (or equivalently AUC ie Area under the ROC curve), which is independent of the choice of threshold probability. However given a model with a fixed AUC or area under PRC, you need to find the threshold that suits your problem description. Hope that makes sense.

@MrMoore0312 Год назад

@@abhigyandatta2008 for sure, thanks! Make sure I've got the best model going in, then and only then test for an appropriate threshold. Makes sense to me!

@venkataramana6975 Год назад

Good work ❤️

@ritvikmath Год назад

Thanks 🙏

@Set_Get Год назад

خوب بود. تشکر.

@ritvikmath Год назад

No problem !

@jagatchaitanyaprabhala8668 Год назад

I think marketing cost n market response is a better example for law of diminishing marginal returns. This carrot apple thing doesn't feel too intuitive isn't it? Also, the precision recall curve may not really be a good example of law of diminishing marginal returns too. Precision recall curve's shape will be a function of class balance as well i think

@rahulprasad2318 Год назад

Isn't bell curve more important?

@ritvikmath Год назад

Totally a valid point. I think “most important curve in data science” depends on lots of things so this is my take on it. I feel this one’s important to understand higher level data science trade offs but bell curve is certainly crucial for understanding things like statistical behavior in the presence of large sample sizes for example