Sara Hooker - The Hardware Lottery, Sparsity and Fairness

Machine Learning Street Talk

Подписаться 143 тыс.

Просмотров 5 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

18 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 18

@iestynne 3 года назад

For someone interested in getting into ML, having access to this kind of high level discussion from top experts - for free! - is an amazing privilege. Thank you so much for sharing!!

@mamotivated 2 года назад

This is the only show that I would listen and get soo excited about a long introduction. Great work team.

@sandraviknander7898 3 года назад

This was a really Interesting episode. Sara is amazing! I have thought about this hardware lottery a lot since I have been really interested in processing in memory (PIM). I do hope that the development of PIM will help expanding the possibilities of ideas. Now maybe it’s going to be as you discuss with R&D resources and it will never take of and what we really need is a really good software stack for FPGAs so that everyone would be able to easily make the hardware that best fit the idea they want to develop. However, Fpga’s would not be the answer entire answer for capsule networks since it is sequential processing. Although you can come a long way in latency through effective cashing. Perhaps you could even leverage some higher order prefetching of data that is tailored to these sequential models. Amazing episode and great insights from all of you!

@quebono100 3 года назад

Yupi new video. I dont get it why such good content does not have more subscribers

@MachineLearningStreetTalk 3 года назад

We are working on it 🙌😂

@quebono100 3 года назад

This is really a great one. Thank you

@florianhonicke5448 3 года назад

Good work!

@Hexanitrobenzene 2 года назад

Great guest - warm personality, wide knowledge, just great overall :)

@TheReferrer72 3 года назад

The Paper is a good read, and its hard to find fault. 15M$ to train a model is peanuts especially as the weights can be copied for near zero cost, its the inference cost that's the big worry. It is also a good thing that commercial hardware is used, because it means that the technology is easier to get in the hands of society as opposed the preserve of the military.

@shanepeckham8566 3 года назад

Fantastic intro Tim!

@machinelearningdojowithtim2898 3 года назад

Thanks Shane! Miss you bro!

@ratsukutsi 3 года назад

I have a feeling that Sara's work at the Hardware Lottery, as well as Kenneth Stanley's, and maybe even Max Welling's, are almost like socio-political arguments isolated from politics by a transparent thick membrane of technical knowledge.

@jasdeepsinghgrover2470 3 года назад

Great work... A nice deep discussion. But honestly, is a neural network with billion parameters even as smart as a Jelly Fish with 5.5k neurons (they seem to do multilabel object detention, motion planning, group behaviour and much more at the same time) .

@datta97 3 года назад

great intro!

@DavenH 3 года назад

God, the popups of the show's main cast has me giggling like a schoolboy! Those buzzy sound effects ... lol! Facetious question: how does one become a "named Bayesian" like Keith? And where are you aviators Tim? I haven't watched through yet, so this could be redundant, but your questions in the intro about how to stop models compressing out protected attributes that are in the long tail -- often a feature we prize in neural nets (i.e. robustness to noise and outliers) -- well, could you oversample those instances so that the model's exposure to these is closer to uniform, if those attributes are indeed important or constitutionally protected? Or maybe it's as simple as making a per-instance learning rate, where the rare instances get higher rates...(on second thought, this would cause spikes for momentum-based optimizers) how would it know what's rare hmmm--maybe an autoencoder side-model to indicate likelihood? But how to decide between outliers and underrepresented, yet important instances. I don't think an automatic process will ever be able to know what those "protected" attributes are, as they are more a reflection of the somewhat arbitrary history of atrocities contingent on some attributes but not others (has there ever been a genocide based upon eye color, say? No -> not protected. Though, perhaps in the Stormlight Archives universe, a liberal society would do so). If that oversampling overfits the long tail instances because there's only a handful per, then it will become obvious; get more data for them specifically. This process will inevitably degrade the accuracy for a constant model size, but you can always make it bigger to represent more stuff in the long tail. No free lunch kinda thing. Looking forward to the episode.

@machinelearningdojowithtim2898 3 года назад

Hey DavenH! Thanks for commenting! Oversampling those protected instances and/or augmenting them with semantically equivalent mutations might help. Cool idea on the per-instance learning rate! Clearly some tweaks to the SGD algorithm would be required, perhaps select mini-batches with protected instances and increase the learning rate on those batches dynamically.

@machinelearningdojowithtim2898 3 года назад

First!

@XOPOIIIO 3 года назад

DNN's conclusions even from biased data are far more reliable than human conclusions from the same data. Because algorithm is unbiased unlike human brain.