Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Подписаться 21 тыс.

Просмотров 4 тыс.

50% 1

Episode 86 of the Stanford MLSys Seminar Series!
Monarch Mixer: Making Foundation Models More Efficient
Speaker: Dan Fu
Abstract:
Machine learning models are increasingly being scaled in both sequence length and model dimension to reach longer contexts and better performance. However, existing architectures like Transformers scale quadratically along both these axes. In this talk I'll discuss Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and model dimension. M2 mixes information along the sequence and model dimensions using Monarch matrices, a simple class of expressive structured matrices that captures many linear transforms, achieves high hardware efficiency on GPUs, and scales sub-quadratically.
Bio:
Dan Fu is a PhD student in the Computer Science Department at Stanford University, where he is co-advised by Christopher Ré and Kayvon Fatahalian. His research is at the intersection of systems and machine learning and focuses on developing algorithms and architectures to make machine learning more efficient.
Monarch Mixer arXiv: arxiv.org/abs/...
FlashFFTConv arXiv: arxiv.org/abs/...
--
Stanford MLSys Seminar hosts: Simran Arora, Dan Fu
Twitter:
/ simran_s_arora
/ realdanfu
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google....
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford

Опубликовано:

1 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 4

@kvotheosem-sangue 3 месяца назад

Explained so clearly! The paper gets you confused when gets into the math due to the material being so dense, thanks for extending to a video format

@jawadmansoor6064 9 месяцев назад

axriv link please?

@backtofocused438 9 месяцев назад

Indeed! It is such a wonderful work and such a fantastic way to learn and I world have expected that for such a fantastic scientic exploration about this

@StanfordMLSysSeminars 8 месяцев назад

Added to the description!