Deep dive into Mixture of Experts (MOE) with the Mixtral 8x7B paper

Подписаться 4,7 тыс.

Просмотров 1,1 тыс.

50% 1

Arxiv Dives is a group from Oxen.ai of engineers, researchers, and practitioners that gets together every Friday to dig into state of the art research that relates to Machine Learning and Artificial Intelligence. If you would like to join the live discussion we would love to have you!
Join here:
lu.ma/oxenbook...
Each week we dive deep into a topic in ML/AI. Whether it is a research paper, a blog post, a book, or a RU-vid video, we break down the content into a digestible format and have an open discussion with the Oxen.ai team, and anyone else who wants to join. We try to cover the content as high level so that anyone can understand it, and will dive into deeper technical details to get a clearer understanding.
This week we cover the Mixtral paper from the team at Mistral.ai. This paper goes over how Mistral used Mixture of Experts (MOE) in their latest Mistral-8x7B-instruct-1.0 paper to achieve better performance than larger models as well as competitive performance with GPT-3.5.
All the notes and previous dives can all be found on the Oxen.ai blog:
blog.oxen.ai/t...

Опубликовано:

17 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 4

@Pingu_astrocat21 6 месяцев назад

thank you for these videos

@oxen-ai 6 месяцев назад

You are very welcome!

@gJonii 8 месяцев назад

This channel is bizarre. Seemingly really small and really high quality stuff. No idea what's going on, but I like. If you had like 1000x more subscribers I'd have ideas for improving the videos, but for this viewer count, these are way overproduced. Againg nothing against it, but man.

@oxen-ai 8 месяцев назад

😂 love this comment. Happy to share our paper club with the people.