Mistral AI's second Large Language Model(LLM) Mixtral8x7b's papers have been released. In this video we will explore the code of Mixtral8x7b, learn how it works and also why it is called "Mixtral". We also learn about some interesting things about the history of "MoE" which is a key part in Mistral AI's Mixtral8x7b.
Paper: arxiv.org/abs/2401.04088
Hugging Face: huggingface.co/mistralai/Mixt...
Chapters:
0:00 Intro
1:34 Architecture(Mixture of Experts)
3:08 Code Walkthrough
7:51 Papers
10:10 Routing Analysis
11:03 Instruct Model
11:27 Outro
Mixtral 8x7b github: github.com/mistralai/mistral-src
Check out our socials:
Website: jarvislabs.ai/
X: / jarvislabsai
LinkedIn: / jarvislabsai
Instagram: / jarvislabs.ai
Medium: / jarvislabs
Connect with Vishnu:
X: / vishnuvig
Linkedin: / vishnusubramanian
31 май 2024