Unpacking how large language models work under the hood
Early view of the next chapter for patrons: 3b1b.co/early-attention
Special thanks to these supporters: 3b1b.co/lessons/gpt#thanks
To contribute edits to the subtitles, visit translate.3blue1brown.com/
Other recommended resources on the topic.
Richard Turner's introduction is one of the best starting places:
arxiv.org/pdf/2304.10557.pdf
Coding a GPT with Andrej Karpathy
• Let's build GPT: from ...
Introduction to self-attention by John Hewitt
web.stanford.edu/class/cs224n...
History of language models by Brit Cruise:
• ChatGPT: 30 Year Histo...
Paper about examples like the “woman - man” one presented here:
arxiv.org/pdf/1301.3781.pdf
------------------
Timestamps
0:00 - Predict, sample, repeat
3:03 - Inside a transformer
6:36 - Chapter layout
7:20 - The premise of Deep Learning
12:27 - Word embeddings
18:25 - Embeddings beyond words
20:22 - Unembedding
22:22 - Softmax with temperature
26:03 - Up next
------------------
These animations are largely made using a custom Python library, manim. See the FAQ comments here:
3b1b.co/faq#manim
github.com/3b1b/manim
github.com/ManimCommunity/manim/
All code for specific videos is visible here:
github.com/3b1b/videos/
The music is by Vincent Rubinetti.
www.vincentrubinetti.com
vincerubinetti.bandcamp.com/a...
open.spotify.com/album/1dVyjw...
------------------
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on RU-vid or otherwise following on whichever platform below you check most regularly.
Mailing list: 3blue1brown.substack.com
Twitter: / 3blue1brown
Instagram: / 3blue1brown
Reddit: / 3blue1brown
Facebook: / 3blue1brown
Patreon: / 3blue1brown
Website: www.3blue1brown.com
3 май 2024