Transformers, explained: Understand the model behind ChatGPT

Подписаться 10 тыс.

Просмотров 5 тыс.

50% 1

🚀 Learn AI Prompt Engineering: bit.ly/3v8O4Vt
In this technical overview, we dissect the architecture of Generative Pre-trained Transformer (GPT) models, drawing parallels between artificial neural networks and the human brain.
From the foundational GPT-1 to the advanced GPT-4, we explore the evolution of GPT models, focusing on their learning processes, the significance of data in training, and the revolutionary Transformer architecture.
This video is designed for curious non-technical people looking to understand the complexities of GPT models in a way that's easy to understand.
🔗 SOCIAL LINKS:
🌐 Website/Blog: www.futurise.com/
🐦 Twitter/X: / joinfuturise
🔗 LinkedIn: / futurisealumni
📘 Facebook: profile.php?...
📣 Subscribe: www.youtube.com/@leonpetrou?s...
⏰ Timestamps:
0:00 - Intro
0:27 - The Importance of Modeling The Human Brain
1:10 - Basics of Artificial Neural Networks (ANNs)
2:26 - Overview of GPT Models Evolution
3:34 - Training Large Language Models
7:05 - Transformer Architecture
7:45 - Understanding Tokenization
10:19 - Explaining Token Embeddings
17:03 - Deep Dive into Self-Attention Mechanism
18:53 - Multiheaded Self-Attention Explained
19:55 - Predicting the Next Word: The Process
22:33 - De-Tokenization: Converting Token IDs Back to Words
#llm #ml #chatgpt #nvidia #elearning #futurise #promptengineering #futureofwork #leonpetrou #anthropic #claude #claude3 #gemini #openai #transformers #techinsights

Наука

Опубликовано:

14 июн 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 38

@ravindranshanmugam782 2 месяца назад

Excellent, went thro' multiple videos on basic understanding of Transformers. This is the best one I could quickly grasp. Effortlessly explained, Well done !!

@LeonPetrou 2 месяца назад

Thank you Ravindran! I try my best to teach things the same way that I'd like to be taught, which is simple and step-by-step. Let me know what other videos you'd like to see from my channel.

@ravindranshanmugam782 2 месяца назад

Hi Leon, it would be great if you can make videos on Langchain and its application which are trending now. You can also add topics like Vectordatabase, Embedding, word2vec and so on. Anything on GenAI is hot now in tech space. Thanks.

@ovidioe.cabeza4750 19 дней назад

Same for me, I am a python backend dev and getting transformer was being tough, but you helped me a lot, thank you!

@vj7668 13 дней назад

Excellent !!! Thanks for simplifying it. Loved it !

@LeonPetrou 12 дней назад

Appreciate that, thank you!

@michaelzap8528 19 дней назад

best. Finally i understand how gpt work now. Thanks male, u the champion.

@programminglover2976 12 дней назад

thank you so much.. really reallly well explained.

@wp1300 29 дней назад

1:12 ANN 2:26 GPT-1 ~ GPT-4 3:34 LLM 7:09 Transformer architecture 7:45 Tokenization & Detokenization 8:17 Step 1 10:14 Step 2 10:20 Token embeddings 14:48 Step 3 15:10 Position Enbedding 16:58 Step 4 17:17 Self-Attention 18:52 Multi-headed self-attention 19:55 Step 5 20:27 Feed-Forward 22:02 Step 6 22:32 Step 6

@anibeto7 2 месяца назад

It was indeed a very informative video. It cleared a lot of the important ideas. Thanks a lot.

@JohnCohen-ur5hk 18 дней назад

Very Good Explanation. Thank You

@rhktech 2 дня назад

very well explained (Y)

@MotulzAnto 25 дней назад

THANK YOU! easy explanation..

@LeonPetrou 24 дня назад

Appreciate it!

@karannesh7700 19 дней назад

thx for this great video !

@LeonPetrou 19 дней назад

Appreciate it!

@Clammer999 Месяц назад

Wow, this is one of the easiest to understand video on how transformers work. You also explained very tokens and embeddings which I was searching for. I’m a complete newbie and I kept hearing nuerons and neural networks. Is a neuron a physical device/hardware or it actually an algorithm? And a neural network is not a physical network?

@LeonPetrou Месяц назад

Thank you! Neural networks, and everything explained in this video is all software (except biological neurons which is in a human brain), it is all algorithms. It's basically just code. The hardware that the code runs on usually just requires high processing power / RAM. This can be a CPU or GPU.

@sudhanshusaxena8134 24 дня назад

Great explanation.

@LeonPetrou 24 дня назад

Thank you very much!

@Omniassassin7 3 месяца назад

This is amazing, thanks a lot man! Quick question, how are the self-attention layers produced? Does the model dynamically “decide” which contextual layer to use depending on the prompt, or is the set of layers learnt during training?

@LeonPetrou 3 месяца назад

My pleasure man, glad you like it. That's a great question. The structure and behavior of these self-attention layers are determined during the model's training phase, not during inference. Simply put, the model learns which words in a sentence should pay attention to which other words to better understand the sentence's meaning. This learning process is fixed once the model is fully trained.. it does not change or decide on a different structure when it's given new prompts to process.

@abooaw4588 2 месяца назад

Bravo 🇨🇵Dommage que ce très bon niveau de d'explication n'est réservé que pour nous qui comprenons l'anglais. Lecun et Bengio en sont pour beaucoup. Heureusement que le nutshell n'est pas traduit par GPT à la noix!

@LeonPetrou 2 месяца назад

Merci beaucoup for your thoughtful comment! I'm glad you found the video informative. Your point about language accessibility is very important to us. We're actively exploring options to include subtitles in multiple languages in our future videos to ensure more viewers can benefit from our content.