How AI Learns to Talk - Recurrent Neural Networks & Transformers

Подписаться 1,5 тыс.

Просмотров 2,4 тыс.

50% 1

Discover the fascinating world of AI language processing as
you take a journey documenting how AI learned to talk. This insightful episode is part of our ongoing series delving into the intricacies of neural networks in an easy-to-understand way, made for all career fields.
We start by contrasting previously discussed network types, like feed-forward and convolutional networks, with the unique ability of Recurrent Neural Networks (RNNs) to analyze sequential data.Witness the evolution of Natural Language Processing (NLP), beginning with Michael Jordan's pioneering work on simple RNNs in the mid-1980s. Discover how these networks, by learning sequences, navigate through state space, laying the groundwork for future advancements. Continue that journey through the 1990s with Elman's RNNs, which showed remarkable skill in word partitioning and text generation, albeit with limitations. Enter Yoshua Bengio's groundbreaking 2003 paper advocating for neural networks in NLP and introducing the concept of word embeddings. Then leap to the transformative works of Geoffry Hinton, Ilya Sutskever, Andrej Karpathy and others, exploring how RNNs evolved into powerful text generators, as demonstrated by OpenAI's relatively massive RNN that was trained on Amazon reviews and developed an emergent capability to detect sentiment. Our journey culminates in the game-changing paper "Attention is All You Need", introducing Transformers. Discover how OpenAI leveraged this to create models like GPT-1, GPT-2, GPT-3, InstructGPT, ChatGPT, and GPT-4; a journey marked by emergent properties and in-context learning capabilities.Finally, we tease our next video, promising a deeper dive into the capabilities of Large Language Models (LLMs) and other
Transformer-based models in the ongoing AI gold rush.Enjoy this video and stay tuned to explore how AI is redefining the way we understand language and communication!The link to a course playlist of DAU recommended AI courses is: dau.csod.com/u...
You will need a DAU account to access these resources. If
you are a DoD member and need a DAU account and you can request one here: www.dau.edu/fa...

Опубликовано:

5 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 4

@MP-nc1su 7 месяцев назад

Tell me if I have this right. The entire context window, which is the stuff I type as well as the stuff the model spits out, goes into the GPT and it uses that to predict the next word in the sequence. The autoregressive part means the predicted word is fed back around and added back to the context window and the cycle repeats. What is really going into the GPT? You talked about learned word embedding, which I know you really mean tokens, are the “input embeddings” at the beginning of the ChatGPT diagram the word embedding you talked about? It is hurting my brain trying to visualize what is really running through this model.

@AIatDAU-px5vx 7 месяцев назад

I think you've got it. Yes, the entire context window includes both what you type (the prompt) and what the model generates in response. This whole chunk of text is used by GPT to figure out what may come next in the sequence. Spot on with Autoregressive nature. The model predicts one word (or token) at a time, and each new word it predicts is fed back into the model as part of the input for predicting the next word. This process keeps repeating, creating a coherent sequence word by word. As to what really goes into the GPT (i.e., model), you are right, before anything goes into the Transformer architecture the text is split into smaller pieces, often called tokens. These tokens are not just the words but can be parts of words (e.g., for bigger words... you can think of tokens as root words) or even punctuation. Yes, the "Input Embeddings" shown on the GPT and Transformer diagrams in the video represent the learned word/token embeddings. It's these embeddings that actually go into the Transformer architecture. Did that answer your question? If that didn't help make me answer again. Thanks for asking.

@MachineLearning-o8d 9 месяцев назад

Time Index: 00:00 Introduction 02:22 Simple Single Neuron RNN Example: 04:26 RNNs Show Early Progress for Natural Language Processing: 07:55 Pioneers Like Yoshua Bengio Help to Thaw the AI Winter: 11:10 Progress Resumes for RNNs & Natural Language: 14:36 Limitations of Scaling RNNs 15:00 Transformers / Attention Is All You Need 16:09 Self Attention / Multi-Headed Self Attention and the FFN 19:22 OpenAI's Modified Transformer and GPTs (GPT-1 through ChatGPT 4) 26:58 Other Large Language Models and next video's topics