Тёмный

How Can We Generate BETTER Sequences with LLMs? 

The ML Tech Lead!
Подписаться 10 тыс.
Просмотров 404
50% 1

We know that LLMs are trained to predict the next word. When we decode the output sequence, we use the tokens of the prompt and the previously predicted tokens to predict the next word. With greedy decoding or multinomial sampling decoding, we use those predictions to output the next token in an autoregressive manner. But is this the sequence we are looking for, considering the prompt? Do we actually care about the probability of the next token in a sequence? What we want is the whole sequence to maximize the probability conditioned on the prompt, not each token separately.
So let's look at why predicting the next token is not the prediction we care about, and how we can do better than simply autoregressing by just looking at the probability of the next token.

Опубликовано:

 

21 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 2   
@tripathi26
@tripathi26 4 месяца назад
Informative! 🙏
@SHAILENDRAUPADHYAY-ok4yz
@SHAILENDRAUPADHYAY-ok4yz 4 месяца назад
Absolute master piece
Далее
AI can't cross this line and we don't know why.
24:07
Просмотров 737 тыс.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Просмотров 302 тыс.
Eco-hero strikes again! ♻️ DIY king 💪🏻
00:48
O’zim bilib ketvotudima😅
01:00
Просмотров 749 тыс.
Understanding XGBoost From A to Z!
26:41
Просмотров 1,2 тыс.
I gave 127 interviews. Top 5 Algorithms they asked me.
8:36
How might LLMs store facts | Chapter 7, Deep Learning
22:43
Why Does Diffusion Work Better than Auto-Regression?
20:18
How AI 'Understands' Images (CLIP) - Computerphile
18:05
What are AI Agents?
12:29
Просмотров 440 тыс.