Тёмный

New course with Google Cloud: Reinforcement Learning from Human Feedback (RLHF) 

DeepLearningAI
Подписаться 341 тыс.
Просмотров 8 тыс.
50% 1

Enroll now: bit.ly/48aqPrK
Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences. Reinforcement Learning from Human Feedback (RLHF) is currently the main method for aligning LLMs to make them more helpful, honest, and safe.
In this course, you will gain a conceptual understanding of the RLHF training process, and then practice applying RLHF to tune an LLM. You will:
Explore the two datasets (“preference” and “prompt”) that are used in RLHF training.
Use the open source Google Cloud Pipeline Components Library to fine-tune the Llama 2 model with RLHF.
Assess the tuned LLM against the original base model by comparing loss curves and using the “Side-by-Side (SxS)” method.
Join instructor Nikita Namjoshi, Developer Advocate for Generative AI at Google Cloud, for this learning adventure. Prepare to include Reinforcement Learning from Human Feedback in your skillset.
Learn more: bit.ly/48aqPrK

Развлечения

Опубликовано:

 

3 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 32   
Далее
Китайка стучится Домой😂😆
00:18
БАГ ЕЩЕ РАБОТАЕТ?
00:26
Просмотров 153 тыс.
Обменялись песнями с POLI
00:18
Просмотров 535 тыс.
RAG vs. Fine Tuning
8:57
Просмотров 22 тыс.
Vector Search and Embeddings
34:43
Просмотров 10 тыс.
Has Generative AI Already Peaked? - Computerphile
12:48
Llama  101
10:11
Просмотров 10 тыс.