How to Improve your LLM? Find the Best & Cheapest Solution

Evaluating LLM-based Applications

СМОТРИМ YOUTUBE В МАЙНКРАФТЕ

"Где логика" в Камеди😁#ComedyClub #КамедиКлаб #харламов #азаматмузагалиев #демискарибидис #тнт4 #тнт

НУБ И ПРО ПОЛИЦЕЙСКИЕ НА 24 ЧАСА В МАЙНКРАФТ ! НУБИК В ГОРОДЕ И ТРОЛЛИНГ ЛОВУШКА В MINECRAFT

Склеил девушку-курьера ❤️

Master LLMs: Top Strategies to Evaluate LLM Performance

What's AI by Louis-François Bouchard

Подписаться 60 тыс.

Просмотров 4,1 тыс.

50% 1

Видео Поделиться Скачать Добавить в

In this video, we look into how to evaluate and benchmark Large Language Models (LLMs) effectively. Learn about perplexity, other evaluation metrics, and curated benchmarks to compare LLM performance. Uncover practical tools and resources to select the right model for your specific needs and tasks. Dive deep into examples and comparisons to empower your AI journey!
► Jump on our free LLM course from the Gen AI 360 Foundational Model Certification (Built in collaboration with Activeloop, Towards AI, and the Intel Disruptor Initiative): learn.activeloop.ai/courses/l...
►My Newsletter (My AI updates and news clearly explained): louisbouchard.substack.com/
With the great support of Cohere & Lambda.
► Course Official Discord: / discord
► Activeloop Slack: slack.activeloop.ai/
► Activeloop RU-vid: / @activeloop
►Follow me on Twitter: / whats_ai
►Support me on Patreon: / whatsai
How to start in AI/ML - A Complete Guide:
►www.louisbouchard.ai/learnai/
Become a member of the RU-vid community, support my work and get a cool Discord role :
/ @whatsai
Chapters:
0:00 Why and How to evaluate your LLMs!
0:50 The perplexity evaluation metric.
3:20 Benchmarks and leaderboards for comparing performances.
4:12 Benchmarks for Coding benchmarks.
5:33 Benchmarks for Reasoning and common sense.
6:32 Benchmark for mitigating hallucinations.
7:35 Conclusion.
#ai #languagemodels #llm

Опубликовано:

5 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 3

@tranphihung6096 5 месяцев назад

The video is helpful for me! Thanks 🥰🥰🥰🥰

@incameet 6 месяцев назад

Speak too fast!

@WhatsAI 6 месяцев назад

Oh really?!

Далее

How to Improve your LLM? Find the Best & Cheapest Solution

9:36

How to Improve your LLM? Find the Best & Cheapest Solution

Просмотров 13 тыс.

Evaluating LLM-based Applications

33:50

Evaluating LLM-based Applications

Просмотров 23 тыс.

СМОТРИМ YOUTUBE В МАЙНКРАФТЕ

00:34

СМОТРИМ YOUTUBE В МАЙНКРАФТЕ

Просмотров 895 тыс.

"Где логика" в Камеди😁#ComedyClub #КамедиКлаб #харламов #азаматмузагалиев #демискарибидис #тнт4 #тнт

00:33

"Где логика" в Камеди😁#ComedyClub #КамедиКлаб #харламов #азаматмузагалиев #демискарибидис #тнт4 #тнт

Просмотров 4,2 млн

НУБ И ПРО ПОЛИЦЕЙСКИЕ НА 24 ЧАСА В МАЙНКРАФТ ! НУБИК В ГОРОДЕ И ТРОЛЛИНГ ЛОВУШКА В MINECRAFT

16:28

НУБ И ПРО ПОЛИЦЕЙСКИЕ НА 24 ЧАСА В МАЙНКРАФТ ! НУБИК В ГОРОДЕ И ТРОЛЛИНГ ЛОВУШКА В MINECRAFT

Просмотров 557 тыс.

Склеил девушку-курьера ❤️

01:00

Склеил девушку-курьера ❤️

Просмотров 63 тыс.

"Make Agent 10x cheaper, faster & better?" - LLM System Evaluation 101

27:42

"Make Agent 10x cheaper, faster & better?" - LLM System Evaluation 101

Просмотров 17 тыс.

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

15:21

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Просмотров 72 тыс.

[Webinar] LLMs for Evaluating LLMs

49:07

[Webinar] LLMs for Evaluating LLMs

Просмотров 9 тыс.

Everything WRONG with LLM Benchmarks (ft. MMLU)!!!

19:20

Everything WRONG with LLM Benchmarks (ft. MMLU)!!!

Просмотров 4,8 тыс.

What is LangChain?

8:08

What is LangChain?

Просмотров 183 тыс.

Fine-tuning Large Language Models (LLMs) | w/ Example Code

28:18

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Просмотров 287 тыс.

Intro to RAG for AI (Retrieval Augmented Generation)

14:31

Intro to RAG for AI (Retrieval Augmented Generation)

Просмотров 48 тыс.

SkyPilot: Run AI on Any Cloud

30:09

SkyPilot: Run AI on Any Cloud

Просмотров 2 тыс.

How to evaluate ML models | Evaluation metrics for machine learning

10:05

How to evaluate ML models | Evaluation metrics for machine learning

Просмотров 49 тыс.

Why Large Language Models Hallucinate

9:38

Why Large Language Models Hallucinate

Просмотров 183 тыс.

Новые iPhone 16 и 16 Pro Max

0:42

Новые iPhone 16 и 16 Pro Max

Просмотров 2,4 млн

Замедление YouTube: решение проблемы. Все известные способы

5:06

Замедление YouTube: решение проблемы. Все известные способы

Просмотров 261 тыс.

Мой новый мега монитор!🤯

1:00

Мой новый мега монитор!🤯

Просмотров 765 тыс.

САМАЯ СТРАННАЯ МЫШКА ДЛЯ КИБЕРСПОРТА: МЫШЬ ДЛЯ КИБЕРСПОРТСМЕНОВ #cs2

0:27

САМАЯ СТРАННАЯ МЫШКА ДЛЯ КИБЕРСПОРТА: МЫШЬ ДЛЯ КИБЕРСПОРТСМЕНОВ #cs2

Просмотров 566 тыс.

Проц Intel за 33000 рублей сдох за 1 неделю до конца гарантии!

14:07

Проц Intel за 33000 рублей сдох за 1 неделю до конца гарантии!

Просмотров 39 тыс.

Что делать после сборки компьютера ?

0:54

Что делать после сборки компьютера ?

Просмотров 359 тыс.

Samsung НЕ сделал селфи на Олимпиаде 🥇

0:47

Samsung НЕ сделал селфи на Олимпиаде 🥇

Просмотров 1,8 млн