Тёмный

Master LLMs: Top Strategies to Evaluate LLM Performance 

What's AI by Louis-François Bouchard
Подписаться 60 тыс.
Просмотров 4,1 тыс.
50% 1

In this video, we look into how to evaluate and benchmark Large Language Models (LLMs) effectively. Learn about perplexity, other evaluation metrics, and curated benchmarks to compare LLM performance. Uncover practical tools and resources to select the right model for your specific needs and tasks. Dive deep into examples and comparisons to empower your AI journey!
► Jump on our free LLM course from the Gen AI 360 Foundational Model Certification (Built in collaboration with Activeloop, Towards AI, and the Intel Disruptor Initiative): learn.activeloop.ai/courses/l...
►My Newsletter (My AI updates and news clearly explained): louisbouchard.substack.com/
With the great support of Cohere & Lambda.
► Course Official Discord: / discord
► Activeloop Slack: slack.activeloop.ai/
► Activeloop RU-vid: / @activeloop
►Follow me on Twitter: / whats_ai
►Support me on Patreon: / whatsai
How to start in AI/ML - A Complete Guide:
►www.louisbouchard.ai/learnai/
Become a member of the RU-vid community, support my work and get a cool Discord role :
/ @whatsai
Chapters:
0:00 Why and How to evaluate your LLMs!
0:50 The perplexity evaluation metric.
3:20 Benchmarks and leaderboards for comparing performances.
4:12 Benchmarks for Coding benchmarks.
5:33 Benchmarks for Reasoning and common sense.
6:32 Benchmark for mitigating hallucinations.
7:35 Conclusion.
#ai #languagemodels #llm

Наука

Опубликовано:

 

5 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 3   
@tranphihung6096
@tranphihung6096 5 месяцев назад
The video is helpful for me! Thanks 🥰🥰🥰🥰
@incameet
@incameet 6 месяцев назад
Speak too fast!
@WhatsAI
@WhatsAI 6 месяцев назад
Oh really?!
Далее
Evaluating LLM-based Applications
33:50
Просмотров 23 тыс.
СМОТРИМ YOUTUBE В МАЙНКРАФТЕ
00:34
Просмотров 895 тыс.
Склеил девушку-курьера ❤️
01:00
[Webinar] LLMs for Evaluating LLMs
49:07
Просмотров 9 тыс.
Everything WRONG with LLM Benchmarks (ft. MMLU)!!!
19:20
What is LangChain?
8:08
Просмотров 183 тыс.
Intro to RAG for AI (Retrieval Augmented Generation)
14:31
SkyPilot: Run AI on Any Cloud
30:09
Просмотров 2 тыс.
Why Large Language Models Hallucinate
9:38
Просмотров 183 тыс.
Новые iPhone 16 и 16 Pro Max
0:42
Просмотров 2,4 млн
Мой новый мега монитор!🤯
1:00
Просмотров 765 тыс.