Тёмный

Corrections + Few Shot Examples (Part 2) | LangSmith Evaluations 

LangChain
Подписаться 58 тыс.
Просмотров 3,4 тыс.
50% 1

Evaluation is the process of continuously improving your LLM application. This requires a way to judge your application’s outputs, which are often natural language. Using an LLM to grade natural language outputs (e.g., for correctness relative to a reference answer, tone, or conciseness) is a popular approach, but requires prompt engineering and careful auditing of the LLM judge!
Our new release of LangSmith presents a solution to this rising problem, allowing a user to (1) correct LLM-as-a-Judge outputs and then (2) pass those corrections back to the judge as few-shot example for future iterations. This creates LLM-as-a-Judge evaluators grounded in human feedback that better encode your preferences without the need for challenging prompt engineering.
Here we show how apply Corrections + Few Shot to online evaluators that are pinned to a dataset.

Опубликовано:

 

4 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 4   
@nidhir9107
@nidhir9107 Месяц назад
The video explore the same content as the previous video (part 1) of the LangSmith playlist. Why?
@lukem121
@lukem121 3 месяца назад
Great content as usual! I'm really excited for the new advanced customer support agent that will be using TypeScript. Do you have any updates on when the video will be published?
@darkmatter9583
@darkmatter9583 3 месяца назад
thank you for everything,im bad now ,but i still followijg your channel and supporting, all your effort best of luck and best wishes ❤
@93simongh
@93simongh 3 месяца назад
In my experience and from some articles I read it appears that asking to provide a numeric score for evaluation is very susceptible to undeterministic, variable results everytime the evaluation prompt is launched. Are these numeric scores shown in langmsith to be trusted?
Далее
Building long context RAG with RAPTOR from scratch
21:30
LOLLIPOP-SCHUTZ-GADGET 🍭 DAS BRAUCHST DU!
00:28
Просмотров 10 млн
Building a Stockbroker Agent in LangGraph.js
20:39
Просмотров 4,1 тыс.
AI Shopping Assistant - Built using LangGraph
18:06
Просмотров 3,4 тыс.
Why You Shouldn't Nest Your Code
8:30
Просмотров 2,7 млн
LangGraph - Controllability with Map Reduce
9:05
Просмотров 5 тыс.
Using LLM as a judge in Langsmith
13:23
Просмотров 96
LangGraph Data Enrichment Agent Template
21:48
Просмотров 7 тыс.
Building and Testing Reliable Agents
22:17
Просмотров 12 тыс.
LangSmith Tutorial - LLM Evaluation for Beginners
36:10