Тёмный

How to Improve LLMs with RAG (Overview + Python Code) 

Shaw Talebi
Подписаться 28 тыс.
Просмотров 24 тыс.
50% 1

👉 Need help with AI? Reach out: shawhintalebi.com/
In this video, I give a beginner-friendly introduction to retrieval augmented generation (RAG) and show how to use it to improve a fine-tuned model from a previous video in this LLM series.
👉 Series Playlist: • Large Language Models ...
🎥 Fine-tuning with QLoRA: • QLoRA-How to Fine-tune...
📰 Read more: medium.com/towards-data-scien...
💻 Colab: colab.research.google.com/dri...
💻 GitHub: github.com/ShawhinT/RU-vid-B...
🤗 Model: huggingface.co/shawhin/shawgp...
Resources
[1] github.com/openai/openai-cook...
[2] • LlamaIndex Webinar: Bu...
[3] docs.llamaindex.ai/en/stable/...
[4] • LlamaIndex Webinar: Ma...
--
Book a call: calendly.com/shawhintalebi
Homepage: shawhintalebi.com/
Socials
/ shawhin
/ shawhintalebi
/ shawhint
/ shawhintalebi
The Data Entrepreneurs
🎥 RU-vid: / @thedataentrepreneurs
👉 Discord: / discord
📰 Medium: / the-data
📅 Events: lu.ma/tde
🗞️ Newsletter: the-data-entrepreneurs.ck.pag...
Support ❤️
www.buymeacoffee.com/shawhint
Intro - 0:00
Background - 0:53
2 Limitations - 1:45
What is RAG? - 2:51
How RAG works - 5:03
Text Embeddings + Retrieval - 5:35
Creating Knowledge Base - 7:37
Example Code: Improving RU-vid Comment Responder with RAG - 9:34
What's next? - 20:58

Опубликовано:

 

16 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 50   
@ShawhinTalebi
@ShawhinTalebi 3 месяца назад
Check out more videos in this series 👇 👉 Series Playlist: ru-vid.com/group/PLz-ep5RbHosU2hnz5ejezwaYpdMutMVB0 🎥 Fine-tuning with QLoRA: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-XpoKB3usmKc.html -- 📰 Read more: medium.com/towards-data-science/how-to-improve-llms-with-rag-abdc132f76ac?sk=d8d8ecfb1f6223539a54604c8f93d573 💻 Colab: colab.research.google.com/drive/1peJukr-9E1zCo1iAalbgDPJmNMydvQms?usp=sharing 💻 GitHub: github.com/ShawhinT/RU-vid-Blog/tree/main/LLMs/rag 🤗 Model: huggingface.co/shawhin/shawgpt-ft Resources [1] github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb [2] ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-efbn-3tPI_M.html [3] docs.llamaindex.ai/en/stable/understanding/loading/loading.html [4] ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Zj5RCweUHIk.html
@bangarrajumuppidu8354
@bangarrajumuppidu8354 2 месяца назад
superb explanation Shaw !😍
@saadowain3511
@saadowain3511 3 месяца назад
Thank you Talebi. No one explains the subject like you
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Thanks :) Glad it was clear!
@GetPaidToLivePodcast
@GetPaidToLivePodcast 2 месяца назад
Incredible breakdown Shaw!
@ifycadeau
@ifycadeau 2 месяца назад
This is so helpful! Thanks Shaw, you never miss!
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Glad it was helpful!
@jagtapjaidip8
@jagtapjaidip8 27 дней назад
very nice. thank you for explaining in details.
@deadlyecho
@deadlyecho Месяц назад
Very good explanation 👏 👌
@firespark804
@firespark804 2 месяца назад
Awesome video, thanks! I'm wondering if instead of using top_k documents/batches one could define a threshold/distance for the used batches?
@examore-lite
@examore-lite 2 месяца назад
Thank you very much!
@lplp6961
@lplp6961 2 месяца назад
good work!
@tomasbusse2410
@tomasbusse2410 Месяц назад
Very useful indeed
@nistelbergerkurt5309
@nistelbergerkurt5309 3 месяца назад
great video as always 👍 does a reranker improve the quality of the output for a RAG approach? like that we could take the output directly from the reranker, right? or what is your experience with reranker?
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Great questions! That's the idea. A reranker is typically applied to the top-k (say k=25) search results to further refine the chunks. The reason you wouldn't use a reranker directly on the entire knowledge base is because it is (much) more computationally expense than the text embedding-based search described here. I've haven't used a reranker in any use case, but it seems to be most beneficial when working with a large knowledge base. This video may be helpful: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Uh9bYiVrW_s.html&ab_channel=JamesBriggs
@ariel-dev
@ariel-dev 2 месяца назад
Really great
@michaelpihosh5904
@michaelpihosh5904 2 месяца назад
Thanks Shaw!
@zahrahameed4098
@zahrahameed4098 Месяц назад
Thankyou so much. Becoming a fan of yours! Please do a video on Rag with llamaIndex + llama3 if it's free and not paid.
@ShawhinTalebi
@ShawhinTalebi Месяц назад
Great suggestion. That's a good excuse to try out Llama3 :)
@nayem5330
@nayem5330 3 месяца назад
Very useful.
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Glad it was helpful!
@nirmalhasantha986
@nirmalhasantha986 День назад
Thank you so much sir :)
@ridg2806
@ridg2806 24 дня назад
Solid video
@Blogservice-Fuerth
@Blogservice-Fuerth 2 месяца назад
Great 🙏
@candidlyvivian
@candidlyvivian 4 дня назад
Hey Shaw, thanks so much for such a helpful video. I''d love to seek your advice on something :) Currently we are using OpenAI to build out a bunch of insights that will be refreshed using business data (i.e. X users land on your page, Y make a purchase) Right now we are doing a lot of data preparation and feeding in the specific numbers into the user/system prompt before passing to OpenAI but have had issues with consistency of output and incorrect numbers. Would you recommend a fine-tuning approach for this? Or RAG? Or would the context itself be small enough to fit into the "context window" given it's a very small dataset we are adding to the prompt. Thanks in advance 🙂
@ShawhinTalebi
@ShawhinTalebi 12 часов назад
Glad it was helpful! Based on the info provided here, it sounds like a RAG system would make the most sense. More specifically, you could connect your data preparation pipeline to a database which would dynamically inject the specific numbers into the user/system prompt. If you have more questions, feel free to email me here: www.shawhintalebi.com/contact
@Pythonology
@Pythonology 2 месяца назад
Happy Nowruz, kheyli khoob! Question: how would you propose to evaluate a document on the basis of certain guidelines? I mean, to see how far it complies with the guidelines or regulations for writing a certain document. Is RAG any good? shall we just embed the guidelines in the prompt right before the writing? or shall we store the guidelines as a separate document and do RAG? Or ...?
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Happy New Year! That's a good question. It sounds like you want the model to evaluate a given document based on some set of guidelines. If the guidelines are static, you can fix them into the prompt. However, if you want the guidelines to be dynamic, you can house them in a database which is dynamically integrated into the prompt based on the user's input.
@edsleite
@edsleite 27 дней назад
Hi Talebi. Thanks for all you show us. But one question : I did your code with mine database, without the fine tuning and it works, very quickly answers but poor contents. That is the point of fine tuning make better answers ?
@ShawhinTalebi
@ShawhinTalebi 21 день назад
It sounds like you may need to do some additional optimizations to improve your system. I discuss some finer points here: towardsdatascience.com/how-to-improve-llms-with-rag-abdc132f76ac?sk=d8d8ecfb1f6223539a54604c8f93d573#bf88
@halle845
@halle845 2 месяца назад
Thanks!
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Thank you! Glad it was helpful 😁
@TheLordSocke
@TheLordSocke Месяц назад
Nice Video, any ideas for doing this on PowerPoints? Want to build a kind of knowledge base from previous projects but the grafics are a problem. Even GPT4V is not always interpreting them correctly. 😢
@ShawhinTalebi
@ShawhinTalebi Месяц назад
If GPT4V is having issues you may need to either 1) wait for better models to come out or 2) parse the knowledge from the PPT slides in a more clever way. Feel free to book office hours if you want to dig into it a bit more: calendly.com/shawhintalebi/office-hours
@susdoge3767
@susdoge3767 21 час назад
great channel subbscribed!
@vamsitharunkumarsunku4583
@vamsitharunkumarsunku4583 2 месяца назад
So we get top 3 similar chunks from RAG right, We are adding 3 chunks to prompt template?
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Yes exactly!
@halle845
@halle845 2 месяца назад
Any recommendations or experience on which embeddings database to use?
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Good question! Performance of embedding models will vary by domain, so some experimentation is always required. However, I've found the following 2 resources helpful as a starting place. HF Leaderboard: huggingface.co/spaces/mteb/leaderboard SentenceTransformers: www.sbert.net/docs/pretrained_models.html
@orvirt8385
@orvirt8385 21 день назад
Great video! What is fat-tailedness?
@ShawhinTalebi
@ShawhinTalebi 21 день назад
😉 ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Wcqt49dXtm8.htmlsi=E_R7A7IrkbAUVaOs
@TheRcfrias
@TheRcfrias Месяц назад
Rag is great for semi-static or static content as knowledge base, but which path do you use for dynamic, time-relevant data like current sales from a database?
@ShawhinTalebi
@ShawhinTalebi Месяц назад
That's a great question. The short answer is RAG can handle this sort of data (at least in principle). The longer answer involves taking a step back and asking oneself "why do I want to use RAG/LLMs/AI for this use case?" This helps get to the root of the problem you are trying to solve and hopefully give more clarity about potential solutions.
@TheRcfrias
@TheRcfrias Месяц назад
@@ShawhinTalebi Its a common use case at work to know how sales have been improving during the current day or week. It would be nice to know how to link the LLM with the corporate database for current information and reporting.
@jjen9595
@jjen9595 3 месяца назад
hello, do you have a video showing how to make a datasett and upload it to huggind face?
@ShawhinTalebi
@ShawhinTalebi 2 месяца назад
Not currently, but the code to do that is available on GitHub: github.com/ShawhinT/RU-vid-Blog/blob/main/LLMs/qlora/create-dataset.ipynb
@cexploreful
@cexploreful 9 дней назад
what do you mean with 'not to scale?' isn't the book at the size of the earth?
@ShawhinTalebi
@ShawhinTalebi 7 дней назад
LOL 😂
@yameen3448
@yameen3448 9 дней назад
Vector retrieval is quite shite. Trust me. To improve accuracy of retrieval, you need to use multiple methods.
Далее
What is RAG? (Retrieval Augmented Generation)
11:37
Просмотров 92 тыс.
RAG for LLMs explained in 3 minutes
3:15
Просмотров 16 тыс.
Open Source RAG running LLMs locally with Ollama
10:00
I Was Wrong About AI Consulting (what I learned)
9:56
Lessons Learned on LLM RAG Solutions
34:31
Просмотров 21 тыс.
I Analyzed My Finance With Local LLMs
17:51
Просмотров 417 тыс.
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33