What is RAG or Retrieval-Augmented Generation

Подписаться 28 тыс.

Просмотров 1,8 тыс.

50% 1

RAG, or Retrieval-Augmented Generation, is a technique used in natural language processing (NLP) and machine learning. It combines two key components: retrieval of information and generation of text. The idea behind RAG is to enhance the generation of human-like text by first retrieving relevant documents or information from a large dataset, and then using this retrieved information to inform the generation process.
In a typical RAG setup, when a query or a prompt is given, the system first searches a large database (like Wikipedia or a similar corpus) to find relevant information. This information is then passed to a language generation model, such as a transformer-based model like GPT (Generative Pretrained Transformer), which uses this information to generate a response that is informed by the retrieved data.
This approach allows the system to produce more accurate, relevant, and informed responses, especially for queries that require specific knowledge or factual information. RAG models are particularly useful in applications like chatbots, question-answering systems, and other AI applications where providing contextually relevant and accurate information is crucial.
Join this channel to get access to perks:
/ @python-programming
If you enjoy this video, please subscribe.
✅Be my Patron: / wjbmattingly
✅PayPal: www.paypal.com...
If there's a specific video you would like to see or a tutorial series, let me know in the comments and I will try and make it.
If you liked this video, check out www.PythonHumanities.com, where I have Coding Exercises, Lessons, on-site Python shells where you can experiment with code, and a text version of the material discussed here.
You can follow me at:
/ wjb_mattingly

Наука

Опубликовано:

3 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 10

@JH-jy1ye 5 месяцев назад

Really good explanation, thank you

@gerardorafaelquirossandova3115 8 месяцев назад

This was a very precise and easy to watch explanation video. Thanks!

@alexdavis9324 10 месяцев назад

Great video!This was very interesting. I am going to try and add this to my work flow at some point

@python-programming 10 месяцев назад

Glad to hear it! My video next week will show you how to get started.

@halibrahim 10 месяцев назад

What about data privacy when you share your documents with these open sources, any solution around it?

@python-programming 10 месяцев назад

This is a very good question. One solution is to use open-source models and host them locally. With this approach, you would not rely on any API and your data could sit and remain local. A good place for models is HuggingFace. This is something I intend to cover in this series on RAG.

@ArunKumar-bp5lo 10 месяцев назад

great short precise idea thanks

@python-programming 10 месяцев назад

Thanks so much! Glad you liked the video!

@sangramkesariray 10 месяцев назад

With LLMs, what's the future of spacy, is it going to do NER currently using CNNs and switch to LLMs and Prompt engineering.

@python-programming 10 месяцев назад

This is a great question. I see spaCy remaining highly relevant for the foreseeable future. spaCy already has built-in components to work directly with LLMs via APIs for OpenAI, HuggingFace, etc. You can use them via spacy-llm (video coming soon!). Also, they have great recipes for their annotation software Prodigy. LLMs are really good for certain NER tasks. I have experimented with this extensively for months. They struggle in a few areas, namely consistency and cost. It is far cheaper, more consistent, and more reliable to use an LLM for assisting in annotations and training a smaller model that can be run locally. In the next year or so, I imagine we see this change, especially as open-source LLMs become smaller and more affordable.