RAG From Scratch: Part 2 (Indexing)

Подписаться 42 тыс.

Просмотров 22 тыс.

50% 1

This is the second video in our series on RAG. The aim of this series is to build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. This video focuses on indexing, covering the process of document loading, splitting, and embedding.
Code:
github.com/langchain-ai/rag-f...
Slides:
docs.google.com/presentation/...

Опубликовано:

1 июн 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 11

@user-wj7ys7mk2d 3 месяца назад

For the very first time in my mechanical engineering life, I think i learned something in detail about software engineering! Thank you!

@alexenax1109 3 месяца назад

Amazing series! Thank you Lance!

@hamzafarouk1538 3 месяца назад

Thanks. Nice video short and clear. But why do you need to store the embedding in a vector db

@Orcrambo 3 месяца назад

For efficient similarity searches

@horyekhunley 3 месяца назад

When performing RAG, you need to encode your input data into embeddings because that is what the LLM understands, it is from these embeddings that the model will perform decoding of the output and subsequently give you the result you've asked for. These embeddings need to be stored somewhere, they can be small or very large, some vector dbs can be free, opensource and be in memory like faiss and chroma, they can also be paid and hosted like pinecone

@clivejefferies 2 месяца назад

@horyekhunley thanks for the insight

@sepia_tone 2 месяца назад

@@horyekhunley the process is a little different The LLM doesn't touch the embeddings. The embeddings are used to convert the documents to a form that can more quickly and accurately be compared to the question (which is also converted to an embedding). This is done by an embeddings model (in this example an embeddings model from OpenAI (referred to as OpenAIEmbeddings() in the code) is used. These embeddings and their associated documents need to be stored somewhere (in this example Chroma is used). This is the indexing phase. After the comparison between the embedding of the question and the stored documents, a subset of the documents which have a high similarity (in embedding space) with the question is given to the LLM. This is the retrieval phase. Finally, the LLM uses the returned documents and it's own knowledge to reason and given an answer to the user. This is the generation phase..

@tongweiwang9775 Месяц назад

@@sepia_tone Still didn't quite get what's the role of indexing here. Relevant documents are retrieved based on the embeddings of the split documents and the embedding of the question. So what's indexing doing here?