Graph RAG: Improving RAG with Knowledge Graphs

Подписаться 162 тыс.

Просмотров 13 тыс.

0% 0

Discover Microsoft’s groundbreaking GraphRAG, an open-source system combining knowledge graphs with Retrieval Augmented Generation to improve query-focused summarization. I’ll guide you through setting it up on your local machine, demonstrate its functions, and evaluate its cost implications.
LINKS:
Graph RAG: tinyurl.com/y3f9cbnd
Github: github.com/microsoft/graphrag
GraphRAG flowchart: tinyurl.com/2s4ytcur
RAG flowchart: tinyurl.com/56mwzkk7
Community Creation: tinyurl.com/54n88c2j
💻 RAG Beyond Basics Course:
prompt-s-site.thinkific.com/c...
Let's Connect:
🦾 Discord: / discord
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Patreon: / promptengineering
💼Consulting: calendly.com/engineerprompt/c...
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Newsletter, localgpt:
tally.so/r/3y9bb0
00:00 Introduction to GraphRAG and Its Cost Issue
00:44 Understanding Traditional RAG
01:46 Limitations of Traditional RAG
02:22 Introduction to GraphRAG
02:39 Technical Details of GraphRAG
05:46 Setting Up GraphRAG on Your Local Machine
06:22 Running the Indexing Process
12:00 Running Queries with GraphRAG
14:26 Cost Implications and Alternatives
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...

Наука

Опубликовано:

6 июл 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 75

@roguesecurity 3 дня назад

Thanks for sharing this video. It was very informative. I would love to see a comparison between different Graph RAG solutions.

@aa-xn5hc 3 дня назад

Great. Please more on this topic

@user-wr4yl7tx3w 3 дня назад

this looks great. yes, would be interested to compare against llamaindex.

@VivekHaldar 2 дня назад

Love the focus on cost tradeoff. Thanks!

@kennethpinpin9053 2 дня назад

Excellent concise introduction on GraphRAG and why, how, when it is needed. Please compare this with Neo4j and Llama Index.

@positivevibe142 День назад

Thanks for stating that Microsoft Graph RAG is limited in contextual understanding, I already noticed that but thought that I messed up somewhere in my setup/building. Excellent, informative, and precise details! Thanks again!

@danielshurman1061 3 дня назад

Another great choice of important topics and video to show the current best implementations. Thank you!

@engineerprompt 3 дня назад

Thank you. Always glad to see your comments.

@Ayush-tl3ny 3 дня назад

Can you please tell how can we change the embedding model to use some open source embedding

@robboerman9378 2 дня назад

Very useful, also good to share the associated cost. It would be very interesting to see a video comparing the different GraphRAG implementations as you mentioned.

@AliKazemi-h1b 2 дня назад

Great job. Please publish more videos about GraphRAG with other competitors.

@CollinParan 3 дня назад

Interested in the Neo4J version

@andrewandreas5795 2 дня назад

Great video! Could this be used with other models like Claude or maybe even OS models?

@jeffsteyn7174 2 дня назад

Excellent. Going to subscribe for more info. Maybe i missed it but it seems like theres no specialised db used like with neo4j. So effectively we can use any data store?

@MPReilly2010 2 дня назад

Excellent! Outstandingly thorough and clear details. Subscribing now.

@engineerprompt 2 дня назад

thank you!

@armans4494 3 дня назад

Great video! Thank you

@paulmiller591 2 дня назад

Great video and very timely. Please do more like these.

@engineerprompt 2 дня назад

thank you, planning on doing that :)

@DayLearningIT-hz5kj 2 дня назад

Please share more about this topic 👍👍 so the most expensive part is in creating the graph ? Do you think that it is really improved the accuracy of the response ?

@DePhpBug 2 дня назад

Been wondering , especially the book.txt , let's say if is a .json object will our RAG works well with json objective when we feed those data into doing embedding etc through the RAG proceess?

@ganesh_kumar 2 дня назад

Baring the cost, can we use it for large number of docs and will it have good accuracy, what about the time taken for the complete indexing? And latency??

@jekkleegrace 3 дня назад

yes please compare with llama index

@henkhbit5748 2 дня назад

Thanks, if we can use open-source LLM then the cost will be less or use groq, right?. There are no examples using python code instead of command line?

@BACA01 3 дня назад

Can we use it with VLLMs?

@mr.anonymus6110 2 дня назад

comparing them and a short video showcasing it on a local llm would be nice for sure

@tasfiulhedayet 3 дня назад

Interested to know about the difference between those different frameworks

@engineerprompt 3 дня назад

Will do.

@DEEPANMN 16 часов назад

Is this possible use a networkx graph library instead of LLM generated graph! I have readymade graph on my private dataset

@lesptitsoiseaux 2 дня назад

How can I add metadata that can be valuable to use as entities? I don't mind buying your course if it helps my use case: I am building a recommendation engine for classes based off transcripts (multiple per class).

@gregsLyrics 3 дня назад

Always amazing vid and wisdom. I do wish there was a Star-Trek Vulcan mind meld available. There is so much I do not understand and I think I am jumping in way ahead of the learning curve - which makes this task that much more difficult. I have a vast amount of scanned pdf that I think is perfect for this, except the "scanned" part. These documents are not machine readable and I greatly fear OCR (garbage in garbage out). Plus I have tens of thousands of pages of bank statements, credit card statements, investment statements , tax returns that I need to analyze and tie to the data. Currently doing all of this manually with printed pages, post-it notes and highlighters. Will your RAG course get me up to speed and teach a design process with the various tools that I will be able to grasp?

@engineerprompt 3 дня назад

The RAG course is focused on different techniques on how to build better RAG applications. In this course, I don't go too much into the ingestion part. For your problem, you might want to look into multimodal models like Claude or Gemini where you can directly process these scanned pages but the cost can add up quickly

@BACA01 3 дня назад

For this you'll need a RAG with local vision models VLLMs

@dcmumby 3 дня назад

Thanks , what uses would traditional RAG be more effective ? Would love to see more detailed implementation as well as using alternative providers. Also very interested in comparing the competition

@engineerprompt 3 дня назад

In cases where you have well structured QA dataset, e.g. HR policies. In those cases you don't really need advanced techniques like knowledge Graphs.

@JatinKashyap-Innovision 2 дня назад

Please create the comparison video. Thanks.

@KristianSchou1 3 дня назад

Have you experimented with a custom documentloader? - I'm working on a RAG system at work, and I've found that the prebuild loaders are severely lacking when it comes to handling text extractiong from pdf-files from "the real world". Would be nice to hear your thoughts

@IamMarcusTurner 3 дня назад

llama parse could from llama index could be a good choice. They noticed the same

@BACA01 3 дня назад

You would need a RAG with local VLLM.

@josephroman2690 День назад

interested in the video of comparison of all neoj4, llamaindex and this one

@awakenwithoutcoffee 2 дня назад

question thought: can graphRAG work with multi-modal data ? I would actually think why not as a picture can be classified/linked to a specific node.

@engineerprompt 2 дня назад

I think it will be able to. Usually, for multimodal RAG, you use a vision model to generate descriptions for each image and preserve the links. Then you can embed the images descriptions just like you do for text

@awakenwithoutcoffee 2 дня назад

@@engineerprompt gotcha. How and where do you think would the images be stored/retrieved ? Let's say we attach a summary of a picture + metadata about the page/chapter/subject/id: maybe we could store the images in a separate database and retrieve it at runtime. Cheers!

@aa-xn5hc 2 дня назад

What about Ollama ?

@lavamonkeymc День назад

How did u make the flowchart with sonnet?

@engineerprompt 21 час назад

I provided the paper to it and then asked it to create visual representation of the proposed method. Usually, it takes a couple of iterations for it to do a good job

@sebastianmanns5391 3 дня назад

Would be suuuuper nice if you could compare Microsoft versus LlamaIndex versus Neo4j implementations :)

@ilianos 3 дня назад

One thing that I don't understand is: why is it limited to only 300 tokens? When you consider the current best models nearly all have 128k now...

@aaronhhill 3 дня назад

Yeah, Claude has a 200,000 token limit and Gemini how has a 2,000,000 token limit.

@engineerprompt 3 дня назад

During RAG, you have to chunk your documents into subdocs. The idea is that you only want to use part of the document that are relevant to the user query since each token costs money. You get to choose the chunk size. In the paper they used 300 but you can set it to any value you want. But larger chunks will incure more charges if you are using an api.

@jefframpe5075 2 дня назад

Greg Kamradt @Dataindependent has a good video on 5 levels of text splitting for RAG, which explains the trade offs between different methods. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-8OJC21T2SL4.htmlsi=Go7eAYu0kkL_exiv

@GoldenkingGT101 3 дня назад

Can it be done with ollama?

@CollinParan 3 дня назад

He said you can in the video, by changing the endpoint in your settings file.

@paulmiller591 2 дня назад

Yes he did mention that was an option

@IslandDave007 3 дня назад

It would be very interesting to see the results of this in global mode compared to dropping the whole book into the prompt of Gemini or Claude if it fit under their token limit. Obviously once you get beyond those limits RAG is required. Also would be great to see this run fully locally against a standard RAG local solution using the exact same LLM.

@unclecode 2 дня назад

You know, I wonder if you could check the cost breakdown and token consumption for the indexing and inference phases. How much of the seven dollars is for inference? If it's low, we could just index once, right? But seven dollars is huge, so it doesn't work. Also, I saw the tokens moved were 900,000+ context tokens and around 160,000 generated tokens. There's room for improvement. I don't think we need 900,000 context tokens for that question. Maybe those context tokens are from the initial prompts used for indexing. If that's the case, rerunning it and noting the costs for indexing vs. inference would be useful. By the way, great content as usual. The knowledge graph concept really clicked for me, like the linear flat nave RAG, a longstanding computer science idea. Its connection to ontology, which interests me, is amazing. Remember Stanford's WordNet and VerbNET? They were golden datasets for extracting relationships and similarities. Now, using large language models to generate knowledge bases from text lets us use those good old time algorithms safely.

@engineerprompt 2 дня назад

Just checked, there were only 25 calls to the embedding API and about 570ish calls to GPT4o. So probably 90-95% is coming from GPT4o calls. That's during the index creation process. So this will be needed. In any case, I think this process runs only once to create the graph and then the retrieval part is pretty much like normal RAG. I will try to run the indexing with something like Haiku which will be much lower cost but hopefully will be good enough indexing. I agree with you. I think the LLM community need to look back some of the techniques built earlier for IR and reuse them. Will get much better performance out of these LLM based applications.

@unclecode 2 дня назад

@@engineerprompt Interesting. In that case, a viable production-level solution is to fine-tune a small model (~1B) specifically for generating knowledge graphs, with different flavors for topics like medicine, math, and business. This would create accurate knowledge graphs at a much lower cost. It's a great business opportunity. Honestly, I never thought of naive RAG as a whole pipeline! To answer complex questions, creating a knowledge graph is essential-perfect for a startup. Secondly, if inference is like normal RAG, that's good. But we need to dig into the GraphRAG library to see how much context it fetches from the knowledge graph and injects into the prompt. If it's too much, we'll have an issue. It’s about heuristically picking the minimal context needed to produce answers, so there's an optimization scenario here. Summary: use a small language model for generating a knowledge graph, and optimize how much context the GraphRAG library injects from the knowledge graph to answer questions. Please check the latter one if you find time.

@ilianos 3 дня назад

Regarding the cost implications (last part of the video): I think - a benchmark for different LLMs with GraphRAG and/or - some request router (but specialised for GraphRAG) would be suitable to optimize the right LLM for the right kind of request.

@engineerprompt 3 дня назад

Agree, I might do a video comparing different models when it comes to cost. Will be an interesting comparison

@bigqueries 2 дня назад

pls compare all graph rag

@svenandreas5947 2 дня назад

ollama is not working for embedding

@engineerprompt 2 дня назад

were you able to use the LLM via Ollama? I think for embeddings there is not a standard API so that's why its not working.

@malikrumi1206 3 дня назад

Why have knowledge graphs become all the rage? How is that any better than a traditional database? Note that PosgreSQL already has a vector store extension, and since it has been out for a while now, I would assume all the others do, too. Since a number of others are requesting a head to head comparison with llamaindex, why not include a comparison with a traditional RDBMS? In one version, the model uses an SQL tool to find (read) the query answer in postgres, and then in the second, it looks for vectors in pg_vector?!

@mrchongnoi 3 дня назад

Thank you for the video very good. Very expensive solution. It seems to me that this solution would not work for a company that has thousands of documents.

@engineerprompt 3 дня назад

Yes, in the current form it's going to be expensive. Now if you were to replace GPT4o with an open weights model or less expensive model, then it could be a different story IF they provide same level of accuracy

@mohanmadhesiya3116 2 дня назад

we want comparing videos

@fintech1378 2 дня назад

cost will come down significantly in the near future, as long as it works then people should start initializing the workflow

@j4cks0n94 3 дня назад

That is SO expensive for just a book

@qicao7769 14 часов назад

After see the cost caused by GraphRag, I am going to change the setting of Embadding and to my local LLM.... too expensive

@engineerprompt 14 часов назад

Agreed but you definitely want to watch my next video on the topic

@marconeves9018 2 дня назад

Are you not familiar with local hosting? Just wondering why you don't opt to showcase these tool integrations by going local instead of using paid APIs-- you closed the video with the premise of this being more expensive but thats greatly reduced if youre hosting it yourself.

@engineerprompt 2 дня назад

check out other videos on my channel :)

@lesptitsoiseaux 2 дня назад