Another first class tutorial. Very generous of you to share your knowledge with us all. Im sure you have a big list of turorials that you have planned, but thought id ask whether you were going to explore vision? It would be fantastic to upload images to a RAG system. It would also be amazing to upload videos to then question and retreive sections from... great tool for teaching others with extractions from videos. Again, thank you so much. You tutorials are accessible by so many and you'll be hitting millions of subscribers in no time!
Thank you so much Jonathan, I appreciate the kind words a lot! I love your suggestion too and vision is definitely something I need to be covering for my content as well. I am planning on creating content around multi-modal RAG and things like that in the future!
Thanks cole awesome stuff One of my favorite channels, I realized to myself what makes a great channel you watch every video from? On that focuses on one very specific great thing of value, note to self. It seems like how the different file type or complex ones like excel or sheets is ingested is really important based on output results you want? Would love to see a video on that for sheet or xlsx specifically, to understand the different ways to process these and reasons and how to be thinking about this based on your goals. Honestly rag is still really confusing based on it's limitations and when and how to use it or not specifically for files like sheets like if I wanted to later extract and process a larger Excel and multiple rows at once based on a column filter for example perhaps to do an overall sheet analysis how that would work. More complex sheets rag workflows would be really interesting as it has a lot of value to many people who use sheets so how to ensure the best results for maybe a bigger complex queries. Thanks appreciate you.
Thank you very much Jarad, that means a lot! I love your thoughts here! This is exactly what I was getting at in the video when I said there are a million ways to handle CSVs/Excels especially and that I'll be covering that in a later video. What you mentioned about wanting to process a whole file based on a column filter to do overall sheet analysis is definitely something RAG wouldn't be good at since it does smaller lookups, not analysis over many chunks at once. But what you can do is have RAG combined with other tools (usually with Python code generation) to do this kind of thing to make a powerhouse of an agent! That's what I'll be making a video on in the future.
Thank you man! And I appreciate the suggestion a lot! I have a video on my channel that goes over frontends for n8n agents but I can certainly put out more content on it. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-JyolNYRbAcs.html
Awesome video! I would love to see the video specifically on csv and excel files, or even hooking into sql databases…. That is something I have been struggling with. Keep up the awesome content!
@ColeMedin what about using sql to drive the accuracy up when searching your documents, to make the ai follow directions basically, then having pinecone for few shot vectors for the conversational part
@@BirdManPhil I like where your head is at with this! That is certainly possible depending on the use case! One use case I am working on right now actually involves having the agent write SQL to perform calculations across a large amount of records that wouldn't work well with RAG. But the table structure is also simple enough where I know the agent won't mess up on the SQL generation.
Thank you so much! I was banging my head against the wall from some other tutorials and your video helped me get my first bot up and running. RAG, Asana, Atlassian Confluence, oh my! I'd love to see a slack video where you hook n8n up directly to slack without runbear. Or is that a terrible idea? :)
I'm so glad I could help!! My pleasure :) I am going to be doing more Slack integrations in the future! Runbear is great but there is always a time and place for custom implementations so I'll certainly cover that still.
Hello Cole, thank you for keeping up on these rag videos. I am trying to do something similar for my work. But we have a lot of files on local shares and SharePoint. I am wondering if you know that this would work for local files instead of cloud serivces like Google drive or SharePoint?
Of course! And fantastic question! This would indeed work for local files. You can trigger the workflow when files are created or updated locally just like with Google Drive, and then processing them would be exactly the same. Check this n8n documentation out! docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/
@@ColeMedin Hello Cole, thank you and I get pdf, excel and text working. But I have not found a way to get doc or docx extracted. Do you know how this can be done
Thank Cole, always learning something from you! How would you handle this challenge: having a google spreadsheet (not excel), feed it to the agent and ask it questions that depend on the data, like what is the sum of the hours worked etc. I guess since LLMs are awful as calculator, you'd have it convert the spreadsheet to an SQL database and make the AI write and execute the query, but I'd appreciate if you could show the exact steps. Thanks!
Thanks Sergey, I'm really glad to hear that! I appreciate this question a lot because you're definitely hitting on one of the limitations of RAG. Since RAG is meant more for specific lookups, it doesn't handle larger file analysis well like determining averages, sums, maximums, minimums, etc. over a large CSV/Excel file. However, you can create a BEAST of an agent by combining RAG with other tools (like Python code generation or SQL generation if you table is in a database like you mentioned) in an agent so it can handle both lookups and calculating values across your files. I will be making a video on exactly that in the near future!
Thanks, Cole. Really great videos that you do! 😊 @16:00 you mentioned about ingesting documents with Excel or Google sheets in different ways of doing it. Do you have a video on it yet? God bless you
Thank you very much - my pleasure :) I do not yet but I am planning to do this video pretty soon! Lot of experimentation I am doing behind the scenes for this actually!
Great job! Can you make a video of an CRM agent that can manage clients. e.g update status, search records and get information, assign task based on criteria. That would be amazing 😊
Thank you and good question! Both are great, I just used Qdrant in one of my other videos because it was a part of the local AI starter kit so I wanted to leverage it for RAG to use the whole package. I'm a big fan of Supabase since I can use it both for the SQL DB for conversation history and for RAG, so it's a double wammy. But again, Qdrant is great for RAG too!
Thanks for the great video! I'm considering switching to Pinecone for the vector store to see if it improves similarity search accuracy, while keeping Supabase for PostgreSQL to manage chat memory. I’m also encountering out-of-memory errors on my self-hosted n8n whenever I try inserting large PDF files into Supabase’s vector store. Any thoughts or tips would be much appreciated!
Yeah that is certainly worth a try, especially if your knowledgebase is huge. Pinecone is fantastic, though PGVector should work just as well as long as you don't have millions of records. Out of memory errors probably means you'll need to upgrade your instance that is hosting n8n! How large are your PDFs?
Fantastic question! I'm using Claude 3.5 Sonnet in this video (not sticking to local in this case). Llama 3.1 8b doesn't do well with RAG specifically in n8n I should mention, since n8n does RAG as a tool and Llama 3.1 8b doesn't handle tool calling well.
Hi! I just discovered n8n due to great interest in self hosting. May I ask is there a way to connect n8n to a front-end? Like I don't need to go to n8n to access that chat and do things wit in but instead have a front end while my n8n will be the back end if that makes sense? If possible, is there such a thing that is already build and I just need to debloy? Thank you!! Super fun to learn about it and thanks to Cole for the majority on n8n! Looking forward for more!
If you want to just use the chat feature, n8n gives the option to embed the chat widget on any website or use the chat directly on a publicly available url too
Thank you very much and that is a great question! As @NishanthA32 pointed out, n8n gives you the ability to embed the chat widget on your website. Not the most customization but for basic use cases it works well. Otherwise, I made a video on my channel recently for how to use an n8n agent as an API endpoint for a frontend built with something like Streamlit or Next.js! ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-JyolNYRbAcs.html
Before anything, thanks for the video, it helped me a lot! I would like to ask two questions: 1. When I delete a file from the folder, the it keeps existing on supabase, and is that supposed to happen? 2. I'm getting "Error inserting: unsupported Unicode escape sequence 400 Bad Request", when uploading docx files, do you have any insight about it? Again, thank you for the video, you explain things very well.
Interestingly, I got accurate/complete results with less hallucination (AND saved prompts) after setting the Supabase Vector Store node to 'Get Many' instead of Retrieve Documents (thus having to switch agent to 'conversational agent') Wondering if that's because the task I hand to the Agent is quite complex (had to to split it into more than one agent and token usage was skyrocketingwhile output quality plummeted)
Wow that is super interesting... thanks for sharing!! That's definitely counterintuitive that making that switch would help, but I'm glad it worked for you! Did you do anything custom with the prompt for the 'Get Many' node? Maybe something you did to set it up more custom (since the Retrieve Documents gives less control and 'Get Many' gives more) made the results better for your more complex use case.
@@ColeMedin Just "search for the docs on, or most related to this query: {{ $json.chatInput }} I wish I knew why Get Many is performing better lol... Since we're here: do you happen to know how can we 'filter' queries? I wish to add similar data but with different contexts inside the same supabase table
Thanks for this, you are awesome! Do you know any content in the web that follows up on pdf analysis agents? The results are very bad for big documents with tables, like financial statements
Of course, thank you very much! I will be making more content specifically on processing PDFs (and Excel docs) in the future. But off the top of my head what I can think of is there are tools out there like AWS Textract that are meant for working with a lot of complicated data in PDFs. I would check that out to start!
ah this is what I was stuck on last time was the different file types in the data loader thanks so much for clarifying that! About pdf's, let's say we are putting through appointment or meeting reminders, is it better to summarize with an ai and convert to text or can it handle just dumping the whole pdf in there in this way you're showing on video?
You're so welcome! The point of RAG is to make it possible to dump huge documents and still make it possible to look up specific pieces of information within the documents. So to answer your question I would just put the whole PDF in the vector DB instead of summarizing it first!
I cannot find the field "file_type" to drag it into the set file id node. I only see mimetype. Is this equivalent or changed N8N the functionality? I also cannot open your linked workflow: "not a json file". Helping hint would be appreciated.
You can use mimeType, it is equivalent! Make sure when you import the JSON file you do import from file not URL! I checked and it is a valid JSON file I can import!
My node that should be inserting into the Supabase Vector store isn't working... I'm getting no error from the node, the extracted input data is waiting at the input stage, the node settings are per the video, the SB project is setup, 'documents' table exists, credentials are right... but nothing gets inserted. It just says 'nde successfully executed' without doing anything. Any ideas? Thanks :)
That's really weird! It must be inserting somewhere if you aren't getting an error. Do you have any other tables in your Supabase? Also I've had this happen a couple times where it seems nothing is getting inserted but then when I refresh my page the records are there.
Do you have any issues with the Google Drive trigger having trouble when you add a lot of documents to the watched folder at once? I find that it sometime submits the same file that was triggered earlier or it misses a lot of them. I see this trigger used in tutorials all over youtube so I'm either doing something wrong or most people only feed it a small number of documents at once. I'm curious if you've notice anything similar.
I have had some issues in the past! I believe what it comes down to is if you upload many documents at once, the workflow will only trigger once but with multiple items going through the workflow at once. So you just have to modify what I have here to handle items in a loop within the workflow. Otherwise, if you're actually seeing the workflow entirely miss a trigger, it could potentially be an issue with rate limits for the actual Google Drive API and not because of n8n.
Interesting... it has worked for me before but I know once and a while it can have issues! Could be some sort of limitation with the Google Drive API as well or a short term rate limit being hit.
Thanks you so much for this ❤. Question, how would you work with this if you are using something like a notion database? And, if for instance, you are running a company and have multiple departments with different knowledge bases. Are you adding all in the same table? or can you add them in different tables in supabase? will adding them in the same table mess with the quality of retrieval?
+1 Thanks for this questions and thanks to Cole for his tutorials ❤. I would love to be able to use the same RAG Agent, but from a Notion database with the ability to be able to update properties AND content blocks within database pages, all in sync with Supabase. I tried to create this workflow from Cole's previous tutorial, but it's much more complicated with Notion than with Google Docs.
My pleasure and great questions! To your first question, Notion has triggers in n8n for when a page is added or updated. So it should be very similar to Google Drive! I haven't used Notion much before so I'm not sure how exactly you would extract the text from a page to put it into Supabase, but I believe you could use the "Get Many Child Blocks" node to break the created/updated page into blocks and insert each of those into the vector database in a loop. For your second question - when you have multiple companies/departments you want to manage knowledge for, one easy solution is to use a separate table for each. That doesn't scale very well though. The better option that I typically go with is to set a piece of metadata on each record for the company/department the record is tied to. Then, when I query the knowledge base (assuming I know the company/department) I can do a metadata filter on that first and then only query on the subset of documents! So I would just take a dive into metadata filtering in general for this kind of use case!
@@ColeMedin Thank you for this response. i am not sure i get the answer though lol on the supabase one😆. i will add you response to clause to get more clarification
@@FarisandFarida Of course and I'm sorry! Hopefully Claude can help haha but yeah lmk if you want more clarification! I'd ask Claude about metadata filtering and it could give a really good breakdown for you!
Good questions! I believe RAG with n8n doesn't support images by default without creating a more custom workflow to do something like extract text from an image and then store that. You could certainly do that though! Multi-modal RAG is something I'd have to look into more with n8n. Yes, you can use the OpenAI API in n8n!
Great vid. The pdf scanner does it extract the txt from flattened PDF files as that was always the difficulty. In flattened the text being embedded in an image so some OCR is involved . Thx
I would love to know your opinion about something I've been researching. What do you consider is the best tool for the production stage when it comes to n8n automation flows. In other words, what would be the best approach to "orchestrating" (publishing/cloning/managing/monetizing) n8n flows that you create for clients. BTW thanks for putting out one of the top channels for straight forward and hands on training in this field.
Thanks for the kind words! And great question! I've been thinking about this myself actually and I'm planning on just creating something custom to manage all of my n8n workflows for production. Just some sort of frontend that I can build really fast with tools like Bolt.new/Claude dev/v0 and then managing the orchestration of n8n workflows with FastAPI endpoints.
@@ColeMedin that's a good approach. It's weird how no one is building an all-in-one solution. I mean that's a pain point for many AI devs. Good luck in that project. And thanks for the great content.
I'm glad it sounds good to you! I've been putting a lot of thought recently into how to make it work super well. An end to end solution is definitely a big pain point for a lot of devs right now! Thank you and my pleasure!
Still working on this flow. may I ask: did you ever encounter a problem where 'document retrieval' output/completion goes only up to only around 300-500 tokens? I've tested it in all rag token use range, from 8k to 15k to 20k+ (by tweaking file limits)... my output seems to always be 'truncated' or limited to 300-400 tk... any clue? thanks already
Hmmm... well the documents are split into chunks when inserted into the vector DB, so maybe you are just seeing a single chunk and that is why it appears to be truncated? Or maybe I'm not quite understanding the issue you are encountering so if you could clarify that would be sweet!
@@ColeMedin so… It’s not that the retrieval was truncated (actually was pretty extensive/complete). But that my output tokens were consistently too short. Well, for now, breaking 1 big task into 2-3 smaller ones did the trick for me
Ohh so the actual response from the LLM was cut off? That's super weird! I haven't seen that happen before. I'm glad breaking it up worked well for you though! Maybe you were actually hitting the context limit with whatever LLM you were using.
Great question - it sure does! You just have to make sure the instance you are hosting n8n on is powerful enough to handle the number of concurrent requests you are looking for. But it doesn't take that much compute power so you should be good regardless.
@ColeMedin I'm ended up spinning up a 3 node kubernetes cluster with a node balancer and persistent volumes. It's not the best at all but it will scale and that's what matters. I've got treafik and let's encrypt all set up woth postgresql and pgvector for hybrid semantic search, n8n is in que mode with the main node set to 2 replicas for now, a dedicated worker node that can spawn up to 5 replicas, and a webhook node that can spawn 2 with redis handking all the it er njde cimmunication cache and queueing . So all that works perfectly, but I can't figure out now how get other tools to properly ingress and I still need at least flowise and langflow and a purple monitoring services working before I can really dive in. It's hard I'm barely ok with docker, and kubernetes is like docker on steroids from thanos
Thank you! n8n is indeed open source itself, but yes I have a lot of content on my channel for LangChain already and will continue to put out more! As an example, I created a video recently for how to use LangChain and n8n together to create some really neat agents: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-8hAMASB-RpM.html
Why is installing Supabase so hard though? So many different parts, I’m stuck on launching the supabase-vector container, it just exits with code 78 and I’ve spent a night on it with no luck
It's really just because there is the vector DB that needs to be enabled and created and then there is the regular SQL DB for conversation history. In this video I go into more detail on getting it set up! ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-PEI_ePNNfJQ.html
Hi Cole, another great video! You just solved my biggest struggle of the last 3 days. But now I have a new problem. How can I upload files from a specific folder in bulk? I have a folder in Drive with 122 PDF files. The current automation only allows me to upload one file at a time, and it's the same file every time. What I'm missing?
Thank you very much - I'm glad I could help! If you want to upload an entire folder for RAG at once, you would want to create a separate folder for that. This workflow is meant to ingest files that are created/updated after the workflow is made active, not to go retroactively through a folder. You can create a workflow that uses the Google Drive "List files and folders" to get your list of your 122 PDF files and then loop through each one to extract the text and add into the RAG knowledge base similar to what I do in my workflow.
i'm struggling to understand the value of n8n over simple, well-proven utilities like pdftotext, pdfimages, xlsx2csv, csvtool, tesseract-ocr, goocr, rclone, syncthing, etc ... glued together with simple scripts. I was hoping to find some AI that did all this with minimal building, but n8n still needs you to do the same heavy work .. just with other tools.
You raise a fair point! You're definitely right that n8n is really just a way to piece a lot of tools together that you could do yourself with libraries and simple scripting. But for a lot of use cases and for a lot of people, n8n just makes that easier. Especially for those who don't know how to do that kind of scripting to piece things together themselves. I will say though that I quite enjoy piecing tools together with Python code myself!
hey cole, having an issue with my supabase vector database, I have created the DB in supabase with the basic script that they provide in the docs, with a different name from "documents", i have adapted everything else so I can call the new vector store. The retrieval works only sometimes, very awkward when I change the model from the vector store tool to a different one, it starts working again! But then, stops working again. Super weird, any thoughts why this is happening? Keep up the good content!
Thank you and wow that is super weird! I'm curious if restarting the conversation would also help the retrieval just like resetting the vector store. Because one thing I see happen a lot is when the AI thinks it doesn't know the answer to something, it gets stuck in the mindset within the same conversation that it doesn't know, even if you ask a different question. Happens a lot with the weaker models especially. Though I guess some clarification would help too! When you say it doesn't work what does that mean exactly?
@@ColeMedin thanks for the reply mate! so basically the I got an error with the embedding node, it throws me an error "Cannot read properties of undefined (reading 'replace')" and there is no more info on the error at all, just some generic rubbish. If i go to to the logs in the "embeddings" part, there is no input for the LLM, so maybe the model is not being able query the vector? but once I change the model, it works again for 3-4 times, but eventually breaks again. i have changed the formatting, and using split code to markdown, and using 4o instead of 4omini now and it's been working... but how can I put something in production with an error like this lol, and 4o is not cost effective and I would like to keep using 4omini hehe i have opened a bug report in github, but no response still, hopefully it will be soon resolved link - github.com/n8n-io/n8n/issues/11173 cheers!
@@supergpt_tv That's really weird... I hope you get a reply with the bug report! I haven't seen this before so I'm not totally sure what it could be either.
@@tecnopadre Great question! The main difference is you can take this n8n agent and extend it/use it with other tools/frontends in a way you couldn't do with OpenAI. The extra customization is the main plus!
Hi Cole, love your content! I have an Alienware m18 R2 with an Intel i9-14900HX, NVIDIA RTX 4090 (24GB), 64GB RAM, and 8TB storage, but I struggle to run LLaMA 70B models. If you ever find the time, could you create a video for users like me on optimizing setups (8-bit quantization, mixed precision, etc.) to run large models efficiently? Your help would be greatly appreciated. Many Thanks!
Thank you very much! Yes even for an awesome computer like yours you'll have to do 4/8-bit quantization to run 70b models efficiently. Running local LLMs and optimizing those setups (with quantization, fine-tuning, etc.) is something like will be putting a LOT more content out around in the near future! Actually working on building my next PC next week to run some more models!
@@ColeMedin Hey Cole! Thanks so much for your kind reply! More content in the near future would be awesome! But would it be safe to say that with the suggested fine tuning it would be possible to get it running efficiently? If you don't mind what type of performance could i expect when doing so?
@@theuniversityofthemind6347 Of course! And fine tuning won't make the model run faster, but fine tuning makes any given LLM work better for your use case so I meant more you can use fine tuning to make a smaller model (like 11b or 32b) work just as well for your use case as a 70b model! Not guaranteed of course but with good data you certainly can
Great question!! I assume you meant Pinecone instead of Postgres since we are talking about vector DBs. Supabase with PGVector is the simplest and will perform just as well as Pinecone and Qdrant until you get to millions of records. Once your vector DB is absolutely massive, that's when it makes sense to go with a dedicated vector datastore like Pinecone or Qdrant. I'd choose Pinecone if you want something set up fast and Qdrant if you want to self-host since it is open source!
Msword files getting all Gibberish when extract. Only extension .doc work a little but still have a lot gibberish words. Can you check this with some proper document file please?
This setup just works with Google Doc files right now! For MSWord you might need a custom extract step to parse the format for those files specifically.