Effortless RAG in n8n - Use ALL Your Files (PDFs, Excel, and More)

Cole Medin

Подписаться 27 тыс.

Просмотров 13 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

30 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 136

@BuddhaMedam 20 дней назад

Keep feeding awesome n8n content buddy. This tool has immense potential and I believe you can be the the right person to show that ;-))

@ColeMedin 20 дней назад

Thank you very much - that means a lot!! And I certainly will be putting out more n8n content!

@PatentFinance 17 дней назад

@@ColeMedin woot!

@jonathanbarber3004 21 день назад

Another first class tutorial. Very generous of you to share your knowledge with us all. Im sure you have a big list of turorials that you have planned, but thought id ask whether you were going to explore vision? It would be fantastic to upload images to a RAG system. It would also be amazing to upload videos to then question and retreive sections from... great tool for teaching others with extractions from videos. Again, thank you so much. You tutorials are accessible by so many and you'll be hitting millions of subscribers in no time!

@ColeMedin 20 дней назад

Thank you so much Jonathan, I appreciate the kind words a lot! I love your suggestion too and vision is definitely something I need to be covering for my content as well. I am planning on creating content around multi-modal RAG and things like that in the future!

@jarad4621 21 день назад

Thanks cole awesome stuff One of my favorite channels, I realized to myself what makes a great channel you watch every video from? On that focuses on one very specific great thing of value, note to self. It seems like how the different file type or complex ones like excel or sheets is ingested is really important based on output results you want? Would love to see a video on that for sheet or xlsx specifically, to understand the different ways to process these and reasons and how to be thinking about this based on your goals. Honestly rag is still really confusing based on it's limitations and when and how to use it or not specifically for files like sheets like if I wanted to later extract and process a larger Excel and multiple rows at once based on a column filter for example perhaps to do an overall sheet analysis how that would work. More complex sheets rag workflows would be really interesting as it has a lot of value to many people who use sheets so how to ensure the best results for maybe a bigger complex queries. Thanks appreciate you.

@ColeMedin 20 дней назад

Thank you very much Jarad, that means a lot! I love your thoughts here! This is exactly what I was getting at in the video when I said there are a million ways to handle CSVs/Excels especially and that I'll be covering that in a later video. What you mentioned about wanting to process a whole file based on a column filter to do overall sheet analysis is definitely something RAG wouldn't be good at since it does smaller lookups, not analysis over many chunks at once. But what you can do is have RAG combined with other tools (usually with Python code generation) to do this kind of thing to make a powerhouse of an agent! That's what I'll be making a video on in the future.

@alvaroaraujo7945 20 дней назад

This is golden! Btw, I vote for a follow-up video on a front-end recommendation to go with this workflow

@ColeMedin 19 дней назад

Thank you man! And I appreciate the suggestion a lot! I have a video on my channel that goes over frontends for n8n agents but I can certainly put out more content on it. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-JyolNYRbAcs.html

@user-uv3nv2bc6v 20 дней назад

Yes please, more n8n and AI => you explain all perfect Cole!

@ColeMedin 19 дней назад

Thank you very much! I'll be doing a lot more with n8n and AI!

@nicholasburdick6086 20 дней назад

Awesome video! I would love to see the video specifically on csv and excel files, or even hooking into sql databases…. That is something I have been struggling with. Keep up the awesome content!

@ColeMedin 19 дней назад

Thank you very much and yeah it's pretty clear to me I have to make a follow up for CSVs/Excels for RAG - I certainly will!

@BirdManPhil 11 дней назад

@ColeMedin what about using sql to drive the accuracy up when searching your documents, to make the ai follow directions basically, then having pinecone for few shot vectors for the conversational part

@ColeMedin 11 дней назад

@@BirdManPhil I like where your head is at with this! That is certainly possible depending on the use case! One use case I am working on right now actually involves having the agent write SQL to perform calculations across a large amount of records that wouldn't work well with RAG. But the table structure is also simple enough where I know the agent won't mess up on the SQL generation.

@kitlee888 20 дней назад

Amazing tutorial...really helpful...thanks thanks thanks

@ColeMedin 20 дней назад

Glad you found it helpful - my pleasure! 😃

@choijungho1 20 дней назад

Perfect!❤

@ColeMedin 20 дней назад

Thank you!!

@johnfreddyvalencia 21 день назад

Gracias desde Colombia. Great Job

@ColeMedin 20 дней назад

Thank you very much!!

@johnsaxxon 20 дней назад

Thank you so much! I was banging my head against the wall from some other tutorials and your video helped me get my first bot up and running. RAG, Asana, Atlassian Confluence, oh my! I'd love to see a slack video where you hook n8n up directly to slack without runbear. Or is that a terrible idea? :)

@ColeMedin 17 дней назад

I'm so glad I could help!! My pleasure :) I am going to be doing more Slack integrations in the future! Runbear is great but there is always a time and place for custom implementations so I'll certainly cover that still.

@mharding27 21 день назад

Hello Cole, thank you for keeping up on these rag videos. I am trying to do something similar for my work. But we have a lot of files on local shares and SharePoint. I am wondering if you know that this would work for local files instead of cloud serivces like Google drive or SharePoint?

@ColeMedin 20 дней назад

Of course! And fantastic question! This would indeed work for local files. You can trigger the workflow when files are created or updated locally just like with Google Drive, and then processing them would be exactly the same. Check this n8n documentation out! docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger/

@mharding27 15 дней назад

@@ColeMedin Hello Cole, thank you and I get pdf, excel and text working. But I have not found a way to get doc or docx extracted. Do you know how this can be done

@SergeyNeskhodovskiy 21 день назад

Thank Cole, always learning something from you! How would you handle this challenge: having a google spreadsheet (not excel), feed it to the agent and ask it questions that depend on the data, like what is the sum of the hours worked etc. I guess since LLMs are awful as calculator, you'd have it convert the spreadsheet to an SQL database and make the AI write and execute the query, but I'd appreciate if you could show the exact steps. Thanks!

@ColeMedin 20 дней назад

Thanks Sergey, I'm really glad to hear that! I appreciate this question a lot because you're definitely hitting on one of the limitations of RAG. Since RAG is meant more for specific lookups, it doesn't handle larger file analysis well like determining averages, sums, maximums, minimums, etc. over a large CSV/Excel file. However, you can create a BEAST of an agent by combining RAG with other tools (like Python code generation or SQL generation if you table is in a database like you mentioned) in an agent so it can handle both lookups and calculating values across your files. I will be making a video on exactly that in the near future!

@agsvk-com 8 дней назад

Thanks, Cole. Really great videos that you do! 😊 @16:00 you mentioned about ingesting documents with Excel or Google sheets in different ways of doing it. Do you have a video on it yet? God bless you

@ColeMedin 8 дней назад

Thank you very much - my pleasure :) I do not yet but I am planning to do this video pretty soon! Lot of experimentation I am doing behind the scenes for this actually!

@misheltal692 21 день назад

Great job! Can you make a video of an CRM agent that can manage clients. e.g update status, search records and get information, assign task based on criteria. That would be amazing 😊

@ColeMedin 20 дней назад

Thank you and I love the suggestion! I've got it added to my list of content ideas :)

@DanielBowne 19 дней назад

Would love to see this done with the new pgvector node.

@ColeMedin 19 дней назад

That's news to me - I will definitely look into this!!

@jamesturner246 19 дней назад

Amazing video, if I may ask, on the previous version of this you used Qdrant but on this version you use Supabase, is there a reason ?

@ColeMedin 17 дней назад

Thank you and good question! Both are great, I just used Qdrant in one of my other videos because it was a part of the local AI starter kit so I wanted to leverage it for RAG to use the whole package. I'm a big fan of Supabase since I can use it both for the SQL DB for conversation history and for RAG, so it's a double wammy. But again, Qdrant is great for RAG too!

@LePsyclone 21 день назад

Great content. Can you create a tutorial to ingest detailed sales data excel sheets ? That is a common real world need! Sales by period, by source ...

@ColeMedin 20 дней назад

Thank you and I appreciate the suggestion! I will be doing advanced RAG tutorials in the future where I will cover these kind of documents!

@djagryn 16 дней назад

nice job !

@ColeMedin 16 дней назад

Thank you!!

@clarkzara15 17 часов назад

Thanks for the great video! I'm considering switching to Pinecone for the vector store to see if it improves similarity search accuracy, while keeping Supabase for PostgreSQL to manage chat memory. I’m also encountering out-of-memory errors on my self-hosted n8n whenever I try inserting large PDF files into Supabase’s vector store. Any thoughts or tips would be much appreciated!

@ColeMedin 7 часов назад

Yeah that is certainly worth a try, especially if your knowledgebase is huge. Pinecone is fantastic, though PGVector should work just as well as long as you don't have millions of records. Out of memory errors probably means you'll need to upgrade your instance that is hosting n8n! How large are your PDFs?

@McAko 5 дней назад

which LLM did you use in this example? in previous videos you said that Llama 3.1 8B was not good enough for RAG 🤔

@ColeMedin 2 дня назад

Fantastic question! I'm using Claude 3.5 Sonnet in this video (not sticking to local in this case). Llama 3.1 8b doesn't do well with RAG specifically in n8n I should mention, since n8n does RAG as a tool and Llama 3.1 8b doesn't handle tool calling well.

@VB_Guron1299 21 день назад

Hi! I just discovered n8n due to great interest in self hosting. May I ask is there a way to connect n8n to a front-end? Like I don't need to go to n8n to access that chat and do things wit in but instead have a front end while my n8n will be the back end if that makes sense? If possible, is there such a thing that is already build and I just need to debloy? Thank you!! Super fun to learn about it and thanks to Cole for the majority on n8n! Looking forward for more!

@NishanthA32 21 день назад

If you want to just use the chat feature, n8n gives the option to embed the chat widget on any website or use the chat directly on a publicly available url too

@NishanthA32 21 день назад

when you open the chat node, it gives you the chat url that you can access on the browser directly

@ColeMedin 20 дней назад

Thank you very much and that is a great question! As @NishanthA32 pointed out, n8n gives you the ability to embed the chat widget on your website. Not the most customization but for basic use cases it works well. Otherwise, I made a video on my channel recently for how to use an n8n agent as an API endpoint for a frontend built with something like Streamlit or Next.js! ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-JyolNYRbAcs.html

@EnzoFrasca-i4x 4 часа назад

Before anything, thanks for the video, it helped me a lot! I would like to ask two questions: 1. When I delete a file from the folder, the it keeps existing on supabase, and is that supposed to happen? 2. I'm getting "Error inserting: unsupported Unicode escape sequence 400 Bad Request", when uploading docx files, do you have any insight about it? Again, thank you for the video, you explain things very well.

@alvaroaraujo7945 18 дней назад

Interestingly, I got accurate/complete results with less hallucination (AND saved prompts) after setting the Supabase Vector Store node to 'Get Many' instead of Retrieve Documents (thus having to switch agent to 'conversational agent') Wondering if that's because the task I hand to the Agent is quite complex (had to to split it into more than one agent and token usage was skyrocketingwhile output quality plummeted)

@ColeMedin 17 дней назад

Wow that is super interesting... thanks for sharing!! That's definitely counterintuitive that making that switch would help, but I'm glad it worked for you! Did you do anything custom with the prompt for the 'Get Many' node? Maybe something you did to set it up more custom (since the Retrieve Documents gives less control and 'Get Many' gives more) made the results better for your more complex use case.

@alvaroaraujo7945 12 дней назад

@@ColeMedin Just "search for the docs on, or most related to this query: {{ $json.chatInput }} I wish I knew why Get Many is performing better lol... Since we're here: do you happen to know how can we 'filter' queries? I wish to add similar data but with different contexts inside the same supabase table

@superresistant0 19 дней назад

10/10

@ColeMedin 17 дней назад

Thank you!!

@nusquama 2 дня назад

this is great ! but vector wiht xls files is always difficult ! Do you have better way to vector xls ?

@ColeMedin 2 дня назад

Thanks and I have a video coming out for this next month!

@nusquama День назад

@@ColeMedin this guy on youtube like your video GucciDGIxOption !!!

@charlleskleber9771 17 дней назад

Thanks for this, you are awesome! Do you know any content in the web that follows up on pdf analysis agents? The results are very bad for big documents with tables, like financial statements

@ColeMedin 17 дней назад

Of course, thank you very much! I will be making more content specifically on processing PDFs (and Excel docs) in the future. But off the top of my head what I can think of is there are tools out there like AWS Textract that are meant for working with a lot of complicated data in PDFs. I would check that out to start!

@jackmermigas9465 7 дней назад

ah this is what I was stuck on last time was the different file types in the data loader thanks so much for clarifying that! About pdf's, let's say we are putting through appointment or meeting reminders, is it better to summarize with an ai and convert to text or can it handle just dumping the whole pdf in there in this way you're showing on video?

@ColeMedin 7 дней назад

You're so welcome! The point of RAG is to make it possible to dump huge documents and still make it possible to look up specific pieces of information within the documents. So to answer your question I would just put the whole PDF in the vector DB instead of summarizing it first!

@jackmermigas9465 7 дней назад

@@ColeMedin brilliant thanks!

@ColeMedin 2 дня назад

You bet!

@Minotaurus007 3 дня назад

I cannot find the field "file_type" to drag it into the set file id node. I only see mimetype. Is this equivalent or changed N8N the functionality? I also cannot open your linked workflow: "not a json file". Helping hint would be appreciated.

@ColeMedin 2 дня назад

You can use mimeType, it is equivalent! Make sure when you import the JSON file you do import from file not URL! I checked and it is a valid JSON file I can import!

@simonsaysboo 2 дня назад

My node that should be inserting into the Supabase Vector store isn't working... I'm getting no error from the node, the extracted input data is waiting at the input stage, the node settings are per the video, the SB project is setup, 'documents' table exists, credentials are right... but nothing gets inserted. It just says 'nde successfully executed' without doing anything. Any ideas? Thanks :)

@ColeMedin 2 дня назад

That's really weird! It must be inserting somewhere if you aren't getting an error. Do you have any other tables in your Supabase? Also I've had this happen a couple times where it seems nothing is getting inserted but then when I refresh my page the records are there.

@gtrusler 16 дней назад

Do you have any issues with the Google Drive trigger having trouble when you add a lot of documents to the watched folder at once? I find that it sometime submits the same file that was triggered earlier or it misses a lot of them. I see this trigger used in tutorials all over youtube so I'm either doing something wrong or most people only feed it a small number of documents at once. I'm curious if you've notice anything similar.

@ColeMedin 16 дней назад

I have had some issues in the past! I believe what it comes down to is if you upload many documents at once, the workflow will only trigger once but with multiple items going through the workflow at once. So you just have to modify what I have here to handle items in a loop within the workflow. Otherwise, if you're actually seeing the workflow entirely miss a trigger, it could potentially be an issue with rate limits for the actual Google Drive API and not because of n8n.

@robgonda3467 19 дней назад

@Cole, it seems like the workflow breaks when you upload multiple files to the google drive. It can only pull one at a time.

@ColeMedin 17 дней назад

Interesting... it has worked for me before but I know once and a while it can have issues! Could be some sort of limitation with the Google Drive API as well or a short term rate limit being hit.

@FarisandFarida 21 день назад

Thanks you so much for this ❤. Question, how would you work with this if you are using something like a notion database? And, if for instance, you are running a company and have multiple departments with different knowledge bases. Are you adding all in the same table? or can you add them in different tables in supabase? will adding them in the same table mess with the quality of retrieval?

@clementgirard2045 21 день назад

+1 Thanks for this questions and thanks to Cole for his tutorials ❤. I would love to be able to use the same RAG Agent, but from a Notion database with the ability to be able to update properties AND content blocks within database pages, all in sync with Supabase. I tried to create this workflow from Cole's previous tutorial, but it's much more complicated with Notion than with Google Docs.

@ColeMedin 20 дней назад

My pleasure and great questions! To your first question, Notion has triggers in n8n for when a page is added or updated. So it should be very similar to Google Drive! I haven't used Notion much before so I'm not sure how exactly you would extract the text from a page to put it into Supabase, but I believe you could use the "Get Many Child Blocks" node to break the created/updated page into blocks and insert each of those into the vector database in a loop. For your second question - when you have multiple companies/departments you want to manage knowledge for, one easy solution is to use a separate table for each. That doesn't scale very well though. The better option that I typically go with is to set a piece of metadata on each record for the company/department the record is tied to. Then, when I query the knowledge base (assuming I know the company/department) I can do a metadata filter on that first and then only query on the subset of documents! So I would just take a dive into metadata filtering in general for this kind of use case!

@ColeMedin 20 дней назад

My pleasure @clementgirand2045! I'm really curious! What were the challenges that you ran into when trying to set this up to work with Notion?

@FarisandFarida 20 дней назад

@@ColeMedin Thank you for this response. i am not sure i get the answer though lol on the supabase one😆. i will add you response to clause to get more clarification

@ColeMedin 20 дней назад

@@FarisandFarida Of course and I'm sorry! Hopefully Claude can help haha but yeah lmk if you want more clarification! I'd ask Claude about metadata filtering and it could give a really good breakdown for you!

@GrecoFPV 19 дней назад

will it work with images also ? and be able to use openai api to analyze context of it ?

@ColeMedin 17 дней назад

Good questions! I believe RAG with n8n doesn't support images by default without creating a more custom workflow to do something like extract text from an image and then store that. You could certainly do that though! Multi-modal RAG is something I'd have to look into more with n8n. Yes, you can use the OpenAI API in n8n!

@photize 21 день назад

Great vid. The pdf scanner does it extract the txt from flattened PDF files as that was always the difficulty. In flattened the text being embedded in an image so some OCR is involved . Thx

@ColeMedin 20 дней назад

Thanks and yeah that makes sense! It's always nice to have a simpler solution than using OCR to extract text from PDFs.

@edyzakaria9522 20 дней назад

Hi just want to check, what happen if i simply upload 2 (or more) files simultaneously, will the workflow will loop per file?

@ColeMedin 20 дней назад

Good question! There will be a separate execution of the workflow for each file!

@T33KS 20 дней назад

I would love to know your opinion about something I've been researching. What do you consider is the best tool for the production stage when it comes to n8n automation flows. In other words, what would be the best approach to "orchestrating" (publishing/cloning/managing/monetizing) n8n flows that you create for clients. BTW thanks for putting out one of the top channels for straight forward and hands on training in this field.

@ColeMedin 17 дней назад

Thanks for the kind words! And great question! I've been thinking about this myself actually and I'm planning on just creating something custom to manage all of my n8n workflows for production. Just some sort of frontend that I can build really fast with tools like Bolt.new/Claude dev/v0 and then managing the orchestration of n8n workflows with FastAPI endpoints.

@T33KS 17 дней назад

@@ColeMedin that's a good approach. It's weird how no one is building an all-in-one solution. I mean that's a pain point for many AI devs. Good luck in that project. And thanks for the great content.

@ColeMedin 16 дней назад

I'm glad it sounds good to you! I've been putting a lot of thought recently into how to make it work super well. An end to end solution is definitely a big pain point for a lot of devs right now! Thank you and my pleasure!

@alvaroaraujo7945 19 дней назад

Still working on this flow. may I ask: did you ever encounter a problem where 'document retrieval' output/completion goes only up to only around 300-500 tokens? I've tested it in all rag token use range, from 8k to 15k to 20k+ (by tweaking file limits)... my output seems to always be 'truncated' or limited to 300-400 tk... any clue? thanks already

@ColeMedin 17 дней назад

Hmmm... well the documents are split into chunks when inserted into the vector DB, so maybe you are just seeing a single chunk and that is why it appears to be truncated? Or maybe I'm not quite understanding the issue you are encountering so if you could clarify that would be sweet!

@alvaroaraujo7945 17 дней назад

@@ColeMedin so… It’s not that the retrieval was truncated (actually was pretty extensive/complete). But that my output tokens were consistently too short. Well, for now, breaking 1 big task into 2-3 smaller ones did the trick for me

@ColeMedin 16 дней назад

Ohh so the actual response from the LLM was cut off? That's super weird! I haven't seen that happen before. I'm glad breaking it up worked well for you though! Maybe you were actually hitting the context limit with whatever LLM you were using.

@BirdManPhil 12 дней назад

does this allow multiple users to use this chat at once? ie a small team or even a large customer base?

@ColeMedin 11 дней назад

Great question - it sure does! You just have to make sure the instance you are hosting n8n on is powerful enough to handle the number of concurrent requests you are looking for. But it doesn't take that much compute power so you should be good regardless.

@BirdManPhil 11 дней назад

@ColeMedin would you be available to help me with a chatbot I'm building if I pay you

@ColeMedin 11 дней назад

@@BirdManPhil Yes I am! Feel free to reach out to cole@dynamous.ai

@BirdManPhil 2 дня назад

@ColeMedin I'm ended up spinning up a 3 node kubernetes cluster with a node balancer and persistent volumes. It's not the best at all but it will scale and that's what matters. I've got treafik and let's encrypt all set up woth postgresql and pgvector for hybrid semantic search, n8n is in que mode with the main node set to 2 replicas for now, a dedicated worker node that can spawn up to 5 replicas, and a webhook node that can spawn 2 with redis handking all the it er njde cimmunication cache and queueing . So all that works perfectly, but I can't figure out now how get other tools to properly ingress and I still need at least flowise and langflow and a purple monitoring services working before I can really dive in. It's hard I'm barely ok with docker, and kubernetes is like docker on steroids from thanos

@jaggyjut 21 день назад

Great video. Any chance to create a tutorial with open source framework like langchain. Thank you

@ColeMedin 20 дней назад

Thank you! n8n is indeed open source itself, but yes I have a lot of content on my channel for LangChain already and will continue to put out more! As an example, I created a video recently for how to use LangChain and n8n together to create some really neat agents: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-8hAMASB-RpM.html

@5capeg0at 2 дня назад

Why is installing Supabase so hard though? So many different parts, I’m stuck on launching the supabase-vector container, it just exits with code 78 and I’ve spent a night on it with no luck

@ColeMedin 2 дня назад

It's really just because there is the vector DB that needs to be enabled and created and then there is the regular SQL DB for conversation history. In this video I go into more detail on getting it set up! ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-PEI_ePNNfJQ.html

@funnydogfargo1026 20 дней назад

Hi Cole, another great video! You just solved my biggest struggle of the last 3 days. But now I have a new problem. How can I upload files from a specific folder in bulk? I have a folder in Drive with 122 PDF files. The current automation only allows me to upload one file at a time, and it's the same file every time. What I'm missing?

@ColeMedin 19 дней назад

Thank you very much - I'm glad I could help! If you want to upload an entire folder for RAG at once, you would want to create a separate folder for that. This workflow is meant to ingest files that are created/updated after the workflow is made active, not to go retroactively through a folder. You can create a workflow that uses the Google Drive "List files and folders" to get your list of your 122 PDF files and then loop through each one to extract the text and add into the RAG knowledge base similar to what I do in my workflow.

@robgonda3467 19 дней назад

@@ColeMedin the workflow breaks when uploading multiple files - says it gets confused about which ID to use

@robgonda3467 19 дней назад

BTW, doesn't seem to be working with folders either, only files at the top level

@charlleskleber9771 20 дней назад

Is it possible to call a specific docuemnt name for it to analyze? Like analyze document "..." and give me xyz

@ColeMedin 19 дней назад

Yes you can! But you might want to include the document title in each chunk so it will be retrieved.

@jjolla6391 19 дней назад

i'm struggling to understand the value of n8n over simple, well-proven utilities like pdftotext, pdfimages, xlsx2csv, csvtool, tesseract-ocr, goocr, rclone, syncthing, etc ... glued together with simple scripts. I was hoping to find some AI that did all this with minimal building, but n8n still needs you to do the same heavy work .. just with other tools.

@ColeMedin 17 дней назад

You raise a fair point! You're definitely right that n8n is really just a way to piece a lot of tools together that you could do yourself with libraries and simple scripting. But for a lot of use cases and for a lot of people, n8n just makes that easier. Especially for those who don't know how to do that kind of scripting to piece things together themselves. I will say though that I quite enjoy piecing tools together with Python code myself!

@supergpt_tv 21 день назад

hey cole, having an issue with my supabase vector database, I have created the DB in supabase with the basic script that they provide in the docs, with a different name from "documents", i have adapted everything else so I can call the new vector store. The retrieval works only sometimes, very awkward when I change the model from the vector store tool to a different one, it starts working again! But then, stops working again. Super weird, any thoughts why this is happening? Keep up the good content!

@ColeMedin 20 дней назад

Thank you and wow that is super weird! I'm curious if restarting the conversation would also help the retrieval just like resetting the vector store. Because one thing I see happen a lot is when the AI thinks it doesn't know the answer to something, it gets stuck in the mindset within the same conversation that it doesn't know, even if you ask a different question. Happens a lot with the weaker models especially. Though I guess some clarification would help too! When you say it doesn't work what does that mean exactly?

@supergpt_tv 20 дней назад

@@ColeMedin thanks for the reply mate! so basically the I got an error with the embedding node, it throws me an error "Cannot read properties of undefined (reading 'replace')" and there is no more info on the error at all, just some generic rubbish. If i go to to the logs in the "embeddings" part, there is no input for the LLM, so maybe the model is not being able query the vector? but once I change the model, it works again for 3-4 times, but eventually breaks again. i have changed the formatting, and using split code to markdown, and using 4o instead of 4omini now and it's been working... but how can I put something in production with an error like this lol, and 4o is not cost effective and I would like to keep using 4omini hehe i have opened a bug report in github, but no response still, hopefully it will be soon resolved link - github.com/n8n-io/n8n/issues/11173 cheers!

@ColeMedin 20 дней назад

@@supergpt_tv That's really weird... I hope you get a reply with the bug report! I haven't seen this before so I'm not totally sure what it could be either.

@tecnopadre 20 дней назад

What's the difference between this an using the OpenAI API realtime?

@ColeMedin 20 дней назад

@@tecnopadre Great question! The main difference is you can take this n8n agent and extend it/use it with other tools/frontends in a way you couldn't do with OpenAI. The extra customization is the main plus!

@navindergill3207 6 дней назад

How to create a multimodal RAG system meaning not just limited to pdfs and excels but even images etc

@ColeMedin 2 дня назад

Great question! I don't think n8n supports images for RAG by default, but there are multimodal vector DBs like Chroma you could look into using!

@theuniversityofthemind6347 21 день назад

Hi Cole, love your content! I have an Alienware m18 R2 with an Intel i9-14900HX, NVIDIA RTX 4090 (24GB), 64GB RAM, and 8TB storage, but I struggle to run LLaMA 70B models. If you ever find the time, could you create a video for users like me on optimizing setups (8-bit quantization, mixed precision, etc.) to run large models efficiently? Your help would be greatly appreciated. Many Thanks!

@ColeMedin 20 дней назад

Thank you very much! Yes even for an awesome computer like yours you'll have to do 4/8-bit quantization to run 70b models efficiently. Running local LLMs and optimizing those setups (with quantization, fine-tuning, etc.) is something like will be putting a LOT more content out around in the near future! Actually working on building my next PC next week to run some more models!

@theuniversityofthemind6347 20 дней назад

@@ColeMedin Hey Cole! Thanks so much for your kind reply! More content in the near future would be awesome! But would it be safe to say that with the suggested fine tuning it would be possible to get it running efficiently? If you don't mind what type of performance could i expect when doing so?

@ColeMedin 20 дней назад

@@theuniversityofthemind6347 Of course! And fine tuning won't make the model run faster, but fine tuning makes any given LLM work better for your use case so I meant more you can use fine tuning to make a smaller model (like 11b or 32b) work just as well for your use case as a 70b model! Not guaranteed of course but with good data you certainly can

@MustRunTonyo 20 дней назад

Supabase vs Postgres vs Qdrant: what's best used in these ai apps?

@ColeMedin 20 дней назад

Great question!! I assume you meant Pinecone instead of Postgres since we are talking about vector DBs. Supabase with PGVector is the simplest and will perform just as well as Pinecone and Qdrant until you get to millions of records. Once your vector DB is absolutely massive, that's when it makes sense to go with a dedicated vector datastore like Pinecone or Qdrant. I'd choose Pinecone if you want something set up fast and Qdrant if you want to self-host since it is open source!

@MustRunTonyo 20 дней назад

@@ColeMedin thanks for the explanation!!

@ColeMedin 17 дней назад

Of course, glad to help!!

@yhojraj 4 дня назад

Msword files getting all Gibberish when extract. Only extension .doc work a little but still have a lot gibberish words. Can you check this with some proper document file please?

@ColeMedin 2 дня назад

This setup just works with Google Doc files right now! For MSWord you might need a custom extract step to parse the format for those files specifically.