How to chat with your PDFs using local Large Language Models [Ollama RAG]

Подписаться 3,6 тыс.

Просмотров 63 тыс.

50% 1

In this tutorial, we'll explore how to create a local RAG (Retrieval Augmented Generation) pipeline that processes and allows you to chat with your PDF file(s) using Ollama and LangChain!
✅ We'll start by loading a PDF file using the "UnstructuredPDFLoader"
✅ Then, we'll split the loaded PDF data into chunks using the "RecursiveCharacterTextSplitter"
✅ Create embeddings of the chunks using "OllamaEmbeddings"
✅ We'll then use the "from_documents" method of "Chroma" to create a new vector database, passing in the updated chunks and Ollama embeddings
✅ Finally, we'll answer questions based on the new PDF document using the "chain.invoke" method and provide a question as input
The model will retrieve relevant context from the updated vector database, generate an answer based on the context and question, and return the parsed output.
TIMESTAMPS:
============
0:00 - Introduction
0:07 - Why you need to use local RAG
0:52 - Local PDF RAG pipeline flowchart
5:49 - Ingesting PDF file for RAG pipeline
8:46 - Creating vector embeddings from PDF and store in ChromaDB
14:07 - Chatting with PDF using Ollama RAG
20:03 - Summary of the RAG project
22:33 - Conclusion and outro
LINKS:
=====
🔗 GitHub repo: github.com/tonykipkemboi/olla...
Follow me on socials:
𝕏 → / tonykipkemboi
LinkedIn → / tonykipkemboi
#ollama #langchain #vectordatabase #pdf #nlp #machinelearning #ai #llm #RAG

Наука

Опубликовано:

29 июн 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 311

@levi4328 2 месяца назад

Im a medical researcher and, surprisingly, my life is all about pdfs i dont have any time to read; let alone learn the basics of code. And i think there's a lot of people on the same boat as mine. Unfortunately, its very fucking hard to actually find an ai tool thats barely reliable. Most of youtube is damped with sponsors for ai magnates trying to sell their rebranded and redudant worthless ai-thingy for a montlhy subscription or an unjustifiably costly api that follows the same premise. The fact that you, the only one that came closer to what i actually need - and a very legitimate need - is a channel with

@tonykipkemboi 2 месяца назад

Thank you so much for sharing about the pain points you're experiencing and the solution you're seeking. I'd like to be more helpful to you and many more like you as well. I have an idea of creating a UI using Streamlit for the code in this tutorial with a step-by-step explanation of how to get it running on your system. You will essentially clone the repository, install Ollama and pull any models you like, install the dependencies, then run Streamlit. You'll then be able to upload PDFs on the Streamlit app and chat with it on a chatbot like interface. Let me know if this will be helpful. Thanks again for your feedback.

@ilyassemssaad9012 2 месяца назад

hey, hmu and ill give you my rag that supports multiple pdfs and you can choose the llm you desire to use.

@Aberger789 2 месяца назад

I'm in the space as well, and am trying to find the best way to parse PDFs. I've setup grobid on docker and tried that out. My work laptop is a bit garbage, and being in the world's largest bureaucracy, procuring hardware is a pain in the ass. Anyways, great video.

@kumarmanchoju1129 2 месяца назад

USe nvidia RTX chat for pdf summarizing and querying. Purchase a cheap RTX card of minimum 8GB vRAM.

@InnocentiusLacrimosa 2 месяца назад

@@tonykipkemboiI think most people are in pain now with just this part "upload pdfs to service X". This is what they want/have to avoid. Anyhow, nice video you made here.

@davidtindell950 2 дня назад

Thank You. I have done several similar projects and I learn something new about 'local RAG' with each one !

@deldridg Месяц назад

Thank you for this excellent intro. You are a natural teacher of complex knowledge and this has certainly fast-tracked my understanding. I'm sure you will go far and now you have a new subscriber in Australia. Cheers and thank you - David

@tonykipkemboi Месяц назад

Glad to hear you found the content useful and thank you 🙏 😊

@claussa 2 месяца назад

Welcome on my special list of channels I subscribe to. Looking forward to you making me smarter😊

@tonykipkemboi 2 месяца назад

Thank you for that honor! I'm glad to be on your list and will do my best to deliver more awesome content! 🙏

@thiagobutignonclaramunt410 Месяц назад

You are a awesome teacher, thank you so much to explain this in a clean and objective way :)

@tonykipkemboi Месяц назад

🙏

@ISK_VAGR 2 месяца назад

Congrats man. Really useful content. Well explained and effective.

@tonykipkemboi 2 месяца назад

Thank you, @ISK_VAGR! 🙌

@gptOdyssey 2 месяца назад

Clear instruction, excellent tutorial. Thank you Tony!

@tonykipkemboi 2 месяца назад

Thank you for the feedback and glad you liked it! 😊

@Oiseaux_rebelle Месяц назад

You're welcome Ezekiel!

@n0madc0re 2 месяца назад

this was super clear, extremely informative, and was spot on with the exact answers I was looking for. Thank you so much.

@tonykipkemboi 2 месяца назад

Glad you found it useful and thank you for the feedback!

@HR31.1.1 2 месяца назад

Dope video man! Keep them coming

@tonykipkemboi 2 месяца назад

Appreciate it!!

@johnlunsford5868 2 месяца назад

Top-tier information here. Thank you!

@tonykipkemboi 2 месяца назад

🙏

@Reddington27 2 месяца назад

Thats a pretty clean explanation. looking for more videos.

@tonykipkemboi 2 месяца назад

Thank you! Glad you like the delivery. I got some more cooking 🧑‍🍳

@chrisogonas 2 месяца назад

Simple and well illustrated, Arap Kemboi 👍🏾👍🏾👍🏾

@tonykipkemboi 2 месяца назад

Asante sana bro! 🙏

@aloveofsurf 2 месяца назад

This is a fun and potent project. This provides access to a powerful space. Peace be on you.

@tonykipkemboi 2 месяца назад

Thank you and glad you like it!

@Mind6 2 месяца назад

Very helpful! Great video! 👍

@tonykipkemboi 2 месяца назад

🙏❤️

@DynamicMolecules Месяц назад

Thanks for this amazing tutorial on building a local LLM. I applied it to my research paper PDFs, and the results are impressive.

@tonykipkemboi Месяц назад

Awesome 🤩 Love to hear that! Did you experiment without using the MultiQueryRetriever in the tutorial to see the difference?

@DynamicMolecules 28 дней назад

@@tonykipkemboi That's an interesting question. I tried and found that MultiQueryRetriever works well in general, when LLM needs to connect indirect information from document, but fails to provide relevant information for direct information present in the document. But, this observation could differ case to case.

@DaveJ6515 2 месяца назад

Very good! Easy to understand, easy to try, expandable ....

@tonykipkemboi 2 месяца назад

Awesome! Great to hear.

@DaveJ6515 2 месяца назад

@@tonykipkemboi you deserve it. Too many LLM RU-vidrs are more concerned to show a lot of things than to make them easy to understand and to reproduce. Keep up the great work!

@SimpleInformationINC Месяц назад

Nice job, thanks Tony!

@tonykipkemboi Месяц назад

🙏

@teddyperera8531 Месяц назад

This is a great tutorial. Thank you

@tonykipkemboi Месяц назад

🙏

@VairalKE 2 месяца назад

Good to see fellow Kenyans on AI. Perhaps the Ollama WebUI approach would be easier for beginners as one can attach a document, even several documents to the prompt and chat.

@tonykipkemboi 2 месяца назад

🙏 Yes, actually working on a Streamlit UI for this

@grizzle2015 2 месяца назад

thanks man this is extremely helpful!

@tonykipkemboi 2 месяца назад

🙏🫡

@notoriousmoy 2 месяца назад

Great job

@tonykipkemboi 2 месяца назад

Thank you! 🙏

@Marduk477 2 месяца назад

Really userful content and well explained. t would be interesting to see a video but with different types of files, like only PDFs, for example Markdown, PDF, and CSV all at once. It would be very interesting.

@tonykipkemboi 2 месяца назад

Thank you! I have this in my content pipeline.

@Marques2025 Месяц назад

Useful tip : use a proper wifi dont use Mobile hotspot while pulling the model from ollama ,i had a error with that ,hopes it helps someone😊

@nagireddygajjela5430 15 дней назад

Thank you for sharing good content

@tonykipkemboi 15 дней назад

🙏

@essiebx Месяц назад

thanks for this tony

@tonykipkemboi Месяц назад

🙏

@vineethnj8744 Месяц назад

Good one, Good luck🤞

@tonykipkemboi 29 дней назад

Thanks ✌️

@ThinAirElon Месяц назад

Super!

@tonykipkemboi Месяц назад

🙏

@franciscoj.moyaortiz7025 Месяц назад

awesome content! new sub

@tonykipkemboi Месяц назад

Thank you! 🙏

@iceiceisaac 2 месяца назад

so cool!

@tonykipkemboi 2 месяца назад

Thank you 🙏

@ninadbaruah1304 2 месяца назад

Good video 👍👍👍

@tonykipkemboi 2 месяца назад

@Nyx-bm5be 8 дней назад

Wonderful tutorial, man! Let me ask you, what are the other kinds of prompts we can use? Also, is it normal for the rag to answer questions about things not on the pdf that was loaded? For example, i tested with the prompt "what is a dog" and got a answer back. Is it because of the RAG and Ollama? Thanks a bunch

@attaboyabhi 22 дня назад

nicely done

@tonykipkemboi 20 дней назад

Thank you 😊

@rockefeller7853 2 месяца назад

Thanks for the share. Quite enlightening. I will def build upon that. Here is the problem I have. Let's say Ihave two documents and I wanna chat with both at the same time (for instance to extract conflicting points between the two). What would you advise here?

@tonykipkemboi 2 месяца назад

Thank you! That's an interesting use case for sure. My instinct before looking up some solutions is to maybe create 2 separate collections for each of the files then retrieve them separaetly and chat with them for comparison. I'm sure my suggestion above might not be efficient at all. I will do some digging and share any info I find.

@Joy_jester 2 месяца назад

Can you make one video of RAG using Agents? Great video btw. Thanks

@tonykipkemboi 2 месяца назад

Sure thing. I actually have this in my list of upcoming videos. Agentic RAG is pretty cool right now and will play with it and share a video tutorial. Thanks again for your feedback.

@metaphyzxx 2 месяца назад

I was planning on doing this as a project. If you beat me to it, I can compare notes

@kainew 5 дней назад

Your video is excellent, you gained a subscriber! I'm looking to move all of my more than 500 project documentation files into a GPT to help resolve support issues and answer questions from auxiliary teams, I can see this being exactly what I needed. Do you know someone who is trying to approach project documentation with LLMs templates? Thank you, big hug from Brazil!

@tonykipkemboi 5 дней назад

So glad you found it helpful and thank you for subscribing as well! 💜 Can expand more on the "documentation with LLMs templates"?

@angadbandal3844 Месяц назад

very detailed explanation, thanks, can you please make the same project to give responses in multi-language and with voice output?

@tonykipkemboi Месяц назад

Thank you. Yes that would be cool. I can see the challenge coming from finding an open source model that is good at multiple languages. The ones I used are not great at all. For voice, it'd probably be easy to use an open source TTS or even be more granular and use 11labs for a better quality in spite of it not being local.

@rmperine Месяц назад

Great delivery of material. How about fine-tuning for llama3 using your own curated dataset as a video? There are some out there, but your teaching style is very good.

@tonykipkemboi Месяц назад

Thank you and that's a great suggestion! I'll add that to my list.

@georgerobbins5560 2 месяца назад

Nice

@tonykipkemboi 2 месяца назад

Thank you!

@garthcase1829 2 месяца назад

Great job. Does the file you chat with have to be a PDF or can it be a CSV or other structured file type?

@tonykipkemboi 2 месяца назад

🙏 thank you. I'm actually working on a video for RAG over CSV. The demo in this tutorial will not work for CSV or structured data; we need a better loader for structured data.

@scrollsofvipin 2 месяца назад

What GPU do you use ? I have Ollama running on an i5 intel with integrated CPU and so unable to use any of 3B + models. TinyLama and TinyDolphin works but the accuracy is way off

@tonykipkemboi 2 месяца назад

I have an Apple M2 with 16GB of memory. I noticed that larger models slow down my system and sometimes force a shutdown of everything. One way around it is deleting other models you're not using.

@stanTrX 2 месяца назад

Thanks, Can you please explain one by one and slowly. Especially the RAG part

@tonykipkemboi 2 месяца назад

Thanks for asking. Which part of the RAG pipeline?

@xrlearn 2 месяца назад

Thanks for sharing this. Very helpful. Also, what are you using for screen recording and editing this video ? I see that it records the section where your mouse cursor is ! Nice video work as well. Only suggestion is to increase gain in your audio

@tonykipkemboi 2 месяца назад

I'm glad you find it very helpful. I'm using Screen Studio (screen.studio) for recording; it's awesome! Thank you so much for the feedback as well. I actually reduced it during editing thinking it was too loud haha. I will make sure to readjust next time.

@xrlearn 2 месяца назад

@@tonykipkemboi Btw, can you see those 5 questions that it generated before summarizing the document?

@tonykipkemboi 2 месяца назад

@@xrlearn, I'm sure I can. I will try printing them out and share them here with you tomorrow.

@tonykipkemboi 2 месяца назад

Hi @xrlearn - Found a way to print the 5 questions using `logging`. Here's the code you can use to print out the 5 questions: ``` import logging logging.basicConfig() logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO) unique_docs = retriever.get_relevant_documents(query=question) len(unique_docs) ``` Here are more detailed docs from LangChain that will help. python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever/

@wah866sky7 28 дней назад

Thanks a lot! If we have a mix of multiple PDFs, Words or Excel files, how can we change the RAG to support retrieval of them?

@tonykipkemboi 27 дней назад

Glad you found it helpful. For different file types, you would consider the loading/parsing and chunking strategies that fit those data types. I'm working on the next video which I will go over CSV & Excel RAG.

@guanjwcn 2 месяца назад

Thanks. Btw, how did you make your your RU-vid profile photo? It looks very nice.

@tonykipkemboi 2 месяца назад

Thank you! 😊 I used some AI avatar generator website that I forgot but I will find it and let you know.

@guanjwcn 2 месяца назад

Thank you

@enochfoss8993 Месяц назад

Great video! Thanks for sharing. I ran into an issue with a Chroma dependency on SQLite3 (i.e. RuntimeError: Your system has an unsupported version of sqlite3. Chroma requires sqlite3 >= 3.35.0). The suggested solutions are not working. Is it possible to use another DB in place of Chroma?

@tonykipkemboi Месяц назад

Thank you! Yes, you can swap it with any other open-source vector database. You might also try using a more recent version of Python, which should come with a newer version of SQLite. Do you know what version you are using now? You can also try installing the binary version in the notebook like so: `!pip install pysqlite3-binary`

@aaaguado Месяц назад

Hello friend, thank you very much for your content. I have a question, how can I make it listen to my server within Google Collab so I don't have to use Jupyter, since my resources are a bit limited?

@Srb0002 Месяц назад

Good Explanation. could you please make video, If PDFs has images and tables in it. How would we extract , Store and RAG on images, tables and text using open source models

@tonykipkemboi Месяц назад

This is a good topic to explore. I might just create another video diving deeper into pdf types and how to extract and use multimodal elements.

@farexBaby-ur8ns 2 месяца назад

Good one.. ok you touched on security- you have here something that doesn’t let things flow out to the internet. I saw a bunch of vids about tapping data from dbs using sql agents. But none said specifically anything about security. So qn- does using sql agents violate data security?

@tonykipkemboi 2 месяца назад

You bring up a critical point and question. Yes, I believe most agentic workflows currently, especially tutorials, lack proper security and access moderation. This is a growing and evolving portion of agentic frameworks + observability, IMO. I like to think of it as people needing special access to databases at work and someone managing roles and the scope of access. So agents will need some form of that management as well.

@carolinefrasca 2 месяца назад

🤩🤩

@AnkitSingh-xc8em 19 дней назад

Appreciate your work, wanted to know can i use it for confidential pdf. is there will be any chances of data leak ??

@tonykipkemboi 19 дней назад

Thank you for the kind words. Yes, if you use Ollama models like we did on the video, then your content will stay private and not be sent to any online service. To be sure, I'd recommend turning off your WiFi or any connection once you've loaded all the dependencies and imports. You can then run the cells to lead your PDF to a vector db and chat with it. After you're done, you can delete the collection where you saved the vectors of your PDF before turning your connection back on. This is an extra measure to give you peace of mind.

@thealwayssmileguy9060 2 месяца назад

Would love it if can make the streamlit app! I am still struggeling to make a streamlit app based on open source llms

@tonykipkemboi 2 месяца назад

Thank you! Yes, I'm working on a Streamlit RAG app. I have released a video on Ollama + Streamlit UI that you can start with in the meantime.

@thealwayssmileguy9060 2 месяца назад

@@tonykipkemboi thanks bro! I will defo watch👌

@Stellasogks 29 дней назад

Are the libraries you used (langchain , chromaDB ...) open source? and can we use any ollama model?

@tonykipkemboi 29 дней назад

yes and yes

@TheShreyas10 10 дней назад

Quite interesting and thanks for sharing it, can you let me know if this would run on 32GB CPU RAM Core i7 processor? Considering you are using mistral model

@tonykipkemboi 10 дней назад

Thank you. Yes that should be sufficient to run the program.

@kiranshashiny 10 дней назад

Nice video, and very informative. My question: I have downloaded the LLMs like gemma, llama2, llama3 and so on on my MacOS. But due to some technical issue, I deleted these LLMs. ( e.g: $ ollama rm llama2) Now I want them again, and noticed that if I run "$ ollama run llama3", this **downloads the entire 4.7GB from the internet** over again. Is it possible to keep them downloaded at some place and when I want it - just run $ ollama run and use it and later delete it when not needed ? Again Thanks in advance and would appreciate a response.

@tonykipkemboi 10 дней назад

Thank you. What you did earlier is the standard way of downloading, serving, and deleting the Ollama models. You can also download more quantized options for each, with less memory. I usually add and then delete whenever I don't need it or when I need to download another model.

@madhudson1 29 дней назад

you're initial ingestion, it doesn't load the first page, it ingests the entire document. Your data variable consists of a list of a single Document object, that will contain the content of the entire pdf

@tonykipkemboi 29 дней назад

That is correct. I did not change the code after testing it previously with loading individual pages. You can load by page and add metadata that way.

@madhudson1 29 дней назад

@@tonykipkemboi but cool tutorial for summarisation using a multi query retriever. I didn't know this was a thing in langchain

@tonykipkemboi 29 дней назад

@@madhudson1 thank you. Yes, it's a neat function

@Ollerismo Месяц назад

great walkthrough, the audio can be increased a little bit...

@tonykipkemboi Месяц назад

Thank you! 😊 I noticed that I didn't adjust my gain after I had posted. Thanks for your feedback.

@deldridg Месяц назад

Is it possible to upload multiple PDF documents using the langchain doc loaders and then converse across them? Excellent tut and thanks - David

@tonykipkemboi Месяц назад

That can definitely be possible. Are you thinking of probably two pdfs that each carry different content?

@deldridg Месяц назад

@@tonykipkemboi Thank you for taking the time to reply - much appreciated. I was just wondering whether this approach allows the ingesion of multiple documents which could be contrasted or used in conjunction with each other. Cheers mate - David

@qzwwzt Месяц назад

Congrats on your Video! In your example you use just one PDF, I have a demand to work with thousands of documents, and the main issue is the time consumption to upload the videos. Can you give me some advice?

@tonykipkemboi Месяц назад

Did you mean to say it takes time to upload the documents to vector store and query over them? If yes, I do agree with you that latency is an issue especially since we're adding another layer of retrieval using the MultiQueryRetriever. It would also depend on your system as well if you're using Ollama.

@AfrivisionMediake Месяц назад

hey, thanks for this. Question. Does it have limitations on the number of documents one can upload to chat with? Like can I upload thousands of documents to use?

@tonykipkemboi Месяц назад

I haven't tested it with many documents but will do.

@AfrivisionMediake Месяц назад

Will appreciate a lot. Much love from Kenya btw😃

@tonykipkemboi Месяц назад

@@AfrivisionMediake 🫡

@SiddharthMishra-pg1os 25 дней назад

Hello ! nice tutorial. I was stuck on the first part unfortunately as I get the error: "Unable to get page count Is poppler installed and in PATH". Do you have any idea how to solve this ? I have already installed poppler using brew.

@tonykipkemboi 22 дня назад

Thank you. Have you tried using chatgpt to troubleshoot?

@rayzorr Месяц назад

Good stuff. Shame you didn't run the notebook. Would like to see how it works.

@tonykipkemboi Месяц назад

Thank you. I tried recording and running the notebook but it killed my video recording since they were competing for system resources with Ollama. I ran the notebook as you can see the outputs in it already and just walked through the code. I'll try running it on the next video for more interactivity.

@theDaddyBouldering 16 дней назад

thanks for the tutorial ! how can I make the model to give answers in a different language?

@tonykipkemboi 13 дней назад

It would largely depend on the capabilities of the given model to translate from English to the target language. You can try by adding the target language in the prompt. Tell it to return the results in X language.

@spotnuru83 2 месяца назад

firstly thank you for sharing this entire tutorial, really great, i tried to implement it and got all the issues resolved but looks like i am not getting any output after i ask any question. i am OllaEmbeddings: 100% 5 times and then nothing happened after that, program just quit without giving any answer. will you be able to help me in this regards to see how to get it worked?

@tonykipkemboi 2 месяца назад

Thank you for your question. Did you use the same models as in the tutorial, or did you use another one? Are you able to share your code?

@spotnuru83 2 месяца назад

@@tonykipkemboi i copied ur code exactly

@spotnuru83 2 месяца назад

the reason was i did not use jupiter notebook, i was running in VSCode and i had to save the value that is returned by chain's invoke method and when i printed , it started working, this is amazing.. thank you so much. really appreciate it.

@DataScienceandAI-doanngoccuong 2 месяца назад

Can this model query with tabular data or image data, can't it?

@tonykipkemboi 2 месяца назад

I assume you're talking about Llama2? Or are you referring to the Nomic text embedding model? If it's Llama2, it's possible to use it to interact with tabular data by passing the data to it (RAG or just pasting data to the prompt) but cannot vouch for its accuracy though. Most LLMs are not great at advanced math but they're getting better for sure.

@yashvarshney8651 2 месяца назад

could you drop a tutorial on building rag chatbots with ollama and langchain with custom data and guard-railing?

@tonykipkemboi 2 месяца назад

That sounds interesting and something I'm looking into as well. For guard-railing, what are your thoughts on the frameworks for this portion? Have you tried any?

@yashvarshney8651 2 месяца назад

@@tonykipkemboi realpython.com/build-llm-rag-chatbot-with-langchain/ I've reas this article, and the only guard-railing mech they seem to apply is an additional prompt with every inference.

@fredrickdenga7552 Месяц назад

from the Kenyan homeland

@tonykipkemboi Месяц назад

Kabisa bro 😎

@fredrickdenga7552 Месяц назад

@@tonykipkemboi am, vouching for u bro, kitu yoyote mpya about Ai, LLMs, etc ikitokea, we weka hapa asap, we're fully behind you

@ayushmishra5861 2 месяца назад

I've been given a story, the trojan war which is a 6 page pdf or I can even use the story as a text , also 5 pre decided question is given to ask based on the story, I want to evaluate different models answers but I am failing to evaluate even one, kindly help, please guide thoroughly.

@ayushmishra5861 2 месяца назад

Can you please reply, would really appreciate that.

@tonykipkemboi 2 месяца назад

This sounds interesting! I believe if you're doing this locally, you can follow the tutorial to create embeddings of the PDF and store it in a vector db then use the 5 questions to generate output from the models. You can switch the model type in between each response and probbly have to save each response separately so you can compare them afterwards.

@ayushmishra5861 2 месяца назад

@@tonykipkemboi What amount of storage will the model take. I don't have greatest of the hardware.

@tonykipkemboi 2 месяца назад

Yes, there are smaller quantized models on Ollama you can use, but most of them require a sizeable amount of RAM. Check out these instructions from Ollama on the size you need for each model. You can also do one at a time, then delete the model after use to create space for the next one you pull. I hope that helps. github.com/ollama/ollama?tab=readme-ov-file#model-library

@ammardarkazanli5633 Месяц назад

Thanks again for the tutorial. I am running the same question against a 500 pages pdf multiple times and I am getting different answer everytime I ran. What can be going wrong here? I simply have a for loop and looping through the exact same question using the same vector db but yet getting different answers.

@tonykipkemboi Месяц назад

Thanks. Are the answers hallucinated in all of them of just the wording is different each time?

@ammardarkazanli5633 Месяц назад

@@tonykipkemboi They are not completely off but they are a bit different. The pdf I am using is about medical terminology. I asked it simply to tell me the components of the cardiovascular system. There is a simple paragraph in it that lists them but yet, one time it talks about the kidneys.. others talks about the heart anatomy... so it is not completely hallucinating but it is not able to nail down a consistent answer

@tonykipkemboi Месяц назад

@@ammardarkazanli5633 one way I can think to solve this is by using the "seed" parameter for the model. You will need to create a modelfile with Ollama model you're using as the LLM so it generates the same output for the prompt. Here's the docs on how to create that github.com/ollama/ollama/blob/main/docs/modelfile.md. You can also watch my other video on creating Ollama UI with Streamlit to see how I implemented the modelfile although I didn't add seed but it's easy to add.

@ammardarkazanli5633 Месяц назад

I will give this a try…

@aizaz101 21 день назад

How can we get output without rephrasing? I mean i want to know what exactly written in PDF as it is. for example if i say what is written in article 3.2.2 and output will be in quotes word to word?

@tonykipkemboi 20 дней назад

Ah yes, good idea. I think for this, you'll have to add citations. I'm early into playing with this as I am working on the Streamlit UI for RAG. Always good to have cited sources.

@pw4645 2 месяца назад

Hi, and if there were 6 or 10 PDFs, how would you load them into the RAG? Thanks

@tonykipkemboi 2 месяца назад

Good question! I would iterate through them while loading them and also index the metadata so it's easy to reference which pdf provided the context for the answer. There's actually several ways of doing this but that would be my simple first try.

@user-tl1ms1bq6n 2 месяца назад

Pls provide notebook if possible. great video.

@tonykipkemboi 2 месяца назад

Thank you! Checkout the repo link in the description for all the code. Here's the link github.com/tonykipkemboi/olla...

@hectorelmagotv8427 2 месяца назад

@@tonykipkemboi hey, the link is not working, can provide it again pls?

@hectorelmagotv8427 2 месяца назад

no problem, didnt see the description, thanks!

@tonykipkemboi 2 месяца назад

@@hectorelmagotv8427 , thanks. Just to confirm, did it work?

@ariouathanane 2 месяца назад

Thank you very much for your videos. Please, what's if we have severals PDFs?

@tonykipkemboi Месяц назад

Yes, so you can iteratively load the pdfs, chunk them by page or something else, then index them in a vector database. You would then ask your query like always and it would find the context throughout all the documents to give you an answer.

@unflexian 2 месяца назад

Oh I thought you were saying you've embedded an LLM into a PDF document, like those draggable 3d diagrams.

@user-tl4de6pz7u 2 месяца назад

I encountered several errors when trying to execute the following line in the code: data = loader.load() Despite installing multiple modules, such as pdfminer, I'm unable to resolve an error stating "No module named 'unstructured_inference'." Has anyone else experienced similar issues with this code? Any assistance would be greatly appreciated. Thank you!

@tonykipkemboi 2 месяца назад

Interesting that's asking for that since that's for layout parsing and we didn't use it. Try installing it like so; "!pip install unstructured-inference"

@venuai-sh4fv Месяц назад

Should I make Chroma DB connection to make this work?

@tonykipkemboi Месяц назад

We do use Chroma in the tutorial.

@vineethnj8744 Месяц назад

Can we do this with llama3 , which will be more good?

@tonykipkemboi Месяц назад

Yes you can use llama3.

@erickcedeno7823 2 месяца назад

Nice video. When i try to execute the following commands: !ollama pull nomic-embed-text and !ollama list. I receive the following error: /bin/bash: line 1: ollama: command not found

@tonykipkemboi 2 месяца назад

This error is means that Ollama is not installed on your system or not found in your system's PATH. Do you have Ollama already installed?

@erickcedeno7823 2 месяца назад

@@tonykipkemboi Hello, I've installed ollama in my local system but i don't know why i'm getting an error in google colab

@nitinkhanna9754 21 день назад

chromadb works with sqllite 3. facing lot of issues using chroma. can we use any other db or just pcl the entire vector db

@tonykipkemboi 20 дней назад

You can definitely replace chroma with any other db like Weaviate or Qdrant or Milvus and so on.

@nitinkhanna9754 20 дней назад

Thanx man ! It worked 👌

@tonykipkemboi 20 дней назад

@@nitinkhanna9754 awesome!

@kiranshashiny 12 дней назад

@@nitinkhanna9754 What other DB did you use to make it work, as suggested by @tony.

@saiprasannach2488 Месяц назад

What is the python version you used for running this poc

@tonykipkemboi Месяц назад

Python 3.9

@brianclark4639 2 месяца назад

I tried the first command %pip install -q unstructured langchain and its taking a super long time to install. Is this normal?

@tonykipkemboi 2 месяца назад

It shouldn't take more than a couple of seconds but also depending on your system and package manager, it might take a while. Did it resolve?

@ammardarkazanli5633 Месяц назад

Can you think of a reason why pip install unstructured[all-docs] is failing on my two macs. I get the error that "Failed to build onnx ERROR: Could not build wheels for onnx, which is required to install pyproject.toml-based projects".. I have tried almost every suggestion on the internet. I am attempting to run on python 3.12.1 and 3.12.3 ... Thanks

@tonykipkemboi Месяц назад

I had the same issue at some point. Switching to Python 3.9 resolved the error for me. Create a virtual environment with 3.9 and try running it there.

@ammardarkazanli5633 Месяц назад

@@tonykipkemboi Just to confirm, everything worked well with 3.9.19.. Thanks for the suggestion. The video was very helpful to get a handle with all the commotion around the different models.

@tonykipkemboi Месяц назад

@@ammardarkazanli5633 glad to hear it worked!

@user-eh2zd2ih8v Месяц назад

I did some first experiments with local AI, using Ollama and AnythingLLM to talk to the model about a pdf file... and so far, the results are just completely unusable. The AI is just hallucinating on me constantly, making up sentences in the pdf that are not there, failing simple tasks like "quote the first line on page 2 without changing it", not to mention more complex tasks like "list all tools mentioned on page 3". Maybe I'm doing something wrong, but I feel very discouraged from using AI at all for this kind of usecase.

@tonykipkemboi Месяц назад

Sorry to hear the troubles but this is very common. Have you tried setting the temperature of the model to 0? That way there's no room for it to be creative.

@user-eh2zd2ih8v Месяц назад

@@tonykipkemboi Interesting, I'll look into that thanks!

@tonykipkemboi Месяц назад

@@user-eh2zd2ih8v let me know what comes of it.

@aalamansari8643 16 дней назад

is it possible using this we can extract data from pdf and convert to proper JSON format?

@tonykipkemboi 13 дней назад

Yes, it is possible. You would need to add another function to do that but vewry doable. I'd start by checking LnagChain docs on JSON extraction and using Pydantic.

@aalamansari8643 13 дней назад

@@tonykipkemboi got it!

@ayushmishra5861 2 месяца назад

Retrieving answers from vector database takes good one minute on my macbook air, how do I scale this model, can you add pinecone layer to it?

@tonykipkemboi 2 месяца назад

So this was a demonstration of running with everything local and nothing online other than when downloading the packages. You can hook up any vector store you like for example Pinecone as you've mentioned. Just beware that since the local models will still be in use, it will still be slow if your system is slow already. Might consider using paid services if you're looking for a lower latency solution.

@ayushmishra5861 2 месяца назад

@@tonykipkemboi So tony, what I am trying to build is something like a website, where people come and drop there pdf's and can do Q and A. In my learning and implementation I found out. My 10 page pdf embedding generation is not taking a lot of time, it used to before using the embedding model you used. Now embedding part is sorted. I tried implementing the code with chroma and faiss, results are almost equal. Even for a small sized pdf, it takes a minute to answer. I understand it takes computational resource from my local machine, which happens to be a Macbook Air M1. Do you the a machine with better GPU, lets assume yours produce the retrieved results under 10 seconds? Nobody would like to wait a minute or more than a minute on website for an answer, also I am scared about the part if there are 100's of 1000's of user, do I need to purchase a GPU farm for this to work, lol. Note- I have never made a scalable project before. Please guide. Also share how much time it takes on your Pc/laptop for the answer to come back from the vector db, so I can understand if it's my system which is weak or libraries like chroma and faiss are not meant for scalability.

@ayushmishra5861 2 месяца назад

@@tonykipkemboi .

@ayushmishra5861 Месяц назад

can anyone answer this please?

@tonykipkemboi Месяц назад

@@ayushmishra5861 so my system is just like yours with 16GB RAM. It takes about a minute or less to get an answer back for a few pdf pages embedded. For longer ones, it even takes longer. One portion that slows the process is the "multiqueryretriever" which I added and talked about in the video. It generates 5 more questions and those have to get the context from the vector db as well which slows down the time to output significantly. Try without the multiqueryretriever and see if that speeds up your process.

@suryapraveenadivi851 2 месяца назад

ERROR:unstructured:Following dependencies are missing: pikepdf, pypdf. Please install them using `pip install pikepdf pypdf`. WARNING:unstructured:PDF text extraction failed, skip text extraction... please help

@tonykipkemboi 2 месяца назад

Have you tried installing what it's asking for `pip install pikepdf pypdf`?

@suryapraveenadivi851 2 месяца назад

@@tonykipkemboi Thank you so much!! for your reply this got resolved..

@tonykipkemboi 2 месяца назад

@@suryapraveenadivi851 glad it worked! Happy coding.

@suryapraveenadivi851 2 месяца назад

PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH? please help with this........

@tonykipkemboi 2 месяца назад

Are you doing a different modification of the code in the tutorial or using OCR? I would checkout the install steps on their repo here (github.com/Belval/pdf2image) and probably use ChatGPT for debugging as well.

@levsigal6151 2 месяца назад

@@tonykipkemboi I've got the same error and I am using PDF file. Please advise.

@sebinanto3733 2 месяца назад

hey,if we are using google colab,instead of jupyter,how will we able to corporate ollama with google colab?

@tonykipkemboi 2 месяца назад

I haven't tried this myself but here are some resources for you that might be helpful; 1. medium.com/@neohob/run-ollama-locally-using-google-colabs-free-gpu-49543e0def31 2. stackoverflow.com/questions/77697302/how-to-run-ollama-in-google-colab

@Justme-dk7vm 20 дней назад

i installed ollama, and verified on powershell of my windows laptop,when i ran "!ollama pull nomic-embed-text" it is showing " /bin/bash: line 1: ollama: command not found" PLEASE HELP ME, ONLY YOUR VIDEO ON THE WHOLE RU-vid IS SAVING MY LIFE, PLEASE REPLY AS SOON AS POSSIBLE

@tonykipkemboi 20 дней назад

So it seems to be an issue with Ollama installation on Windows. I haven't tried installing Ollama on Windows but might be a good time to add a tutorial on that, maybe. Have you tried watching other tutorials or docs on how to set up Ollama on Windows?

@Justme-dk7vm 20 дней назад

@@tonykipkemboi okay that’s kind of you. The problem is not with installation I guess, im successfuly running on powershell and command prompt. The message is appearing on colab notebook.

@tonykipkemboi 20 дней назад

@@Justme-dk7vm ah I see. So you're using it in colab instead of "Jupyter Lab" locally? I would suggest starting with using it on Jupyter Lab. You just need to install it using "pip install jupyterlab". I haven't ran it on colab but am sure it's possible.

@Justme-dk7vm 20 дней назад

@@tonykipkemboi Okay thankyou so much. I was just scrolling through your videos, it amazed me, you are damn Sir ❤️I would love to get connected with you on linkedin, could you please provide the link.

@Justme-dk7vm 19 дней назад

@@tonykipkemboi hey I tried on Jupyter lab today as you said, I'm not getting that error like previous. But when I entered a query, its taking so much time to load. How to resolve this?

@makethebestgame4868 Месяц назад

I got this error when running your code on colab: "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. imageio 2.31.6 requires pillow=8.3.2, but you have pillow 10.3.0 which is incompatible." Could you help me to check?

@tonykipkemboi Месяц назад

The error message indicates a conflict between the versions of `imageio` and `Pillow` packages. Here's how you can resolve this issue: 1. **Uninstall the current version of Pillow:** ```bash !pip uninstall pillow -y ``` 2. **Install the compatible version of Pillow required by imageio:** ```bash !pip install pillow==10.0.0 ``` 3. **Reinstall imageio to ensure all dependencies are correctly aligned:** ```bash !pip install imageio --upgrade ``` Here’s how you can run these commands in a Colab cell: ```python !pip uninstall pillow -y !pip install pillow==10.0.0 !pip install imageio --upgrade ``` This sequence will uninstall the conflicting version of `Pillow`, install a compatible version, and ensure `imageio` is up to date. This should resolve the dependency conflict you are encountering. Let me know if it works.

@makethebestgame4868 Месяц назад

@@tonykipkemboi tks a lot

@ayushmishra5861 2 месяца назад

can I do this on google colab?

@tonykipkemboi 2 месяца назад

This is local using Ollama so not possible following this specific tutorial. you can however use other public models that have API endpoints that you can call from Colab. I also want to mention that I have not explored trying to access the local models through Ollama using Colab.

@ruidinis75 2 месяца назад

E do not need anAPI key for this ?

@tonykipkemboi 2 месяца назад

Nope, don't need one.

@anuarxz_ 3 дня назад

Hi bro, good video!! But in my console I only see this: OllamaEmbeddings: 100% and stops automatically

@tonykipkemboi 3 дня назад

Thank you! Does it show anything on the app?

@anz918 5 дней назад

I have a 1000 Page PDF will it be able to go through that

@tonykipkemboi 5 дней назад

Good question. I haven't tried it but my naive guess is it can handle it.

@gilleslejeune6823 16 дней назад

Thanks, i dont see where you can tell to handle other langage than English ?

@tonykipkemboi 13 дней назад

@N0rt 2 месяца назад

whenever I try pip install -q unstructured["all-docs"] (using win11) I keep getting a subprocess error: Getting requirements to build wheel did not run successfully. │ exit code: 1 any help appreciated!!

@N0rt 2 месяца назад

AssertionError: Could not find cmake executable!

@tonykipkemboi 2 месяца назад

Which Python version are you using? It seems there's an issue with latest Python versions and seems to work best with Python

@N0rt 2 месяца назад

@@tonykipkemboi I'm on Python 3.12

@tonykipkemboi 2 месяца назад

@@N0rt, I see. The unstructured package has had a lot of errors recently. You could try installing CMake on your Windows machine using the instructions below. Let me know if it works. 1. Go to cmake.org/download 2. Select Windows (Win32 Installer) 3. Run the installer 4. When prompted, select Add CMake to the system PATH for all users 5. Run the software installation 6. Double-click the downloaded executable file 7. Follow the instructions on the screen 8. Reboot your machine

@N0rt 2 месяца назад

@@tonykipkemboi thank you so much ill give it a try!

@Hoxle-87 2 месяца назад

Where is the RAG part?

@tonykipkemboi 2 месяца назад

The whole thing actually is RAG.

@Hoxle-87 2 месяца назад

@@tonykipkemboi thanks. So adding the pdfs augments the LLM, got it.

@muhammedyaseenkm9292 2 месяца назад

How can download unstructured[all-doc] I can not install this,

@tonykipkemboi 2 месяца назад

Did you install it like this '!pip install --q "unstructured[all-docs]"'

@user-cx6rg6mr7d 13 дней назад

can you run this on free version of google colab? thank you!

@tonykipkemboi 13 дней назад

Yes, you can. The Ollama folks added an example notebook in their repo here github.com/ollama/ollama/blob/main/examples/jupyter-notebook/ollama.ipynb

@user-cx6rg6mr7d 13 дней назад

@@tonykipkemboi thank you!

@ANTONKIRIHETTIGE Месяц назад

Hi, Thank you for the video. Please help me to resolve this error "ValidationError: 1 validation error for LLMChain llm Can't instantiate abstract class BaseLanguageModel with abstract methods agenerate_prompt, apredict, apredict_messages, generate_prompt, invoke, predict, predict_messages (type=type_error) " it occurs when I run following code. " retriever=MultiQueryRetriever.from_llm( vector_db.as_retriever(), llm, prompt=QUERY_PROMPT ) # RAG prompt template ="""Answer the question based only on the following context: {context} Question: {question}""" prompt =ChatPromptTemplate.from_template(template) "

@tonykipkemboi Месяц назад

Hi, did you setup your local Ollama model like this: ``` # LLM from Ollama local_model = "mistral" llm = ChatOllama(model=local_model) ```

@ANTONKIRIHETTIGE Месяц назад

@@tonykipkemboi thank you for the reply. Yes I did. I just copied the same codes you did.

@ANTONKIRIHETTIGE Месяц назад

@@tonykipkemboi I even tried with different models(llama2,phi3) but the error is same.

@hibakabeer1685 Месяц назад

Iam not able to install --q unstructured langchain and --"unstructured[all-docs]". It is taking long time and didnt installed Can you please help

@tonykipkemboi Месяц назад

You can try installing them directly on your system using the terminal or close the notebook and restart.

@hibakabeer1685 Месяц назад

Thank you so much

@hibakabeer1685 Месяц назад

Can I ask one more question I get all good. But in the last chain.invoke when I ask the questions it is taking a very long time to run and there pops up an error. __init__() takes 1 positional argument but 2 were given. Can you help me with that as well?

@tonykipkemboi Месяц назад

@@hibakabeer1685 so the latency is potentially introduced because we used the MultiQueryRetriever function which generates 5 more queries similar to your original question then sends them to the vector db to get context on them too. For the error, can you show me the code portion that is failing and also add the full error message?

@hibakabeer1685 Месяц назад

Sure