🐱 GitHub repository: github.com/alejandro-ao/chat-with-websites 🔥 Join the LangChain Master Program (early access): link.alejandro-ao.com/langchain-mastery 💬 Ask your questions in our Discord Server (but please leave a comment here too for engagement): link.alejandro-ao.com/981ypA ❤ Buy me a coffee (thanks): link.alejandro-ao.com/YR8Fkw
when fine tuning a model : it seems that it is adjusting the model weights based on the input and expected output? that would mean the brain is @Open? so its basically live when taking to the brain in training does this effect its weights ? can only the trainer effect the weights of the model ? i expect this is happening in memory? based on Positive and Negative feedback in Chat could we not be talking to the brain and teaching it and adjusting its weights on the fly? Also (sorry) When using the Rag type systems are the documents being tokenized using the tokenizer from the model ? (i know it locks the database down a bit) , as then we could consider the local rag as the working memory system ? and the llm as the long term memory system , should there be a bridge between the database and trainer so that it could essentially update the Longterm memory periodically releasing the local rag data? ie essentially training a lora to be applied (or merged) ... hence the llm should have a lot of loras from each interval of updates or training Or not if the strategy is full merge?
can i just say you make this whole thing bearable because your voice delivery is on point i won't go anywhere else i'm gonna learn everything here what a bless
hey Alejandro, thanks for your video! This project is my first side project! I reaally appropriate your amazing job! If you are looking for the idea for next tutorial, the text-to-mindmap maybe a good idea!
hey there! i'm glad you enjoyed the project. that sounds like a fun idea, I will probably me doing something like that in the future. Maybe something using knowledge graphs?
Thank you so much for your amazing videos , this one in particular is outstanding as far as RAG is concerned , you are a real master in gen AI , I do find your videos tremendously helpful , keep on making them cheers from Saudi Arabia 😍😍😍💯
heyy I had a doubt u said that after we do similarity based ranking with a vector db and get few chunks(context) to answer our query and then we'd pass these contexts with user query along with chat_history to an LLM, But if we pass chat_history wouldn't that exceed the max token size of an LLM if conversation went too long ??
hey there. great question. yes, totally. if the conversation is too long, then you can exceed the context window of your LLM. however, keep in mind that modern LLMs, such as GPT-4, Claude and especially Gemini 1.5 have gigantic context windows, so this might not be too much of a concern. also, consider that sending the entire conversation history is only one method for implementing memory in these systems. you can also send a summary of the conversation + the last 10 messages, for example. or produce a NER-based memory. i don't think there is an industry standard yet for implementing memory, though. so feel free to try out several methods.
Thanks for the video, it's great stuff. Wonder if you can do these videos with Gemini LLM, the LLM and embeddings are free as far as I can tell. I somehow made it work by watching your video, thanks man.
hey there, that's a great idea. good that you made that work. i was going to use them for a video a few weeks ago, but then realized that i needed to set up a vpn before recording. these models were not available in the eu when i checked a few weeks ago (great)... so i got lazy and went for openai. i will set up this vpn thing (or move out of europe) :P
Great video! subscribed and liked. Just one part that I could not understand. the stuff_documents_chain tasks input and context. How is the context being passed from retriever_chain to stuff_documents_chain? Dose LangChain just defined that creste_retrieval_chain can pass context from the first argument to the second?
hey there! welcome to the club! yeah that's exactly what is happening. we are using a prebuild chain that does the `.invoke('{"context": [...]"})`by itself without us having to call the variable. if you look at the prebuilt chain's source code, you will see that it is calling the invoke method inside the runnable. i will make a future video creating chains ourselves so that this is easier to understand!
Hey there. This is not necessarily deployable to HuggingFace, as HF is a place to host models. Since we are not creating our own models here, that would not work. We are building an app, however. And you can deploy it to the web! If you want to see how to do that for free, you can watch this video: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-74c3KaAXPvk.html
What a great tutorial!!! Really enjoyed it!! New subscriber here!! Only one question...just for testing purposes I used the chat to ask if he could answer topics different from the context of the websites I was chatting with, and it did. It even wrote code for me. Is there a way to restrict the app to only answer about the content of the website we are chatting with and not other questions? Thanks for the amazing video
hey there, welcome onboard! yeah, that's on of the main complications of creating rag applications. i have seen several ways of dealing with this. you might want to try some of these: 1. the main thing to do to control this issue is to add a restrictive instruction to your initial prompt. so you would have to add something like this: "Don’t justify your answers. Don’t give information not mentioned in the CONTEXT INFORMATION" 2. to go even further, you can fine-tune your model with some of these edge cases. 3. lastly, a more complex approach would be to add a "policing" ai to read the main ai's answers and decide wether to accept them or not. this is, of course, more expensive and complex.
It looks interesting. 👌 But One question: Can I give any website link into it and ask for the best keys used for this website (like using it for digital marketing concepts)
i suppose you can! although i would think that this requires a different approach than using a LLM. you might need other NLP algorithms to deal with this. maybe some pipeline that strips your corpus of text, removes filler and useless words and gets the main words. it then could generate a bag of words for the keywords of your website. check it out, you can do all of this with python ;)
This is so cool, thank you so much. Can this be applied to database with so many views or tables so you can ask questions and it's intelligent to perform joins to bring the answer? It will be interesting to see if it's possible or create a video. Thank you so much
Hii, I am not able to load data from the web in Windows its everytime shows 3 errors like import pwd, from langchain_community.document_loaders.pebblo import PebbloSafeLoader and after doing that it is not still working. what to do please assist me. Thank You, You have done a very good job.
Hey Alejandro. I tried to deploy Chatwithpdf app to stream lit but its not working. whenever I upload pdf and start processing its starts to download weights. I am using huggingface APIs instead of OpenAI. Does streamlit has any sort of limitation ? IF yes is there any way I can perform this task without downloading weights just like OpenAI APIs?
@@alejandro_ao thanks! Excited for it. I think its a very common use case where ppl wanna have multimodal RAG on several doc types and return the source doc
I did given in the video ,but in the end the open ai session timed out due to its free version after i put the url 😭, please give me some solution on how to solve this issue and make it a runnable chatbot
hello there. that's very strange. are you sure you are not accidentally hitting the limit of the context window? you can use a model with a larger context window maybe. try 'gpt-4-0125-preview', it has a 16k token window. if the issue persists, try opening an issue in our discord forum: link.alejandro-ao.com/discord it's free :)
Not the GPT-4 model itself! The model only has access to its training material. The Chat-GPT Plus app uses a feature like this one here to enrich what the model does :) This is to show you how that mechanism works, now using the latest version of LangChain
@@alejandro_aoYesh, of course it doesn't use it per say but rather sends a query, launches a function that fetch content of the page then reads/interprets the content and "clicks" to change page and so on until it gets results. The intent of the video went above my head :) Keep it up! Love the content, there's not enough of it on RU-vid!
Hi Alejandro great clip, I'm encountering this particular error especially after I replace one website with another. Is there a way to reset the session_state after uploading a new website to safe memory? BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens. However, your messages resulted in 15766 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
Hey there. That's odd. About the first question, you can deal with this by creating a button that flushes all the variables stored in streamlit's session state. Something like `if st.button("flush"): st.session_state = []` If you are having this problem without having a super long conversation history, you may want to use a shorter chunk size. The `RecursiveCharacterTextSplitter` takes the `chunk_size` parameter and you can set it to as many maximum characters you want per chunk. You can get more info about this text splitter here, it's actually pretty cool: python.langchain.com/docs/modules/data_connection/document_transformers/recursive_text_splitter
Thank you Alejandro for your clear and insightful tutorials. They are much appreciated. I remade your full solution and it's working like a charm. However, when I ask it a question that's outside of the context of the webpage (like: What's the capital of France?, for example), it still answers. How can I ground it to answer only from the context? i tried to add the following to the system prompt of the get_conversational_rag_chain: ("system", "Answer the user question based on the below context. If the context does not contain the answer, do not make up an answer and respond with \"I am sorry I cannot answer this out-of-context question!\": Context:{context}") However, it still answers irrelevant questions. What do you think?
Hi, thank a lot for your sharing. And also I want to ask about error message APIConnectionError: Connection error while running Chroma.from_documents(""chunkc","embeddings"). Here I use AzureOpenAIEmbeddings.. how to solve this error? thanks for helping.
Awesome video. What extensions do u use in VSCode? Seems to be very helpful. Also, can you please show how to input many data sources as an input. i.e many pages of a website
You are a rare gem. I really appreciate your knowledge sharing. Please help us release video that uses natural language to SQL and we can connect to WhatsApp. To make things more exciting that we can load image from WhatsApp to our database.
I have try your code it work but when i input the question i received a message TypeError: serializable.__init__() takes 1 positional argument but 2 were given
hey mate, can you log in to our discord server and post your question in the forum? it's free and that way it's easier to help you out. remember to include your code and the full error message so that we can help you! here is the link: link.alejandro-ao.com/discord
Amazing content as always. I have a request: Could you create a tutorial on creating a vector database with PDF files and using LangChain to query on it?
Hi! The chat is showing the scraped text and and the message in prompt template How to solve it? (As we only want to show queries and the response only.)
you are right! currently, it is working because it's getting the `user_query` variable from the wider scope. but it should be using the function param instead or it will stop working as soon as you modularize the app. thanks! i will push the corrected code in a moment :)
Hi Alejandro, thanks for your videos, your videos helps me to solve my first steps in use of LLM models. Please is it possible, that you show some solution of your last Langchain videos but as GPU version, how to run ChatBot with own PDF on GPU? Thanks alot.
hey Mathias, thanks mate, it means a lot! sure thing, i'm very glad that you share these video ideas that can be super useful to the community! i'll be working on a video about it!
hey there. this method only loads the web page you pass to it. you would have to code something a bit more sophisticated to crawl the entire website! beautifulsoup, browserless and puppeteer are your friends 😎
@@alejandro_aoHi. It would be appreciated if you could add one last part showing how you would use beautifulSoup to extend this app from a web page chat to a whole web site chat. Thanks.
that's miro.com/ but i have since switched to excalidraw.com/ (example here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-kBXYFaZ0EN0.html ) both have a free tier
Do you refer to langchain's document loader or reader documentation. Many document loaders are available this is webpage loader. U post websites like blogs or any articles for retrieving data using open ai api. Read the langchain documentation and llama index data framework also
hey there, yeah this is correct! you would have to scrape the whole website first to chat with the whole website. you can use beautifulsoup to do that too, or you can use a no-code tool like octoparse (it's great). sorry about the title, the keyword rating for "chat with website" was like 50x better than "chat with web page" :S
Thats what Im very curious about. I made a pdf webscraper for mining documents to train a model, but i could only get it to pull the ones on the actual page the url directed to specifically. Wondering if theres a technique for automatically searching through all the pages associated with the original url. Im sure there is, I'm just a programming noob so im learning best I can.. ✌️
hey there! there is a way indeed, but it is a bit more complex. you would have to either: 1. have access to the website database. this way is simpler because you would just have to apply a RAG algorithm to a database. 2. scrape the website. this is more complex, as it requires using something like python's beautifulsoup to scrape the contents of the entire website. but beware because some websites don't allow bots (sometimes they can even try to get you in trouble). a no-code tool for scraping that is very good is octoparse, but know that this is on the edge of what is allowed and they have had several lawsuits in the past for making scraping so easy.
Question - If retrieval chain is already finding the most relevant documents chunks based on conversation history and user's input and passing it through the {context}, what is the need to integrate retrieval chain using "create_retrieval_chain(retriever_chain, stuff_documents_chain)"
you mean all at the same time? you can use streamlit pages to merge some of the other apps we've built in this playlist! Here's their docs on how to do that: docs.streamlit.io/get-started/tutorials/create-a-multipage-app i'll be updating the chat pdf video soon with the latest langchain as well
Great tutorial! I'm a new subscriber and found it really helpful. I'm interested in using the Pinecone vector database for my projects. Could you please provide some guidance on how to get started with it? Any tips or resources would be greatly appreciated. Thank you!
Excellent video. But just for suggestion can you make video on how do we deploy the same code using some microservices like fastapi? As most of your videos are using streamlit ( I actually learned a lot about streamlit 😅) but in case of simple app deployment on even localhost with fastapi or flask will be very helpful.
You can get access to the api by signing up for billing on the openai api platfor (not the same as getting chatgpt plus). OR you can use Groq for free ;) they have a bunch of open source models and they are free to use. just swap a couple of lines on this code. here is the docs with langchain: python.langchain.com/v0.2/docs/integrations/chat/groq/
hey there. this only goes through the webpage that you load in the URL. if you want to scrape the entire website you would have to use another kind of technique, such as scraping with beautifulsoup, browserless or even puppeteer
@@alejandro_ao I looked at the langchain docs and I was wondering if you could do a video about the differences between all those URL loaders like WebBaseLoader, UnstructuredURLLoader, RecursiveURLLoader and how do we know when to use which
hey, in case you haven't seen it. here's the video on how to put this to production in a simple, free way: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-74c3KaAXPvk.htmlsi=xfL4RZuDTb3H4rgr
could you make a version where you don't delete stuff all the time as it wasn't always clear what was supposed to be deleted and what needed to be kept so i often missed when you did it and it made it harder to follow what you were doing.
Hi, great video! but after I change the website URL,it keeps answering based in the first URL. I closed the browser and opened again. Inserted a different URL, and asked What is the title of the articles and still giving the title of the first URL. Even after change the URL in the text input, the program does not change the URL, because keeps answering based on the first URL.
Friend, could you teach me how to receive a data stream from OpenAI with langchain through document analysis? I swear I've put a lot of effort into figuring it out, but I'm struggling a bit with the language and the new documentation 😢
hey my friend, sure thing. but i'm not sure i can help more without more info. why don't you bring that up in our discord server? maybe we can help you out there: link.alejandro-ao.com/discord
I am having trouble with the Chroma library. When I import it only recognises chroma class instead of Chroma. When I use chroma, it says it doesn't find de from_documents. But when I try Chroma it doesn't exist. Can someone point me in the right direction?
hey there, yeah, i have seen this issue before. try verifying that chromadb is correctly installed in your virtual environment by running `pip freeze | chromadb` in your terminal. if it is installed, just import Chroma even though it is recommending you to use `chroma` for some reason. this worked for me. le me know how it goes. also, feel free to join the discord server to have more news about the channel and the community! 👉 link.alejandro-ao.com/discord