The Best tool for this is ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-bcK7LldB3dk.html I like some of the transitions, but sometimes they're a bit too much and are seemingly random. Since we use these persistent elements that transition across pages to indicate some kind of relationship between the previous and the next states, some of your transitions confuse me because I can't immediately see what the relationship is. For example 1:23 of the selectable tiles (which weren't selected) transition into being two switches... does that mean anything? are they related in some way? I see this as random and a bad use of the design language. However, at 3:14 I like the transition from switches to the ticks on a paper, that makes sense to me. Epic presentation tho
couple of things observed. 1. Its not free because integration with openAI is required 2. It is too slow. For two page PDF it takes somewhere around 10-20 seconds to respond when I am on a 48GPU machine
Is there any way I can test it for free? I used a PDF with only one page and it says "You exceeded your current quota, please check your plan and billing details"
Liam, if there an option to make the assistant always use the data that has been uploaded to knowledgebase? It doesn't read the KB files every time and uses the links that even doesn't exist
@@ahmedmiftah8308what’s the point in merging the pdfs if the chunks is going to break them up anyways Each section should be able to be another pdf which makes since anyways
What if we have multiple PDfs and we want to fetch the Answer from that pdf ? like for an example : I have 20 Pdfs, and if I ask one question then it should fetch the answer from any one of the Pdf (correct obviously) and show me as a output.
great tutorial! I have hundreds of research papers in pdf format. Can I use this approach to build a vector db and then chat with chatgpt? Is there a limit to the size of db? any pitfall to avoid?thanks!
Hi Liam, great video. I do have a question, from the following code, i notice that we don't have to specifically turn the "query" into embeddings, before it performs a search against the vector db? Is it because the function "similiary_serach" internally calls the openapi embedding to perform words embeddings? query = "Who created transformers?" docs = db.similarity_search(query)
Hey Liam, How many PDFs can I use this on? I have 1000+ instructional documents on an information system I use and have been trying to create a chatbot with this database embedded for quick question answering. Would i have to combine all the pdfs? can i put them all through vectorization? What are your thoughts?
Not necessarily, if you cram it full of thousands of chunks I'd assume the recall just gets slower and slower and uses more resources on your system. Best to setup different indexes for different information or use namespaces (Pinecone feature)
NameError Traceback (most recent call last) in () 1 # Get embedding model ----> 2 embeddings = OpenAIEmbeddings() 3 4 # Create vector database 5 db = FAISS.from_documents(chunks, embeddings) NameError: name 'OpenAIEmbeddings' is not defined
I've written a prompt for GPT-4 that I use with chatGPT in Macromancy formatting to transform it into a legal assistant, and the results have been stellar. Is it possible to encode this prompt into the system you describe so that the bot operates with it in mind?
I've been watching a lot of your videos and they are very helpful but man you gotta stop banging your arms on the table lol - Might I suggest a mic that hangs from the ceiling? Thanks for the content regardless!
Hi Liam, I am getting FileNotFoundError when running the textract.process command. I have the pdf file in the same project folder as my .ipynb file. I am using visual studio code
WARNING:langchain.embeddings.openai:Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details.. This pops up while I'm creating the vector db
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. yfinance 0.2.18 requires beautifulsoup4>=4.11.1, but you have beautifulsoup4 4.8.2 which is incompatible.
Thank you it worked perfectly despite generating an error on the pip install. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. yfinance 0.2.18 requires beautifulsoup4>=4.11.1, but you have beautifulsoup4 4.8.2 which is incompatible.
Hi Liam, I am getting 'authentication Error' when running 2. section of the code "Embed text and store embeddings" . I have not change anything yet just running it as is. Any suggestion?
This is not a 5 minute job for almost any pdf because many would contain table data or multiple columns. If you want it done in 5 minutes its going to have to be text-only. So your video title is deceptive. And this will not work well.
This is great. Is it possible to retrieve images from the PDF? I have a PDF with many graphics that help understand the content. Do you have any ideas as to how I can provide images as part of the conversation?
Thanks, but it would be interesting to do without the OpenAI API, as it is paid and it would be very expensive for large projects to be analyzing PDFs with it, it could be another hugging face model, I am trying to do something in this direction, if you have any ideas let us know let me know!
can i ask whether you paod for the OPENAI KEY OR YOU DID IT WITH THE FREE TRIAL? Cuz am encountering this error RateLimitError: You exceeded your current quota, please check your plan and billing details.
there is a little error, in the embedding section OpenAIembeddings is not defined, if I am not wrong just add a line -> import openai (I wonder on 115.760 views how many has really done the tutorial XD )
Thank you very much for this great video!!! One question. On the part of Create chat bot with chat memory (OPTIONAL), I received the following message "DeprecationWarning: on_submit is deprecated. Instead, set the .continuous_update attribute to False and observe the value changing with: mywidget.observe(callback, 'value'). input_box.on_submit(on_submit)" Why? Would you be able to fix it?
👏👏 Hey Liam, your five-minute tutorial is fantastic! Kudos and thanks for putting the effort to produce it. Your app is exactly what any knowledge worker is craving for: We all have gigabytes of pdf files in some folder named "READ", "TO READ" or "__TO READ" (so it stays on top of the root :), but never get to it (probably distracted by all these tutorials to become more productive we love to watch). A bot that can read that stuff for us, so we can continue to wing it is a true godsend. :D
Nice tutorial! I learned a lot from it. I used the learnings and added my own spin (using a data sync tool to pull from custom knowledge base and integrating with Pinecone instead of feeding input from a static doc), wondering what you think: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-qyUmVW88L_A.html Anyway, thanks for making an awesome informative video!
NameError Traceback (most recent call last) in () 1 # Get embedding model ----> 2 embeddings = OpenAIEmbeddings() 3 4 # Create vector database 5 db = FAISS.from_documents(chunks, embeddings) NameError: name 'OpenAIEmbeddings' is not defined solution plzzz!!!
I am creating a chatbot to help employees, I have a 220 page contract pdf that I need my chatbot to be able to answer questions about accurately. The issue is, fine-tuning with the data doesn’t produce accurate outputs. Would this be a good way to achieve this?
Great video! I was wondering why is it a private chatbot when you're using openAI key and sending the information to LLM GPT-3.5? How can you secure sensitive data with your method? Thank you sharing your knowledge.
Out of complete ignorance, is Langchaining the best method currently available to increase the perform of our LLMs Chatbots? If not, what is it or what other methods are out there that I may be missing. Thanks for answering.
What is the point of doing this, you can just put the data in a MYSQL and write a PHP code to search it. This is overcomplicating task just so you can say: hey guys, look what I made in AI? Completely useless.
These are lazy videos... show an example. Do you want people to buy? Show solutions not features. Sales rules. Show how this solves issues. I just bought activepieces but this video doesn't show me why i should add taskmagic
Hi, sorry, there is an issue in colab, first script: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. pydrive2 1.6.3 requires six>=1.13.0, but you have six 1.12.0 which is incompatible. yfinance 0.2.36 requires beautifulsoup4>=4.11.1, but you have beautifulsoup4 4.8.2 which is incompatible.. By the way, do you plan to make an adaptation for Mistral AI?
Thanks for the super video. I have a question: in the overview you show that ChatGPT3.5 is used, or that the query is last processed by 3.5. But in the code I can't find any reference to it. Where is my mistake?
Problem with this approach I find out that Your Bot gonna reply on irrelvant question "Ask who is spiderman". So even after providing a prompt which clearly Instruct system Not to response If current context doesn't hold knowledge LLM gonna reply. How to handle this scenario?
Thanks a lot man, been trying to get this to work via other ways for days. This was so easy, great tutorial. How would you transfer something like this to a user friendly ux/ui?
Hi, THANK YOU for sharing your knowledge. Could please let me know how many PDF can we train using this technique and does this LLM remember what PDFs it has been trained on or do we have to train the LLM at before running the query?
I have no idea what “chunks” are or how to count them. I didn't understand where and how I have to rewrite the code to let the Programm find one or more of my PDFs. Is it the path or just the file name? What if I have more than one PDF? Should I convert the PDFs into a chunk file or just split them and keep it as a PDF!? But how I insert multiple PDFs into the Code wasn't shown. Sorry but If you do a tutorial and assume people know how to do it, then you don't need to do it at all.