AI Meets SQL: Developing AI Database Agent in 10 Minutes

Denis Magda

Подписаться 1,8 тыс.

Просмотров 7 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

3 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 44

@vamsiraghu3258 7 месяцев назад

excellent explanation and the demo. Thank you!

@DevMastersDb 7 месяцев назад

Thank you for feedback! Glad you found it useful.

@applepeel1662 7 месяцев назад

Really cool! Thanks a lot

@slowjocrow6451 6 месяцев назад

Crazy stuff, thanks for the video

@applepeel1662 7 месяцев назад

Really cool! Thanks a lot

@DevMastersDb 7 месяцев назад

Glad you liked it! Anything else you'd like to learn about? It should be related to databases (the focus of my channel)

@supimon9146 8 месяцев назад

many thanks and keep going my friend !

@DevMastersDb 8 месяцев назад

Glad you liked it my friend!

@stevenhkdb 6 месяцев назад

crazy useful and stragit to the point!!!

@DevMastersDb 6 месяцев назад

Yep, this stuff is crazy. And that’s just the beginning. It’s gonna be much wilder soon )

@ivonne8412 7 месяцев назад

Nice well explained content, subscribed!

@DevMastersDb 7 месяцев назад

Welcome aboard! Glad you found the tutorial useful 👍

@anuragangara2619 4 месяца назад

Thanks for this video! Got some really promising results in just a few hours! Quick question -- I don't need the context to have the full table ["products", "users", etc]. For my use-case I only need it to have context for the user. (i.e. products for user 1005, purchases for user 1005, etc) If I provide the full tables in include_tables, it very quickly reaches the token limit. Is there a way to dynamically reduce the amount of context when initializing the Langchain Database agent?

@DevMastersDb 4 месяца назад

Excellent! glad you managed to get it working on your end 👍 Try to instruct the LLM to retrieve that context dynamically. For instance, you can say to execute the query "select name, price from products where id = {id}" setting the id as a parameter. Then, the LLM can perform this request over the database and pull a user-specific context. Also, LangChain support various tools/actions that let LLM pull info from other sources or perform various actions: python.langchain.com/v0.1/docs/modules/tools/

@anuragangara2619 4 месяца назад

@@DevMastersDb That makes sense, the issue is (unless I'm misunderstanding), passing entire tables to the LLM, regardless of whether the LLM knows that it should filter down to a subset of data, seems to take a lot of tokens. i.e. We're providing a lot of context, and then asking the LLM to disregard most of it (as opposed to providing the narrow context in the first place). As a result, if I add more than two tables, I get the error: {'error': {'message': "This model's maximum context length is 4097 tokens, however you requested 10494 tokens. Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}} I'm extremely new to this stuff (just a day or two), so I could totally be missing something! One thing I'm going to try next is to create a View with the data after I've applied some joins and filters to it and then pass the view in include_tables instead, so I'm providing just the minimum context the model would need.. not sure if that'll work, or is even the right way of thinking about it 🤔

@DevMastersDb 4 месяца назад

@@anuragangara2619 how many tables do you folks have? First things, yes, ask the agent to look into those which are necessary by defining them in the "include_tables". The LLM will pull only the metadata of those tables and not the actual data. So, it should fit in the context window. And then in your system message for the LLM you also say that use this query to pull the actual data for a selected user. If that data set doesn't fit into the context, then try to filter by the user_id/product_id and some time frame. Anyway, take it slowly and learn LangChain capabilities by playing with a few tutorials. You'll definitely get it figured!

@anuragangara2619 4 месяца назад

@@DevMastersDb Hmm, odd -- only 5 or 6 tables; personal project, so just me. I'll look into that then! Thanks so much for the great video, really enjoyed this!

@DevMastersDb 4 месяца назад

@@anuragangara2619 make sure that LLM doesn’t pull data, enable the “verbose” mode to see what happens. Also, start with 1-2 tables and then add more, see when it breaks. It might have to discover the root cause. Glad you liked the video, thanks for feedback!

@kollisreekanth 6 месяцев назад

Really wonderful video. Thanks for sharing it with everyone. I just have a question, can we use this with NoSql databases like MongoDB/DynamoDB?

@DevMastersDb 6 месяцев назад

Glad you found the video useful! As for MongoDB and other NoSQL databases, I don’t see that LangChain supports agents for them. But some folks found a way how to create custom agents using foundational capabilities of LangChain: omershahzad.medium.com/agent-for-mongodb-langchain-ccf69913a11a

@kollisreekanth 6 месяцев назад

@@DevMastersDb thank you so much for the quick reply. Appreciate it 🙏🏼

@anagai 6 месяцев назад

what model is this using? Can we do this with Azure openai?

@DevMastersDb 6 месяцев назад

I used OpenAI GPT 4. Absolutely, you can use other models including Azure OpenAI

@SAFEEMMOHAMMED-g4x 8 месяцев назад

Thank you so much. Just one query prepare_agent_prompt Trying to create 5 different agent prompts example query with Mysql, Query with Postgres, Query with Aurora. Should I create 5 different agent prompt or as per {input_text} it will manage

@DevMastersDb 8 месяцев назад

Thanks! Glad you liked the video.👍 The {input_text} is the user prompt (your actual question) that you are asking later in the notebook. What you can do is to add {database_dialect} as another parameter to the agent prompt’s text and pass Postgres, MySQL or another db name. If the behavior between databases is very different, then just create a unique agent prompt for every database.

@huiraym 8 месяцев назад

Wonderful video!!! What if the table or column names or even the data are not english like, i.e. a product table is named PD and columns are id, prodcode, typecode, or some combination of columns for a particular info. Data sometimes can be some ridiculous abbreviation. how would you educate it? Via prompt? Thanks so much😂

@DevMastersDb 8 месяцев назад

Thanks, glad you liked it! Non-English data is easy as long as the OpenAI GPT LLM speaks and understands many languages. For instance, you can run the Jupyter notebook from the video and ask questions in a different language and/or insert non English data. The LLM should handle it easily. As for cumbersome abbreviations, I have no idea for now :) but I’ll sort this out while preparing for the following live stream next week. Join if you wish, I’ll make sure to answer the question: ru-vid.com-UvpSBHJFdU?si=oirCEwPCnHRJUBYQ

@SAFEEMMOHAMMED-g4x 8 месяцев назад

second query want to switch between different API using chat prompt in context with this how agents will work can you please create a video or share some document.

@DevMastersDb 8 месяцев назад

You mean you want to use another LLM, the one that is different from OpenAI? If yes, LangChain supports many LLM providers.

@SAFEEMMOHAMMED-g4x 8 месяцев назад

@@DevMastersDb API i mean our software application API routes I want to call API as per prompt for example. /teacher /student if prompt is show student details student route should execute. So shall i go with agent concept please guide Thank you so much for your prompt response

@DevMastersDb 8 месяцев назад

I see. In fact, the model (at least OpenAI GPT4) can make a call to your API depending on a user prompt. You need to create an OpenAI spec that defines your /teacher and /student endpoints. Then the GPT model will execute one of the endpoints depending on a prompt. Check my latest video on this topic. This should be a perfect starting point for you: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Ysh9dwia8FM.htmlsi=fr9bWEOFrmSaAsO1

@GeriFitrah 7 месяцев назад

Will it work with chat history?

@DevMastersDb 7 месяцев назад

Yes, you need to tweak the current implementation as follows: 1. Store the history in some variable like "chat_history". 2. Pass this variable to the agent prompt that is generated by the "prepare_agent_prompt" method. 3. You can append the chat history to the end "agent_prompt" variable as follows ".... also, consider the following chat history {chat_history}"

@slowjocrow6451 6 месяцев назад

What is RAG?

@DevMastersDb 6 месяцев назад

RAG stands for retrieval-augmented generation. It's a technique to enhance the behavior of an LLM by providing it with more context. Usually, you get that context from your own database that stores your own data. For instance, let's say you ask ChatGTP to recommend a few places to stay in NYC between April 1-7, 2023. ChatGPT doesn't know those details, it was trained on some generic data from the past and didn't have access to the private data of Expedia or Booking.com. But Expedia/Booking's own AI assistant can easily address this task by using the RAG approach. You ask their assistant to recommend the places, they query data from the database and feed it as a context to an LLM (that can be GPT), and then the LLM responds to you like a human would.

@slowjocrow6451 6 месяцев назад

@@DevMastersDb Great explanation thanks. So is your langchain example RAG? Because it's providing extra metadata etc to your query? I've looked at a few examples of langchain and it seems to match my idea of what RAG is, but langchain doesn't call itself RAG so maybe I'm missing something. Trying to figure out what all these new buzzwords mean hah

@DevMastersDb 6 месяцев назад

@@slowjocrow6451 yep, LangChain doesn't have any RAG-specific APIs and it doesn't need them. But when you create those chains (with LangChain) and some parts of the chain retrieve additional information from a database or another resource and feed this information as an extra context to an LLM - then you're effectively creating a RAG-based solution with LangChain. Hope it makes sense. Also, I found this LangChain cookbook useful, take a look: python.langchain.com/docs/expression_language/cookbook/retrieval

@Mostafa_Sharaf_4_9 7 месяцев назад

what if I want my agent to add data to the database ,for example If I want to make an agent for a clinic that book a new appointment for the patient and store his information in the database

@DevMastersDb 7 месяцев назад

Yes, the agent can modify data by translating your requests into INSERT, UPDATE, DELETE, and other DML/DDL statements. In the video, the agent connects to the database using a read-only user role. The agent still can generate a statement updating the database but the database won’t let it execute the statement. Overall, you can create a database role/user for the agent with broader permission and use it at your own risk.

@Mostafa_Sharaf_4_9 7 месяцев назад

@@DevMastersDb can you please tell me how to make a database with border permission ?

@DevMastersDb 7 месяцев назад

@@Mostafa_Sharaf_4_9 if you follow the Jupyter notebook shared in the description of the video, then you'll find this code that opens a connection to the database SQLDatabase.from_uri( "postgresql+psycopg2://sql_agent:password@localhost:5432/postgres", include_tables=["products", "users", "purchases", "product_inventory"]); Now, replace "sql_agent" with "postgres" user in that snippet and the agent will connect to the database using the "postgres" user that can do anything.

@Mostafa_Sharaf_4_9 7 месяцев назад

thank you , last question please , how to make agent with custom prompt with memory ? @@DevMastersDb