Using LangChain Output Parsers to get what you want out of LLMs

Подписаться 62 тыс.

Просмотров 38 тыс.

50% 1

OutParsers Colab: drp.li/bzNQ8
In this video I go through what outparsers are and how to use them in LangChain to improve you the results you get out of your models.
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/samwit/langchain-t...
github.com/samwit/llm-tutorials
00:00 Intro
04:56 Structured Output Parser
12:26 CommaSeparatedList OutputParser
14:13 Pydantic OutputParser
19:00 Output FixingParser
21:26 Retry OutputParser

Наука

Опубликовано:

7 июл 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 91

@magicofjafo Год назад

Dude, I was exactly having output parser problems just last night. This is exactly what I needed. Thanks.

@pankymathur Год назад

Thanks a lot Sam, I really like the way you went deep into explaining all different types of parser with examples. This is definitely one of the top notch content video you released, keep it up 😊

@toddnedd2138 Год назад

Thank you. There are pros and cons with langchain. It is a powerful framework but sometimes it is (imho) a little bit to verbose if it comes to prompt templates. This adds up if you make a lot of request to the model and costs unnecessary tokens (in production ready applications). Therefore i use my own written prompts, the crucial thing is finishing the prompt with the final instruction, here is an example: --- The result type 2 should be provided in the following JSON data structure: { "name": "name of the task", "number": number of the task as integer, // integer value "tool": "name of the tool to solve the task", // omitted "isfinal": true if this is the last task, // boolean value } Respond only with the output in the exact format specified, with no explanation or conversation. --- So far, this has always worked reliably (over 500 calls). I found this in a lot of papers, so, the credits for this go to some other intelligent guys. From my experience, the name of the fields and also the order of the fields can make a difference.

@MichaelNagyTeodoro Год назад

I did the same thing, it's better to do the parsing outside of langchain methods.

@thatryanp Год назад

From what I see of langchain examples, it seems that for someone with development experience, they would be better served with some basic utility functions rather than taking on langchain's design assumptions.

@toddnedd2138 Год назад

@@thatryanp & @Michael Nagy Teodoro There could be some disadvantage if you write own solutions when it comes to updating to a newer underlying model. Maybe not critical today, but one day it might be a topic. My guess, the langchain community will be fast provide updates.

@rashidquamar Месяц назад

thanks Sam, I was strugling with outparser and you helped on time

@kuhajeyangunaratnam8652 2 месяца назад

Thanks a lot mate. This is invaluable as it gets, code walkthrough with all the explanations. Not to mention that code itself well documented.

@jasonlosser8141 Год назад

Great video. I’m using core python elements to parse right now, but I’ll incorporate output parsers in my next rebuild.

@jdallain Год назад

Thank you very much! This is super helpful and something I’ve struggled with

@MariuszWoloszyn Год назад

I used to insert and output in YAML. It's more human readable than json hence it works better with llms. No missing colon or stuff like that.

@jolieriskin4446 Год назад

Another variation I've been using is to have a separate JSON repair method. I usually use a similar technique of showing the example JSON and immediately call my validation routine afterwards. If there is an error, send the JSON error and line number it's on as a separate call and try up to 3x to repair the output. The nice thing is you can use a lot fewer tokens on the repair call and potentially call a more specific or faster model that is tailored towards just fixing JSON (rather than wasting an expensive call to GPT4 etc...).

@ugk4321 Год назад

Super content...explained well. Thank you

@RobotechII Год назад

Wonderful content! I'm sending it to my team

@____2080_____ Год назад

Awesome and thank you for teaching

@bingolio 9 месяцев назад

Sam you are my unsung Hero of AI. THANKS!

@TomanswerAi Год назад

Great explanation thank you!

@tubingphd Год назад

Thank you Sam

@galkim1 11 месяцев назад

This is great, thanks

@vikrantkhedkar6451 23 дня назад

This is a really important stuff

@oz4549 Год назад

I have an agents which goes through a list of tasks. I want the output structure to be different depending on the question asked. Maybe in one instance I just want to return json but in another instance i want to return markdown. I tried to do this with prompts but it is not consistent. Is it possible to do this?

@anindyab Год назад

Thanks for this, Sam. Your videos on Langchain have been incredibly informative and helpful. Here's a request: Can you please do a video on creating Langchain agents with open source/local LLMs? The agents seem to require specific kind of output from the LLMs and I think that can be a nice follow up to this video. In my brief experience open source LLMs are not easy to work with when it comes to creating agents. Your take on this will be very helpful.

@samwitteveenai Год назад

The big challenge is most of the released Open Source models can't return the right format. I have a new OpenAI one coming and will try to convert that to open source to show people.

@anindyab Год назад

@@samwitteveenai This is great news. Thank you!

@pranavmarla Год назад

I've been playing around with langchain for a couple of days and this is really helpful! Output parsers would be great while dealing with tools that need to interpret the response. I hope this gets integrated into SimpleSequentialChains too? Because currently SimpleSequentialChains only accept prompt templates which have a single inputs.

@redthunder6183 11 месяцев назад

How does the output parser actually parse the output that it gets back tho? Is it just regular code? Or is it something more. Like as an example what if the model forgets the second “ to end a string???

@alihosseini592 9 месяцев назад

As you also mentioned in the video, CommaSeperatedOutputParser does not really work well(for example there was a dot at the end of LLM's response. Is there any other way to get the model to output only a list?

@giraymordor Год назад

Hello Sam, i have a question: I aim to send a sizable text to the OpenAI API and subsequently ask it to return a few select sections from the text I've dispatched. The text I intend to send consists of approximately 15k tokens, but the token limit for gpt3-5.turbo is merely 4k. How might I circumvent this limitation and send this text to OpenAI using the API? This is not for the purpose of summarization, as there are ample examples of that on RU-vid. My goal is to send a substantial amount of text to OpenAI within the same context, and for the model to retain what I previously sent. Following this, I would like it to return a few parts from the original text, preserving the integrity of the context throughout these operations. Thank you in advance for your guidance!

@vijaybudhewar7014 Год назад

That is something new i did not know this...as always you did your job at its best

@RobvanHaaren 3 месяца назад

Sam, I love your videos, I'm a huge fan. The only feedback I would have to make your channel better is to fix your typos. Both in your template strings (not a big deal since the LLM will understand regardless), but also your video titles (eg. at 4:57) "Ouput", may affect your credibility. All the best and keep up the good work!

@samwitteveenai 3 месяца назад

Thanks & Sorry about that. I have tended to record these on the fly and put them out. I have someone editing now who will hopefully catch them as well.

@tomaszzielonka9808 11 месяцев назад

How giving a specific role (in this example a master branding consultant) improves (or impacts, in general) the outcome of a prompt? LLMs make predictions based on sequence of words and I try to bond role-playing with model's output.

@Darkhellwings Год назад

Thanks for the explanations. What I still miss from this tutorial (and some others of yours), is how to personalize langchain's API to go beyond what is provided at the moment. For instance, a simple question raised immediately after watching this would be : how to implement a custom output parser, for a custom format that is not JSON or lists ? Is it possible to make something for tables ? Thanks anyway, that was still great !

@samwitteveenai Год назад

almost all customization is done at the prompt level. If you are doing something for a table you would want to think through first what would the LLM return as a string. a CSV? How would it represent a table etc. Then work on the prompt that gets that and lastly think about an output parser . You raise an interesting issue, maybe I should make a video walking through how I test the prompts first and get that worked out. If you have good use case please let me know. One issue I have is I can't show any projects I work on for clients etc.

@ElNinjaZeros 10 месяцев назад

When I try to apply this parsing with models called -langchain, sometimes it works and sometimes it doesn't. Same with langchain's pydantic.

@Aroma_of_a_Roamer 11 месяцев назад

Love your content Sam. I was wondering have you ever got classification/Data Extraction working with an Open Source LLM such as Llama 2? Would love to see a video on this if you have. Thanks Keep up the great work.

@samwitteveenai 11 месяцев назад

I have been working on this with mixed results, hopefully can show something with LLaMA2

@Aroma_of_a_Roamer 11 месяцев назад

@@samwitteveenai You are an absolute champion. I think all app development is exclusively done with ChatGPT since it is a) superior to open source LLM & b) App & library developers such as Langchain have geared their app development towards it, using their own prompt templates. Each LLM has its own way and nuance as to how to format the prompt in order to make it work correctly.

@imaneb4073 2 месяца назад

Hello thank you so much for such valuable and creative content that helps us a lot please i have a question , I am using pydantic Output Parser on a strcutured pdf documents to generate a dataset ( where I will select only specific fields ) I used OpenAI as llm model but the problem i faced is i am working with a folder of 100 pdfs so the code suddenly is intrurrepted due to openai limit daily rate of requestion . Please how to handle this is there a trick ? Or another alternative?

@elikyals 9 месяцев назад

Can output parsers be use with the csv agent?

@askcoachmarty Год назад

Great vids, Sam! So, is this awesome pydantic output parser available for node? I'm finding shaky info in the JS docs, I'm currently using the StructuredOutputParser, but I'm creating some agents that I want to output in Markdown. Is it best in javascript to just post-process and convert to markdown? Any pointers or thoughts would be greatly appreciated!

@samwitteveenai Год назад

Pydantic is a Python thing so maybe not in the JS version, but my guess is they will have something like this soon. Technically you could just make it yourself as it is all just setting a prompt. I have a new vid coming out in an hour which shows another way to do the same thing.

@askcoachmarty Год назад

@@samwitteveenai cool. I'll look for that video!

@RedCloudServices Год назад

There is no LangChain plugin in the ChatGPT plugin store. Did they remove it?

@easyaistudio Год назад

the problem with trying to do the formatting in the same prompt that does the reasoning is that it impacts the result

@samwitteveenai Год назад

You can get the model to give reasoning before as part of the output. Ideally you want reasoning instructions earlier that output instructions.

@PaulBenthamcom Год назад

With regards the Pydantic Output Parser, when it gets the badly formatted output - do you get that as your prompt result or does the parser feed that error back to itself to correct it until it has a well formatted output to return to the user?

@samwitteveenai Год назад

It will give an error and you can set that to trigger an auto retry etc.

@cloudprofessor 3 месяца назад

How can we use output parsers with RetrievalQA ?

@bleo4485 Год назад

Hi Sam, thanks for the video. You should set up a patreon or something. Your videos have helped a lot. thanks and keep up the good work!

@samwitteveenai Год назад

Thanks for the kind words.

@BrianRhea Год назад

Thanks Sam! Would using an Output Parser in combination with Kor make sense? Is that worth a video on its own?

@samwitteveenai Год назад

At the moment all of this is changing with the OpenAI functions (if you haven't seen them I have a few vids about this ). Currently LangChain also seems to be rethinking this. I will revisit some of these. One issue is going to be are we going to have 2 very different ecosystems ie OpenAI vs everything else. I am testing some of the new things in some commercial projects, so let see how they go and then I will make some new vids.

@sethhavens1574 Год назад

i’ve noticed that using turbo 3.5 recently there is quite often issues with the model being overloaded - using langchain (at least i assume that is where this comes from) the chain will usually retry the llm query - is there a way to control the number of retries and the interval between retries? and thanks for the awesome content, super useful stuff! 👍

@samwitteveenai Год назад

I think there is a PR submitted to control the number of retries but don't think it is there yet.

@sethhavens1574 Год назад

@@samwitteveenai cool thanks for the feedback dude

@blackpixels9841 Год назад

Thanks Sam! Is it just me or do you also feel that the API is slow to return json 'code' than it is plaintext? Getting upwards of 30 seconds per API call to parse a PDF table into 250 tokens of json

@samwitteveenai Год назад

Interesting. I haven't noticed that. It shouldn't be any slower

@MrOldz67 Год назад

Hey @Sam thanks again for all these useful videos I was wondering would that be possible to use the same outputformer to get a Json file that later we would be able to use as a dataset to train our language model If yes would it be possible to bypass openai in this process and maybe use another Llm from a privacy perspective Thanks a lot

@samwitteveenai Год назад

Yes, absolutely you can use it to make datasets. Lots of people are doing this. It will work with other LLMs but most the open source ones won't have good outputs so they often fail etc.

@MrOldz67 Год назад

@@samwitteveenai Thanks for the answer I will try to find a way to do that. But meantime if you would like to make a video i'll be really interested :) Thanks in advance

@ohmkaark 11 месяцев назад

Thanks a lot for great explanation!!

@user-wr4yl7tx3w Год назад

By chance. Does LangChain have an implementation of the Tree of Thought?

@samwitteveenai Год назад

Not yet but I have been playing around with it. I want to make sure it works for things not just in the paper before I make a video.

@xiam19 Год назад

Can you do a video on ReWOO (Reasoning WithOut Observation)?

@samwitteveenai Год назад

Yeah it looks pretty cool. I will take a proper look.

@MichaelScharf Год назад

Is this not eating up a lot of tokens, especially the pedantic case?

@samwitteveenai Год назад

Yes it does eat up some more tokens, but the pydantic model really allows you to use the outputs in an API etc much easier. Regarding price it all depends on how much you value interaction. I see some customers are happy to pay a dollar ++ for each conversation which is a lot of tokens. Usually that is a lot cheaper than a real human being involved etc.

@picklenickil Год назад

This is what you call.. more than a party-trick.

@NoidoDev Год назад

I don't get it, maybe I missed something or don't know some important element. Why is the language model supposed to do the parsing as some form of formatting? Why isn't this just done in code with the response from the model?

@samwitteveenai Год назад

Getting the model to output it like that makes it much easier than try to write regex expressions for every possible case the model might output.

@shivamkumar-qp1jm Год назад

Can we extract code from the response

@samwitteveenai Год назад

yes take a look at the PAL chain it does this kind of thing

@ivanlee7450 10 месяцев назад

is it possible to use another llm for output parser

@samwitteveenai 10 месяцев назад

yeah certainly then it becomes a bit like an RCI chain which I made a video about.

@ivanlee7450 10 месяцев назад

How about hugging face model

@mytechnotalent Год назад

New to LangChain Sam and I appreciate this video. Really looking for how to tune this properly with the open-face HuggingFace rather than OpenAPI paid API.

@orhandag540 3 месяца назад

but wwhat if we want to to that with an open source LLM(Hugging Face) ?

@samwitteveenai 3 месяца назад

You can certainly do the same with something like a Mistral fine tune etc

@orhandag540 3 месяца назад

@@samwitteveenai but somehow the prompt template of mistral is not compatible with langchain models, I was trying to build this with exactly with mistral

@gitmaxd 10 месяцев назад

I disagree! This is one of the more sexy parts! It’s the hocus pocus of “Prompt Enginering”. Great video!

@ashvathnarayananns6320 8 месяцев назад

Can you post these videos using open source llm rather than using open ai APIs. Thank you

@samwitteveenai 8 месяцев назад

I have posted quite a few videos that use OpenSource models. One challenge is up till recently the OSS models weren't good enought to a lot of the tasks.

@ashvathnarayananns6320 8 месяцев назад

@@samwitteveenai Okay and Thanks a lot for your reply!

@clray123 Год назад

Honestly the more I watch about LangChain the less value I see in using it vs. just coding your own interactions with the model. It seems to be doing trivial things at a very high level of text processing and obscuring what it does. While you still have to learn the API and be limited by it.

@MadhavanSureshRobos Год назад

Practically speaking, isn't guidance so much easier and better to use? For practical reasons these doesn't seem to add more value

@samwitteveenai Год назад

I am planning to do a video on Guidance and Guardrails as welll.

@MadhavanSureshRobos Год назад

That'll be wonderful!

@hqcart1 Год назад

I've managed to get gpt3.5 to return json for 100k prompts, and it always returned json. it took me few hours to get the right prompt though!

@jawadmansoor6064 Год назад

What parser or other method do you use in chains? For example: memory = ConversationBufferMemory(memory_key="chat_history") tools = load_tools(["google-search", "llm-math"], llm=llm) agent = initialize_agent(tools, llm, agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION, handle_parsing_errors=_handle_error, memory=memory, verbose=True) I am getting output parsing errors: Thought:Could not parse LLM output: `Do I need to use a calculator or google search for this conversation? Yes, it's about Leo DiCaprio girlfriend current age raised 0.43 power.` Action:google_search` Observation: Could not parse LLM output: `Do I need to use a ca Thought:Could not parse LLM output: `Could not parse LLM output: `` Do you want me to look up more information about Leo DiCaprio girlfriend's current age raised 0.43 power?`` Action:google_search`` Observation: Could not parse LLM output: `Could not parse LLM o Thought:Could not parse LLM output: `` Is there anything else you would like me to do for today?`` AI:Thank you! > Finished chain. 'Thank you!'