Hi, great tutorial thanks! is there a way through which the final output of the chain could be streamed as one does with other non-agent or graph based models in langchain. I mean when using langserve and a simple LLM one is able to stream information as ChatGPT does, with words coming up, is there a way to do this and still use the graphs?
ou can also stream the final output if you skip the "RunnableLambda" wrapper that I used and write a real streaming interface w LangServe or another serving framework, but with agents most of the time is spent on intermediate steps. as we mention in the video, you can stream those intermediate steps and come up with ways to represent them in the UI to reduce perceived latency
well done. I'm always looking for reasons to avoid aws. Will give this a try with our AI stack. (I almost missed the intro to what modal was -- so went looking for it as a python library).
Its like Dwight from Office and Richard Hendricks from Silicon Valley became AI engineers XD No hate, I mean it with all the love. Love em both! Great video, thanks guys1