This is great work! It seems like this approach limits us to one or the other, and from some testing the routing appears to be greedy. How would we have the option to use either static and dynamic routing?
We can include both static and dynamic routes in the same route layer - it will decide to not use a route if none are similar enough (ie returns Route(None)) - but we do want to add route specific thresholds to improve the flexibility here, and down the line some auto optimization seems reasonable too
@@jamesbriggs That's true that we can mix the type of route in the same route layer, but what I mean is that having a static get time method that modifies the query is no longer possible when you add the function schema. You'll end up getting some sort of TypeError or mapping error. Basically, I want to be able to ask "What time is it in Littleton Colorado" and "What time should I eat lunch" both of which would hit a time-like route but require different logic. And because the route layer appears to be greedy in the sense that if I make a route layer with routes ["get_time_static", "get_time_dynamic"] everything seems to route to get_time_static because the utterances are so similar and it was placed first. I hope my concern makes sense.
@@SaulRamirez-x6e okay yes I understand now, it should actually work if you set up the routes with enough utterances to separate them both - what I would recommend is to test with a ton of example questions and where you see the wrong route being chosen add the query to the correct Route.utterances list - doing this with enough examples should result in the correct routes being identified between them It has been awhile since the logic for route selection was written, so I may need to revisit - but the choice should be based on which route scores highest and should be independent of the route order
yeah, I'm yet to build a full agent using this, but I think we could build a fundamentally different type of agent with this idea - I'm sure there would be both pros and cons but I would love to see it done :)
Thanks for the video i really liked the concept. I wanted to ask you if dynamic routing being used needs just OpenAI encoding model for finding the dynamic routes? does the dynamic routing support multiple routes triggering if the query asked contains info about multiple routes?
I'm also looking for simultaneous function calling possibilities. I realize this gets more and more indeterminate with each layer of abstraction though, so I'm kinda half-expecting that kinda thing not to exist yet.
Hi James, what would be the main difference between dynamic router and doing RAG ? Let's say I have multiple functions. I don't want to pass all of them in my prompt. I could write a few query examples for each of the function and use a retriver to get the function that I need. What would be the advantage of dynamic router vs the simple retriver ? Thanks for your great content !
Hey JAMES, i more than sure that openai is doing the same semantic search when choose function calling. Did you try 10 different functions for Dynamic Routes and also have an option bypassing the function?
This is a more efficient and deterministic way of doing what OpenAI do with function calling, rather than providing description for when to use each tool (which is fed into the LLM call, costing tokens), we define a set of queries that should trigger a route (does not get fed into the llm, saving tokens and time). Because we don’t feed descriptions into the llm for decision making we can also scale the number of tools being used to hundreds, thousands, or more tools (ie routes)
Good optimization. I'm currently working on a research problem and will be done within this month. Post that my plan is to contribute to your repo. Also why not use a small LLM (phi) to generate code which you can further run on code interpreter? I'm saying small LLM because it generates fast
that'd be awesome would be great to have you contribute! We can integrate OSS LLMs now too, see github.com/aurelio-labs/semantic-router/blob/main/docs/05-local-execution.ipynb
it is essentially extreme classification problem, and I have seen that specific use-case applied to thousands of classes - so afaict at least that many - I will test this out though soon
less token input/output because we don't need a full agent description, with many different tool descriptions etc - instead we use semantic route to choose the tool, and pass it directly to the LLM which generates the response second reason for faster speed is that no all tools require dynamic (ie generated routes), some can be static, trigger a function to retrieve data (for example) and return directly to final LLM call - with that you're skipping one of the LLM calls
The LLM is only creating the function call output. It does not have to generate a „Thought“. This step is done in by the semantic routing which is less versatile but much faster and more stable (in the scenarios the routes account for)