No video :(

MOST Important AGENTIC Application - Speech to Text to AI Agents (TTS, STT, LLM Router)

IndyDevDan

Подписаться 17 тыс.

Просмотров 9 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

29 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 63

@marcossixto5324 4 месяца назад

2:24 I enjoyed the proof of concept in the beginning, because it helps me determine if it's worth tuning in to the video

@vincentjean6756 4 месяца назад

The GOAT of AI Coding has given us another gift. Thank you Dan!

@soulessshoe 4 месяца назад

awesom, i've found for voicellm, having something like vapi where you can interrupt or it waits for your pauses makes it feel so much better

@drumlivetonya 4 месяца назад

I like where you're going with this this is exactly what I've been looking for!

@AGI-Bingo 4 месяца назад

I made something similar right after OpenInterpreter launched. Some things add: TotalRecall, 1Thread, GMoE, AI Telepathy (between agents), Adaptive SkillLibrary, branched flows, self reflection, bechmarks, user/ai feedback, self improvement, async anticipate likely message for faster realtime responses, multilangual, web/api exposed with user permissions, mic threshold for continues flowing conversation + interrupts. Async tasks with updates/notifications/progress reports. Semantic routing instead of keywords. Background online research for urls not in training. Differentiate when talking "about" Ada and "to" Ada (right time to respond based on context). And a couple more.. i got a long list/ bingo board haha Lemme know if you wanna collab ❤ All the best!

@aurora.radial 4 месяца назад

Holy shit man, the tower is quite high uh? Thanks for sharing it, really interesting stuff. It makes sense to the name of your channel 😄

@AGI-Bingo 4 месяца назад

@@aurora.radial Thanks man! I'm working on both an intro to the channel - showcasing that bingo board, and also about the AAA+ framework (Advanced Atomic Agents), which is akin to AgentOS but already atomic and composable, with some of the advanced features I listed. Lemme know if there's one you prefer first :) All the best!

@blahblahdrugs 4 месяца назад

Do these applications use CrewAI or AutoGen or something else? I want to start building mine.

@jak-3D 4 месяца назад

I am working on personalized agentic assistance as well and would love to collaborate @AGI-Bingo

@blahblahdrugs 4 месяца назад

@@jak-3D If you want to collaborate I'll need your discord.

@kubasmide223 4 месяца назад

This is the best AI YT channel. Quality content Dan

@loryo80 4 месяца назад

This is a project that will make me living with my ai passion..I have already a lot of ideas and use cases that will make life easier.. thank you already for these videos series

@goforit5 4 месяца назад

Excited for series about Ada

@free_thinker4958 4 месяца назад

When indey speaks we gotta listen carefully 💯👏!

@AGI-Bingo 4 месяца назад

Boom! Let's make it happen!

@maskedvillainai 4 месяца назад

The second I heard you speaking in this video I knew it was legit. You’ve made a name for yourself that reflects stability, authenticity and authority. First time i saw you was the supabase tutorial on context injection and I’ve never learned ML the same way since. Cheers

@johnbarros1 4 месяца назад

I love this project, its very facinating and inspiring. Thank you for sharing this with us!

@jdallain 4 месяца назад

Seems like a great use case for langgraph where you can get finer control of your agents and their direction

@CHNLTV 4 месяца назад

Looking forward to riding along... I'm going to try use free STT & TTS libraries (FasterWhisper & OpenVoice) I love the concept and look forward to the innovations here

@oldmangrizzz 4 месяца назад

This is spot on exactly what I have been working on trying to build myself for the last eight months and doing a piss poor job! Strong work to you and your people all great start on your POC, I look forward to seeing snatching some of it up of ya lol

@MarkoTManninen 4 месяца назад

Deepgram with Claude tools (function calling) will let you near realtime voice to command to voice flow. But then you better let your assistant to do some real productive work to pay of the bill.

@SkyEther 4 месяца назад

Another masterpiece brotha! You are my go to AGI guy now :) I feel like I can build my AGI with your guidance! Would love to have a community of all of us together making this vision a possibility!

@reality-drift122 4 месяца назад

tis a beautiful prototype

@kevinrstruck 4 месяца назад

This is amazing. I am looking forward to this one. Thank you for sharing.

@krishnapraveen777 4 месяца назад

Awesome stuff. Really cool

@JohnLewis-old 4 месяца назад

Very exciting. Thanks for sharing. What LLM are you using in the background?

@aurora.radial 4 месяца назад

yoh man, This is extremely awesome! Thanks a lot for sharing and explaining it! I'll definitely keep an eye on the channel, and see where it goes. I'm also thinking about creating my own assistant, so it will help a lot. I'll try to share, if I get into anything helpful as well.

@josephtilly258 3 месяца назад

really cool

@josephtilly258 3 месяца назад

do you have a discord or something for your community ?

@ronaldokun 4 месяца назад

Your code theme is very cool. I don't know if it would strain the eyes if using it exclusively but I'm curious. Amazing work by the way. Keep up the great work!

@YossiDahan- 4 месяца назад

Reaching god mode

@christopheboucher127 4 месяца назад

so great !! look forward to use it ! Do you plan to implement later self improvement, like the learn function of open interpreter O1 ? And do you plan to make sort of wrappers (like MemGPT does) to use it with local llms ? Thanks for all your videos, thanks for sharing your skills and knowledge

@mikew2883 4 месяца назад

Good stuff! 👍

@TimeLordRaps 4 месяца назад

I realistically think it would be best to focus on the dynamics of routing. Creating a base routing agent flow probably is a good first step.

@TimeLordRaps 4 месяца назад

However, for videos focusing on progressive writing like you did in your agent os video where you are introducing high level ideas in greater detail progressively. I feel like this video was too focused on a working prototype, which I still appreciate due to your pride in it, but as a new viewer I believe that you excel in providing explanatory depth unlike most other youtube creators on technical topics. For example from my perspective I think this video could've been a dive through the motivation, multiple flows of multi-agent systems, other options (IDK), assumptions (You've taken and can be taken), solution (LLM Router), finalized with a call to action for people to build effectively with this solution. Starting the video with a question may be a powerful way to provide the motivation.

@TimeLordRaps 4 месяца назад

I think dynamic agent generation pre-routing is the next paradigm btw. Great content keep it up.

@TimeLordRaps 4 месяца назад

I already suggested the feature to langchain/graph.

@AGI-Bingo 4 месяца назад

I agree, and also bake benchmarking and feedback into it, so us humans and even AIs soon will be able to revise/make new routing flows, and make sure it improves on every iteration

@TimeLordRaps 4 месяца назад

@@AGI-Bingo I think we don't get self-improvement without explicitly having benchmarks to compare effectiveness on, that then progressively get beaten at lower token constraints, in all steps of learning. Feedback is the training loop already, what I mean is that at each point in the training new feedback is added, so we're probably 1 or 2 new feedback mechanisms for enhancing training, 1 of which is RHO-1 they study so much more about token dynamics in training than anything else I've seen, where they use a smaller model to decide high impact tokens to train a larger model drastically reducing the needed quantity of tokens in pretraining. This I would argue is a ai-ai feedback system. Normally though pretraining lacks effective feedback beyond the loss. Fine tuning is human - ai feedback, and partially human-human feedback, in the first we determine what they train on and that changes based on the ai's performance, in the second we observe what other humans do to decide new paths to expand the human - ai feedback. I have an interesting theory for a feedback mechanism to simulate thinking steps at the cost of inference compute.

@s_streichsbier 4 месяца назад

I'm just glad to know that I'm not the only one that starts prototyping with multiple copies of a main.py file :)

@brianmi40 4 месяца назад

YES, somewhere between Open Interpreter, Limitless/Rewind, and Rabbit with your choice of local or online LLM lies OUR FUTURE... The first to LAND THIS could be an overnight UNICORN...

@employaiptyltd 4 месяца назад

Craz🎉good. Well done bDan

@EntertainmentZone-jw6bq 3 месяца назад

you should make it always on, so that when it recognizes a possible prompt or command it activates, if that makes sense, as far as i am aware deepgram is the fastest stt and tts, maybe give it access to the content in your main display to have more context, just some ideas

@indydevdan 3 месяца назад

this is where we're heading. Just like you mentioned: always on, more commands, faster tts stt, lower costs.

@CYI3ERPUNK 4 месяца назад

subbed

@YorkyPoo_UAV 4 месяца назад

I think you need a beta tester that suffers with all things code and also somehow finds all the bugs. I might know the perfect person.

@seanzoso 4 месяца назад

Great insights. Question is the "from modules import llm" code available anywhere?

@MichaelWoodrum 4 месяца назад

I've been building a multi agent system that works with fastapi and WebSockets for streaming for a few months. It's modular and capable of running anywhere. How could this system being shown now, work without using a computer for interaction? Could you adapt this to web access?

@ryanscott642 3 месяца назад

what do you think about using something like octopus llm for the agentic routing?

@dcmumby 4 месяца назад

assemblyai have a super fast service

@aurora.radial 4 месяца назад

Good to know, thanks for sharing

@bernardo4290 4 месяца назад

This is fucking cool

@enton9422 4 месяца назад

I would like the ai have personality, and with one prompt installation sir

@toromanow Месяц назад

What package to install for line 13: from modules import llm?

@lokeshart3340 4 месяца назад

hello sir . Sir i have also mad an advance personal AI assistant with hand gesture and many advance features . I also need like this can you help me pls. Can we pls collaborate?😅❤

@AGI-Bingo 4 месяца назад

Agentic Developers.. Assemble

@lokeshart3340 4 месяца назад

@@AGI-Bingo lets gooo