Voice AI vs OpenAI Realtime API | SaaS Killer?

Подписаться 15 тыс.

Просмотров 5 тыс.

50% 1

In this video, I dive deep into the impact of OpenAI's new Realtime API on existing voice AI platforms like VAPI, Bland, and others. Many have asked whether this new release will replace these platforms, but I believe it will actually enhance them. I'll walk you through how the Realtime API improves latency, emotion detection, and speech-to-speech interactions, solving many issues we faced with voice orchestration layers.
Using an AI voice orchestration demo, I'll explain the pros and cons of both systems and why platforms like VAPI will thrive by integrating the Realtime API. Whether you're a developer or just interested in the future of voice AI, this breakdown will help you understand how these advancements will shape the SaaS landscape.
My resource hub:
hub.integratic...
Work with us 👋🏼
integraticus.com/
My Links 🔗
👉🏻 Subscribe: / @jannismoore
👉🏻 Instagram: / jannismoore
👉🏻 LinkedIn: / jannismoore
👉🏻 More ways to reach me: integraticus.c...
#VoiceAI #RealtimeAPI #SaaSKiller #OpenAI #VAPI

Опубликовано:

11 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 42

@tc7841 22 часа назад

Two other major limitations: 1. extreme costs. About a dollar per minute 2. Rate limitis. For most people, whose organisation is within tier 1, they only have a few minutes of talking before you exceed the rate limit...

@jannismoore 11 часов назад

That's all temporary, though :)

@entranodigital 4 дня назад

Amazing like always. Perfect Deep Dive!

@TrejonEdmonds 5 дней назад

Great overview of the huge boost that Vapi and other providers are getting with these updates. Thanks for the insightful share and your perspective on this opportunity.

@jannismoore 4 дня назад

Thanks, Trejon! Appreciate the feedback.

@jeelanshahtlyr6076 3 дня назад

Jannis is the ONLY way to go when it comes to Voice Callers and Automation

@slayga 2 дня назад

Amazing explanation Jannis! I have way too many projects on my plate right now but watching this is making me want to add vapi integrations too haha...

@aiamfree 4 дня назад

Sam Altman told everyone, do not build companies around the features of GPT or you will be made obsolete, literally said it… You want to use the power of GPT to actually offer something novel

@gbaked 4 дня назад

True that. Altman advised companies to innovate beyond basic integrations and focus on building unique, defensible products with long-term value rather than simply relying on the immediate functionality of existing models.

@dazdazfzf День назад

Looks like facebook era. Also A way to discourage competitors imo. Do average pro know de how to deploy any gpt app ?

@therealJonathanDC 3 дня назад

Great explanation! Do you provide the Miro board links to these diagrams as Zooming in on the images your Hub gets blurry and cannot read.

@BrockMesarich 5 дней назад

Banger after banger Jannis!

@jannismoore 5 дней назад

Thanks, Brock! Looking forward for your first custom-coded Realtime API app :)

@mussen1876 4 дня назад

Thanks for sharing your vast knowledge Jannis.

@SaminYasar_ 4 дня назад

My man never misses how are you pumping so hard man

@jannismoore 4 дня назад

I’m surprised myself

@juanschubert6164 4 дня назад

Your video is very cool and informative. The only thing I cannot confirm is the thing with empathy. I’m working since over nine years in psychology and we are about to give our best to implicate this empathy and we are 100% sure that AI has the ability to trigger and create emotions. ❤

@jannismoore 11 часов назад

Yep, empathy is still wacky, but it's getting a lot better. The fact that it can actually tell Jokes without simply talking monotone is already quite impressive. I believe it's not too long anymore until we have our first empathetic conversations. :) Very exciting times!

@autoreai Час назад

Thanks for the video. It makes a lot of sense.

@gabrielgiraldo206 33 минуты назад

Excellent explanation! Really informative. I would like to know if I can deploy my own services using the schema shown at 1:27. I want to deploy my own solution, but I would appreciate it if there is any material available, preferably the solution I want to deploy should work good in Spanish. I really appreciate any help or guidance you can provide.

@aiplaygrounds 4 дня назад

Alll over it. Great video

@HugoPodw 4 дня назад

Great video! (Cooking up something similar and this helped a lot with research)

@jannismoore 4 дня назад

Thanks, Hugo! Keep up your great content too!

@MehulRoxbusiness 5 дней назад

Great video!

@tijendersingh5363 3 дня назад

Is it cheaper on Vapi compared to open ai.. why Vapi is better then directly using it from OpenAI

@jannismoore 2 дня назад

It's not really about the price, but the utility - that's precisely what I explain in this video

@jamesballantyne9214 3 дня назад

Hume AI has had voice to voice demo available for a while yet it actually seems to fail on some of the issues you’re proposing it will fix, eg returns inappropriate tone, interrupts you more often assuming you’ve finished speaking, sped not noticeably different. Agree on its potential but yet to see results.

@jannismoore 2 дня назад

Hume isn't a native speech to speech from what I've seen so far. It still goes the same orchestration route, just with emotional recognition. I like their approach, but it feels still very clunky

@TashiDorjeLinas 4 дня назад

It’s $15+ hour for realtime API output. Too expensive to use in a product except for wealthy clients. Nice video, thank you.

@jannismoore 4 дня назад

You'll see those prices drop faster than the WiFi signal when you step one foot outside your house

@entranodigital 4 дня назад

@@jannismoore 🤣🤣

@MrDonald911 4 дня назад

AI engineer here, please don't pretend to know what's the architecture of the Realtime API. Even though OpenAI claims it's a speech to speech model, they could be doing what Vapi and the rest are doing, but just in a very optimised way, with AI models that are smaller in size and fine tuned for this specific task. In general all models provided by OpenAI are very opaque and nobody knows how they really work except the engineers who made them. Also, you could have also talked about data concerns. Many companies will not want their data to transit through OpenAI's servers so alternatives to the Realtime API who let you use your own LLM (like RetellAI) have an edge ;)

@jannismoore 4 дня назад

I doubt that. Yes, it’s true that end-to-end models mostly perform worse than cascade models/frameworks like the current orchestration layers, but if you’ve followed the current developments, there are many concepts being introduced that definitely have potential to achieve an actual audio end-to-end. But yes, OpenAI could potentially also use a cascade model, but I believe doing empathy recognition and some others of their introduced features is a lot harder. There are cool concepts like this one, which may have potential, but in the end we can only guess as of now what’s behind their setup: arxiv.org/abs/2405.17809 Given their claim that their model handles empathy and accents incredibly well, I’m still convinced that they use something else than regular text processing, even when using a cascade model. What are your guesses?

@aminbusiness3139 День назад

The gap between Open-AI’s voice model versus open-source + existing providers is just too massive … Open-AI voice outperforms other models in every measurable metric and the API is accessible with only a few lines of code If an enterprise isn’t comfortable using it for whatever security concerns they have , I’m afraid they’ll just have miss out then lol 🤷‍♀️ That’s what Startups are for anyways .

@greendsnow 4 дня назад

I don't like Vapi and realtime api pricing is abusive. Can you show us how to do: Twilio input > Webhook > Deepgram STT > Memory + AI > Deepgram TTS > Twilio output

@cscrowley1 4 дня назад

Why would you build with yesterday's stack? I agree about pricing, but Google and others will be snapping at their heels (as evidenced by NotebookLM). I doubt they can keep up their current pricing for long. My guess is that it is more of a provisioning thing. They don't want (are not ready for) mass adoption just yet. I still plan on building with it, but realize it will not be so attractive till the price drops. If you really want to to use yesterday's stack, have you checked out Groq? Their super fast inference makes up for a lot.

@marcc0183 4 дня назад

but if you want it in production you need to have a dedicated server that uses websocket clients to connect with the deepgram websocket etc... that's when vapi comes in

@jannismoore 4 дня назад

Vocode might be the right solution for you then. I don't recommend starting your own structure though, except if you have the resources to throw a dedicated team on it.

@jannismoore 4 дня назад

Many people underestimate specifically that part A LOT

@jannismoore 4 дня назад

Prices will definitely drop, and I don't think it'll take long either