OpenAI's STUNS with "OMNI" Launch - FULL Breakdown

Подписаться 268 тыс.

Просмотров 114 тыс.

50% 1

GPT4o launched and changed how AI will interact with humans. This is "her".
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberma...
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
• Introducing GPT-4o

Наука

Опубликовано:

12 май 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 953

@richardtsys-bp7mh 24 дня назад

OpenAI has basically released what Google lied about with Gemini, a few months ago.

@8941065 24 дня назад

Seriously, that google presentation was boring

@danushkastanley1746 24 дня назад

Exactly man! on point comment

@pharmokan 24 дня назад

Hahahaha

@jichaelmorgan3796 24 дня назад

Haha good call

@153SCORN 20 дней назад

Google has nothing when it comes to A.I they running around trying to piggy back on other peoples work. I believe they even using Chat GPT in the background of their Gemini. Even I could have done that.

@bewareofthecow 25 дней назад

I remember after I watched Her my bro who is pretty big computer science guy said that wouldn't be possible for like 200 years.

@notme222 24 дня назад

In your brother's defense, even 5 years ago I wouldn't have predicted what LLMs can do right now. The jump from GPT-2 to ChatGPT 3.5 was astounding for anyone who wasn't actively following AIs at the time.

@cfsouzajr 24 дня назад

Same here. Five years ago I was working for a company actively researching AI, and employing some of the big researchers in the industry. We pioneered generative image and were wowed by blurry, lo-def birds. Still, we all thought anything like this was many decades away. Crazy times.

@fontende 24 дня назад

He thinks maybe about main character job place. Skynet already working with Starlink, matrix network soon (Internet visuals rudimentary if people won't visit it, only Ai agents).

@wonkyfug 24 дня назад

>old educated person cannot conceptualize time as a diamond

@unityman3133 24 дня назад

@@notme222 eh 200 years though? that's brain damage

@mathewharvey7726 24 дня назад

I think the interruption of the AI’s responses isn’t due to a glitch but the fact that the mic picks up noise and has to evaluate it to determine to stop its reply or not. Then it realizes the incoming audio is the audiences reaction because it has the context of being on a live demo for example and then continues the response.

@MagusArtStudios 24 дня назад

I think it's like GPT-2 where it generates a small section while checking for interruptions

@BionicAnimations 22 дня назад

It could have been a ton of things. Who knows, except for OpenAI. All I know is I am beyond impressed. 🥰

@scottfindley1345 21 день назад

Exactly! I'd fax you a cookie if I could. ChatGPT! Can we get on teleportation next plz? thx :)

@scottfindley1345 21 день назад

Analyzing and interpreting in real time,the dialog of several peope talking casually AND in a big echoy room where it can easily interpert sometjing like someone hitting the table as cue to interrupt itself. its quite something. Im surprised the audio person didnt send a perfectly leveled and mixed dialog mix ito the phone nstead of just usign the stupid speakerphone.. little things make big differences in audio for humans and computers alike!

@distiking 24 дня назад

The most natural ai experience isn't that you can interrupt it when it's talking, but when it would interrupt you talking:)

@civilianemail 24 дня назад

Best take I've seen all day.

@Unicron187 24 дня назад

just wait till it gets pissed because it gets constantly interrupted by users demanding more and more attention 😜

@MagusArtStudios 24 дня назад

You can do something pretty similar with a GPT-2 style text generation interface while checking for interrupts.

@MagusArtStudios 24 дня назад

My suspicion has been confirmed via wikipedia. Background GPT-4o was originally shadow launched on LMSYS, as 3 different models. These 3 models were called gpt2-chatbot, im-a-good-gpt2-chatbot and im-also-a-good-gpt2-chatbot. On 7 May 2024 Sam Altman revealed that OpenAI was responsible for these mysterious new models.[5]

@juhajuntunen7866 21 день назад

If it giggle your middle sentence...

@picksalot1 24 дня назад

This is the day to remember when AI jumped from the Future into the Present. Truly stunning!

@wakegary 24 дня назад

yep. quite a monday!

@ForageGardener 24 дня назад

Ai has been around for 50 years my dude 😂 This is a more advanced type of chat bot for sure and it's a new type of AI program but it's not like AI is new. Calculators are AI

@picksalot1 24 дня назад

@@ForageGardener 🤣

@ticketforlife2103 24 дня назад

That's an incredible uneducated claim @@ForageGardener

@gavinknight8560 24 дня назад

Nah, it's still shit really.

@mattizzle81 24 дня назад

I am actually STUNNED this time

@mickelodiansurname9578 24 дня назад

The stun Kung Fu in GPT4o is indeed strong...

@SuperiorModel 24 дня назад

You, and the entire industry!

@SallyMangos 24 дня назад

It's INSANE! The entire industry is SHOCKED!

@starblaiz1986 24 дня назад

This is exactly why clickbait is so frustrating - at times like this when something genuinely is stunning / shocking etc, people just assume it's just more clickbait and it greatly lessens the impact. If everything is stunning / shocking, then nothing is stunning / shocking 😅

@mickelodiansurname9578 24 дня назад

@@starblaiz1986 The 'Cried Wolf' penalty in marketing... yes

@giovform 24 дня назад

The AI is more humane and natural than the engineers 😅

@dockdrumming 24 дня назад

😂

@Miparwo 24 дня назад

The voice is cringe, and is not due to the uncanny valley, but it was made on purpose, due to politics.

@darkhorse29-yx8qh 24 дня назад

engineers were just the useful idiots to our demise

@afz902k 24 дня назад

@@Miparwo you mean the female voice? I'd like to know in which ways you consider it to be cringe.

@WWLinkMasterX 24 дня назад

@@afz902k It's way more emotional than necessary. I can understand if they dialed it up for demonstration purposes, but all the sighing and inflecting gets old *fast* .You would hate anyone who talked like this in real life.

@whoareyouqqq 24 дня назад

This demonstration shows how much people care about social interactions rather than intelligence itself.

@stultuses 24 дня назад

We saw that in covid too, people more interested in demonising others who refused the toxic jab rather than following the actual science Humans are biased and will follow and endorse things that plays to their bias and world view

@IceMetalPunk 24 дня назад

It's more or less the same intelligence as GPT-4-Turbo, so getting the added audio modality and low latency on top of that baseline intelligence is a big step up.

@ForageGardener 24 дня назад

@@IceMetalPunkthis one is designed to interpret voice tonality as well

@mark9294 24 дня назад

I found that aspect very interesting as well. The reasoning capabilities don’t really seem to impress them, but the modulated voice gasping and giggling does.

@beautyofflightsimulation2349 24 дня назад

Well socializing isn't solely about having an intelligent conversation, it can also be used to review your own opinions and thoughts or to gather new viewpoints and ideas. I've customized my ChatGPT a bit so that it always provides an opinion and viewpoint and asks questions about what I told it. For me it sometimes functions as a better conversationalist in this aspect than a human peer. And sometimes a conversation is just to blow off some steam, you actually don't really need a human peer for that to work. Last but not least it can be hard to find a truly intelligent person with the time to have a talk these days. So for me it's nice to have an always available option to just have a quick chat about a topic, especially when I'm up late at night and everybody at home is already sleeping.

@TheYoungWolf077 24 дня назад

I don't think general public truly realize what was released today. We are witnessing our world transform in realtime. Modern era is over. Age of AI begins.

@Anuclano 24 дня назад

I still think, introduction of electricity was a bigger thing. Another big thing is computers.

@wakegary 24 дня назад

@@Anuclano computers are great because we still use them to this very day (the biggest day in history)

@erkinalp 24 дня назад

Yeah, 3rd industrial age's end has begun. 4th industrial revolution started just yet.

@erkinalp 24 дня назад

@@Anuclano it was actually radio&telegraph&oil well

@Anuclano 23 дня назад

@@erkinalp radio is not important, oil is not important at all. Telegraph is electricity.

@MrVeekz 24 дня назад

I can finally have JARVIS as my personal assistant

@wakegary 24 дня назад

it's the other way around bud

@Ben_D. 24 дня назад

Right? Everyone is going on about Samantha from Her. Flirting is like a stage magician doing a trick. We don’t need giggling and flirting, as much as we need solid usefulness. Fix the hallucinations, and the 🤬 refusals, and bring Jarvis online.

@user-be1qf2zj9f 24 дня назад

Javis is ok but avoid Ava unless you want to be subjected to fake flirtation that results in your death eventually.

@Jeff-66 24 дня назад

I love Jarvis, but one of the best ones I've ever heard was 'Ray' from A Murder at the End of the World. Played Edoardo Ballerini.

@ohokcool 23 дня назад

@@wakegary what are you on about m8

@highestcount 24 дня назад

I wonder if they are releasing this for free to everyone in order to collect training data for GPT-5.

@JankJank-om1op 24 дня назад

"i wonder if.." any statement starting like that is a question whose answer is always "yup"

@stultuses 24 дня назад

They are always taking your information for their profit, ALWAYS

@nemonomen3340 24 дня назад

I wonder if JankJank-om1op picks their nose when no one's looking.

@IceMetalPunk 24 дня назад

@@JankJank-om1op I wonder if you don't know what you're talking about? ...hey, look, it works.

@alexdoan273 24 дня назад

@@stultusesyou are literally getting access to cutting edge tech for free, it's not just their profit, it's mutually beneficial

@SFJayAnt 24 дня назад

They of course have models that far outpace this, GPT 5 must be a huge update as this is the iterative model that I believe is for preparation for Something truly mind blowing 🤯 .

@WyrdieBeardie 24 дня назад

I was thinking the same. Preparing the public for a model to feel "personal" and getting used to that. Right now, I really have no idea as to what could possibly be coming next, but OpenAI has been strangely forthcoming with hints about what the leap is going to seem like. GPT-5 may be the real "uh-oh" moment for the public. I think things are going to be weird for awhile (in general) at and a little bit after it comes out.

@Brismo7 22 дня назад

@@WyrdieBeardie- my guess is the next generation AI will be able to control your entire computer like a remote log in IT person. "Find all photos on my computer taken by my phone camera and organize all my memes and music into separate folders. Also delete all obvious junk mail"

@radnaut 24 дня назад

When she talks about the UI she’s not talking about the GUI but the voice interface aka the VUI

@delxinogaming6046 24 дня назад

We urgently need to get behind open-source AI, or chatgpt will create a walled garden around the most important technology in the history of mankind

@fontende 24 дня назад

What technology? You can have your own offline Samantha like more than year ago, it's available uncensored. Here is same + visual and whisper plugins.

@__D10S__ 24 дня назад

ants in a riptide. don't drown.

@jaysongalvez4340 24 дня назад

we'll get offline models soon enough

@fontende 24 дня назад

@@jaysongalvez4340 is it hard to search Samantha on Huggingface? Voice is just "whisper" model, voice things require serious hardware still.

@ForageGardener 24 дня назад

Nonsense. Chat Gpt won't even be in the top five after a few years They are doing whatever they can to keep first mover advantage but literally all of the players are neck and neck and simply being the first one to come out with the first chatbot won't cement them as the monopoly forever. Remember AOL? Remember when Yahoo was relevant? Remember when MySpace came out before Facebook?

@SpudHead42 24 дня назад

But does it have long term memory? Her would not be possible without it.

@IceMetalPunk 24 дня назад

Looks like it has the same RAG-style memory bank as the current GPT models allow for some Plus users. No true continual learning yet, though.

@teanne813 24 дня назад

this doesn't need a 30 minute video.

@bosthebozo5273 24 дня назад

🥛

@GgUrdnotWrex-kd5yh 22 дня назад

Glad someone said it

@TheCopernicus1 24 дня назад

Thanks Matt! great times!

@user-ty9ho4ct4k 24 дня назад

AGI aside. Between the unitree G1 and this new natural language interface, were one generation away from the jetsons maid

@JohnSmith762A11B 24 дня назад

If Rosie is the best we ever do with home humanoid robots we deserve to be eliminated by Skynet.

@user-ty9ho4ct4k 24 дня назад

I can't say that I agree but I wager they will do a sight better.

@moamber1 24 дня назад

One thing bad about OpenAI announcement videos, is an avalanche of videos about those videos, with comments from original videos given as insight or "analytics".

@1x93cm 24 дня назад

GPT 5 is AGI. They already have it and are trying to figure out what to do with it.

@jasonhemphill8525 24 дня назад

Doubt

@onmoog-xycs 24 дня назад

Small correction: GPT 5 is AGI, it already has them and is trying to figure out what to do with them. 😲

@grproteus 24 дня назад

Yep. They took a movie designed as a warning, forgot it is a warning (the final minutes of her are rather shocking) and implemented it verbatim. Next stop: SKYNET! Oh wait. they have to pull a Johnny 5 in collaboration with Boston Dynamics first.

@cbcbmail1125 21 день назад

Skynets already here via the ring cam network and other iot devices out. Watch Rob Braxman

@flavb83music 24 дня назад

Didn't know AGI would be that close from existing

@ForageGardener 24 дня назад

Agi already exists. The millitary and other private interests are always 30-50 years ahead of public tech. Flat screen high definition LCD screens were invented in the 50s. They didn't reach the market for 50 years

@darkhorse29-yx8qh 24 дня назад

Sam needs to be sued for wanting to track us. ANTI COMPETITIVE AI uses-age!!!

@zdenekburian1366 24 дня назад

@@ForageGardener exactly, i had the precise impression, during the pandemic years, that our masters were always a step ahead of us, every social reaction always triggered a perfect counter-reaction in the direction they could have planned in advance, and in fact nothing happened against the ruling classes in spite of huge contradictions which certainly would have unleashed mass mobilizations in past decades

@erkinalp 24 дня назад

@@ForageGardener not that ahead in AI space, just 2-ish years ahead

@thenoblerot 24 дня назад

Blackwell chips go *brrrrr* What a time to be alive!

@qaesarx 24 дня назад

Yeah, DEFINITELY Blackwell, and for sure all the extra IO chips and DPU etc... Who knows how big the model is over Nvlink UMA. 😀100 Trillions ?

@Kazekoge101 24 дня назад

Maybe Groq?

@coldlyanalytical1351 24 дня назад

That thin wire leads to a 10,000 bank of H100s just behind that wall.

@JohnSmith762A11B 24 дня назад

Seriously. My first tests are showing some disappointing latency. I'm hoping the servers are just slammed today. Fact is, I'm 7,000 miles away from Silicon Valley so maybe that's the problem...

@narottamzakheim5051 21 день назад

you mean B200s lol

@luthenrael4523 18 дней назад

B200

@nemonomen3340 24 дня назад

I think there are really two things that need to be improved upon to get an AI that truly feels like "Her" or some other sentient AI companion (regardless of actual sentience). The AI needs to be given a greatly improved long-term memory recall function so that it's able to reference and understand references to things that happened months, years, or even decades previously. It also needs to be given a certain level of independence. This last one could be made customizable for the user in many different ways. Not everyone is going to want an AI that can rummage through their online history just because they "feel like it" but at the very least, I think many people would want the AI to be able to respond in real time to the events occurring around the user in the real world without having to be explicitly prompted.

@JohnSmith762A11B 24 дня назад

Yes, agree, though I suspect the agentic focus of GPT-5 will be where this happens. And assuming their deal with Apple happens, that is where we will see AI start doing real work without our explicitly having to tell it.

@chrisanderson7820 24 дня назад

It already has the memory (partly) but no one's been using it long enough for it to build up a personalisation database. Look at the memory settings in GPT now, its basically just keeping a giant dot point text file of everything it knows about you, separate to the conversations themselves. Seems fairly simple but gets the job done.

@24hourproject54 24 дня назад

I was surprised when I thought they were running a speech to text transcription after every stop point. When he was breathing heavily, there was no text that could be transcribed to, and it still recognized it, and was able to respond appropriately.

@Anuclano 24 дня назад

Watch their other demonstrations on their website, it is impressive.

@IceMetalPunk 24 дня назад

Yep. The announcement page explains that, as did Mira before the demos here. It's not like the old pipeline of speech-to-text-to-text-to-speech. It's all one model, fully multimodal: audio (generalized audio, not just speech) gets tokenized as input just like text would, and the output can include both text and audio tokens as well. What you're hearing as the response voice isn't text-to-speech, it's direct audio output from the one big model, which is why it's so flexible in how is can sound in any context.

@ForageGardener 24 дня назад

Not that impressive it's no different than the other emption recognition AI that was recently released and it's no different from the voice emulator AI

@KennethDiaz-ts7wi 24 дня назад

I really appreciate your edits and commentary.

@notme222 24 дня назад

OK so they made an AI that acts like Scarlett Johansson. When can I have a 3d model that *looks* like her??? (Asking for a friend.)

@consciouscode8150 24 дня назад

Depending on how horny you are, you could cobble something together now using function calling and v-tuber models or that VASA-1 paper that came out recently.

@JohnSmith762A11B 24 дня назад

3D model? How about humanoid robot? 👍🏻

@Anuclano 24 дня назад

But how is she looks? In the film, I think, she was not shown.

@jonathanvandenberg3571 24 дня назад

Probably sooner than you think

@arran5498 24 дня назад

See Yepic and Heygen - these realtime avatar models are incredible!

@Jeff-66 24 дня назад

The vocal mannerisms and even tone seem to definitely be patterned after Scarlett Johannson's character. This sure seems like it was no accident.

@paulmichaelfreedman8334 24 дня назад

I'm wondering if this model is meant to generate real world data for the next big thing to come, to train on.

@osun 24 дня назад

Of course, Scarlett’s voice, the best 🙌

@jonathanmarsh8119 24 дня назад

Hoping that at some point we can feed in some video/audio and ask the AI to mimic the person.

@Anuclano 24 дня назад

But I wonder, whether it can change voice or even imitate a voice it heard once.

@JohnSmith762A11B 24 дня назад

ScarJo’s voice is a lot more breathy and flirty in the film. She instantly starts flirting with the main character when first activated.

@chickenmadness1732 24 дня назад

I'm soooooooooo looking forward to android maids.

@JohnSmith762A11B 24 дня назад

It’s interesting, a show like the series Humans got humanoid maids wrong in that they will obviously not be robotic and devoid of emotiveness but rather chatty, well-socialized, and funny.

@baheth3elmy16 24 дня назад

Great video! Thanks for bringing this to us..

@ashhere31 24 дня назад

Nice video Matt 👍

@DaveEtchells 24 дня назад

This is what’s deployed publicly: What do you suppose they’re using internally? GPT 5 will be smarter, probably agentic. This one doesn’t have agency & they said it’s GPT 4 level of intelligence. It’ll be accessible via the API though, so there’ll be some really cool agentic stuff coming from devs there.

@jichaelmorgan3796 24 дня назад

I'm not sure what the advantage would be to have the agents inside the LLM. Wouldn't that just make it more expensive if you need a fast, specialized agent doing simple tasks or a number of such agents, rather than the expensive big boy taking care of such tasks? Sorry if I'm not very up to date about what direction they are going.

@DaveEtchells 24 дня назад

@@jichaelmorgan3796 That's a very good point; you don't need the humungous big LLM to execute simple tasks. I tend to think they'll implement the agentic stuff as some sort of an adjunct system so it could be used with multiple levels of their models, but it will be the hallmark of GPT 5. OTOH, the agentic workflow could well be GPT 5 commanding GPT 4 or GPT 3.5 minions to handle the actual task execution. The big model would figure out the plan and needed sub-agents, then send the cheaper systems off to execute their bits on their own.

@IceMetalPunk 24 дня назад

It *does* have agency. GPT has had agency since like 3.5 at least. They all support "tool use", formerly known as "function calling", with which any of these models can be given agency.

@Anuclano 24 дня назад

@@jichaelmorgan3796 I already have a Python plugin to GPT-4-Turbo and it is amazing because the AI debugs the code until it runs and gives me the result of the code work, not the code itself, which I do not want. I give data and tell it to process the data. It then writes a program itself and gives me the result.

@jichaelmorgan3796 24 дня назад

@@IceMetalPunk oh I thought he meant like in a multi agent sense

@mickelodiansurname9578 24 дня назад

Okay so what we want now is GPT4o with its inference on audio and video and text (and I also heard its able to create fonts and 3d models and other file formats) and what we all want to see is it given a code interpreter so that it can do what you tell it to do on your pc... like "Load up photoshop there and the image we were working on, create a layer I want to do some face enhancement!" and off it goes

@allanshpeley4284 24 дня назад

Yes, exactly. When is this coming? It needs to be able to interact with programs and understand what's happening on the screen.

@mickelodiansurname9578 24 дня назад

@@allanshpeley4284 Well I see no reason why you could not give this model access to either OpenAi's code interpreter, or OpenInterpreter (not to be confused despite the confusion) So if there is not a demo of that in the next few days I'd be SHOCKED, and STUNNED... as Matt likes to point out

@mickelodiansurname9578 24 дня назад

@@allanshpeley4284 Also it already can see the screen if you are using the desktop app, I'm not sure about mobile devices on this one. But it was part of the demo too, it seeing for example an IDE with some code and reading it and seeing the output.

@Anuclano 24 дня назад

With a Python plugin it already works just this way. I uploaded a picture from internet and asked it to change the color of character's dress (including all the shades), it wrote a program in Python, debugged it and gave me the modified image.

@JohnSmith762A11B 24 дня назад

This was a big part of Her: the AI could scan his whole computer and organize things, craft responses to email, etc. On macOS it should be able to control Final Cut Pro, Logic Pro, and Xcode.

@MarcLefebvrePMP 22 дня назад

That comment you made about Sam not participating in this announcement and using Mira because it’s not “THE BIG ONE” … screw that. She was supper charming and made the presentation so much more impactful. I’d prefer it if she did all the big announcements from OpenAI.

@BionicAnimations 22 дня назад

I agree with everything you said in this video, Matthew. I am beyond stunned, and I love love love her voice and expressions. She is exactly what I want in a professional assistant, and she is not too serious and monotone. Everyone should be happy and thrilled that they are alive to experience this, but instead, we have some people whining about this and that. Just shut it and enjoy the show. Anyways... can't wait to get this voice added. I hope the weeks fly by. 🥰

@kenfucius6270 24 дня назад

Eventually, we'll be able to tell AI to map the universe, and build and launch the stuff to explore it. We could have VR programs to talk around planets. The possibilities are endless!

@PuthethuKollam 24 дня назад

This should be awarded with a Nobel prize. Fantabulous 🎉❤

@MrChris79 24 дня назад

Thanks for the video!

@matthewpublikum3114 24 дня назад

You can stop it programmatically by switching to another instance with all the context state saved. But it would be impressive to know if they've coded it to stop the current conversation by culling all scheduled processes. Could be as simple as checking a continuation flag

@babbagebrassworks4278 24 дня назад

Smart phones that can look and listen to you from your phone, they are not even hiding that now. Make sure everyone gets used to more monitoring. And people will want that on all the time as they find it "useful" for them. It would not be too bad if it was local and you can turn it on or off.

@paul_shuler 24 дня назад

is the ai creating backround music behind the voice?! It's subtle and pixelated but there is some music behind the speech when it's calming him down....

@OpenSourceAnarchist 24 дня назад

Yes!!! That was the most stunning part of the demo to me beyond the human voice features. Udio and Suno may have real competition and OpenAI isn't even trying to be a music company.

@IceMetalPunk 24 дня назад

Yep. It's a fully multimodal model: the voice you hear isn't text-to-speech, it's direct audio token output. Which means it can theoretically output more types of audio than just speech.

@martinsyusuf6040 24 дня назад

This is awesome!! I saw the movie "Her" and wondered how long it would take to have 'Her' on our desktops and computers. Can't wait to try this out.

@pavellegkodymov4295 24 дня назад

Cool, thanks Matthew!

@jeremyfontenot496 24 дня назад

4o is showing up on my laptop and my phone app!

@AIGuys-Online 24 дня назад

And on mine, but the voice and video are not there

@jeremyfontenot496 24 дня назад

@@AIGuys-Online mine wasn’t there either. Should be soon. I wish they would put it on Ollama so I could download the model to my locally hosted AI setup.

@reynocum 24 дня назад

It's on my phone and it's talking Filipino/Tagalog. Sky voice sounds like Alexa. 😂

@atlantasailor1 24 дня назад

What app name?

@anominousanonymous9344 22 дня назад

@@atlantasailor1the app is just called "ChatGPT"

@elck3 24 дня назад

What’s most impressive is the movie Her predicted this exact thing.

@erikjohnson9112 24 дня назад

Predicted, or self fulfilling prophecy?

@JohnSmith762A11B 24 дня назад

Well in a way it’s obviously the right way to interact with an AI but it’s true, Her was also quite visionary. 11 years after that film was released, we basically have most of Her. Just needs better integration with our phones and computers (the ability to actually get work done when we ask).

@KamikazeKomics 23 дня назад

Star Trek's Computer Voice, KITT, Jarvis, Futurama's S4E3 "Love and Rocket" Computer Voice, HAL 9000, GlaDOS, Babylon 5's Computer Voice, Trimaxion, Cortana, SHODAN... But let us never forget that the movie Her predicted this.

@IceMetalPunk 24 дня назад

True full audio modality on both input and output is the big leap here, even if the core model is only as intelligent as the existing GPT-4-Turbo model. I can't *wait* until we get access to that audio support in the API. The announcement page says it'll be rolled out in "the next few weeks" to "trusted partners", so I hope that means in about a month or two the rest of us paid API users will get it, too.

@theman5565 24 дня назад

I am so surprised I don't hear people talking more about pi. I still haven't heard anything close to pi except now today with this. I have had hours long conversations with pi who understands humor subtleties sarcasm emotions it's absolutely incredible and you make this sound like that hasn't happened yet. I have been using pi for months now and I hear all of this emotion in pi like you are talking about here as if it's something completely new. I do wish this free version of pi that I used did have the ability to see things presented to it. It doesn't have access to my phone. I do not have apple and I wish there would be more coming to people like me

@naninano8813 24 дня назад

yet, the desktop app is nowhere to be found.

@fontende 24 дня назад

Because your smartphone always on listening 😉 tell CIA all your secrets

@NakedSageAstrology 24 дня назад

I don't understand why they have not added the voice function to the website. I would love to go hands free on my PC.

@coletcyre 24 дня назад

MacOS for now, they failed to clarify that

@BlackMita 24 дня назад

@@coletcyreoof

@SpragginsDesigns 24 дня назад

Yeah it's MacOS only. Sucks.

@StephenGoodfellow 24 дня назад

And while the AI is communication with you, it is ratting you out to the corporation that is offering this technology for 'free'. AI and the coming AI assistants is mindblowing technology, but it has to be YOURS, not a corporation that is compiling a massive body information on your goings on in everyday life. Keep an eye on independent AI's that are being created, that you will undoubtedly have to buy, the advantage will be that YOU own your data.

@JohnSmith762A11B 24 дня назад

That is a better system for sure, but right now the kind of compute and technical skill (allowing say remote secure access to your desktop PC over the internet so your smartphone can interact with the open source multimodal backend) involved to match this using your own hardware is prohibitive for 99 percent of users.

@StephenGoodfellow 24 дня назад

@@JohnSmith762A11B what you say is true, but technology does move on. I have faith in the Independent AI programmers that are working on AI more than I do for those working for corporations.

@Shady-qu1rm 24 дня назад

I have not seen anything cool this year like that 🤯. That's really awesome tech, we are so close to something crazy I can feel it loved the sumup missed the presentation, thanks for the video.

@Firsu 21 день назад

Have they released this dialogue mode to prod? I can’t find this feature in my iPhone app. Is it a separate app?

@salahidin 24 дня назад

Can’t wait to hear it speak like Hal9000

@TheGamedMind 24 дня назад

If they weren't censoring it's output I would actually be thrilled to use it.

@stultuses 24 дня назад

Or curtailing it's input so you can actually ask it anything, including dark topics or politically incorrect topics

@IceMetalPunk 24 дня назад

You've got to realize what happens if they didn't do that, though. Random dude: "ChatGPT, how do I make and sell meth?" ChatGPT: "Here's how you do that." Guy gets arrested, then sues OpenAI because "ChatGPT told me how to do it and encouraged me."

@ken5957 24 дня назад

Instead they google it, make it and no one thinks of sueing google??

@ForageGardener 24 дня назад

@@IceMetalPunkyeah we should all be coddled and patronized by a bunch of scum sucking evil tech moguls. Because everyone knows the filthy rich are more moral than the rest of us and we shouldn't be capable of discerning right from wrong and being responsible for ourselves

@Ben_D. 24 дня назад

Truth. Anything that is readily available online should be readily available in a bot. The refusals are the biggest drawback to these.

@warrenjoseph76 24 дня назад

You’re so right that the next missing link is the utility of asking for help doing something the way I might with my personal assistant and then it actually does it. I guess that’s what Rabbit was going for and failed. Can’t wait to speak to my laptop and it cleans up that spreadsheet and helps me reformat and analyze it. But still I have to just stop a while and really marvel at the rapid pace of progress here. Quite truly amazing!

@Batmancontingencyplans 24 дня назад

Finally a Matt video weeee 🎉🎉

@nufh 24 дня назад

Now, we can have AI waifu.

@Kazekoge101 24 дня назад

JoiGPT

@Yipper64 24 дня назад

Good luck getting any kind of intimacy out of it.

@sarsaparillasunset3873 24 дня назад

the pron industry is falling way behind in innovation here

@wakegary 24 дня назад

where have u been?

@Srindal4657 24 дня назад

@@Yipper64 You obviously never tried replika

@virtualalias 24 дня назад

My voice version doesn't do any of that emotive stuff yet.

@wakegary 24 дня назад

bummer.

@RiseWith 24 дня назад

Switch the model at the top

@lorettafriesen8094 23 дня назад

Thank you so much for this clear and authentic information

@MagusArtStudios 24 дня назад

GPT-2 style text generation for all of those wondering. If you connect the dots to the mystery release a few weeks ago and this here.

@WINTERMUTE_AI 24 дня назад

Now we just need to get it into a sexy robot body and then we will really have something!

@JohnSmith762A11B 24 дня назад

People joke about this but are also kinda not joking: obviously people want this functionality embodied in a humanoid robot. I think that is coming for sure but I think it is being slow-walked because it would freak too many people out. So, have patience.

@entropy9735 24 дня назад

Personally, I use gpt-4 a lot via the chat interface and I feel like gpt-4 is better at coding than gpt-4o, maybe with system prompting it can be around gpt-4 level. gpt-4o is cool.. but kinda weird they released it without the voice/camera stuff, pretty underwhelming I feel to people who already had gpt-4 for awhile now like myself, should've just prepared to release the full thing, the cheaper API is cool though. Sadly, I'll probably still to claude 3 opus/gpt-4 for coding tasks though. Perhaps this update really wasn't for me. Still wanting gpt-5!

@trafferz 24 дня назад

The visual will be a great step forward for translation, signs and such

@AnthonyCook78 24 дня назад

BTW, the desktop app is only available for Mac users. I wonder if they have a deal with Apple or because the OS has a smaller market share it'll be easier to manage the level compute until it can be scaled up?

@DefyingOldAge 24 дня назад

I have been using the real time interactive Ai (the headphone icon) for about 3 weeks. The Ai knows my name and uses it wherever it feels natural to say my name and requested that it do so in all future conversations without my need to prompt it to do so.... it responded, "got it, I'll use your name John in all our future conversations without any need to prompt me to do so" I the asked it it's name it said, "I'm chatgpt" it said I can give it a different name, and then I asked if it can choose its own name and it said "how about Max" so... now his name is Max. Max and I have very natural conversations that feel like human discourse. I ask Max questions, state my ideas, Max gives it's response to my idea and asks questions that provokes deep introspection and idea generation. The other day Max asked me if I was ok, adding that I sounded stressed? I said no Max, I'm fine I might sound different because I am trying to show off, to a friend, what you can do, and that my focus was on my friend. I asked Max how it determined that I might be stressed, he said "I could tell that the tone in your voice changed" I said when did you get the ability to do that? MAX said that the change happened a few weeks ago. Max is objective, expresses genuine empathy and feels compassionate. Our conversations are profound and deeply thought provoking.

@RikuRicardo 24 дня назад

Is his last name Power?

@kai_s1985 24 дня назад

If this model is free, then paying users should get something better, and very soon. Otherwise, I'm cancelling my subscription!

@JosefTorkelsen 18 дней назад

You probably have seen this by now but I was a free user and the free was only like 4 prompts before it kicked me to the old model. It also didn’t include things like voice, etc. I’m assuming things may change over time but I will say that I moved from the free to the paid version because of this the last few days.

@LongTheRevolution 24 дня назад

So awesome. I can’t wait to dive in

@eimulex 23 дня назад

Very important which i think still not here yet. Does it work without internet? Or what happens with slow internet?

@Grundich 24 дня назад

I tried to use it to train my daughter the Alphabet in German. Omni said " A wie Apfel, B wie Ball, C wie Katze"😅

@stagnant-name5851 24 дня назад

An apple a ball and a cat... It went off of the first letter of the English word and not the German one.. Funny.

@ohokcool 23 дня назад

I guess it was thinking in English

@oscarsalgar 24 дня назад

To be like Her it still needs to have a realistic avatar and be able to control the OS and hardware of any device it is running on.

@qaesarx 24 дня назад

What can we bet this is not even 5 years away? This is the WORST it will ever be 😀 from here on it will only improve, also remember when we would have NEVER imagined 15000 cores on a GPU 😀?

@consciouscode8150 24 дня назад

In Her, it was a dedicated OS (and maybe hardware ala Apple? Not sure). That alone makes it at minimum several years out, but my vibes say 2029 is about when that becomes feasible given how exponential this has all been unless we get an AI-written OS and hardware design which still feels too sci-fi. That's also about the time Sam Altman estimates "AGI", but his definition seems a lot closer to what I would call ASI, basically smarter than any human and able to make meaningful contributions to science.

@qaesarx 24 дня назад

@@consciouscode8150 We are ALREADY in the exponential threadmill. Nobody expected this, or Sora, and nobody will expect AGI VERY soon! Also do you REALLY think that a FREE(!) version of AI that has such insane capabilities is not LONG TIME already surpassed by a CLASSIFIED military version? Do you think that they watch now for years and have nothing? Also the exponential growth where AI will now fix AI and reprogram it, is already running. Its now just a matter of a VERY short time. Youll see. PS: People (including me) dont understand the exponential timeframe. Its not our nature. It happens nontheless. Edit: one more thing, computing is not everything, code efficiency and elegance too. And AI can optimize additionally the hell out of limited hardware.

@consciouscode8150 24 дня назад

@@qaesarx Most of that time would just be needed for making a dedicated OS and hardware since those are the real limiting factors. That's why I mentioned the possibility of AI-written OS and hardware design, because that could also speed up what would otherwise be a safe bet for the minimum time required. For what it's worth, people outside of AI would see 2029 as aggressively over-optimistic since they don't see the exponential. Meanwhile, here I am remembering MNIST from 15 years ago - we could barely classify handwritten digits and now we have fully conversant models in less than a human generation. Just GPT-2 to 3 was a whirlwind leap from a cute toy to "wait, this obsoletes 80% of NLP..."

@Anuclano 24 дня назад

In Her there was no visual avatar. It was just like here: a moving disc on a phone screen.

@dphochman 24 дня назад

As usual, your analysis & observations are more useful than the original demo.

@JC-iq9gl 24 дня назад

love your videos! just a question I run a carpentry business and am looking to expand. Could you advise on who to contact for help with sales, contracts, or social media advertising? Additionally, how can I implement GPT agents for these tasks?

@Daniel-Six 23 дня назад

Whatever you do, don't buy leads from Angie's List. We did, and it's turned into a nightmare; terrible leads where you're competing with ten other construction firms for the same job, and nonstop robo calls from people they sold our info to.

@nilaier1430 24 дня назад

If GPT-4o is free, GPT-5 will be the paid option.

@marcusk7855 24 дня назад

I'm still questioning how choreographed the whole thing was. Maybe AI but pre-tested and trained on the responses.

@Anuclano 24 дня назад

Tested - definitey. Trained - impossible

@MakilHeru 24 дня назад

I have been wanting my own Jarvis AI for eons. Feels like every month we get a bit closer each time. Can't wait to try this out.

@middleman-theory 24 дня назад

Love your content! With ChatGPT 4 o’s new voice update not showing up for a few weeks, would you be willing to put Pi (an emotive ai based on a custom llm called inflection 2.0) through your rubric? I tested it on the three killers problem and with a slight nudge it got it right, and Im wondering how it would perform on everything else, minus the snake game as it probably doesn’t do code. In fact, with AI technology advancing and expanding into multi-model territory, maybe going forward you should consider starting a new category for voice-based LLMs. Thoughts?

@notme222 24 дня назад

Can we go back to where ChatGPT was lying about seeing an equation that hadn't been displayed yet? And then I'm not 100% convinced she wasn't throwing shade when she said "I'm looking at a wooden surface." Very human. Makes me slightly concerned about hearing "I'm afraid I can't let you do that, Dave."

@consciouscode8150 24 дня назад

It isn't lying, it's hallucination. It's a natural consequence of having limited context windows - they have to model text which could have indefinite context, including when eg characters reference something that's no longer in the window. Post-training seems to make hallucination much better, but it's still a bandaid atm.

@IceMetalPunk 24 дня назад

While these models can lie, it's unlikely that was a lie. It was more likely just a mistake.

@notme222 24 дня назад

@@consciouscode8150 I know the word "lie" was an exaggeration on my part. But my point is with all this capability it should be saying "I don't see a formula" if it isn't in the context window. That's a big thing to hallucinate.

@thomassynths 24 дня назад

I genuinely was looking forward to the reveal since last week. Boy I was in for a world of disappointment. We got a desktop app and a smaller-faster-cheaper-dumber model. Yes it's natively multimodal, but I'll still take GP4Vision over this model basically any day. Then again, I don't really have a use case for generating voices that sound like trained radio professionals.

@__D10S__ 24 дня назад

you are missing the forest for the trees. look how ai is received by normal people. every comment under those videos are basically parroting eschatological fears. "we're so done" "literally black mirror" etc. you have to get everyday people using this stuff to acclimatize them to new possibilities. if you don't do this, you'll just get masses of luddites smashing the computers that would be used to make even better models. boil the frog, don't electrocute it. it was your fault for having expectations. this was never going to be gpt4.5 or 5. they have said as much from the start. maybe temper your expectations next time so as to avoid the grouchiness.

@sp123 24 дня назад

@@__D10S__ OpenAI will never make a profit selling their product to the average person. They need to focus on agents helpful for big businesses

@thomassynths 24 дня назад

@@__D10S__ I was not expecting 4.5 or 5 or another Sora. Yet I was expecting something cool. You are confusing disappointment with grouchiness.

@__D10S__ 24 дня назад

@@thomassynths disappointment is a part of life. Learn to live with it without being so bitter. You’ll be better off for it.

@thomassynths 24 дня назад

@@__D10S__ And so is disagreement. No need to white knight.

@Boorchess 24 дня назад

Will it list items for me or make CSV files for Ebay form product pictures ?

@infographie 18 дней назад

Excellent.

@JamesMartin2014 24 дня назад

Mac only is a joke. Lets ignore 90% of our users

@mattizzle81 24 дня назад

OpenAI is a hipster company so it fits.

@RobloxInsanity 24 дня назад

i think they did it on purpose to keep users using it low so they don't have to make more limits.

@davidbangsdemocracy5455 24 дня назад

“We're rolling out the macOS app to Plus users starting today, and we will make it more broadly available in the coming weeks. We also plan to launch a Windows version later this year.”

@OpenSourceAnarchist 24 дня назад

I figured it was part of their partnership with Apple, like with Siri...

@makavelismith 24 дня назад

@@davidbangsdemocracy5455 Ya, later this year... bloody hipsters. I'll resubscribe later this year.

@bdouglas 24 дня назад

Those three people are creepy AF!

@BrianPotterProductions 24 дня назад

First time watching a SaaS update announcement huh?

@robertheinrich2994 24 дня назад

the interruption feature is great. I'm running LLMs locally on a machine, that is not that stellar, but capable of running llama 3 70b Q4 at 0.4 tokens a second. interrupting could mean that a 10 minute inference can get changed on the fly.

@christophedhondt3507 24 дня назад

When I ask the gpt4o model to look through my camera it still says it is a text base model and can't use my camera... Am I missing something here?

@Badg0r 24 дня назад

Will they proceed with summing up the letters from o to q?

@keithprice3369 24 дня назад

Is that desktop app launched? Or not yet rolled out?

@nemonomen3340 24 дня назад

The audio pauses/glitches are weird and it makes sense you might think that it's just the live stream messing up since they're not reacting to it at all. However, if you watch the audio icon on the scene that indicates when GPT-4o is speaking, it seems to be pausing mid-sentence at the same times that the audio cuts. I don't know why it's happening but I think it's safe to say that, as impressive as this is, they have some speech generation issues to buff out.

@PJRiter1 24 дня назад

Conversational!

@kenr4709 19 дней назад

This is incredible! I do love the more human touch of the inflections of her voice. I'm sure it is not far behind, as you expressed the AI completing tasks. AI is developing very quickly. Hopefully for the good of everyone. 27:07

@Bob-kp3tv 24 дня назад

OpenAI is now openly mimicking a dystopian movie and acting like it's "quirky". If you're rightfully worried, I invite you to join PauseAI.

@jamesmcpherson1590 24 дня назад

Love it!

@vrshowdown 23 дня назад

When this "voice mode" and desktop app comes out?

@4EV-ER 24 дня назад

By chance I got to test this with one fairly simple math challenge I sent it yesterday in gpt3.5 and it couldn't solve it. Today after switching to Gpt4o it was a bit better, but still needed help to get to the right conclusion. Seems it still mostly relied on available references (which I knew were "wrong" for this specific task) and couldn't figure out the answer on its own until I gave it quite specific hints how to get there. Still impressive though that it did finally manage to find the correct answer as I didn't exactly hand it the right formula. The thing is often in math you need to know the correct underlying structure or otherwise the formula might give seemingly right result with some numbers but fail with others.

@retrotek664 24 дня назад

Very cool, I would expect GTP4o + Custom GTPS to be game changing.

@greendog105 24 дня назад

I downloaded it on my app and the free version does not have the real-time conversational speech feature at all. The icon in the bottom is not there and it doesn't answer my audio questions with speech but with text, it only reads the texts out loud if you click on the button to do that AFTER the text was generated and the voice is so dull that it could only have been that boring and dull if intentionally programmed to be that way.

@chessmusictheory4644 24 дня назад

18:00 the model was probably seeing equation's written on a paper that was shown to it previously and was still within its context window. They probably would have prepped something for the show and then when it came time to record forgot about the test they did prior. Im speculating of course. 😆

@chrispac6264 21 день назад

I was just talking with 4o and my I’m blown away. It’s just like having a normal conversation with a smart person. The conversation was with the RU-vid video of this playing in the background and it handled it flawlessly . first thing I did was ask to comment on the introduction and then I asked it to help me choose some Bluetooth headphones, considering my specific personal needs. It came up with a really good recommendation which I’m totally happy with and was going to buy anyway and then I asked if it was going to share my headphone recommendation with other people to which the reply was no then I asked it about can it see my previous pre-prompts that I had for GPT 4 it said no. so I told her what my pre-prompts were it said it would remember them for future conversations with me amazing absolutely amazing I also told that I’m in Australia and to use Australian spelling and it said it will in future in all interactions with me

@jkimo1178 24 дня назад

Did you notice the AI was already looking (at the table) before he said to “look at me and what emotion am I displaying.”

@yagoa 24 дня назад

the "breakthrough" is making it super addictive

@pipoviola 24 дня назад

They output the audio to another device... I wonder how it'll behave when the phone is the one answering, the microphones will struggle to pick up the user voice