It would be so damn cool if open ai made a two wheeled auto balancing robot that had an extendable stick and at the end of the stick you hook the phone up to it and then the phone/chatgpt when plugged in can pilot it.
Great! Now pair it with the ability to help people BUY stuff.... yada yada yada, do some AI magic, allow the two users to connect each other without revealing private info, and you have the ultimate AI market exchange. Re-apply logic for other things like offering or finding a job.
How is ot done? They use a tablet with splitscreen. On one site there is the ChatGpt App, on the the other an paint app. How can you active Chatgpt to see the picture? Or was the picture uploaded before? Thx Markus
Im genuinely starting to get the impression this whole thing is fake. If it was as real as the demos show, you would just release it. If the demos are real, its CLEARLY READY. SOMETHING DOESNT ADD UP.
They probably have some kind of problems there. Even ChatGPT is buggy all the time. You have to refresh the page several times. They probably did not calculate the power, they cannot withstand the release of gpt 4o in free access.
As an eager user awaiting the new voice and vision features of ChatGPT 4o, if access isn't granted soon, I will be compelled to transition to using Gemini AI.
So cringe how you’re clearly using Scarlet Johannsons voice after she explicitly said you’re not allowed. And don’t say it’s not supposed to sound like her.
I'm not entitled like the rest of the comments but can we please get some info on the new voice model? I'm sure everyone is working hard to bring it to us ❤
Thank you for the video. I am currently living in Osaka, Japan and I am very interested in Instant Translation with AI models. However, what I understand by "Instant Translation" is not: "I say a sentence - The model translates it after a few seconds and I can hear it - I say another sentence - The model translates it after a few seconds and I can hear it..." What I understand by Instant Translation is: "You are talking in Japanese and, while you are talking in Japanese (with a delay of a few senconds), I listen your speech in Spanish. No matter how long it is the speech. May be the Japanese speech is 10 minutes long and I can begin to listen to it after 5 seconds in Spanish and will end 5 seconds after finishing in Japanese". Basically it is like having a interpreteur by your side who doesn't have to wait until the end of the speech to begin translating. That way, the conversation gets more fluid. I know this is not an easy task, as there are SOV and SVO languages. However, I think that Seamless m4t model is able to take this into account aswell. Do you think is it possible to implement such a thing with this model?