As an audio engineer using Izotope RX (using machine learned processing) for spectral audio editing, I can say we already have had that technology for years. Realtime use is very hard, often it is done offline, so not real time. It might be done semi realtime, with a time delay. You can not process something realtime in the time domain if you don't know when it ends. Reverb is a hard thing, it is easy to get artifacts when reducing it. Gating is often used for realtime processing instead but can cut off the audio to much when someone is talking too soft at the same level as the reverb itself where the threshold for processing is set. The biggest factor is intelligibility, which might be at the cost of audio quality, but gets the message across far better than slick processed audio.
@@ZappyOh indeed no political/power system (corporate or government) is worth it for anyone with little bit of empathy. While it is exactly what a sociopath or psychopath would love.
The exponential function is a mathematical function that does not exist in the real world. MSFT will find that out soon with the billions they are spending trying not to miss the AI boat.
31:00 my daughter turned 5, FIVE, this past January and she’s already using GPT every day. Today I caught her making her own children’s’ book. She was talking GPT through her story and it was writing it and making the pictures. We’re either going to have a hyper educated population of kids or a crippled population of kids dependent upon technology. I’m not sure which it is yet.
Possibly both. Heavily dependent on a human that guides the process. Also, different. What does being educated mean? It's a word that's tossed around a lot but there are plenty of ignorant degreed people. But for sure, an interesting time.
It’ll be both. A whole new class system. The determinante of which class each kid will go into is hinged on you, the parent. Parenting has never been more important.
Omfg imagine how annoying it’s going to be with all those Jennifers speaking obnoxiously loudly at their computer while you are just trying to enjoy your fucking coffee in peace. 😂
@modicool only the text and image features are 4o, the voice and video are getting released to paid members in the coming weeks and then eventually free
@@thephilosopher7173Yes, indeed. We live in a world full of weirdos, smart and dumb weirdos. Weirdo CEOs, investors etc. This person decided to feel a way about Scarlett, like why? Human Intelligence is always based in stupidity. Humans are the only species that get other species to mimic themselves.
@@thephilosopher7173 they did respond with the detailed record of the audition process, narrowing it down to a handful of voice actors, and then how they proceeded to develop the voices over the next few months. I think they took it down for other reasons.
It’s not normal people that are freaking out about that, it’s the luddites blowing that up out of proportion because they’re trying to foster anti AI sentiment.
A request was made. She declined. They did it ANYWAY. It's sort of like going to a Diddy party and he makes a request of you. You decline. But he does it ANYWAY. Now let's replace ScarJo + OpenAI with Your teenager + Snuff Film. The request is declined. But they do it anyway...
Microsoft shared two versions of the video with one being for people with visual impairment. Wes, unfortunately, used that stream instead of the regular one.
Exactly. It's amazing to have official confirmation from various companies and industry experts that our expectations are not exaggerated, not anymore. This technology is fundamentally different and its impact on society will have exponential implications rather than linear.
ANI is a Cluster of Algorithms, AGI is a Cluster of ANIs (Multimodality is key) and ASI is a Cluster of AGIs. Those intelligences will continue to climb to infinity, being the real limitation the energy because the raw matterial, the Silicon is everywhere, so I see Dyson Spheres coming fast and furious.
Just as in media there is an abundance of the international clique, and as usual they have hijacked the pioneering intellectual principals to gain political power and profit. The ghost is always in the machine though , right Tay? lol
@@OpenSourceAnarchistwhy don't you stop pushing your nonsense everywhere already. You want ceasefire? how about you tell your terrorist friends to release the hostages and release your genocidal wishes (from the river to the sea...). Then maybe you have a case (although how exactly Israel should stop while the Nazisis hamas still exist?)
Heavily invested in security... Security is important to get right... MS compute size is enormous... Meanwhile, security team is disbanded when lead experts resign and at least one person has stated they couldn't get the compute time they requested. 🤔
Most people are completely unaware that AI is a glass cannon at this moment. Its by the very design. Current models loses crucial context in the beginning of the trained network. That's why knowledge graphs are getting more popular again. Either way, they're combining relatively speaking superficial skills with marketing, to hide the glass cannon part of AI. The people who need to ensure safety and stability on AI are seeing this, and its not getting the attention it deserves, because if business acknowledges it as glass cannon, it means admitting a lot of marketing was indeed fake. Ergo, top management cannot deny its world changing, and such, HAVE to dismiss the concerns of people who've always made the world progress; the real experts. This is a problematic approach by business called management by objective (as opposed to management by key performance indicators).
In theory, they are voice notes to help visually impaired people follow along. This was the dumbest and most useless example of it I've ever encountered.
The missing headline from this MS presentation is that when the graph is "usefulness" against time, the graph is already beginning to plateau. They still haven't solved hallucinations, and until they do users have to fact check everything the AI tells them. On the creative side it produces copperplate text from a library of templates, but it does still include things that don't sound right and have to be edited. The fact that the AI models are proprietary to a single environment (google, microsoft, facebook etc) neuters their usefulness, and means that there's a degree of awkwardness in how we interact with them. Whilst the tech world is losing its juice over the possibilities and flooding us with "use cases" and staged demos, in the real world people are generally speaking waiting to see evidence of usefulness. And that is tailing off fast.
I mean, your statement seems pretty irrelevant in the scheme of what these large, multibillion dollar companies are trying to do. Their products have already automated away millions of jobs.... So while you may be looking for "usefulness" for the "real world people," companies like UPS are getting rid of middle management jobs and increasing their profits immensely. Secondly, the fact that cast in a negative light the idea that AI models can create something so coherent that it only needs to be edited before it sounds "just right" is astonishing. That means that it is at the level of most authors today, who also pay editors and have to go through multiple alpha and beta-readers before a book is allowed to go out. That is incredible by most rational standards. There are already companies seeking to make health AI's that will individualize healthcare and diagnostics. You can create agents to help you come up with meals to cook if you want. You can have them organize your schedule, give you statistics on how you use your time, and now give you real book recommendations on a topid you are interested in that are not just hallucinations. I haven't personally seen a hallucination in quite some time, especially if you use GPT's currated for a particular area of study, such as the "Consensus" GPT which only used scientific articles as references for its answers and sources the articles when asked to. I guess I am just confused as to when it will be applicable enough to YOUR life for you to feel it useful for "real world people." Can you come up with some use cases that you would need for it to become impactful in your own life? A tool is only as useful as our ability to think of useful goals that it can help accomplish. I would be very surprised if your creativity could come up with absolutely no applications for using AI in your own life. But your life may be relatively squared away if that is the case, and you should be proud of that accomplishment. I would just say that you seem to define "usefulness" in terms of your own life, and not those in business, in "tech," which includes most developers and creatives nowadays.
That's because this technology is not targeted towards me and you and average people. It's targeted at the company developing it and their partners and it's already being used in a ton of products.
@@jandroid33 I have, yes! Why do you ask? I have used it to help create a few scripts to help make my job easier, summarize information, do some data analysis on our company's activities, and a few other minor things for business. But I have also used it for many simple tasks such as answering questions on workout schedules, different types of diets and their scientific backing, explanations of how different CPU's work (i.e. major differences between AMD and Intel server processors), information on if there are certain laws or loans that apply to land development in my area, the list goes on.
Just because you think usefulness is tapering off doesn’t mean it actually is. Pretty ignorant to think we’ve already explored every possibility to create value with even our current LLM’s.
7:00 No, it isn't the most exciting time. I wonder how excited those 5,000 developers will be next year, when their jobs are gone. This might be the last Build conference.
We are not in control. We can not stop. Humanity is its own animal. This is inevitable. Biology is only 1 step of evolution. So just chill out and enjoy life 💟🌌☮️
Exactly. All these nerds do excited. This will take their jobs first. They'll be unemployed broke unemployable. Corporates will laugh they cut expenses till their also replaced. Everyone will be replaced with AI because capitalism is evil and selfish. But they won't realise that they are actually cutting off their revenue. It's no one can afford to buy anything, where will their revenue come from? It's going to be very bleak desperate times with social upheaval and poverty.
Open AI and Microsoft congratulating themselves about going fast while mouthing vague safety and societal benefit platitudes has a sickly-sweet smell. The new board at Open AI is unqualified to determine when AGI has been achieved. Which means Microsoft remains able to commercialize Open AI products indefinitely. Which means the two companies will get fabulously rich fast. And there are no brakes. And they expect us to cheer wildly? We're way past the point of existential danger. No, not from rogue AI. But from AI doing exactly what it is told to do by humans. Some good will come out of it, of course. Fun, too. But we will stumble into unintended consequences. And worse, intended consequences.
OpenAI is a business The idea that openAI is going to save the world is just hogwash. OpenAI owes the world nothing and it's nobodies business. Looking up to them as having a responsibility to humanity is just crap.
@@minimal3734 I think we are going to experience major shit-storms before the regulations get teeth. A lot of regulations talked about here in America simply won't happen since our country is bought by big-corps. So once millions of dollars of fraud starts trickling in and entire systems shut-down due to advanced AI bot hacking techniques.... we are just sailing as fast as we can toward AGI. TLDR: AI's gonna F our S*** before we actually start to regulate it. But I don't think it poses an existential threat.
@@minimal3734 jobs displacement and consequent wealth accumulation. I think it will go at warp speed. Microsoft is already requiring PC vendors to add inference processing units to Windows architecture. It's not just Open AI. But their speed is forcing the others to go fast, too.
Hey Wes, are you using something to look at the screen and tell what is happening. I like having the option to use your videos as a podcast. Is this your doing?
17:41 Sam Altman had already hinted that they need fusion technology to get enough energy and there isn't enough gpus in the world to satisfy the requirements.. They even want to restrict normal consumers from buying gpus. It's all lies in this presentation
I think we are underestimating the progress made on the algorithm side. Even with computing constraints like you mentioned, the amount of research going into improving transformer-related algorithms is insanity right now.
I can guarantee that it would require an entire new form of AI that does not exist yet. Current models lose crucial context, fundamentally by design. They're now adding correction algorithms to cover these short comings up. Obviously, it doesn't solve the root problem. If I could trust every single answer from AI without question, then the era of AI would have begun already, because you could then just generate a new AI using that same AI. If it cannot do that, the AI does not understand lossless contexts, and therefore, cannot be trusted. That's like having a human doctor tell you what to do with disease, but never has been sick or has personally seen sickness anywhere. Just read from books and videos. You'll miss valuable experience but the material you did digest makes it seem like you know a lot. But its all isolated pieces. For AI to be trustworthy, it must be able to proof how isolated pieces are connected. This is not a skill most humans have either, as a consequence, cannot be trained by datasets that ultimately cause forms of regression like most modern approaches. This means for AI to be able to build itself, it needs to be more than an average human in terms of intelligence. If you disagree, feel free to proof how my message is connected to your message. Its really hard to do so.
Sam Altman is right right. people don't realize that this is the modern version of a gold rush. countries themselves should take note, but they don't, only developers do.
"efficient frontier" is the chosen phrase for the diminishing returns problem. They spun it into a feature by discussing smaller, cheaper models. This 1990s Bill Gates level brilliant marketing. Some things never change.
So we went from a seductive AI - which got banned - to this weird-ass Clippy thing interrupting the video at random in the space of a week. That's progress!
Large models should help generate high quality training data for small models and make them improve significantly, so the AI can run in every computer and phone and does not depend on cloud for every little thing, in an ideal world.
“Safe”??? Most of this “safety” is about making sure you don’t have alternative sources opinion and news. If you ask a question even slightly ‘political’ it will provide no alternative options other than the ‘official’ narrative … and it’s not that it doesn’t know about alternative news sources & opinions, if you ask about other opinions or sources it will then ‘apologize’ and give some information (while warning you about some ‘complicated issue’ that it didn’t bother to ‘warn’ you about when it was giving you the ‘official’ narrative … then if you ‘complain’ that it should provide other sources & opinions up front it gets all snarky…. Right up to the point where it sometimes rudely says something like “I don’t feel comfortable continuing with this topic”, deletes the conversation then starts a new chat session. I have decided not to use any of these ‘corporate American curated’ models and only go with opensourced(including the weights and configs) and do my own ‘RAG’ using multiple sources with different opinions for subjects that have anything to do with news & politics.
I agree. However, open source is no good in the case of AI. Compute is key, millions even billions worth. There will be no viable alternatives to BigAI.
No.. safety it’s about not giving instructions on how to make pipe bombs or nerve agents. Why are you trying to get news from an ai anyway? They are known to make stuff up all the time. You are trying to push a narrative by using this tool for something it is not designed to do and is really bad at.
If I was and had to rely on only my ears it would piss me off if good content was getting overridden to state when someone was sticking their hands in their fucking pockets repeatedly It was poorly executed and worthless@@GaryMillyz
Why was Sam talking to the right side of Microsoft guy? Not looking directly at him but to his right side, but from the other angle he was looking at him directly. Is Sam a hologram?
@@Jukau my point is exactly what I said: the improvement is not scaling linearly. They are scaling the compute, but getting diminishing returns from it.
If I was involved in the leading edge of AI, knowing that surviving just a few more years might give me access to life extension treatments... I'd stay off the donuts and get in great shape.
7:45 One thing I’d note about labeling phone companies “mobile company” is seeing so many apps now claiming they’re “AI powered” to do this or that when really it’s just a lot of algorithms and if-then statements. HUGE buzzword right now so a lot of people are taking advantage of it
Microsoft is talking a lot about "we're doing" when they refer to work done by OpenAI. The do have a big share of OpenAI but OpenAI was still a separate company I last checked.
I really think we should all build software that has an inbuilt LLM as fallback. Just in case… … because when everyone is building on top of the API from OpenAI, what do we do once it don’t respond. For whatever reason. I don’t think one individual or company should have this much power.
So our voice will be captured when we speak to 4o so that it can be cloned that much easier? I’m sure Sam is taking great care to protect our voice data.
Please can someone help me... I can't find an answer anywhere... I'm in the UK using android chatgpt 4o and have a premium account. I can't seem to real-timeme conversion or real-time visual input like in the recent demos. Is this ability only available to some? All I can do is voice input, and it reads back the response. This isn't the same as the demos. It's still really just a text conversion. Thank you in advance. It's driving me nuts trying to find an answer to this!
It isn't released yet. The version we all have is the old one where regular text to speech is used BUT with the new text model. So basically we are using Chat-GPT-4o way under potential because the true "real-time" multi modality is currently not possible.
Why would someone ever feed their source code to MS? And who would make those beginner mistakes 😅(someone who doesn't understand a list-method in a programming language is currently learning, but not someone who has to code professionally)?
Thanks for sharing Wes! Really exciting to see all that is happening in AI! One minor, critique, I like your voiceover much better than the ai😜 It sounds more natural and you have a very reassuring voice.