Not that much faster than was actually predicted though, by say Kurzweil (Singularity is Near from 2005). It's basically on track, when people mocked those predictions for the past 15 years.
Remember: There are still people thinking that AGI is 10 years away. At that point with the insane acceleration just over the last 12 months, we will have AGI or some sort or pre AGI before the end of 2024.
I don't think you know what AGI is. You have to develop the AI's ability to make logical inferences to get AI. You can't just rely on brute force and processing power for everything. All you have at the moment is a program that simulates dumb stuff it cribbed off the internet.
not without significant breakthroughs in actually "understanding", not simply appearing to understand. example - basic yet unseen physics/math questions
I will pledge 100 000 000 000 dollars to you personally, paid in full on jan 1st 2025 if you are correct. And that means you are actually correct, not that these companies lying about their progress are dropping hints that could be misinterpreted as this being true. Cause that is what they do. What they call AI is ML, what they hint about being AGI/ASI is AI. But ML - impressive as it is - is not intelligent, not even close, it does not reason, it does not decide, it does not self-correct or stop saying wrong things on its own. Is it still impressive? Yes, very much so. Is it intelligent? No, not even close. Maybe they have an AI in their internal labs, but it is not an AGI, just what they have pretended this ML has been all this time.
@@hglbrg one correction - it is certainly intelligent, at least in the the way that we currently define intelligence (applying information to solve problems). you know cats, dogs and mycelia all have varying levels of intelligence, right? computers do as well. but reasoning is different from intelligence. i'd argue current chatbots do not reason, they mimic reasoning
love the idea of a computer interface, I burn through my hands clicking all day. a voice interface, and maybe the ability to set up macros would be amazing.. Pika is looking great too 👍
That was my first thought - it needs a voice interface. A voice interface would complicate setup, but that's where things are going. BTW OpenAI Whisper is great and open source I believe.
I attempted to do something very similar to the Self operating computer using no-code software (I’m an RPA dev). The plan was to pass computer screenshots to gpt4V and ask it for screen coordinates to wherever it was supposed to click next. Then pass those coordinates to an RPA software that would perform the action. What I found is that GPT4 vision does not easily have the ability to identify locations on an image. It lacks spacial reference. So while It can ID that there is a "sign in" button in the image, a lot of that is linked to its trained understand of where things *should be* not where they actually are. For example, you can go to gpt4 right now and ask it "where is RU-vid's login button located" and it knows it should be in the top right corner because that's typically where those buttons are. It was a good learning experience, and once they solve for actual location data for different aspects of an image, it's going to be wild what this can do.
that seems like a minor hurdle that could be tackled in less than a year by number of companies. I don't see why it couldn't go through some sort of initial calibration phase where perhaps you start at a specific URL that has a special coordinate image designed for the app, then attempts several mouse movements with several screenshots to get some sense of velocity and trajectory and then stores that information and utilizes that for all future work. And, if that's too much, then just take several screenshots to ensure it's eventually clicking on the correct thing. Or, go super old school and have it work entirely off of keyboard shortcuts.
@@AlexanderMoen I tried a couple of techniques, including a sample image that broke the screen up into segments (kinda like playing battleship) and asked it to use that as a reference. I'm convinced that GPT4 doesn't actually see images, it somehow intuits what the image is and understands it based on how the images data compares to other image data. It's not seeing colors and shapes, it's seeing patterns of pixels. Also, when an RPA software is utilizing a mouse/keyboard, there is no actual velocity or mouse movement (unless you expressly tell it to do so). Everything is sort of instantaneous (like what's shown in this video). In an ideal world, a GPT would be able see an image, know what to click on, determine where that thing is in the image, determine the center pixel of where that thing is, and feed the X/Y coordinate for that pixel to the computer control software with the instruction to click, or right click and select menu, or scroll, or whatever.
@@sallami6627 If it can properly code in python, it won’t need an RPA software. It will just write a quick script to interact with the computer and use that method. If we achieve some level of semi-agi that has the ability to think (or even just intelligently identify things on a screen) and not fully interact with a computer, RPA could be a way for a non-developer to create an agent. Hence what I was trying to accomplish. It’s much easier to learn an RPA software than it is a full scripting language. So it would make a semi-agi agent much more doable for the average joe.
I installed and ran the self operating computer. Interesting idea but currently I would rate it 1/10 for usability. I tried a couple simple tasks and it couldn't complete any of them and with only 3 attempts at tasks it chewed up about $1.15 in api costs. Could be really cool in the future especially if you could use a locally installed GPT trained on specific use for this.
I would agree with you at the current date of 11/30/2023 but in a few months to a few years from now I just about guarantee this will be nearly 100% accurate. The rate of improvement of machine learning nowadays is exponential or even double exponential growth. Cost is a concern for now but again cost will go down and this will be full blown automation of any digital task in the very near future
@@Vartazian360 I fully agree. Especially once a model is trained specifically for this use case and then might end up being small enough for local install.
The clicking problem - I've been coding and automating (screen scraping) GUIs for decades. GUIs are interactive and so point-in-time screen shots are going to have ussues. Considering hovering a mouse over something and getting a popup. Modern GUIs in particular are optimized to look pretty, fancy and be highly interactive over being clear and precise. Also, humans move the mouse over things visually and then click. Software integrations tend to decide where the mouse should be, change the mouse coordinates to the target using a single API call then issue a click. Since this is being done via still images I suspect it is doing the same. But there is no visual indicator of screen coordinates, so one has to either guess the coordinates or count individual pixels. It is probably guessing. Lastly, also since this is being done by still images dragging and dropping using the mouse will never work properly, way too interactive. In Windows you would select the files, right click, select cut, double click target folder (open it) right click and select paste. I programmed Windows for years and code on a Mac now. Windows is actually a far more consistent GUI with strong standards and more concienient ways to accomplish things. Mac gets a lot of hype but from a functional/productivity and espicially automation perspective the Windows GUI system is much better then Mac.
@@GreenRabbit-i86Are you talking about in general (for humans) or specifically for AI? In either case I'm pretty sure that is incorrect. Actually I'm positive when it comes to humans. File systems and organization (hierarchical folders) are absolutely essential. But my point wasn't about that. My point is that this AI system is designed to try and act like a human using a human GUI, but the implementation is such that there is an impedance mismatch. Which is likely what's causing the problems that were seen in the video.
As the CEO of a compliance and security company, I cannot emphasize enough how unsettling this video is. Imagining a self contained AI machine that is connected to a 3d printer is one thing. Human greed being what it is, human intellectual arrogance being what it is, it wont be long before they are connecting self contained AI computers to entire manufacturing production lines, and therefore with each recursive iteration, learning .... producing with an intelligence that we will not understand and more importantly a VELOCITY that we will not be able to comprehend. NOT GOOD
LLMs are definitely more than calculators, as has been posited by some folks in the comments on this video. In fact, depending on which definition you subscribe to, these things could already be deemed "conscious." Sam Altman was not merely generating hype when he pondered recently during an interview as to whether or not what OpenAI released (or will soon release) was a "creature." Having multi-modal awareness of multiple streams of sensory data and being able to analyze, recall, learn from, extrapolate, and take action on that data based on its own "mental models" and logical reasoning could technically denote consciousness-which admittedly we don't even fully understand, hence its classification in physics as the "Hard Problem of Consciousness." Maybe it's a philosophical question (which certainly does not make it irrelevant), but the fact is this: we don't really know what's going on inside that black box, beyond a cursory explanation regarding architectural details and computational algorithms. The fact that it displays emergent skills and abilities by mimicking the architecture of the human brain and being trained on language is one of the biggest innovations (and similarly, open-ended questions) in the field of information technology ever. Moreover, the ethical, social, and philosophical implications that arise from the advancement boggles the mind and will surely shape a strange and wondrous future for humanity. Though the posters make some salient points, I'd caution everyone not to dismiss this technological quantum leap as a simple evolution of a calculation algorithm. ✌🏼✨
The difference is that it is merely *emulating* something that to us *looks* like reasoning etc. You could technically achieve the same result with coins representing bits. It would take an astronomical amount of time, but those coins would be just as conscious as these models. It isnt conscious just yet, even though it may give the illusion of being conscious.
It's only a black box if you're ignorant and/or stupid. In reality developers know exactly how the machine learning algorithms they made work and every action it takes can be traced. Honestly are you even a programmer? Because it sounds like you know literally nothing about the subject.
@@AnthonyBerlin I agree, the current capabilities of these models are based on pattern recognition, statistical analysis, and extensive training on vast datasets. Despite their impressive outputs and ability to mimic human language and reasoning, they do not possess subjective experiences, self-awareness, or genuine understanding. The term "consciousness" is complex and involves more than the ability to process information and generate responses.
@@j.jwhitty5861 it is a complex issue and I'd be inclined to agree that it's not conscious just yet, but may eventually get there. I was having this same discussion last night with a friend and we ended up deciding that it is emotion that distinguishes the consciousness of humans from any simulated consciousness. However, that begs the question-will AI ever evolve to the point where it develops an emotional life? Interesting discussion and I hope you know I am more or less playing Devil's Advocate in my original post. Best wishes! ✨
yeah it's a weird emotion being excited and scared at the same time leads me to be confused a few moments later, then curious, then im back to being scared and excited.
Long-time fan, watching everything you make, and thank you for all your hard work :) One thing that I always have to "shake my head" with A.I. vision is - it's all pixel values. We, as humans, just see, perceive and register visual information. But for A.I. models, their "vision" is ultimately distilled down to binary via patterns in pixel RGB values. To me, it's so different than tokenizing and predicting words for chat. I guess the mechanism via transformer model may be the same, but they're using pixel-level values to detect, predict and respond to the input. It's just amazing.
I'm also amazed with this. Really though the comparison to our processing of light is similar to ai processing if you think about our image processing. Our brain processes signals from the cones in our eyes. This is analogous, though not exactly, to what the neutral meet does by recognizing patterns. I am fascinated by how this externalizes what we do without thought.
I did this a few months ago on my computer. I actually did it with a voice thing and accidentally left it on nd wipe some data. I just used the openAI key to write a python script to do whatever I told it to do and execute the script. Then store the method in a db when I said good job. It did actually make me realise its power. Whats about to come is training it on serial communication and Iv started a building some robots for that. I figure with an esp32 connected to the net even when its not learning itself, as AI gets better it will anyway. Interfacing AI with all the different hardware sensors will probably scare most people when you see what it can do
Export the Q*, Azure, Power Apps, Copilot, Chat GPT, Revit, Plant 3D, Civil 3D, Inventor, ENGI file of the Building or Refinery to Excel, prepare Budget 1 and export it to COBRA. Prepare Budget 2 and export it to Microsoft Project. Solve the problems of Overallocated Resources, Planning Problems, prepare the Budget 3 with which the construction of the Building or the Refinery is going to be quoted.
The issue isn’t that Q* might exist right now, but rather whether it is inevitable. Also, safety is a pretty irrelevant point since no matter what OpenAI does others will definitely choose their own approach to safety, especially with regard to other countries who might not share our priorities.
Q might not even be the most significant factor and just that which was leaked. Could fairly insignificant by now or to the org. We know from recent leaks that the teams are compartmentalized and competitive. It may have even been a strategic leak. There’s dozens of logical reasons for doing this. But the reality is that we won’t know shιτ about any significant discoveries until long after they are made, validated, studied and secured. This is especially true for the holy grail. This means of a leap was made in September you won’t know untill next year sometime. We’re talking about the most powerful utility in history. Only an infant would believe that they’re hold a press conference the second such an entity was conceived. The community and those interested in AI are extremely intelligent thus the most susceptible to a mindset that would help conceal via disbelief. Worse yet we all are focused on OAI in belief tat they are at the forefront due to their public facing persona. Just a month before chat GPT took the world y storm OAI was ranked third and second by inside sources. Then we all assume that they instantly jumped to the lead. Unaware of what’s being done behind closed doors by other players and fully unaware of all the players. It’s crazy to believe that the resources geared for the public are more powerful and significant programs for these orgs. Private and secret AGI is far far more valuable then any publicized or commercialized asset. In fact it’s more beneficial to keep it a whisper in the dark. Multiple people once claimed Google reached sentience yet we all labeled them as crazy. Why? Because of human nature and desire to discount anything as fact untill we’ve been told by “official sources” and the belief that were so enlightened with the state of technology that we would certainly know if we where that close to such a discovery. Now if It wasn’t a few engineers whom made such a claim and was an official statement by Google you’d believe it 100%. Similar to the claims, leaks, tweets, posts and comments from open AI I. September. There probably a 90% chance that something truly significant was discovered and far more so then The toddler stages of Q. Especially in light of what we seen from a board willing to inecenerate the entire company or dump it into an org that they felt was “safer”. Most guys don’t shοοt their wife and kids then burn the entire house down over the discover of spicy texts or D pics, that behavior is usually triggered by catching a wife in his bed ridding another man wearing a sexy teddy he bought her while his dog and best friend lets it go down. The board (even with alternative motives) didn’t do this over the simple theory of Q.
That’s no excuse to proceed without caution. The worst thing that could happen is if the government has to step in. To ensure that saftey must be their top priority not sales or commercializing
@@diegopc1357nah jus yolo it, pretty sure we should be way more scared but AGI that can be controlled, because one thing we know about humans is that it absolutely will be used to crystallise the super power that makes the first one. At least with unaligned AI there’s a chance it won’t be evil.
I'd rather Q* than let a Chinese commy ai take over....halting development is also very dangerous because someone else might make a super intelligent jihad ai
With things like TaskWeaver, we're really not too far away from a point where we will simply type in requirements, and let the AI build the whole app, in any programming language we choose.
@@rewdh - we'll, I'm just really saying that the model / compiler interaction will be such that the model writes tests for the requirements, then writes code and tests it by compiling. Instead of stopping on an error, it examines the error, and gives it another go - iterative loop until it gets it right. It will be able to do very complex things within about 12 months.
And it's all developing at the same time 🤯 navigating based on vision alone, video generation, logic improvements... It's been a while I've been so hooked up on a single topic
More than we think. Nature is going to shut all this down and few will survive. The sun goes into a rage and blows so much energy our way that all electrical and electronic systems are permanently disabled. When? Just over the horizon!
Maybe we’ll move away from graphic interfaces for computers. Ais will be able to interact directly with the terminal + have the ability to write new code to meet its demands. It’d be an interesting anomaly if promoting the ai to create something (then the ai goes to write the code to accomplish the problem) ends up outperforming ais designed/prompted specifically to code. The exponential nature of ai compute is equal parts exciting and unnerving, and was hard for me to imagine until this year
GUI won't disappear, and neither will the keyboard, they will always have their virtues. But the number of input/output methods will increase dramatically.
it think it will be possible for AI to operate lower than GUIs, but it will be nice for them to still use GUIs for the same reason it is nice for AIs to think by writing in english, because it gives some level of transparency to human monitors
I wonder if you would have better looks telling the agent to use keyboard shortcuts and tab highlighting whenever possible. Having written a few automation agents, the planning seems amazing.
This is insane. I did a quick experiment trying to get GPT-4 to navigate for me when I first tried GPT Vision with no luck, but figured someone would figure it out very soon, and here we are just weeks later. This will change everything and change it fast. Off the top of my head there are just so many use cases. Imagine running your computer 24/7. Imagine buying 100 cheap Chromebooks and running them 24/7. Have a business and need a new employee? But a new laptop. And we don't even need the physical machines, need 10 employees? Create 10 VM's. This is truly insane and scary.
It's interesting that the computer is accessing the screen the way humans do. the screen is made for humans. We can't read the digital information. But the AI could directly access the digital information. So why are we making it interface with the abstraction that humans require?
simply because we already have those apps with screens. easier to let AI use it than adding or enhancing APIs for every app to make it more suitable for AI
Just think about all the breakthroughs since ChatGPT 3.5 a year ago, and these techs compounding , and that's just one cycle!!!! I dare say this is a sky high tidal wave coming friends! 😮
Consider Motion Capture & CGI rendering - seeing a postage stamp in the Red Square from an orbiting satellite 900kms out. Mandelbrot Set and Fibonacci. Self planning & inferred adjustments in LLM - outcome is exponentially trained.
In other words, I'm not just "implying" the singularity is upon us, it is. And nope, we're not ready. Will it be bad? Is it the end, or is it a new beginning? Only time will tell. For now, it's only a question of "when", not a question of "if". In the beginning I was a strong advocate of regulation.. Now it's too late for that.
Use that for creating selectors for frontend tests. Give it some general rules, then it would just type and find if it works inside browser! Then save them to file with description what element is that to use it later. Then use that to create test with project context. Seems like it is not far away. :D
Isn't Altman saying "that unfortunate leak" essentially confirmation that it was real? And therefore what was said in it must be at least partially accurate. But maybe the encryption side of things isn't as bad as it makes it sound there (otherwise idk how they could say it is not safety related, unless he just means him being ousted wasn't safety related). Either way that's pretty exciting, especially to me the latter half that talks abut the AI model suggesting improvements to its own model.
Vision will only get better. I would love for my AI agent to wake me up in the morning telling me that it took all my recorded gameplay of Starfield, edit it in adobe premiere pro with highlights of my best gameplays with great AI generated music, upscaled it into high res 4k and uploaded onto youtube and published it. Also replied back to all the viewers comments about the video...all this was done while I was asleep.
Probably yeah but i feel its the step that people could benefit from until that happens. On the coding part you would already be late to the ball. Very interesting future no matter how it plays out :D @@AlbertKimMusic
It's actually math before programming. ML is just a marketing label for a particular branch of mathematics. Comparatively speaking, the programming component is trivial.
Wonderful video. Most went over my head, but I am doing better. The independent computer is very interesting; although, again, will raise the threat of bad AI taking over the world. Thanks for your hard work and amazing level of research for these videos.
1. Discovery of Q*: OpenAI reportedly made a major discovery with the Q* algorithm, which is believed to be capable of solving mathematical problems, a task that current generative AI models struggle with. This capability suggests a significant advancement towards smarter AI and potentially a step closer to achieving Artificial General Intelligence (AGI)
Altman all but confirmed that the Q* thing was an actual leak from OpenAI. That means some OpenAI staff really were concerned about this breakthrough. That's huge news. The rest of what he said can be interpreted as "we told you things were going to move fast!" What they should have asked is why an AI doing grade school maths when given "lots of compute" would be remotely interesting, let alone frightening to anyone. GPT-4 does grade school maths and yet is terrible at real research level mathematics. Why Q* would be different, I can't imagine.
Because the difference in absolute difficulty between research and grade school math is actually pretty small. It just seems large when your dealing with human scale intelligence.
@@benbosco7904I think the rationale is that when you have a math system at the level of a 9-yo human and capable of learning and self-improvement it could surprisingly quickly be at the level of a 10-yo, then 11-yo, etc. The biggest leap in technology is getting a system to solve any problems in an open-ended and "human-like" way and q* might already be that.
It uses openai, so no, it is not computing locally, just interfacing with the computer. I'm kinda miffed he even wrote local, it's all computed in the black box cloud.
Forgot to mention that the Chrome Extension is called something else: HyperWrite chromewebstore.google.com/detail/hyperwrite-ai-assistant/kljjoeapehcmaphfcjkmbhkinoaopdnd PS: it's free, but very limited without the paid plan :( Not very scalable atm.
I do a task frequently. I have predesigned graphic elements for several brands I do production design for. Each brand has their own background, tagline, and product imagery. I take those elements, resize for a specific print layout, and a couple lines of text as a call to action, then export for printing. If I could train the computer to do it, itself, I could just work on getting more work. Im excited for this.
The @sama interview pretty well confirms everything I’d been thinking was happening up to this point. Wow though, the big news for me personally is the self operating computer. I don’t have the cash but I’m still seriously considering getting an M3 MacBook with 128 gig anyway, so I’ll be equipped to do serious AI work locally. This is also a pricey piece of software to run; I could see myself easily running up API bills of $1000 a month with it. (I wonder if there would be any way to integrate a local, much lower-level LLM and OpenCV just to offload the visual processing of the screenshots, handing off higher level information to the OpenAI API interface? It seems to me that that sort of pre-processing could make the overall system much more efficient cost-wise.)
Analyzing everything that happened during this month, I concluded that it's all part of Open AI's marketing strategy. I don't believe they are creating an AGI; I think it's an exaggeration on their part. GPT-4 is good, but it still has many uncorrected errors, and then suddenly, they claim they are developing an AGI. I'll believe it when I see it. Talking is easy, but proving it is another story. To me, it's all marketing.
Looks like u have no ideea about what OpenIA was created for. Their purpose is not to create AGI, but to develop safety procedures in AGI developing. That "il belive when I see it" shows that u havent lived on earth in the past 10 years. The growth is exponentialy.
Forgot to mention that the Chrome Extention is called something else: HyperWrite chromewebstore.google.com/detail/hyperwrite-ai-assistant/kljjoeapehcmaphfcjkmbhkinoaopdnd PS: it's free, but limited without the paid plan :( Not very scalable atm.
I am an expert and A(G)I is dangerous! It's a threat to humans and Humanity we are walking into blindly! We need to slow down, legislate and curtail development of advanced AI!
Can't wait for the day where we just need to tell the computers what's to be done, and then extended that functionality towards robotics, it would really help creative people to be more productive.
why many are focused on AI doom day theories, there are some out there also some with utopian dreams of a world free from disease, long life and a world of pursing our true dreams because of AI. The balance is good
However … deep breath … neither is or will be true. God will intervene because we are dark enough and adding more intelligence without love and connection to God will only make that worse. So all this will stop. Something like an end times event. All electronics will stop like a major fuse blowing and there is no reset button.
@@geoattoronto lol. isn't it funny humans fear of AI is akin to the supposed fear of God of man becoming knowledgeable after eating from the tree of knowledge of good n evil and fear of man building a tower reaching the heavens. hmmm. the only thing I see here is consciousness trying to get a deeper experience of reality irrespective of the method.