Fascinating conversation with Oriol about StarCraft, DeepMind's AlphaStar, sequence modeling, reinforcement learning, the Turing test, AGI, and the process of developing new ideas. Here's the high-level outline: 0:00 - Introduction 0:55 - StarCraft and Gaming 13:31 - AlphaStar 1:07:00 - Turing test 1:12:32 - Current limits of deep learning 1:25:35 - Process of developing new ideas 1:30:45 - Artificial general intelligence 1:43:33 - Next steps
This interview, like all the others I have seen that Lex conducted, was a treasure. Lex listens and rephrases with great warmth and understanding. Oriol explained tech issues in a very understandable way. Great! Two comments (a bit on the darker side, sorry): 1) Isn’t the Turing Test a test for deception; i.e. can an AGI fool a person into believing that it is a human? An AGI that could deceive humans could vote, could spread false rumors, could easily sell you things you don’t want, could swindle you, and would even eventually imitate known humans. For example, your son who drops by to borrow money could be an AGI (and we thought spam email and robo calls were annoying!). Or your girl friend might not really be your girl friend, your wife or husband could be an impostor, as could the visitor from the IRS, or the police. Any effort by the Turing Vice Squad Police to identify AGI-“humans” could be defeated by superior intelligence, hacking of files, or personal infiltration. Deception gives animals a survival advantage, and so we can assume that bad agents, and perhaps even AGIs, would use AGIs for deception advantages. 2) Does anyone feel a twinge of sadness for the human side of AGI game playing advances? Of course, we cheer the advance in knowledge, and the success in solving a difficult problem. But on the other side of the game is the entire human race, humbled by a machine. For the moment, we claim authorship of the AGI, but later AGIs will be created by AGIs, and our defeats may expand to include most, if not all, aspects of human accomplishment, including philosophy, art, research, and invention. Thank you. William L. Ramseyer
I think the micro advantage (and bot-like behavior for that matter) of alphastar was understated in this video. One of the agents in the showmatch was blink microing 3 groups of stalkers simultaneously on 3 different parts of the map. No one would mistake that for a human player, as it's completely impossible for a human to replicate. A very simple solution to the APM problem would have been, instead of limiting its actions in a 30 second window or whatever, only allow it to do up to 1 action per 200ms for example. Even with the agents we see on the ladder now, they generally play like idiots, but with their raw macro/micro they carry themselves to GM. For example, I recently saw a replay where the agent scouted one base location for a third, and it wasn't there -- it was in a different base location. But the agent never scouted the other location, and couldn't figure out what was going on for the longest time. Another game, the agent only built roaches and never could build antiair to deal with enemy voidrays. If it just made hydras instead of roaches it would have been a joke but it was too dumb (botlike) to change its behavior in realtime. Alphastar is incredibly interesting and entertaining to watch, but it doesn't quite seem like a success from a "can we make an AI that makes interesting strategic and tactical decisions" standpoint. I don't think an AI will ever truly reach that potential until it can do metacognition on some level. That is to say, the agent should be capable of recognizing when one of its decisions isn't working in realtime and try something else. Right now, if the player finds a way to exploit the agent, it will make the same mistake over and over, exactly as you'd expect of a bot. Of course, this is the opposite of a trivial problem to solve, as it's counter to the entire way alphastar is designed. Alphastar doesn't really "learn", so much as "evolve" over many generations, so learning realtime is almost out of the question. Anyway, Alphastar is very cool, but not quite "human level" in all regards. It's exploitable in ways that a human is not, even if it can win due to its inhuman mechanics.
I think he is a bit too focused on the winning-against-a-human part, which is not really what the research was actually about. Starcraft is very different from chess, GO, and Atari games which they have been doing before. The MuZero learning technique, which they have developed, turned out to be pretty bad at this sort of environment / more limited than expected. What was exciting is that they developed a new learning technique, one where the AI can play a game real-time (not turn-based) and, most importantly, without having to know the opponent's action policy, without planning over every single little action, but instead model the action-transition (together with stochastic processes) as well in the AI without the game simulator actually performing it (and a bunch of other stuff that I'm skipping). So the advancement isn't AI clicking too much, the bad thing isn't AI clicking too much. Rather, the fact that the AI doesn't do worker rush, don't place random buildings all over the map, and actually that it has some small sense of planning and plays somewhat sensibly. Something he thought was completely impossible around 2016 as he tells. Playing Starcraft like a human isn't really much of a research topic, playing Starcraft sensibly at all is a real advancement though. This is a good stepping stone, but indeed there is still much to develop in this field.
I’ve used the actions per minute from AlphaStar, TLO(Protoss) and MaNa to explain spread of distribution to my students during my 1 month career as a lecturer. Damn that's a hard job, props to you Lex!
Glad to hear they are still working on AlphaStar. I am sure they will be able to create an agent that can play any race and beat any top professional's main race. I am looking forward to that.
Oriol Vinyals got interrupted at 50:37, but never came back to it... What he was getting at - what I believe - was that AlphaStar won a game when playing scissor against a rock, which was an unexpected technique. (by utilizing a certain spell often and pushing hard against the enemy in specific points) Basically showed a way scissors can win against rock in Starcraft 2.
In the ChatGPT era this opinion is gold: 1:10:09 "I think GPT2 to being the most recent one which is very impressive but... to understand, to fully solve... passing or fooling a human to think that (...) there's a human on the other side... I think we're quite far!" answering about Turing Test
Wow this will be interesting. I think i played alpha star. I have played 13.5k ranked matches since 2010. I play the game everyday. Thank you for having this talk lex!! Starcraft2 is the greatest video game involving strategy ever made. Its what helps keep me sharp. I am in the diamond league so I'm mid high tier of a player.
Lex, I liked your questions about digging into the 80's/symbolic world integration. I want to hear about more systems use of knowledge graphs/symbol advancements in production/research. Great talk, I think you both are on the same page as far as approachs/attitudes.
I really believe, we will have in the next 7-9 years a general artificial intelligence, MAYBE not human level intelligence but definetely more general AI
@@chi6168 Protein Folding is one of the most important -yet unanswered- questions of Biology. It's the same as if an AI would find a new material for better batteries with double the power.
@@DavenH Iam optimistic Hanabi will be solved in 2-3 Years. They said about Starcraft in 2017 it would take 5 Years to solve it, but it was only 1,5 Years.
Damn, I just watched this for the first time even tho I like to play SC2 and DeepMind playing StarCraft got me real interested in Ai stuff... Crazy I never noticed this episode before..
I really believe, we will have in the next 7-9 years a general artificial intelligence, MAYBE not human level intelligence but definetely more general AI
You already can have a data super Human intelligence. It apears we already have that in several areas, like Go, Medicines, etc... But fact is nobody knows if it's even possible to get a artificial general intelligence (aka. Strong AI). Therefor it's impossible to give any predictions how long it might take.
@@6GaliX You are right, but if you look at the rate of progress of the last years, you can try to imagine what will be possible within a decade. I mean look at alphago. Most scientists thought it would take at least a decade in a 2015 survey. it took only 1 year. Then look at alphastar. A few months before the match a well known scientist said it would take 5 years.
@@willasn9080 I am not denying, that one. The Problem when it comes to Strong AI is, we simply don't know if it's possible to create something with consciousness. It's a different beast we currently know almost nothing about. I would even go as far and say it might be impossible to create something artificial with some kind of consciousness. Most people just assume it will happen one day since we have seen the concept of it in many movies for decades. It is the same with the question is there other life in the Universe? Fact is we don't know. As far as we know it's super quiet out there for now. The chances that we are infact alone is much higher then most people assume. They are showen other life on other planets for decades again. I hope you get where I am going with this.
@@6GaliX One question. Why do you think an intelligent system has to have some form of consciousness? For example : look at Alphastar. It has no consciousness at all, but at the same time i would call it an intelligent machine. In my opinion Intelligence and consciousness does NOT have to go hand in hand. They are two different things. But you are right in terms of the fact that we dont know how to get to AGI.
It seems what is needed to solve general AI is a "meta model" that trains on how models succeed or fail based on how they connect (or not) with other models. Humans have ways to analyze, modify, and improve our more specific models of the world. For example, identifying a cat in a tree as a squirrel should be preventable by analyzing the cat result with another model of the world that includes an understanding of what other animals can climb and another model that knows about relative sizes of animals, etc. It's the connections between models that is the key. I disagree with Lex that resources like Wikipedia are the way to get there. As rich a resource as it is, Wikipedia really is the a product of understanding many models of the world. To read and understand Wikipedia, one needs to have an understanding of many different models and how they connect. Attempting to train neural nets on this type of database with so many built-in assumptions could be a red herring. I think a better approach is to keep building and refining specific models (gaming models, physics models, biological models, etc.) and to then work on meta models that bring these together with the aim of what Oriol describes as learning to learn. I also wanted to join the many other voices in congratulating Lex on sharing such an amazing set of interviews. Incredible work with far-reaching effects.
They seem to suggest AlphaStar is a stronger agent compared to AlphaGo but I don't get that, its just different domains right? Say the Starcraft game speed is slowed down by a factor of 1000, then MCTS+CNN like in AlphaGo would work simply because we happen to have the computational resources to do MCTS when game speed is that slow i.e. several seconds between each ‘turn’.
It seems to me as AI miss generalisation. DeepMind is probably aware of it, as they created reinforcment learning based on Dopamine reward system, but what makes human learn so fast, set it's own goals, it's actually Nucleus Accumbus - Pre Frontal Cortex relation in creating new neural connections. (Strenghtening week connections based on Pre Frontal Cortex projection) It is felxibility (all races) robustness, speed and efficiency of learning is need from that. Something that alows to destill observations from it's time relation and immediate reward. Not immediate, based on "reflection". Constant process of destilling the game to simpler and simpler elements, untill whole game can be played in matters of miliseconds, and those set rewards. Then full action spectrum optimizes each element in high density and immediate reward cycle, but immediate reward is tied to higher goal. In the end you have set of hierachical goals, that you can optimize sequentially or in parallel. Other thing I was wondering if hidden layers can connect in 3 d structuers or it is sequential (one level layer connects the other). I think if we want to immitate brain that could provide new possibilities.
I have the same sentiments as you. Deep neural networks are hard to generalize, especially if your inputs are not of the same distribution as your intended environment. So far, we have overcome this problem partially by augmenting images in the imaging domain. But, unfortunately, this is difficult in reinforcement learning.
@Amanuel Temesgen the US military just greenlit 'SkyBorg' an AI flight system that they hope one day may be taught to run missions autonomously, possibly even targeting and firing on targets. They watched the movie stealth and thought it was a great idea. And there are multiple instances of swarm tech being developed. Paired with the funds being dumped into AI by the military and it's pretty easy to see all the pieces are slowly coming into place for a situation like Skynet to take place. It wont be started by a an intelligent program, but a lazy programmer that wants to take a nap while the drones they are responsible for do their job... and bam they wake up to being blown up by skynet.
@Amanuel Temesgen that's the great debate, and why I'm FOR progress with AI, not against. That being said at some point we will need treaties on how AI can engage in combat, among other things. Without any regulations AI will run amok at some point that WILL prompt regulations. Realistically conventional laws cannot stay up with the development of AI. The argument for developing out the tech because if we don't, others will... is valid. Likewise if we develop it out first, than we can also develop out the first set of countermeasures. That all being said, your argument doesnt debunk my tongue in cheek comment of 'skynet is coming' if anything it just further proves my point. Hopefully skynet is benevolent... nothing to say we cant develop out an AI that spreads good/is beneficial.
just started this. fascinated to see how they set up the AlphaStar. in theory, it's easy to make an AI that can beat a person in starcraft - much easier than chess or Go - by virtue of having unlimited APM (actions per minute) and conceivably, though not necessarily, full map vision. So i wonder what constraints were put in place, if any, to demonstrate that it is the "intelligence" doing the winning as opposed to brute force computing power...
it looks like he addresses this exact concern starting at 41:30 edit: and his assessment is the same: they limited the AI's APM to keep it human-like (and competitive), knowing full well that removing this limitation gives it god-like abilities to multi task and micro-manage units.
If you watched the show matches with TLO and Mana, they talked about the constraints of a human-- APM and camera focus. Look at Alpha with blink stalkers vs. mana's immortals, which typically do very well vs stalkers. In this case, Alpha was able to control multiple groups of stalkers and have inhumane micro and was able to maximize the utility of blink that the stalkers could beat the superior unit composition of mana. Instead of having the intelligence to get into the mind of the opponent(theory of mind, strategies in poker, and RPS), the alpha star league taught that particular ai that blink stalkers with beyond human ability control had a better chance to win. When they implemented the different camera constraint, Mana was able to win, but this was with a newly trained network.
@@bitcrusherNOOP very cool insight, thanks. i was a hardcore protoss player so that blink stalker - immortal example is very interesting. i've actually started watching the show matches on mute while listening to this interview, but haven't arrived at that game yet i don't think :D
Let's be more honest. A true general AI that can pass the Turing test would know not to pass the Turing test. So what does the Turing test actually prove? Furthermore, things that would pass today's Turing test, may not pass tomorrow's. Imagine how easily you could fool an eighteenth century philosopher with today's technology.
From here you can see 3 guesses at which paper he is talking about: ai.stackexchange.com/questions/12095/name-of-paper-for-encoding-representing-xy-coordinates-in-deep-learning
Because it's made a risk analysis over the entire probability space. The detriment of 2 more; is statistically less than having your workers attacked and having below the optimum number of probes. Most pros now adopt AlphaStar's worker count
Lex can you improve your voice tone a little bit? Right now, the information contained in this podcast is GOLD, but I fall asleep so easily after getting so much Monotone. :/
Install AlphaMatrix input interface in your neck then the driver and download the latest engine. You will be converted to an android and never fall asleep again!
Can someone teach this guy speaking skills, specifically how not to sound like he is at a funeral? The content is interesting but expect the organ to start playing the requiem in the background at any moment.