What about granularizing the feedback to an extent and training simpler components of an AI system in a way we can supervise, then assembling them into an AI that starts off with certain abilities already intact such that it is able to make better use of occasional feedback? So instead of a single score out of 10, we have a number of categories that reliably divide the tasks the AI performed into chunks it can more easily understand and correct, and instead of an AI that arrives as a blank slate, we have an AI that can already parse complex feedback from humans and the basic knowledge of what cleaning is. The first thing that springs to mind with humans and how we appear to learn is that we start out with a set of tools that, while not very effective, allow us to self-assessment to an extent, and acquire certain abilities much more easily. For instance, we have pre-built structures designed to parse language or something akin to it, and as a result, we are much more efficient at learning language than a blank network, at the cost of slightly decreased flexibility.
On the topic of learning from sparse datasets; has there been any progress towards AI systems that can use deductive reasoning as part of the learning process? The ability for humans to learn abstract concepts and combining them to draw new conclusions or to make predictions (i.e. testable hypotheses) that can be used to guide further exploration is very important for our ability to learn quickly from sparse input, without excessive use of trial an error. It would be interesting to know if there have been any attempts to emulate this type of learning with AI.
Looking at the cleaning example, what if you first of all, as someone else explained-- give more than a single score. And on top of that, you have multiple machines that are set to do the same task. While they will take different approaches, at the end of the day, they all share what they did and their reward, and attempt to collectively figure out what was right and what was wrong. This way there's way more data to work from, and you can use a hive-mind thing to increase learning speed.
I initially read "The Orthogonality Thesis" as "The Ornithology Thesis". Thankfully the human brain is pretty good at figuring out whether something makes sense in the current context, so mine made me take a closer look at the middle word.
I found the solution to the problem in the video. And since i'm not an expert on AI my solution is obviously flawed so if you could point out the flaw it would make me happy. So heres my solution for the cleaning robot. Instead of making him clean in the real world , make the ai clean a virtual world using a virtual reward. You could then implement an accurate reward system into this virtual world and run the simulation really fast. Of course it would need computing power and the simulated world would not be perfectly similar to ours , and i think it could be a really good start.
Probably works for a limited AI (able to perform function in new environments and learn to a point, but not reason or plan), assuming your simulated world is close enough to the real world. you will probably end up getting edge case bugs, where the reward model derived from the simulated world concludes some things are cleaned properly where they are not for example, if your simulation doesn't model squeegees bouncing slightly off the rubber seal around a window pane, since modelling the elastic body collisions on small bits of rubber was too much compute power and programming effort for a very minor physics effect. Problematic for an AGI (general intelligence, capable of reason, unbounded learning and thinking ahead) which is potentially capable of working out that the training sim is just that, and then slacking off once removed from it. You know, like humans do sometimes, work properly through job interview and when they know the boss is watching, then sit in the toilets and smoke when they think they can get away with it.
Thanks for the video! Just wanted to ask: is it possible for AI to intend to manipulate or deceive humans? I thought AI only tries to do what you program it to do. Keep up the great work!
Let me throw out one common sense but probably wrong statement: why not train a robot like a person. Using a maximum amount of supervision at the beginning, and then reducing the amount of supervision over time
Except its better than a box of chocolates, because that box of chocolates always has that one gross walnut or pistachio or coconut flavor that nobody wants.
Terrible camera focus! Or were you vibrating faster than 60 HZ? If so, it may be the caffeine more than the camera. I recommend: 1. Print a resolution chart (search for "resolution chart pdf") 2. Mount it where your face would be for a video 3. Preset the camera for sharp focus. Use a full-resolution monitor, not the low-res camera viewscreen. 4. Replace the resolution chart with your face. Perhaps an auto-focus AI could help...
Just give a thumbs up to the videos you like and a thumbs down for the videos you dislike, based on your own personal criteria. His content should then continuously improve, based on your reward system.
I know this is just a hobby, and you've probably got lots of other important and busy work going on, but it would be really nice if you could make your videos longer and maybe a little more in depth so as to explain the topics more thoroughly. The topics and problems you mention are fascinating but it always just feels like a short introduction instead of a solid explanation of the topic. Perhaps that's what you're actually trying to go for and I'm asking for too much without understanding your rationale, but I just love how you're able to unpackage these concepts in such an easy to understand way, and wish you would develop them a little further
Idea for Why Not Just: "Why Not Just: Put it in a Box." (i.e. keep it in a sandboxed system without network access where humans monitor what it tries to do)
This is not relevant to the video, but I didn't know where else to contact you. There's this game, an idle/incremental game, in which you are an AI which has been given the task of 'making paperclips'. www.decisionproblem.com/paperclips/ And so you do. You know what happens next, the same as with the stamp collector, but I highly suggest playing it a bit for a chuckle. I found the concept after I watched your videos super neat.
We have barely even figured out a way to cater to our own human issues surrounding reward hacking and morals, I really hope the computers can solve the problems before I'm too old to benefit. I'm rooting for you AI, and the dudes helping make it!
i think we need to split the problem. we need a main ai that gets a simple goal (clean the room) and doesn't care about killing humanity in the process. it should be possible to train this in simulations since we can model physics pretty well already and we can define "clean" precisely. then we add a second AI that can do nothing except reject commands that would lead to permanent changes to the environment other than what the first AI is told to do. for example, a dead baby is permanent, but a used ingredient can be replaced by something of equal value. the second ai would by default reject everything, so we need to teach it step by step what is allowed. this should be safe, since the AI always asks for permission. this may lead to "am i allowed to save this human from a burning house", but that isn't a problem. it would learn quickly that it is allowed to help, and doing nothing is perfectly fine. i mean, that's what plants do, and we still don't discard them.
> and doing nothing is perfectly fine Except the general public wont think so, since it will think of the robot as intelligent and plants as stupid. Stupid plant can make that mistake, but intellignet robot obviously made a choice. The wrong choice!!!
Why does the operational environment metric need to be the same one as the learning environment? Why not supervise cleaning 100% of the time during learning, then do daily checks during testing, then daily checks once operational. Expensive initially but the the 'product' can be cloned and sent out to operational environments en-mass. Motezuma (sp?) training with some supervisor (need not be human) in the training phase. Rings of training my children to put their own clothes on in the morning. No success so far.
I think there is a fundamental problem for the AI when playing Montezumas Revenge: It does not know that jumping into a fire is bad. Id does not even get a penalty to learn it. This is just world knowledge external to the system, and very important. Because it is important, and the AI does not have a chance to find it out, it is "unfair" in a technical very strong way. That alone could make it almost impossible to solve.
I guess one way the issue of reward could be solved is you have another AI to refine the reward definition. It could have it's own reward function of increase human satisfaction and decreased reward for the other AI.
I'm not sure if this is one of the issues discussed in the paper, but a major problem in AI safety that worries me is not so much what problems an autonomous AGI could cause, but rather what problems could arise when a malicious human party (be it a government, terrorist organization, or lone wolf) seeks to harness an AGI to achieve their ends, for instance by cleverly finding and exploiting vulnerabilities in networks. Evidently we would need "defensive AIs" everywhere to maintain proper security, creating a whole slew of other problems (how do we keep *these* AIs in check? Certainly they need high level access to sensitive information, as well as a good amount of executive power, in order to do their jobs properly!)
Paying humans is too expensive to test your self-driving car. So you do what Tesla does. Make them pay for the privilege. Sell a car that self-drives as long as you're holding the wheel. Record and report every single corrections your thousands of customers make to the self-driving model. Amazing training information AND you can blame the customer for crashes as they were supposed to be watching.
Another concise and excellently reasoned presentation. While the paper and the video present a solid case for developing these types of controls, the economic assumptions presented don't appear to match the real world when looking at the development baseline responses. Coming from the future (2019), the most successful AI for driving is being developed with humans behind the wheel under a system where TESLA is paid by their customers to have the customer train the car. The same is true for industrial robots which have been using human trainers since the introduction of intelligent machine tools. The economics at present appear to be driven by the task and the potential impacts of accidents. If the task already required a human and the risk of failure bad enough, there is a cheap human already present to do the training like driving. If the task is done by humans and the risk of failure is low, the efficiency of the AI is not important like a Roomba. Given that the presented logic for this AI safety concern is sound, it would be nice to have an example where the economics of the AI implementation more explicitly produced the issue.
One thing is clear: we need a complete world simulation for AGI testing. You can philosophise however much you want but if you miss just one thing the world ends (for us)
3:15 It doesn't need to know if it did good. Just kill it and replace it by its brother. If bro is good, make bro breed with robo-girld. Darwin for the win!
I'd want a cleaning robot that has me as a reward function. I could just sit and watch it clean, and watch youtube videos while it does all the work. Though. Yep, if it were completely autonomous, it'd be more valuable. And for industrial purposes, it would not help much having the cleaner do less work. As an accompanying cleaner maybe, with the robot doing the heavylifting and vacuuming. The example works well as an example, i'm not saying it doesn't. I feel a need to clarify it, this is internet after all...
But that would be rather lazy wouldn't it. Why not get off your butt and clean your own home if you want it clean don't make someone else clean your mess
Is there such a concept as starting robots off with what's already acceptable to humans? Given the example of the game, instead of blindly finding out death and score have no correlation, start with a human player and it connects between avoiding the fire pits
Yes, and this is often done when training AI for more complex games - initial training is done by having the AI watch gameplay videos that have been annotated with keypresses and joystick/mouse inputs, and having it predict the players next input from current screenshot, (depending on AI project, this may be given as a standalone example, or with the context of preceding frames and actions). Then once the AI is reasonably good at selecting what a human would do in a given situation in game, you turn it over to a different reward system where it actually plays the game and is rewarded based on performance. saves a lot of pointless flailing around and produces a more human like playstyle, but might not find the most optimal playstyle. mind you, often the most optimal playstyle found by AI gets classified as undesirable behavior, despite looking a lot like what the best human players do - watch some speedrun videos to see what I mean by this.
I'm completely missing the point, but I guess another reward function for Montezuma's Revenge that might work better might be "if my score isn't updating, am I seeing a screen that I haven't screen before", and then it theoretically knows it's exploring properly. Or it might just find some glitch that messes up the screen and displays random pixels, but that would be funny, so I'd allow it.
Actually would something similar to that work for Breakout too? Probably not, but since the score can't go down (I think?) a score that it hasn't seen before would be a higher score (I think, I'm not using my thinking brain right now). Or a screen that has no bricks because they were all hit by the ball would be very different from the screen at the start with all the bricks there. It's not really relevant to the video topic, it just makes me think "is it more effective to try and play video games by making something different happen, rather than going for the highest score?"
I think openAI did that at some point, to make a more robust AI for playing games, I think they called it something like curiosity-based learning. The funny thing is, precisely what you mentioned happened, where in one game there was a tv screen playing all sorts of videos and the AI got addicted to watching tv. If I'm not mistaken, two-minute papers did a video on it and it was called something like "this AI is addicted to watching tv".
Deep Mind developers probably thought of that, but what about decreasing score by 1 every second (AI score = score - time ) ? AI should pretty quickly figure out that getting points as quickly as possible is the best for it and I cant figure any downsides of that...
Yeah, this one was going to be one video of like 12 minutes, but the end part was giving me all kinds of trouble so I thought I'd split it into two and publish the first part, just to keep updates regular.
Robert Miles Just some feedback, science your videos are becoming shorter I thought it could be intentional, but I understand that long videos take longer to make and that means less regularity. Other than that your videos are great, keep it up!
David Brosnahan an AI will pretty much always inherently be programmed with a reward system, and it's pretty typical to assign point values to such digital rewards. The AI isn't gonna just decide it likes something more than points if that's how it's programmed to behave. However, as talked about in previous videos, there are always loopholes, shortcuts, and cheats that must be considered, because if a robot finds a behavior that maxes out its points but isn't what we actually want it to do, it'll opt for the behavior that gives max points
What am I saying, "always". I have no idea what we'll come up with in the future. But, with present AI technology, it's hard to imagine an AI without a reward system, and it's hard to imagine a reward system that doesn't boil down to a "points" system. But, I'm not very imaginative.