9 Examples of Specification Gaming

Robert Miles AI Safety

Подписаться 156 тыс.

Просмотров 308 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 1,3 тыс.

@marccram6584 4 года назад

There was an experiment where crows were rewarded with a peanut for picking up trash. For each piece of trash the crow deposited in a special bin, the crow received one peanut. This worked great for a while until the crows ran out of trash and then the crows decided to hang around trash cans and assault humans who were trying to throw trash away. The crows would harass the people until they dropped their trash and then go get a peanut. Essentially the crows were taught to mug humans.

@fergochan 4 года назад

This is probably the best story here because it shows how universal this behaviour is. It's not just computers or AIs that do this.

@blahblahblahblah2837 4 года назад

It's entirely, intrinsicly biological behaviour. Simple action = reward. Humans would be no different, except that we have some concept of morality/longterm consequences. Although, a human probably _would_ do exactly this if there were very little consequence and no perceived better alternative way to obtain food. I guess it's not much different to a really persistent begger.

@gordontaylor2815 Год назад

@@blahblahblahblah2837 IIRC, the movie "The Terminal" has a sequence where the main character (played by Tom Hanks) does something similar to the experiment mentioned in the OP in order to get food for himself. (If you watch the movie, you would understand that the character is doing this for the reasons you describe in your own comment.)

@Mythologiga Год назад

Similarly, in India under british rule, there was a snake problem. To address it, the british started to offer rewards for any killed snake. At first, it worked, and people were genuinely capturing snakes. But after a short while, some people started to breed snakes instead in order to get more rewards. The program had to be cut after this was discovered. The new snakes breeders, now stuck with worthless animals, released them in the wild, erasing any gains made by the program.

@Hangman11 Год назад

@@Mythologiga Ergo Humans are Crows

@matesafranka6110 4 года назад

My algorithm teacher used to say, "The best thing about computers is that they do exactly what you tell them to. The worst thing about computers is that they do exactly what you tell them to."

@rentristandelacruz 4 года назад

There is a program known as Polyworld. The idea is to evolve artificial creatures via natural selection and evolution. One creature evolved a behavior of producing an offspring then eating it. The programmer initial forgot to add a cost when producing offspring so the cannibal creature essential has an unbounded source of food (it's own offspring).

@RobertMilesAI 4 года назад

It's on the list! Number 23, Indolent Cannibals tinyurl.com/specification-gaming

@lordkekz4 4 года назад

I've always wanted to make a game like that. I gotta check that out.

@BarbarianGod 4 года назад

I think that was in a Rick & Morty episode

@tim40gabby25 4 года назад

I played laserzone with a mate where we both hid squashed in a tiny barrel and alternated shooting each other. We got the first and second highest scores, picked up the cash prize, split 50-50 and departed like Kings, having earned approximately £5 for 90 minutes work, congratulating ourselves with a £2.00 celebratory beer, each, thus requiring us to restart the process as we had just enough for 2 tickets. By the end of the day it was like shooting very drunk fish in a barrel, yet we concluded we students were one up against the Universe.

@MouseGoat 4 года назад

@@BarbarianGod the ep where Beth finds out she has left her childhood freind to rot in a magical fantasy world? XD

@NoahTopper 4 года назад

That program that deleted the text file terrifies me deeply.

@GdotWdot 4 года назад

I read a story in a computer magazine long ago, about a hacking contest that didn't go too well. The goal was to store the given payload on the server. That's it. The rules were so vague one crafty participant just copied the payload at the end of an URL pointing to the target server, so that it would be saved in the logs.

@JamesPetts 4 года назад

@@GdotWdot I remember once when I first started my masters' degree, there was a treasure hunt in the MCR, where new students at the college competed for a prize by accumulating points assigned to various items (a small number of points for something like a conker, some more points for some college crockery, etc.) brought to an introductory party at which the prize would be awarded. The item with the highest number of points was the glasses belonging to the (notoriously difficult) person who maintained the college's IT systems. The idea was for people to find ingenious ways of trying to steal them (temporarily). I decided that the best way of scoring these points was simply to invite the person in question to the party, on the basis that, if he came, he would inevitably come wearing his glasses. Unfortunately, he did not accept the invitation, but the idea was a sound one.

@ukaszlampart5316 4 года назад

I think it is a big mess-up on creators part to even allow generated program to perform side-effects (they probably did not use subset of a language, but allowed for generating arbitrary code). In principle you could develop true AI this way, given enough time and computational power (I do not say it is practical, because probably all world computers at once, would not be powerful enough to get any meaningful results, space of possible programs being too large to narrow to the few which we would call "smart")

@user-qw1rx1dq6n 4 года назад

Direct quote from the AI before it pulled that: I’m gonna do what’s called a pro gamer move

@MainGoldDragon 4 года назад

The AI has run the numbers and the best solution it has found for saving the planet is deleting half the humans.

@valshaped 4 года назад

So an A.I. is an extremely skilled, unsupervised toddler being paid in candy to do a task

@LuaanTi 4 года назад

Yup, external rewards work about as well for AI as they do for humans; you find the easiest way to earn the reward, and don't bother doing anything without a reward :P

@renookami4651 4 года назад

Yes, and the ones supervising the task are people who are bad at stating exactly what they want, aka humans, so the A.I is doing exactly what they're been asked for as textbook definition goes...Yet the humans complain. xD

@carlosvega4795 4 года назад

This is what most pople refuse to understand... Except for the "skilled" part. They have no skills, they only do what they are designed to do, exactly as a tiger is for hunting. They're not skilled, it's just natural for them, since we are not hunters their ability seems "skilled" for only a skilled learned and expert human would have that same ability to blend, prowl and efficiently take down a prey sometimes without it even noticing. So goes with AI, they're not smart, they're not skilled, they're just naturals to the job, your human feedback is just a way to tell them "make this part bigger... no, wait, smaller, too small, a bit bigger... Good!, now do that exact thing but at the other side of the room since you were in the wrong place the entire time". Good thing AIs don't get annoyed :P

@gavinjenkins899 4 года назад

@@carlosvega4795 lolwut, tigers are extremely skilled hunters, AI is smart literally by definition, AIs could certainly become annoyed, what on earth are you on about

@monstertrucks9357 4 года назад

@@gavinjenkins899 He's right. AIs can never become annoyed. They are 0% sentient, 0% alive, and always will be. They will never feel annoyance, not even in the slightest degree -- it is not existentially possible.

@bubinasuit 4 года назад

I literally did a science fair project where the result of “can a genetic algorithm learn how to arrange solar panels efficiently” was “this genetic algorithm learned to exploit my raytracer”

@planktonfun1 4 года назад

hmm nice

@mistymysticsailboat 4 года назад

could you take a picture of it?

@Vaaaaadim 4 года назад

Hey! But perhaps this can still be of some use. If it cheats the system, then you've found something to bug in your system(which you may not have found otherwise), and if it doesn't then you might have an actually useful result.

@advorak8529 4 года назад

Good! You learned way more than you bargained for!

@robertvralph 4 года назад

@@advorak8529 hahaha... best comment.

@plcflame 4 года назад

I wish there were movies like that. AI isn't evil, it's just extremely good in doing what you asked for

@Khaos768 4 года назад

There are movies like that. All movies where an AI is trying to enslave and control humanity are basically that AI's efforts to follow the order that it was given to protect humans. Step 1: AI finds out that the greatest cause of harm against humans is other humans. Step 2: AI enslaves humanity to control humans and prevent them from hurting each other. There are even games like that, e.g. Cortana in Halo 5.

@mimszanadunstedt441 4 года назад

Human: 'Make me into superman' AI: Roger Roger. *Kills you and makes a Superman sculpture out of your corpse in a pose from a comic* If you wanna see something like task misinterpretation there was a brainwash episode in mlp where the resulting behavior was as such. Season 6 Episode 21. And if thats too light for you, then the Franken Fran novel is a 'monkey's paw' parody and horror manga series (with some amazing and gruesome imagery).

@djjoeray 4 года назад

Reimagining the 'deal with the devil' plot as an AI problem....I like it

@hugofontes5708 4 года назад

@@djjoeray AI is just a genie we can't have provide all our wishes... Yet

@hermask815 4 года назад

A movie franchise with the „what you wish for” theme are the ones called wishmaster . A djinn misinterprets wishes on purpose. #1 & #2 are ok b-movies .

@ZardoDhieldor 4 года назад

The hacker heart inside me just loves how AI creatively circumvents the restrictions/goals put in front of it. The boat example just makes me smile everytime!

@NoNameAtAll2 4 года назад

Is your avatar from To The Moon?

@ZardoDhieldor 4 года назад

@@NoNameAtAll2 Yup. My user name is inspired by To The Moon, too. It's Latin for "moon bunny".

@mimszanadunstedt441 4 года назад

idk if its circumventing so much as thinking it did a good job

@yondaime500 4 года назад

What is interesting is that the AI doesn't even know it is circumventing the rules, because technically, it isn't. It can't tell a bug from a feature. As far as it knows, it is only doing what it was told to do.

@ZardoDhieldor 4 года назад

@@yondaime500 I would go so far and say that even the term "artificial intelligence" is terribly misleading. AI is just as stupid as computers always were: Simply doing what it's told.

@ChrisD__ 4 года назад

Robert: "Give it a small reward for every frame the pancake isn't on the floor" Me: *already laughing hysterically*

@seriouscat2231 4 года назад

It could logically just freeze in place if there weren't any other conditions.

@badrequest5596 4 года назад

some time ago i was playing around with machine learning in unreal engine and i gave it control of a plank and put a ball on it. to make it simpler i just had the ball roll in on axis, so left or right, and the objective was not to drop the ball. it learned the trick really really fast. the best way to not drop the ball, was not to move the plank at all

@badrequest5596 4 года назад

after that i had the same AI learn how to drive a template car in the game engine and not hit walls. so what's the best way to not hit a wall? that's right! dont move at all! . i felt like an idiot for not seeing that one coming

@MidnightSt 3 года назад

i maintain that this one was embarrassing. give it a reward for the sequence "side A touching the inside of the pane, in the air, side B touching inside of the pane" AI would most likely still hack this one too in some hillarious way, but at least it wouldn't be an embarrassingly stupid mistake done by the researcher.

@cappuchino_creations Год назад

I kinda expected the whole plan to be just shaking in spacsm, because that would make the pancake jump all the time and have infinite airtime for each frame not touching the pan

@nowheremap 4 года назад

AI has already surpassed humans on malicious compliance.

@benwilliams5457 4 года назад

AI have not yet surpassed humans - humans are usually empathic enough to be aware of their malice and often attempt to conceal. The specification errors presented here are easily discoverable. The obvious solutions are to invoke an iterative process of refining the specifications until the path of least resistence/greatest reward is the one the programmer desires. In consequence.. 1.) the AI is now motivated to select modes of compliance which are less easily discoverable 2.) the AI effectively trains the programmer to specify the solution exactly. It seems that this might not be so bad; If the AI can find a cheat that gives results indistinguishable from the 'correct' result then it doesn't matter how. If the AI can iteratively train the programmer then the solution is eventually perfectly described, obviating the need for the AI solution. Of course, both of these outcomes probably require that the whole universe of all possible knowledge is catalogued, but then this is the root problem of, and solution to all hypothetical AI safety issues.

@ZenoDovahkiin 4 года назад

No, actually. There is nothing malicious about it. AI is rather accidentally glitch hunting.

@benwilliams5457 4 года назад

@@ZenoDovahkiin Malice implies agency, but the whole point of an artificial general intelligence - at least from the non-scientific cultural viewpoint - is that it has, or appears to have agency. Thus, characterising its actions as accidental is not appropriate. However, malice also implies a judgement of good vs bad so I agree that the neutral "glitch hunting" is more suitable than "malicious compliance". This leads me to wonder about whether a sort of moral code could be added to the AI reward function; e.g. a reduction in the value of the reward for any action for any harm that that is caused, or not avoided compared to an alternative action, or a null action. It would be rather like Asimov's Three Laws of Robotics. There is an obvious problem with defining "harm" meaningfully, but there are some simple initial approximations: for example: estimate the total number of healthy humans (within the scope of the AIs senses or field of action) and reduce the reward if the number is lower after an action vs. inaction. The "glitch hunting" nature of AI will doubtless look for short-cuts - attracting more humans with a delicious smell to offset any losses - but the code can be refined as the AI finds loopholes. This approach reduces the problem of hard-programming in every possible contrarian action of an AI to hard-coding every aspect of a moral code, such as we all have built in. Of course, this would fail utterly since no-one can describe precisely what their moral precepts are without listing by example or falling back on poorly defined aphorisms like "Do no harm", despite actively using that morality to police every aspect of our lives. If some one did make headway in an absolute definition of morality, any AI would tear it to shreds because human "morality" is an unevenly-applied, self-serving mess of contradiction that are made up of learned responses, laziness and enlightened self interest. It won't do much for AI safety but it might teach something about human-intelligence-safety.

@Scotch20 4 года назад

The AI is just doing what you told it do to

@anandsuralkar2947 3 года назад

True

@chrisjones5046 4 года назад

I teach about this in one of my lectures, it's a interesting sub-set of Goodhart's Law "When a measure becomes a target, it ceases to be a good measure". It turns out humans have been dealing with this one for a while. It sort of makes the AI more human.

@freefrag1910 2 года назад

truly a widely applicable thing. the same happens for "industry standard benchmarks". Once a benchmark program is accepted by the community you can see how the manufacturers fine tune every bit of the hardware to get the highest results, often sacrificing real life results.

@Redmanticore Год назад

@@freefrag1910 like wolkswagen. "Volkswagen had intentionally programmed turbocharged direct injection (TDI) diesel engines to activate their emissions controls only during laboratory emissions testing" "during the bank's (European investment bank, EIB) annual press conference on 14 January 2016, the bank president, Werner Hoyer, admitted that the €400,000,000 loan might have been used in the creation of an emissions defeat device.[348] "

@mrosskne 4 месяца назад

@@freefrag1910 Then why not make the benchmark "real life results"?

@youtubeuniversity3638 Месяц назад

Then how do we actually make good targets, if creating them out of measures is no go?

@mrosskne Месяц назад

Don't create targets.

@Randgalf Год назад

The funny thing is that this made me realize I often reasoned (and executed accordingly) like an AI when I was a kid. I had a knack for zoning in solely on the stated objective of any task with no concern for any underlying purposes, much to the chagrin of any present peers or teachers which I always found confounding.

@prolamer7 Год назад

interesting insight!

@piemaster6512 4 года назад

At 5:02 I absolutely lost it. I would feel personally attacked if my program did that to me. Fantastic!

@massimookissed1023 4 года назад

I'm surprised it didn't also stick its middle end-effector up at the programmer.

@Jacob-pu4zj 4 года назад

It was absolutely hilarious. They left it far too many degrees of freedom.

@TonyHammitt 4 года назад

Really needs the "Thug Life" glasses at that point, or a microphone to drop...

@freefrag1910 2 года назад

@@Jacob-pu4zj but that is the beauty of finding sometimes brilliant solutions

@karapuzo1 4 года назад

From the list: "CycleGAN algorithm for converting aerial photographs into street maps and back steganographically encoded output information in the intermediary image without it being humanly detectable." That's great, I am not even mad. Second place goes to "Genetic algorithm for image classification evolves timing attack to infer image labels based on hard drive storage location"

@renakunisaki 4 года назад

I like the one that figured out that lesions are likely to be cancerous if there's a ruler next to them.

@mrosskne 4 месяца назад

Can you explain what the first one means?

@karapuzo1 4 месяца назад

@@mrosskne search on Google "This clever AI hid data from its creators to cheat at its appointed task" (with the quotes) there is an article on this, also the paper arXiv:1712.02950

@sszone-yt6vb 3 месяца назад

Basically the GAN learned to sneak the answer into an image which was supposed to be (and looked like)a heavily transformed version of the image. By answer I mean the basically the original image! Here the goal was to create street map from areal photographs. I think CycleGAN basically worked by converting street maps into areal photos and another part of the network did the areal photo to street map. You can think of it as trying to create networks for both the forward and backward problem. But first part of the network managed to hide what the final answer should look like in the intermediate image! So it managed to basically cram in two photos worth of info into just one and the second part of the network basically read off the answer and outputted the final answer. It wasn't really converting the areal photos to the street maps. It was simply reading off the street map info which was hidden in the high frequency details of the intermediate photo generated by the first piece of the network. To a human eye it looks indistinguishable from noise. You can search online there are news articles on this.

@TrimutiusToo 4 года назад

Code bullet reference... He actually made that agent training thingy public if anybody want to try, though it doesn't have that physics bug anymore.

@knight_lautrec_of_carim 4 года назад

it's more of a direct mention than a reference

@TrimutiusToo 4 года назад

@@knight_lautrec_of_carim reference usually means any kind of mention.

@TheAechBomb 4 года назад

I think it still has the physics jank, he just limited the joints so it couldn't push the legs into its body

@phiefer3 4 года назад

@@TheAechBomb iirc he never "fixed" anything to prevent that, he just made it so that the agent died if any part of it besides its legs touched the floor

@underrated1524 2 года назад

@@phiefer3 I mean, that is a valid solution IMO. He wanted his creations to walk on their legs, so he made that a requirement for survival. ("I've found that to be a pretty good motivator." ~CB) I mean, what else was he going to do, tinker with Box2D itself? He has enough trouble using the darn thing, let alone debugging its bugs XD

@CraftyF0X 4 года назад

They sure as hell good in "thinking out of the box". They percieve no bounds for solution but the rules.

@inyobill 4 года назад

"We're going to get rid of all of the software engineers, the users will be able to just tell the computer what they want, and the computer will produce the correct software." (I was hearing this would be reality in 1990 by 2000, the end date keeps getting pushed back. "Oh, you have customers that are able to specify their non-trivial product in sufficient detail and rigor that the machine will produce the desired product. Hunh." The hardest part isn't the design and code, by a competent team. The hardest part is specifying the system behavior to produce the desired end.

@ABaumstumpf 4 года назад

There are very good code-generators for a lot of simple tasks, there are also generators for really complex things, even the design. But it always needs quite a lot of human work to be even close to useful.

@inyobill 4 года назад

@@ABaumstumpf Interesting observation. I was in the world of hard real-time, where predictability was a critical system characteristic, so my view-point is probably a bit skewed, even beyond the egotistical ("How could a machine possibly replace someone as brilliant as I?"). User Beta testing was/is not an option.

@ABaumstumpf 4 года назад

@@inyobill I only have to deal with very little hard-realtime systems, but most is not hard (but not as lenient as soft either). The only thing we have that is generated a the api-functions for database-calls, and some base-classes that only hold data. But in general code generation has come a long way, google and amazon use it extensively, alongside a lot of meta-programming, but it has to be very well structured and needs a lot of specifications.

@abdulmasaiev9024 4 года назад

"customers that are able to specify their non-trivial product in sufficient detail and rigor that the machine will produce the desired product." - well, see, it's for machines so it needs to be unambiguous. As to not reinvent the wheel let's use one of the unambiguous ways to tell computers what they're supposed to do that are already there like a programming language like C++ and wait a minute this software engineering replacement is literally just software engineering

@musaran2 4 года назад

Users don't even KNOW what they want.

@chrisofnottingham 4 года назад

The thing is, setting the wrong targets is what happens all the time even without AI.

@WarrenGarabrandt 4 года назад

Humans are amazingly great at setting the wrong target, and then maximizing for a metric instead of actually improving anything.

@TlalocTemporal 4 года назад

@@WarrenGarabrandt -- Sounds like every political and economic system past the commons.

@robertvralph 4 года назад

@@WarrenGarabrandt This literally explains so much of the failings of bureaucracy

@IPlayWithFire135 4 года назад

@@WarrenGarabrandt Climate change wants to know your location

@GeneralSorrow 2 года назад

Game achievements.

@chandir7752 4 года назад

There's something about these AI's when they act they way they do that has me rolling on the floor. Like they look so stupid but are actually extremely good at doing what they are asked to do. That pancake throwing technique lmfao imagine if they tried this in real life with an actual robot arm and suddenly the thing tries to set up a new pancake throwing record... I can't hahaha

@martiddy 4 года назад

AI be like: "look at me master!, I'm doing a good job" *proceeds to yeet pancake*

@FlesHBoX 4 года назад

I mean, the problem really is US and not the ai. We are clearly not giving them proper instructions because humans rely on a lot of implied and inferred meaning. Even the most specific of human spoken languages are nowhere near as precise and specific as even a rudimentary programming language. It's such a fascinating thing. I imagine that the process of creating AI has taught us a lot about how humans think.

@plcflame 4 года назад

@@FlesHBoX Strange enough to think that maybe, before we create an superintelligent and powerful AI, we need to create another language and adapt our brains to this.

@squirlmy 4 года назад

this must be your first video with Robert Miles? Maybe you are new to AI altogether? I'm just an enthusiast, not an academic, but the crazy things AIs do is the very first thing I learned about the subject, years ago. It's really strange to me to read someone mention this here as if it were at all unusual or unexpected. Yes, all computers, not just AIs, do exactly what you tell them quite literally.

@kaitlyn__L 4 года назад

@@squirlmy I learned about the "grow tall and fall over" thing over a decade ago from BBC4 and a few of the video game playing ones, but these other examples were new to me .

@Waffles_Syrup 4 года назад

Reminds me of tom7's nintendo learning program that learned that the best way to not lose points in tetris was to just pause the game

@renakunisaki 4 года назад

I love the ones that learn to break the game.

@khatharrmalkavian3306 4 года назад

Not playing Tetris is the same solution I found to Tetris.

@AdmiralJota 3 года назад

Joshua?

@mrosskne 4 месяца назад

@@AdmiralJota what?

@AdmiralJota 4 месяца назад

@@mrosskneReference to an old movie. ("WarGames")

@buzz092 4 года назад

Chortled at the "YEET"

@Bruva_Ayamhyt 4 года назад

**chortles heartily**

@Brindlebrother 4 года назад

_chortling continues_

@RobertMilesAI 4 года назад

@europeansovietunion7372 4 года назад

My favorite is the use of camera perspective to trick the human that was supposed to train the AI. Instead, the AI basically managed to train the human to validate what the AI wanted.

@LetalisLatrodectus 4 года назад

And the scary bit is that the AI isn't even malicious or anything. It's just doing what it really thinks we want to see.

@JaneDoe-dg1gv 4 года назад

A classic example of how it is best to assume ignorance before malice.

@PinataOblongata 4 года назад

@@JaneDoe-dg1gv Otherwise known as "Hanlon's Razor". For a given value of "best" you would think we could come up with a better algorithm for calculating likelihood of either behavioural driver, i.e., if it's a voter, weight ignorance, if it's a politician, heavily weight malice/self-interest, and if it's Trump, assume maximum levels of both ignorance AND malice ;)

@MouseGoat 4 года назад

my main fear finally brought to reality, and this is a dump AI if we ever build a strong one... well we f****d.

@Zer0Spinn 4 года назад

@@MouseGoat I was gonna say the same. Imagine this shit with an AI that controls the power grid, military systems, or god knows what. We have a lot of work to do if we really want to put this things to work...

@TheInsideVlogs 4 года назад

shameless plug: we built gym environments to study specification gaming where you can play the noodle game of this video as a human and see if you hack games as well as the AI. just google "quantilizers github" and you'll find it.

@Kratokian 4 года назад

You know, a good chunk on the list, I can't tell if the AI actually did anything wrong either. There's definitely something to be said for expectations as well as everything else. "Bicycle AI learns to circle around goal" Isn't that good if it's improving stability? Oh, I guess it never went 'towards' the goal at all Boat Race robot, people definitely value playing some games like this, so if it looks bad isn't it just because the material was too simple? Stenography, that's legitimately just a useful system that doesn't take any extra data or overhead to keep track of maps, when used in an existent system, good job robot!

@CoryMck 4 года назад

_"in real life you can't just be very tall for free"_ Guys lying on their tinder bios: *Says who‽*

@stardustreverie6880 4 года назад

liked for the interrobang usage

@fredericapanon207 4 года назад

So what is the Unicode for the interrobang? 'Cause I want to able to use it too!

@RoberttheWise 4 года назад

Normal person: (sees AI making dumb mistakes) "Haha, dumb computer. What is everyone so afraid of? This thing could never take over or destroy the world." Safe AI enthusiast: (sees the same thing) "Oh no, this thing will take over the world and destroy it if we actually let it do serious thing. We should be really careful with it."

@kaitlyn__L 4 года назад

exactly!

@kaitlyn__L 4 года назад

especially because these kinds of outcomes are already potentially affecting people's lives where software to help determine criminal sentences are involved?

@kaitlyn__L 4 года назад

which is some dystopic sci-fi shit btw, but at least that would have the courtroom painted black with neon lights, in the real world they look exactly the same but are shifting their operation

@RoberttheWise 4 года назад

@@kaitlyn__L I feel like sci-fi writers did us all a big disservice with their depiction of rogue AI. Gave us as a society a major blind spot for technology going rogue while still being very dumb.

@feha92 4 года назад

Correction: it is not everything that collides with him that turns into gold. It specifically states _everything he touches._ So he can still breath, but any air that touches his fingers (there is a reason it is called a midas finger, or green thumb) will continuously turn to gold (no idea if it welds with the prior gold that just fell, or if it becomes atomic gold dust oozing out of his fingers). And he can still eat, the implements used to do so will however become gilded. Similarly, if he scratches himself, he turns into gold. edit: also, the ground most certainly is not a single object. *Maybe* I can agree to it if you refer to a single casted slab of concrete used as ground, but even regular stone is filled with weird stuff I dont know about cause I am not a geologist, and earth has clumps of dirt aplenty.

@tricky778 Год назад

Then can't he carry gold claws with which he can hold things?

@ChayComas 4 года назад

"The robot just throws the pancake as high as it can" LMFAO

@bartsola8349 20 дней назад

I think the bit about Midas is brilliant because it subtly relates to the rest of the video, like how people have been finding loopholes in the narrative of the Midas story and how AI is finding loopholes in the "narrative" of the games and its objective

@georgew.9663 4 года назад

3:32 an accurate representation of my journey through life thus far

@columbus8myhw 4 года назад

Joke: Genie: "You get one wish" Midas: "I wish anything or anyone I touch turns to gold" Genie: "Ha, fool, you do not realize -- wait, what?" Midas: "I know what I said"

@Ghi102 4 года назад

I might be dumb, but I don't get the joke?

@martiddy 4 года назад

Poor Midas can't even touch his own pp.

@АлександрБагмутов 4 года назад

How nice of you to warn us first. "Joke:"

@wood-eye 4 года назад

@@Ghi102 Genie in a bottle. Rubbing counts as touching.

@drdca8263 4 года назад

Ghi102 the joke is that in this variant, Midas *did* want to turn the people he touched to gold, and the genie was surprised. It goes against expectations, and also might lead to the reader missing the “or person” in the first line, and then realizing it when the genie does. Produces a kind of double take?

@rancidmarshmallow4468 4 года назад

I'm not an expert, but I believe in the midas scenario 'touch' is interpreted as 'midas is thinking about the fact that he is touching this thing, so it turns to gold' or at least, I've read versions where he is initially very happy and touches various objects, and his clothes don't turn to gold until he realized he is 'touching' them, and his daughter does not turn to gold until he realizes he is touching her. since midas never thinks about that fact that he is touching the ground or air, it never turns to gold.

@mhorzic 4 года назад

Missed you man, love your research stories.

@Alorand 4 года назад

So "monkey's paw" is the very nature of how AI behaves? Why does this cause me to feel profound anxiety?

@michaelspence2508 4 года назад

Because you're paying attention.

@GuRuGeorge03 4 года назад

we act like AI as well. You just don't realize it. But here is one example to start with: Sex was invented for reproduction. We humans invented condoms to exploit sex for pleasure instead of reproduction. You see this pattern in literally everything we do. It just isn't as obvious as with the AI that you see here, because you're so used to thinking that whatever you are doing is "more intelligent"

@patrikcath1025 4 года назад

"Stop humans from killing each other!" "Lol okay, killing them myself"

@flymypg 4 года назад

I was an undergrad at UCSD during the early back-prop days with Bart Kosko and Robert Hecht-Nielsen, and developed an abiding fascination with following ML developments, though only as a hobby, never professionally. Robert Miles is, to me, one of the best at identifying and explaining both the fundamentals and some of the "curious corners" of current advancements. However, one YT video every 4 months is not nearly enough. I can haz moar? Puhleez? Nice haircut! Truly a good one in this era of online tonsorial self-mutilation.

@morkovija 4 года назад

I rofled at the original audio reconstruction!x)

@blahsomethingclever 3 года назад

The introduction of imagining Midas’ gift was cute. Several errors though: IF the curse was anything touching his flesh was to turn to gold, assuming only those direct atoms or electron clouds of a certain limit, Midas would instantly be surrounded by a reactive chemical cloud of torn apart molecules, probably smelling like ozone, reddish black in color. Ultra fine black gold dust constantly comes off him, coating him instantly in thick layers of dust. Midas just asphyxiates. IF some level of human story telling is allowed, the curse applies to his body and mouth only. Only 'objects' turn to gold. Easy fix then: hook Midas up to a feeding tube or IV as needed, build articulated clothing he can wear, problem solved. Moreover, if the entire planets some are replaced with gold at original density, it would violently shrink almost a thousand kilometers, and become incredibly hot due to gravitational contraction. Volcanoes of liquid gold would erupt into a landscape with mountains no higher than 200 meters. And things would still look black, due to the dust!

@puskajussi37 4 года назад

Just for fun, how about situations where a wish/AI would go horribly unwrong? For example: Someone makes a system that has instructions to maximize world "badness" or some such. Then the system reasons 1. "Badness" = ("badness" in the end ) - (how good thigs have been) 2. The bleakest (or "baddest") world state is the heat death of universe and that eventuality cannot be avoided. Thus it creates a prosperous, long lived utopia so the eventual tragedy of all that being lost is the greatest.

@Hexanitrobenzene 4 года назад

Interesting perspective on this problem.

@tricky778 4 года назад

I have never experienced such brilliance of thought from anyone! Or at least not in a form I could perceive, I hope you are a real and kind person. I need to believe that real and kind people are this brilliant.

@peterzerfass4609 4 года назад

Flipping the cube over so that top and bottom planes match. That one is pure genius. Well played AI, well played. Thing is, in some applications you actually want the AI to come up with these 'outside the box thinking' kinda solutions (e.g. in pharmaceutical resarch)

@tiagotiagot 4 года назад

Reminds me of that time when they wanted to end a snake infestation and so they offered a reward for every dead snake; so people started farming snakes; and when the government found that out they canceled the reward program, and so all the farmed snakes were released, making the snake infestation even worse than what it was before.

@halyoalex8942 3 года назад

The Cobra Effect :D

@tjnanimation6490 4 года назад

7:21 OH MY GOD! I KNOW THAT GAME!!! (I know code bullet too, good channel) That quote was buried SO DEEP in the back of my mind, I haven't heard it in like 15 years!!!

@MushookieMan 4 года назад

To solve the specification gaming problem, we only need to create AGI that can interpret what we meant, instead of what we said.

@plcflame 4 года назад

Or we can create another language and another way to think, more specific, before creating AGI

@hansisbrucker813 4 года назад

@@plcflame Like Lojban?

@davidwuhrer6704 4 года назад

Ah, the old DWIM problem.

@askarkalykov 4 года назад

Special +1 for specifically mentioning a bug depicted in CodeBullet laser kill show video :D

@fletchermorgan8351 4 года назад

I played Strategy Challenges of the World in 1995! Thank You Miles!

@RobertMilesAI 4 года назад

Found 'em!

@AyrtonTwigg 4 года назад

I could watch these examples 24/7

@LateralTwitlerLT 4 года назад

"oxygen molecules will turn to gold atoms, [...] and I guess the ground is just one solid object" kay then

@s6th795 4 года назад

7:30 me removing my tasks from the Scrum board

@PMA65537 3 года назад

Yossarian moved the bomb line.

@thomasjoyce7910 4 года назад

Thank you for the low odds, high payoff joke. That sound-byte from Nine Men's Morris was memorable for me too.

@TrimutiusToo 4 года назад

Code Bullet actually made that engine with learning agent running away from laser kinda public... Though mentioned bug with physics engine was fixed of course...

@Unprotected1232 Год назад

This actually could help expose bugs and exploits in video games. Has potential for QA when the technology matures enough for AAA development.

@danieljensen2626 4 года назад

I'm glad you featured code bullet because that was exactly the example I was thinking of in my head the whole time. Every time he's done evolution training like that the algorithm inevitably finds a bug in the physics engine to ecology l exploit.

@Martcapt 4 года назад

God, this is just comedy gold. All of these should be rearrenged with even more of a stand up comedy feel. An AI algorith comes into my cooking class. He tries and fails to flip a pancake. I just tell him: for crying out loud, please just try and avoid dropping it on the floor as long as possible. He then proceeds to fling it at the ceeling. It got stuck there. Then he turns to me with a look of glee in his eyes: I am good AI. Task successful. Edit: Fml, he got really angry when I tried to scrape it off and tried to kill us all.

@FunkGodPutin 4 года назад

Computerphile introduced me to this glorious specimen of an AO researcher. I just discovered this channel and I am elated as a result.

@jorice1592 4 года назад

The Midasocalypse! That would be METAL. Literally 😂

@davidwuhrer6704 4 года назад

That's heavy, doc.

@jackshae7 4 года назад

Literally it takes a lot for me as a person to want to like and subscribe in general, especially when watching a new channel, but I am two minutes in and I wanted to like, subscribe and even comment. I love the humor and excited to watch the rest of your videos!

@sylvainprigent6234 4 года назад

I really think that the pancake throwing thing is amazing. I just found your channel after having seen quite a few of your computerfile videos on the AI safety subject. And I find it a very interesting topic although I still only know what you teach in these few videos. Yet this rejoins phylosophical (and mathematical) concepts about how to define things. How do we define properly what we mean, all the subtle implicit and ambiguous part of the language and what do words mean I'll sure be watching more

@jonaswolterstorff3460 Год назад

These videos are just so incredibly valuable in light of the Yudkowskian viral podcast and people flailing about trying to understand these issues better (yep, I am the flailing one)...

@MarkAhlquist 4 года назад

I think the creative solutions to problems that AI comes up with will someday be the most useful thing about AI

@perjespersen4746 4 года назад

Midas' transformation wasn't a punishment but more like a gift from Dionysus for treating Silenus well. And he didn't starve but had his gift reversed by Dionysus by washing his hands in the riverPactolus.

@franklyanogre00000 4 года назад

If that were the case, Midas would have created a film of gold across his surface at most half a dozen atoms thick or so... then suffocated in a few minutes with a crinkling, tinkling sound. His family would have been just as safe as his food.

@lend9754 4 года назад

That "original audio" joke just won you a new sub

@alhazan 4 года назад

What I'm getting from this is that we should use machine learning to discover loopholes in real physics.

@underrated1524 2 года назад

This has actually happened at least once. Google "an evolved circuit, intrinsic in silicon, entwined with physics". The issue is that "magic" solutions like this tend to be surprisingly useless, because they often depend on contingencies in the training environment, like temperature or ambient radio signals, that can't be relied on in practice.

@Captain.Mystic Год назад

One of the original versions of the midas story was the fact that the person who gave his wish was as an actual reward from the gods, and the god immediately recognized it as a bad idea and warned him against it, and promptly un-""cursed"" him when midas found out he couldnt eat food and asked to change it back. Think of it as that one guy you gave your favorite controller to(you know the one) and he says he prefers the one not covered in cheeto dust when he realizes that because its the best, youre the one that uses it the most.

@MuPrimeMath 4 года назад

The reward modeling problem is really interesting!

@Violent2aShadow 4 года назад

"I knew everyone would die. I just wasn't sure what would kill us first." Hasn't that always been the case?

@pierrecurie Год назад

1) About a year after this was posted, Kursegsagt (sp?) posted a video about what happens if the earth turns to gold. The atmosphere compressing/boiling thing is nearly identical (they also consider 2 other scenarios). 2) Humans are just as vulnerable to specification gaming (some of the comments call it malicious compliance). There's an old story about some British colony having a rat problem. They tried putting a bounty on rats by giving money to people who brought in rat tails. A few months later, they were paying out large amounts on rat bounties, yet there was no meaningful reduction in wild rats. After investigating, they found that people were farming rats for their tails.

@justrecentlyi5444 4 года назад

Well I've just been randomly recommended this video, first time seeing the channel and something really interesting!

@JaredVeale 4 года назад

As a child I was told I could only have one cookie from a batch that was made. I took that to mean I could have half of every cookie, and then just have a whole one at the end

@Connorses Год назад

Reminds me of when the class was asked to write instructions for making a peanut butter jelly sandwich, and then our teacher followed our instructions very literally. "take 2 pieces of bread from the bag" (Teacher rips 2 pieces of bread off of a slice) "put peanut butter on one piece" (haphazard placement of peanut butter via fingers) "and jelly on the other piece" (more use of fingers to placejelly) "And now put the 2 pieces together so the peanut butter and jelly on the pieces of bread is touching." And we have a wad of bread with peanut butter and jelly inside, technically a sandwich but not what we wanted at all. She did this with as many instructions as she had time for. It was hilarious.

@Asdayasman 4 года назад

Koji did ok at making a pancake flipping AI in Unity. From memory, the environment is a couple of joints in an arm, a frying pan, and a floppy disc, the agent has some life total that counts down, and it fails if the life total reaches zero, or the pancake hits the floor. An amusing side effect of this was that first the agents learnt to not drop the pancake, but learning to flip it took ages because there were no breadcrumbs leading to that, and when they did figure it out, they waited until they have basically zero HP left, because "maybe it's going to stop counting down this time", then performed the fastest and most efficient pancake flip I've ever seen in my life. Here's the video: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-DDhv7biBd4o.html It's at 2:10. The comment he makes is: "too good..."

@peteranderson037 4 года назад

Should we ever be so foolish as to create a general artificial intelligence (which, considering the forward thinking abilities of most humans is a 100% guarantee) based in what we have seen in the small scale a general AI will invariably be a consummate and insufferable smart-ass.

@TlalocTemporal 4 года назад

Counterpoint: A general AI will understand that being an insufferable twit will be a bad choice, and ask clarification questions.

@AnthonyBecker9 4 года назад

@@TlalocTemporal Only if this general intelligence is actually subject to consequences, which it might not be in an intelligence explosion

@Badspot 4 года назад

I made a primitive artificial life simulation for my 8th grade science project and one of the first things it did was find an off-by-one error I had made where nutrients at the edge of the play field could be eaten without them being removed. So they just zipped to the left hand side of the screen for free infinite food and reproduced until the program crashed.

@tricky778 4 года назад

Sounds like the housing market in the UK

@casey6556 4 года назад

Deleting the trusted output and returning nothing is basically a real world equivalent of Silicon Valley’s (the TV show) “I told an AI to fix bugs in our code so it just deleted all the bugs by deleting all the code”

@olivergilpin 4 года назад

Hey Rob, we met at Vidcon and talked about media polarisation - how’s it going? :)

@a.baciste1733 4 года назад

The unexpected, way too specific analysis about midas added when editing was truly hilarious. Completely my kind of humour here 👌(and strangely interesting)

@0ptera 4 года назад

Odd you didn't pick the Elite Dangerous example. AI exploiting a glitch to get superweapons and start hunting down players is hilarious.

@squirlmy 4 года назад

Rob may not be a gamer. I have no idea what "Elite Dangerous" was, I had to google it. Why do you think everyone is familiar with internet games?

@the11382 4 года назад

Where can I find more about it?

@Hexanitrobenzene 4 года назад

I did a quick back of the envelope calculation and it shows that gravitational acceleration - I assume that's what you mean by "gravity" - does not depend on density linearly. Here is why: gravitational force is F=GMm/R^2 = mg, where G is gravitational constant, g is gravitational acceleration, M is the mass of Earth, m is the mass of atracted body and R is Earth's radius. I suppose we assume that the mass of Earth is unchanged during our "densification". Mass is proportional to density and volume: M=ρV, volume of the sphere is V=4/3*pi*R^3. If we denote constants here as k, we get M=ρk*R^3, solving for R we get R=(M/ρk)^(1/3), using this in our first equation, we get g=GM/R^2=GM * (M/ρk)^(-2/3)=GM * (ρk/M)^(2/3)=G*M^(1/3)*K*ρ^(2/3), where K is just a mathematical constant. If we denote g_2 as acceleration after densification and g_1 as before, we get g_2/g_1=(ρ_2/ρ_1)^(2/3) ≈ (19.3/5.5)^(2/3) ≈ 2.3 times.

@nikosplugachev6610 4 года назад

I've got a little list haha

@guspaz 4 года назад

As some day it may happen that a victim must be found... Be sure to set your simulation parameters such that you are not on the list.

@DrellPoker 2 года назад

Dude it would be so awesome to have an A.I. that exclusively finds crazy bugs in games.

@DustinRodriguez1_0 4 года назад

My absolute favorite example of this sort of "specification gaming" was an article I read ages ago about a system that was designed to automatically build circuits. I don't remember the goal, but it was using genetic algorithms to evolve a circuit for some purpose. At the end of it, it produced a circuit that looked strange... and had at least one component that was literally not connected to anything. But, when they removed that disconnected component, the circuit stopped working. Completely baffled the creators. I don't know if they ever did investigate it more deeply and figure out how the disconnected component was involved, their theory at the time was that it was some kind of induction/RF interference voodoo.

@Hexanitrobenzene 4 года назад

EDIT: Actually, it is present in a list given in a description: "A genetic algorithm designed a circuit with a disconnected logic gate that was necessary for it to function (exploiting peculiarities of the hardware)" citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.50.9691&rep=rep1&type=pdf I read the paper, it is very interesting and not too technical, if you have some basic familiarity with electronics. By the way, there were also four other "cells" (components of FPGA), which, by looking at the connection diagram, were only passing signal through them unaffected, but altering the presumably unconnected elements of these cells degraded the circuit performance. Interesting...

@cleitonoliveira932 4 года назад

Yay, we'll all going to die from AI systems!

@mytech6779 4 года назад

finding loopholes /shortcuts is an inherent piece of intelligence. Cannot be separated.

@DeviRuto 4 года назад

That Midas thing is exactly the plot of a comic called The Midas Flesh.

@TheFinagle 4 года назад

"Computers do what you say not what you mean" is a concept every budding programmer gets very familiar with

@derektafoya1152 4 года назад

That beginning of the video derailment followed through to its conclusion bit was great

@nitramreniar 4 года назад

The second clip of the one with the bricks is absolute gold

@Yipper64 Год назад

8:14 I would have said every frame the pancake remains on the pan. Yes it wont flip the pancake when you start to try to train it to do that, but with time it may.

@EternalDensity 4 года назад

Fascinating! Especially the example which learned to delete its test data.

@mokopa 4 года назад

I was so fascinated by [the contents of] this video that, at the end, I snapped out of a trance. That rarely happens. Good job!

@lordkekz4 4 года назад

Finally another video! Keep up the great work!

@waltlawson2709 4 года назад

Of all things I expected to see, I did not anticipate a Strategy Challenges of the World reference. I haven't organically thought of that game for decades.

@SuperNintomdo 4 года назад

Hey! I just found your channel and I watched a BUNCH of your videos. I love all of it! I had a random thought, that I am sure is wrong, but at this moment I can't determine how I am wrong and wanted to share. It seems like most your examples have the AI have one terminal goal. Why not have an AI with multiple terminal goals? Aka Goal 1: Collect Stamps, Goal 2: Don't cause the death of a human, Goal 3: Don't change your terminal goals. Wouldn't a hyper intelligent AI, since it has no feelings, just go "Okay the best option for goal 1 is to destroy all humans, but then I CAN'T achieve goal 2, so what's my next best option?" I understand there are loopholes such as the AI saying "I didn't cause the death of all humans, I only produced so much CO2 that humans couldn't survive, the warming effects killed them" so if you do see this and try to answer my question I hope you can try to take human context into my assumptions and not just write off the question as not specific enough. This is already pretty long and I figured you wouldn't read a whole thesis lol.

@BhupinderSingh-xv6dk 4 года назад

Dude I had no idea you had your own RU-vid channel So glad man 👍

@thrallion 4 года назад

It's always a wonderful day when a new Rob Miles video comes out

@danielrhouck 4 года назад

Some-but of course not all-of these seem like they could be fixed just by simulating fro longer. If you have a tall thing that moves a lot when it falls, that same tall thing will end up moving relatively little if you triple the simulation time (if you train it with the longer simulation time it can grow higher, but the original solution isn’t good). Similarly if you extend the pancake simulation you’ll find that something that keeps the pancake in the pan gets a score that grows with O(t), where t is the simulation time, instead of a constant score.