Artificial Curiosity

Подписаться 770 тыс.

Просмотров 44 тыс.

50% 1

Curiosity is something that all humans exhibit in some way throughout their lives. Recently, a team at Berkeley published a paper on Curiosity driven learning, and they demonstrated how it helped enable their AI agent to learn how to play the popular game Super Mario Brother very efficiently with the added benefit of curiosity to help Mario explore his options. I'll explain how it works in this video using code, animations, math, and the spoken word. This technology can be used to help make our systems more intelligent, and thus our applications more capable of helping other people. Enjoy!
Code for this video:
github.com/llS...
Please Subscribe! And like. And comment. That's what keeps me going.
Want more education? Connect with me here:
Twitter: / sirajraval
Facebook: / sirajology
instagram: / sirajraval
This video is apart of the Move 37 course at School of AI:
www.theschool.ai
More learning resources:
pathak22.githu...
pathak22.githu...
alumni.berkele...
www.technology...
• Reinforcement Learning...
Join us in the Wizards Slack channel:
wizards.herokua...
And please support me on Patreon:
www.patreon.co...
Signup for my newsletter for exciting updates in the field of AI:
goo.gl/FZzJ5w
#ArtificialCuriosity #SirajRaval #AI
Hiring? Need a Job? See our job board!:
www.theschool.ai/jobs/
Need help on a project? See our consulting group:
www.theschool.ai/consulting-group/
Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available):
www.wagergpt.co

Опубликовано:

7 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 106

@Rahul-co3ie 5 лет назад

My curiosity makes me watch your videos ❣️❣️

@Ricocotamus 5 лет назад

Hey I found a "mistake" (more a lack of precision) : - Sparse Reward : it's not the fact the reward is delayed compared to the actions taken. It's unprecise and precising does not confuse. Sparse reward is the fact that there's not a lot of reward above or below zero over the time. A good examples if chess : you either win or lose or tie. That's sparse because out of 40-50 actions you only get one signal that is non zero. You can have delayed rewards without being sparse (though it's not anymore a Markov Decision Process, but no environments used in DeepRL is MDP) Great video and well explained as it's not easy concept :)

@hck1bloodday 5 лет назад

but in a complex scenario (think of openai in dota2 game) a sparse reward can be defined as a delayed one, since in the mean time you take a lot of actions, and when the ai achieve its reward will be very difficult to know which ones of the actions leaded to the reward

@Ricocotamus 5 лет назад

As I said, a delayed reward can be dense. Sparse reward could be delayed but it's not a prerequisite :)

@toastrecon 5 лет назад

@@Ricocotamus With chess, would some of the immediate reward be taking one of your opponent's pieces, maybe ranked by some value? It might not yield the sophisticated strategies needed to win against an opponent who is attempting more than a war of attrition.

@Ricocotamus 5 лет назад

David Clawson, the problem with this is it’s gonna lead to an algorithm that focus on taking pieces instead of winning, which is not the same

@alexmanuele6983 5 лет назад

Neat!!! A professor at my university is teaching us about his approach to reinforcement learning: A clever, computationally inexpensive genetic algorithm called Tangled Program Graphs. I look forward to discussing his thoughts on this curiosity approach with him. Thanks Siraj!

@chastibarna6931 2 года назад

Hola Álex , compañero virtual. Conseguiste rebatir con tu profesor??... (Ya hace 3 años de eso)... Pero tengo curiosidad y grandes deseos de que hoy en día estés gozando de un buen trabajo o estudios. Saludos desde Barcelona, España 😎

@chastibarna6931 2 года назад

Más que nada es que incluso mirando el vídeo, soy de otro país y no comprendo lo que dice...

@ConnoisseurOfExistence 5 лет назад

Let me repeat here my comment to the AI curiosity videos on the 2 minute papers channel: I see this curious AI as an actual AGI. Just give it a robot body, which can move around, have some sensors to perceive its immediate environment and its body condition (battery charge, damage), maybe some pre-programmed knowledge to recognize power sources, some 'hands' to manipulate the environment, and some long term goal - for example to survive longer, and let it explore around. It will learn new skills and it will apply them to ensure it's survival. And it will develop itself with no limit. Of course, such a robot won't be safe for us, as sooner or later it may destroy us, if it decides, that we're a challenge to its survival.

@lionelt.9124 5 лет назад

Test your hypothesis. unity3d.com/machine-learning All hype aside ML is essentially a function approximator as far I can tell. Albeit a really interesting and exciting one though. Nothing can develop without limit. ML an DL still runs on hardware and physics. I don't know about you every but piece of hardware I've seen has had limitations and physics ...pffft... it's full of 'em too! Have you seen the speed of light? Me neither, but it's there!

@BitcoinmeetupsOrg123 5 лет назад

World's greatest educator, world's greatest RU-vidr and a mind sharper than Feynman's.

@CodingSteve 5 лет назад

I see... Intrinsic rewards equal artificial dopamine then

@siliconmessiah1745 5 лет назад

Something makes me realise that AI specialists will make excellent teachers who could contribute greatly in making the world a better place.

@Techieadi 5 лет назад

Oh captain, my captain! Thank you for bringing these algos to us.

@colstoun4762 5 лет назад

This one melted my brain. Great video.

@tim_house 5 лет назад

How curious

@valentinevandi6575 5 лет назад

I should collect my tuition and just give it Siraj, thanks making these contents very easy to understand

@dennismusingila3012 5 лет назад

This guy...This guy is so cool

@jingtying 5 лет назад

I like your last words "explore not exploit" :-D Thanks for your videos

@jeff-xy7qp 5 лет назад

I miss the days when people would complain about how fast you talked.

@movement2contact 5 лет назад

Y THO

@perprit 5 лет назад

Hey siraj, thanks for the great video again! As a small request, could you please add the references to the paper next time you upload other videos?

@aamir122a 5 лет назад

Well presented . you got me curious.

@sean..L 5 лет назад

Logically deconstructing curiosity and goal-reward motivation like this could potentially lead to a philosophical discussion of hedonism and human nature.

@antonlinden5216 2 года назад

Good video, please post the paper next time.

@christophersimms9128 5 лет назад

I've been working on curiosity algorithms for several months now, It's promising, but what this video covered is almost a year old now.

@Kevin________ 5 лет назад

Any up-to-date papers you would recommend that provide substantial contributions in this area?

@tp5691 5 лет назад

Are there any focus algorithms so the AI can figure out what in the environment it is affecting and is affected by and then cul the extraneous pixel data or what have you?

@SaveAsss 5 лет назад

This is very interesting topic. In near future it will have value to get back to it.

@pauldirac4718 5 лет назад

This is quality RU-vid content. Why isn't this video trending :-^\?

@BelalHossain-zr9yg 5 лет назад

Wow!!! Curiosity of AI

@erickknackstedt3131 5 лет назад

I look forward to the day I understand. You are a treasure!

@mmohammadian5198 5 лет назад

hello SIRAJ I'm from Iran I like you'r content please make more tutorials and more videos GOOD LUCK

@benfreed6471 5 лет назад

Best explanation I've seen of this topic so far

@abdulbasithashraf5480 5 лет назад

Super intriguing to me. There's so much to learn.

@MrJorgeceja123 5 лет назад

I'm curious to know how adding Attention to the ”intrinsic curiosity module” will affect the module itself.

@omarhammami96oh Год назад

This really insightful!! Thank you so much ,sir!

@shalabhsingh5007 5 лет назад

Excellent explanation in such a short time frame

@leonhardeuler9839 5 лет назад

Good to know you are a Splinter Cell fan, Siraj

@imkharn 5 лет назад

Lol, curiosity is all it takes to succeed in life: 1) Fuck all rewards and pleasure, I just want to experience new things in life 2) Oh wait... apparently I get to experience a lot less if I dont have a job. 3) Ok finding a job, but only because I am curious about life.

@mesopable 5 лет назад

Siraj is slowly turning into Freddy Mercury

@bkworld6565d 5 лет назад

Thanks bro, you are doing awesome content. can you do separate channel for explaining white papers and research papers which will reduce our time in studying math. So we can work more intensively hand in hand.

@toastrecon 5 лет назад

Awesome video, and a fascinating topic. Thanks!

@failogy 5 лет назад

Artificial Intelligence + Artificial Curiosity + Artificial Greed + Artificial Survival + Artificial Fear + ... Put it all these things , and it will become REAL.

@gabrielleshull9106 5 лет назад

Examinator Don’t forget artificial consciousness. Today’s unconscious a.is are on the level of jellyfish and insects. They merely respond directly to their inputs in the way they evolve to.

@ziad_jkhan 5 лет назад

It's all about self-preservation and robots don't really need greed or fear for that and artificial empathy is all they need to be equipped with to make them safe.

@ziad_jkhan 5 лет назад

Oh, to be more accurate, empathy can be understood as nothing but cognitive 'mirroring' in neurogical terms. Humans are know to be equipped with 'mirror' neurons and they're are what cause us to instinctively experience what other experience through what our senses allow us to we perceive around us.

@Papada00 5 лет назад

Need artificial bravery

@katkosmos 4 года назад

@@gabrielleshull9106 Though, there is no way to know whether consciousness isn't just an advanced input-output system and thoughts are just one of those reactions. Though that gets into philosophy

@Jone952 5 лет назад

Siraj has mad meme game

@skyacaniadev2229 5 лет назад

Nice video, thank you!

@zool0941 5 лет назад

hey siraj did you know bacteria have a tumble and run mode depending on the rotation direction of their flagellum? so you can be random - i.e. curious? tumble to turn, run to move. and hence the basis of curiosity?

@tonycatman 5 лет назад

I just did a search. I couldn't find a Siraj Video on semi supervised learning.

@tylerangert9121 5 лет назад

Holy shit!! Do you think there’d be any benefit to using a GAN / other generative model to predict the representation of the next state? I heard you emphasize “real state” a lot. Especially after reading the GQN paper, I can’t help but imagine that using those sorts of generative predictive models would be useful with the intrinsic curiosity module to really quickly train models that learn how to play in complex 3D environments.

@manashejmadi 5 лет назад

Hey Siraj could you make an image colorizer - converting black and white images to color ones

@shubhamdeveloper753 5 лет назад

Great explaination, thanks....

@josesaldivar655 5 лет назад

thanks Siraj. Can you pelease share with us, a couple of factors that make you successful in the web ? Thatnks again

@daviddemeij 5 лет назад

This is exactly the type of reward system that we should be careful about when we create more advanced AI in the future, curiosity can take you to very surprising places and it might not always be beneficial to us humans.

@EctoMorpheus 5 лет назад

Fuck off.

@electronresonator8882 5 лет назад

0:10 I'm sorry Siraj, but there's proverb, curiosity killed the cat, means animals also exhibit curiosity, and about AI, I don't think without changing the code from any existing learning method prior to this, the AI can actually write its own code as part of its curiosity, which means AI do not exhibit curiosity, human programmer does

@gabrielleshull9106 5 лет назад

Electron Resonator Humans cannot rewrite the code that dictates how our neurons grow, form connections, and change; yet it is self evident that we are curious. I think this a.i isn’t curious to nearly the same level as humans though because it has no conscious desire tor learn, instead it has a simplistic reward mechanism for learning.

@lkd982 5 лет назад

@@gabrielleshull9106 Yes, but I guess humans are on the path of being able to rewrite their own neuronal code, judging by genetic engineering research etc. But the appropriate conceptual context for this direction of thought (which is, as Siraj says, "learning from how our mind works"), however, must involve a respect for the actual history and prehistory of human culture and thought. Then we can see it saw existence, meaning etc in terms of the three: God, the World and the Human. The "human" only made sense in the context of the other two, in the conceptual development of mankind. It is not a matter of whether you believe in god or not. It is about honestly looking at "how the human mind works" - Curiosity comes before "work"!

@AP1919 5 лет назад

What resource or online course is good for learn AI?

@f23anone82 5 лет назад

Did I understood correctly: curiosity is the difference between the predicted future state and the actual future state. The bigger the difference - the more curious the agent?

@bAbApersianwale 5 лет назад

Thank you 😀

@ianborukho 5 лет назад

Dude you are so fkin clear! Even at 2x. Hell yea

@mohamedkhaled-qc7kb 5 лет назад

I want to figure out to learn math for machine learning have to learn all contents linear algebra and calculus in khanAcadmy ?

@TraceMyers26 5 лет назад

Isn't your description describing progress in the level as reward?

@Donaldo 5 лет назад

The model is not given level progress as such. That would be an extrinsic reward

@Personnenenparle 5 лет назад

Does this approach consider the lack of prediction as a reward? Like.. we get obsessed by stuff we dont understand.. we argue until we are sure we are right or wrong.. Also, memorisation should be rewarded.. for exemple, to explore the world, adding data is good, verifying how thing works is good, being wrong is good, comparing things is good. Anyways, im thinking a lot about ai... right now i feel like many people want to let them do the hard work and just look a them evolve.. but i think we lack structure.. we need to design systems better with math, physics and knowledge, and let ai do thing we absolutely cant do ourself. We cant just scale ai indefinetly.. we have limited resources.. we are way smarter.. lets use our brains

@ziad_jkhan 5 лет назад

The only thing I'm worried about is the use of AI to exploit humanity. There are thousands of filthy rich sickos out there willing to invest fortunes on that. They'd take any risk to achieve their goal. They have no empathy or the monetary system has desensitized them so their AI will act like them. Strong AI without empathy, or cognitive mirroring rather, can have potentially worse destructive abilities than the nuclear bomb. The monetary system is to blame for that because it drives people to compete ruthlessly to survive.

@ziad_jkhan 5 лет назад

Well there's a solution against this kind of eventuality and it's about getting rid of the root causes of those issues themselves first ie money and trade. I can't go into the details here but anyone curious enough to know more will find necessry material on *tromsite . com*

@oraz. 5 лет назад

Pretty damn awesome

@sheacrow3943 5 лет назад

Question, could this process be used in concert with a Monte Carlo simulation to predict a spectrum of possible states and choose the best one? Creating something like machine imagination.

@phils744 5 лет назад

Hi, to choose the best one according to what out come....

@sheacrow3943 5 лет назад

Phil S based on the forward dynamics model. My thinking is more complex environments like the real world have too many features to effectively learn from. Resulting in a noisy signal. We want general AI so defining what affects our agent may not feasible to accomplish that. Providing a method for the agent to simulate many different states and have that agent learn which possibly best, based on the highest reward from the curiosity module, might provide a more general agent. One that could optimize in any environment and learn faster.

@EctoMorpheus 5 лет назад

Sure, if you don't know it yet, look up the paper called "World Models" by Jurgen Schmidthuber. Absolutely amazing stuff. They also use a state prediction model, but it outputs a probability distribution over the possible states, which can then be sampled. This "imagination" allows them to train an agent that hasn't ever seen the actual environment; they just sample a random state from the latent space and use the predictive model to pretty much simulate the environment.

@sarathkumarm8249 5 лет назад

Hey,Siraj.. I am gonna to ask a completely irrelevant question in comment section . . . . . . . . HOW do you edit your videos? ...:(

@Frankthegravelrider 5 лет назад

Good video Siraj!

@BenchmarkRadio 5 лет назад

whyyyy is your new content not reaching my feed! subscribed - and the bell is ringing :( I'm like 2 months behind! damn you Whitehouse!!

@JulianHarris 5 лет назад

Really great. I watched this after trying to read the paper it's based on from May 2017 [1]. The question that's still not immediately clear to me is how the system decides what is relevant for the feature set ;eg the snake thing) and what isn't (eg clouds). [1] arxiv.org/abs/1705.05363

@user-or7ji5hv8y 5 лет назад

How does it identify what is relevant?

@Papada00 5 лет назад

You dumb? He explained it in the video. Anything that can be interact is relevant

@Gvozd111 5 лет назад

The backdrop tho

@evgiz0r 5 лет назад

How would you know the cloud is irrelevant when compressing the data? That is an assumption of the game the user just cannot do. One of the clouds in a later stage might interact with the env, but it would be invisible to us... Maybe the AI needs some "resolution" of what might be relevant and by how much, but that is something it needs to learn by itself. The whole game->neural network ( or a combination of whatever) seems to be still a mystery and requires heavy involvement of the network owner.