Would be really cool to see how the AI (when it's more intelligent of course) manages to survive in different environments like, for example, in deserts or cold regions where the resources are more scarce. Really looking forward to seeing how this project developes
I tried doing so, but penalizing him for being alive resulted in Chiklz trying to die as soon as possible instead of trying to learn :D One way would be to continue training from the moment where he already know how to survive and punish his existence then, but that's for another video :D
It could be a little house instead of thee bed, and it protects him from spider, buts gets hungry and lazy (less energy) so he has to come out eventually, great channel, why you don't have more views? get this man more views
Great work! First off all, I want to commend you for utilizing the standard player stats to produce a complex state machine! Now, a question: I see you didn't use imitation to speed up, but when you re-introduced the workbench, did you use curriculum learning or simply introduce it as a set of game rules built into the player?
Thank you and very good question! I am almost certain that using imitation and curriculum would have made the training faster, however I did not use any of that for this project (mainly because I was too lazy to implement them xD). Before re-introducing the workbench, the agent had few "blank" actions(if it decided to perform them, nothing happened). Once I've re-introduced it, I've added back the mechanics for those actions related to the workbench and after few iterations, the agent realised those are not "blank" anymore.
@@Zuzelo That's brilliant! In inference stage, it's actually a great workaround alternative to behavior swapping or Hypernetwork for complex behavior. Ever since I've started learning ML, I've been amazed at how some simple solutions get overlooked or sometimes even forgotten and "reinvented" in other domain areas. Also, as brilliant as Unity guys are, I was amazed to watch the joint presentation at Unite Copenhagen where two guys actually stand on the same stage, present AI Planner and ML Agents and then walk away in different directions. That is to say, the imitation system is a great proxy for phenotype but the rule-based behaviour can have great impact as a proxy for genetics or simply cultural artifacts. If I observe my cat, ever since she started going to the litter box, she knew she had to paw at something. Unfortunately, most of the time it's not sand, but the wall of the box, but it goes to show that instincts are a great example for rule-based behavior where the agent has no idea why they're doing what they're doing, and yet the emergent behaviour is still complex. I'm looking at your code for v1 and what I like about it is how there is virtually no arbitrarity in the reward system, because staying alive can come from everything before that. It really makes sense, if you think about it, since most of evolution was not about rewards, but avoidance of pain/death. The emergent behaviors then happened because adversaries continually changed. There's very little evolution in areas with no natural enemies. Also, this allows for a lot of backwards engineering of complexity. For example, chopping wood can become a lot more complex and interactive element simply by gradually adding time and destructible objects while never having to switch to a new environment. Likewise, and that's something I'll be working on, depletion of health or ability to defend can gradually be exposed to more complex factors, such as psychological states, without having to go too deep into them. For example, if an agent is high on the neurotic scale, they can have more impulsive reactions to sounds and stress impact on health, so they'll start monitoring that even though they don't know why they're more neurotic (pretty much like humans do).
@@ispajic3 You are right, most of the times, the simplest solutions are right in front of us, but get overthinked :D Also, I think the idea of having an AI with actual psychological states sounds hella fun. (I might even borrow it for some projects in future xD)
@@Zuzelo Oh sure, go for it. I'd be glad to help back after this video. In the simplest form, you may want to look at www.crystalknows.com/personalities/blog/the-big-five-personality-traits-in-personality-neuroscience/ The way you might approach some of these is: neurotic
No :( but besides the ML-Agents Documentation: github.com/Unity-Technologies/ml-agents/blob/release_8_docs/docs/Installation.md There are 2 very good videos on how to set everything up and use it from Code Monkey: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-zPFU30tbyKs.html and Sebastian Schuchmann: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-_9aPZH6pyA8.html. Check them out!
Would be curious to know if the addition of some randomness in the first environment would have caused him to learn to live on the edge less. If he can run a perfect loop over and over and just about stay alive there's no need for him to improve right? But if there's a chance that on any given loop he may lose just a little too much maybe he'd ensure he was more prepared?
@@Zuzelo just to clarify, I mean randomness in the penalities or in the rate at which campfire health etc decreases. Not randomness in the ai's decision making. But if that's also what you meant then got you! Very much deferring to your expertise here
@@nathanvanderriet209 ah I see, well having random environment would mean it is much more difficult to understand(for the AI) how this environment works. It might end up with an agent that is unable to learn at all.
@@Zuzelo ah alright. Even if we were only talking 10% deviations I suppose just the fact that there is an element of randomness does make it a lot harder to understand
Imo ai don't craft istantly items cause if it do it would run out of resources Ps. In the end, it survive so it win even at the first attempt. For the ai materials and tools are useless, the mission is survive and it do it very well