In this video I discuss the problem of sparse reward enviroments and how OpenAI managed to solve it through the use of curiosity. The method is called Reinforcement Learning with Prediction-Based Rewards and it incentivizes visiting unfamiliar states by measuring how hard it is to predict the output of a fixed random neural network on visited states.
Support me on Patreon: www.patreon.com/user?u=25285137
Keep in touch: / sebastianschuc7
More on the topic: openai.com/blog/reinforcement...
26 июл 2024