Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning (SciRob 23)

Подписаться 16 тыс.

Просмотров 32 тыс.

50% 1

A central question in robotics is how to design a control system for an agile, mobile robot. This paper studies this question systematically, focusing on a challenging setting: autonomous drone racing. We show that a neural network controller trained with reinforcement learning (RL) outperforms optimal control (OC) methods in this setting. We then investigate which fundamental factors have contributed to the success of RL or have limited OC. Our study indicates that the fundamental advantage of RL over OC is not that it optimizes its objective better but that it optimizes a better objective. OC decomposes the problem into planning and control with an explicit intermediate representation, such as a trajectory, that serves as an interface. This decomposition limits the range of behaviors that can be expressed by the controller, leading to inferior control performance when facing unmodeled effects. In contrast, RL can directly optimize a task-level objective and can leverage domain randomization to cope with model uncertainty, allowing the discovery of more robust control responses. Our findings allow us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 g and a peak velocity of 108 km/h. Our policy achieves superhuman control within minutes of training on a standard workstation. This work presents a milestone in agile robotics and sheds light on the role of RL and OC in robot control.
Reference:
Y. Song, A. Romero, M. Müller, V. Koltun, D. Scaramuzza,
"Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning",
Science Robotics, September 13, 2023
PDF: www.science.or...
For more info about our research on:
Agile Drone Flight: rpg.ifi.uzh.ch/...
Drone Racing: rpg.ifi.uzh.ch/...
Machine Learning: rpg.ifi.uzh.ch/...
Affiliations:
Y. Song, A. Romero, and D. Scaramuzza are with the Robotics and Perception Group, Dep. of Informatics, University of Zurich, and Dep. of Neuroinformatics, University of Zurich and ETH Zurich, Switzerland
rpg.ifi.uzh.ch/
M. Müller and V. Koltun are with Intel Labs
vladlen.info/
Music Credits: scottholmesmusic.com under Free Creative Commons License

Опубликовано:

28 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 23

@ArkoN7 Год назад

Well done guys 👌👍 It's been a long way, happy that I could also be a part of it before my retirement (as a Drone Racing Pilot) 😉 best Regards Kay

@sepdriessen1058 Год назад

Whereas I am not an expert on either, it seems to be a matter of definitions. Yes, if you impose a trajectory as target for optimal control, you are solving an inverse dynamics problem, which means multi-order differentiation. This significantly limits the scope of feasible solutions on an input level, yet it leaves you with a more tangible optimisation problem (to find a global optimum within the remaining scope of solutions). However, it is surely possible to also define an optimal control problem on a task level. This enlarges the scope of solutions, and likely exposes many local optimums that are better than the global optimum of the formerly described trajectory-based optimisation problem. However, now you have an optimisation problem that is no longer tangible on a global level. This requires you to settle on (quasi-)local optimisation approaches, of which reinforcement learning is a fine example.

@thecontrolenggeek786 5 месяцев назад

Waw.... Although I was already sure that control theory can push human limits as far as he can imagine, but this is literarily speechless....

@limbo3545 Год назад

Next step: add weather conditions. Damn this was incredible!

@suchetansaravanan-gh3og Год назад

INSANE

@aaxa101 8 месяцев назад

I think these human racers are really good. The robot just gets 0.5s faster (10%)

@rogerbye4047 Год назад

Music is too loud. You'd be better off without music, but at least turn it down to background levels.

@fanshi2271 Год назад

One day, I dream I can achieve that. Damn! 😆

@SarkarAniruddha Год назад

and at that time, the drones will be 10x times faster 😅

@islamdib9663 11 месяцев назад

each of us has abilities, you must try, and I hope you will arrive one day

@SpongeBob-xh8ir 2 месяца назад

Can't go outdoor 😂

@mukil_saravanan Год назад

Incredible Please take me there 🥹🥹🥹

@bourr4766 Год назад

Interesting, I wonder if it's possible to integrate OC techniques with DQN, to maintain the efficiency but make it amenable to formal verification.

@LukeVader77 Год назад

Wow! This is really cool stuff! What splendid automation 😮 Thank you algorithm for taking be here 😊

@rocketmike9847 Год назад

You can see in the young kid's eyes that he just realized he's out of job :(

@peble_8807 Год назад

Not really, drone racing is a hobby and an AI being able to fly an fpv drone changes nothing for top fpv pilots, as that ai is not going to ever be competing against them in a race.