This video is about the Hyperparameters that can be set for the Reinforcement Learning algorithms Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), which are two commonly used state-of-the-art Reinforcement Learning algorithms.
Please let me know if you have any questions. I'll try my best to help you out.
PPO Paper:
arxiv.org/abs/1707.06347
SAC Paper:
arxiv.org/abs/1801.01290
Find me on:
Discord: / discord
Twitter: / bot_academy
Instagram: / therealbotacademy
Patreon: / botacademy
Download my Unity AI example:
github.com/Bot-Academy/BallJump
Credits:
15.54 - End
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Music: Ansia Orchestra - Hack The Planet
Link: • Ansia Orchestra - Hack...
Music provided by: MFY - No Copyright
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Contact: smarter.code.yt@gmail.com
Chapters:
00:00 Intro
00:17 Config: Release 1 vs. Release 3
01:00 Parameter: learning_rate
02:02 Parameter: learning_rate_schedule
02:36 Parameter: batch_size
03:55 Parameter: buffer_size
05:00 Parameter: beta
06:48 Parameter: epsilon
08:05 Parameter: lambda
09:20 Parameter: num_epoch
10:31 Parameter: buffer_init_steps
11:36 Parameter: init_entcoef
12:16 Parameter: save_replay_buffer
14:18 Parameter: steps_per_update
15:19 Parameter: tau
15:44 Outro
26 июл 2024