How To Speed Up Training With Prioritized Experience Replay

TheComputerScientist

Подписаться 4,1 тыс.

Просмотров 10 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

21 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 41

@kevin5k2008 5 лет назад

Loved your animation and how you explained this concept in a systematic yet easy to understand manner.

@ashwinsingh1325 4 года назад

This is explained so well! Hope you continue making content : )

@unoti 4 года назад

You content is head and shoulder above the rest on the topic. Kudos! Reward + 1000! The music unnerves me as I try to concentrate on the ideas, though... reward -0.1

@rishabhsheoran6959 2 года назад

Amazing explanation! Loved your content. Keep making such awesome videos!

@TheAcujlGamer 3 года назад

This channel is awesome!

@swarnas2313 3 года назад

Your explanations are very clear and understandable. Thank you :)

@undergrad4980 3 года назад

Thank you for all the effort! Great video!

@andreamassacci7942 5 лет назад

Amazing content.

@joaopedrofelixamorim2534 2 года назад

Great video! Thank you for it!

@yatshunlee 2 года назад

Thank you so much! I like your explaination:D

@adeemajassani5860 4 года назад

Great explanation. Thanks!

@adarshjeewajee939 5 лет назад

pie torch :)

@aayamshrestha9084 5 лет назад

Awesome work !

@MasterScrat 4 года назад

Very nice work! :D

@ArmanAli-ww7ml 2 года назад

Do we need neural network to generate data for experience replay?

@neilpradhan1312 4 года назад

awesome !!1 great work

@danielortega494 3 года назад

Subscribed!

@sludgekicker 4 года назад

Running the max() function over the complete priority buffer, hinders the performance by a substantial amount. I would storing the max probability in a variable, and compare it with newly added errors, and update the variable. This can then be used for adding new experience priorities.

@Небудьбараном-к1м 4 года назад

What about normalizing priorities (0 to 1)? This way we could just set max_priority to 1, and I think it would positively affect performance, keeping it stable! What do you think?

@julioresende1521 2 года назад

A better way is to use Segment Trees...

@youcantellimreallybored3034 Год назад

@@julioresende1521 But wouldn't the time complexity for using segment trees to compute max from index 0 to index N - 1 (length of the array) be the same as running the max function over the array?

@youcantellimreallybored3034 Год назад

@@Небудьбараном-к1м I think in order to normalize priorities you first need to compute the max priority.

@julioresende1521 Год назад

@@youcantellimreallybored3034 you can use one variable to store the max value. The segment tree (sum) is useful to compute the roulette method.

@ArmanAli-ww7ml 2 года назад

Please explain it with real time example

@hitinjami1143 4 года назад

hii how do i save a trained agent?

@jmachida3 3 года назад

Hi :) You can save the weights of the neural networks at the end of the training proccess. In my case, I use the tensorflow.keras library, where the models have a method called save_weights.

@Небудьбараном-к1м 4 года назад

Why not normalize priorities? I think that will boost the performance much

@TheAcujlGamer 3 года назад

Good jokes on 1:43

@gamecraftczjaajenomja1057 3 года назад

Love them too. PieTorch is the best xD

@raghuramkalyanam 5 лет назад

Nice content except i have to watch it at 0.5 speed.

@TheAcujlGamer 3 года назад

I watch it at 1.2 speed lol

@superz5510 4 года назад

Is there anyone like me who got lost when he started writing code

@ThePaypay88 4 года назад

Hard paper to understand

@ArmanAli-ww7ml 2 года назад

Anyone who can write all these steps one by one?

@xxXXCarbon6XXxx 5 лет назад

Hmm, I'm getting an error AttributeError: 'DoubleDQNAgent' object has no attribute 'sess'. Not sure why as a lot of the code looks like the previous???