We look at a project that I did trying to create a version of AlphaZero for the game Pente. AlphaZero uses Monte Carlo tree search in combination with machine learning and neural networks to estimate which moves are the best for the current player. We use convolutional neural networks to attempt to estimate move values for every possible move.
#some2 #3blue1brown
Primary Sources:
arxiv 1902.10565 - Accelerating Self-Play Learning in Go
Music:
Credit to Ludwig and Schlatt's Musical Emporium
(Everything below is me trying to beat the algorithm)
This implementation doesn't use the attention mechanism. Also, the original implementation by DeepMind could beat even the best chess players like Magnus Carlsen. There's also a subtle reference to the controversy around Hans Niemann hidden in the video, if you can notice it. Transformers could be used here, like in Leela Chess Zero, but we don't do that.
19 сен 2024