Тёмный

The Elo Rating System for Chess and Beyond 

singingbanana
Подписаться 227 тыс.
Просмотров 1,4 млн
50% 1

Опубликовано:

 

12 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 1,4 тыс.   
@thelastcube.
@thelastcube. 4 года назад
"Your rating is a measure of your ability relative to the population" That is an important factor to remember
@tlsgrz6194
@tlsgrz6194 4 года назад
IQ works the same way and there are even so called adaptive IQ tests that present you with items of varying difficulty depending on how you answered previous items.
@B1gLupu
@B1gLupu 4 года назад
@Elephant Philosophy how could there not be when new players bring over their 1000 points that trickle down to better players over time?
@olegdoroshenko4836
@olegdoroshenko4836 4 года назад
@@B1gLupu that only works if average new player is much weaker than 1000, you have. But inflation or deflation is indeed exists. Assuming available new player is stronger than 1000, that will render daily players to lower rating
@B1gLupu
@B1gLupu 4 года назад
@@olegdoroshenko4836 it doesnt matter if the player is weaker or stronger than 1000 mmr, its still 1000 new points in the system. They will take points from some other new player and then lose them to someone better.
@TheZenytram
@TheZenytram 4 года назад
It is pretty obvious that this happened, ppl try to say other wise so it seems that the GMs today could be way better than those from the past, to mitigate a sense of progress of a game that is dying.
@ih8mcfly
@ih8mcfly 4 года назад
i plugged my rating against carlsens and it returned "LOL"
@23deepakiyer76
@23deepakiyer76 4 года назад
lmao underrated
@Metalhammer1993
@Metalhammer1993 4 года назад
I might get him one day, right?^^ He is only 2000 or so ahead of me. (JK I'm unrated. I'm not even good enough to try to somehow get an estimated Elo rating. single digits when kindergarten children are in the 400s is emberassing^^)
@wompastompa3692
@wompastompa3692 3 года назад
Just use the Wooden Shield, bro.
@slimeboy5626
@slimeboy5626 3 года назад
@@wompastompa3692 PepeLaugh they don’t know
@radicalbradical3164
@radicalbradical3164 3 года назад
@@Metalhammer1993 yea you should try to get good at chess because humor is definitely not your thing.
@tobiaschaparro2372
@tobiaschaparro2372 3 года назад
The fact that Elo isn't acronym is the biggest plot twist in the history of anime.
@cadeezra315
@cadeezra315 3 года назад
It is an acronym tho. Stands for Electric Light Orchestra ;)
@etzelvonk
@etzelvonk 3 года назад
To be precise it is Élő and not Elo which is a Hungarian word with the meaning of "living" in a sense opposite of dead.
@priscillaperry6463
@priscillaperry6463 3 года назад
Woooah lol I was assuming it was an acronym too
@Briscket
@Briscket 3 года назад
@@cadeezra315 Sun is shining in the sky.
@oldaccount1254
@oldaccount1254 3 года назад
@@Briscket there ain’t no clouds in sight
@codyjackson7579
@codyjackson7579 3 года назад
Thank you for explaining through mathematics that I'm trash at chess.
@sprengmeister2758
@sprengmeister2758 2 года назад
Whats your elo? Lmao
@SpiceWeazel
@SpiceWeazel 2 года назад
@DontGetBackranked Wow that's garbage dude
@kyndrix15yearsago96
@kyndrix15yearsago96 2 года назад
@@SpiceWeazel 😭
@aaronzhang4165
@aaronzhang4165 2 года назад
@DontGetBackranked bro isnt even the original commenter
@siwalekunda
@siwalekunda Год назад
I know right, like bruh we get it, I will never improve at chess 😒
@sb_dunk
@sb_dunk 3 года назад
I love how James looks incredibly normal throughout the whole video, but he picked a frame where he looks like he's biting air for the thumbnail
@combatwombat8581
@combatwombat8581 Год назад
I found the exact frame at 3:04
@fredharte8677
@fredharte8677 5 лет назад
The 32 that is mentioned in the video is the k factor which is in fact a variable. K factors are used in order to exercise some control over ratings drift. A high k factor is applicable to young and fast improving players who will quickly suck points from the rating pool leading to deflation unless some mechanism exists to counter it. On the other hand, long established and highly rated players are normally associated with a low k factor which has the effect of dampening the vicisstudes of tournament play. Such players are likely to have arrived at a plateau in their development, and their rating should contain a greater historical element. The control of a the average and distribution of a rating pool is the major task of all rating officers. The aim is to make ratings consistent over time so that a good club player rated 1800 today has the same ability as somebody rated 1800 fifty years ago - not a trivial task.
@Pr0t4t0
@Pr0t4t0 Год назад
Perhaps would could create an exponential decay function based on the number of games played by a chess player or by the chess player’s age to determine k values.
@prdoyle
@prdoyle Год назад
damping*
@joesowden9602
@joesowden9602 Год назад
​@@Pr0t4t0I think exponential decay would be a bit to harsh in decline, you don't want it to decline indefinetly. I think a linear decrease until you hit a baseline is used in online games a lot, but in chess wikipedia says its either 10 20 or 40
@CrimsonCypher
@CrimsonCypher Год назад
That explains why some hikamaru video I saw, if he won, he would win 3 or 4 points for his ELO, but a loss would be -16 Insane point difference making it seem unfair to have to win 4-5 games just to be up for 1 loss
@joesowden9602
@joesowden9602 Год назад
@@CrimsonCypher It is fair because he's so high rated he is meant to be better than all of his opponents, therefor the higher you are relative to your opponents the more you need to win to maintain your elo
@PatoPat58
@PatoPat58 3 года назад
I am 62 years old and I am a chess master and a mathematician. A great achievement of the Elo system has been to give value to every game. Previously, when a player had no more goals to achieve in a tournament, their game also got worse and there were "false" results. But the Elo system has also been shown to have a dangerous flaw. What happens when a player thinks he has reached or even surpassed his best possible elo? If his goal is to have a good score, more than to fight in tournaments, then he will be tempted ... to stop playing. And that's what happens for a lot of players!
@xaius4348
@xaius4348 3 года назад
I used to play a game where they had system that gave you more/less points for a win based on your ranking tier, so players at the top tiers had to maintain roughly a 80% winrate to stay the same rank, despite having to play similarly ranked players. I managed to get a good winstreak as a pretty new player, realized it was incredibly unlikely that I could maintain that rank, and quit the game despite really enjoying it. I want to say I was in the top 0.07% of players with my final rank.
@donnadie5882
@donnadie5882 3 года назад
And you realized why i left chess
@MafuHardy
@MafuHardy 3 года назад
@@donnadie5882 to me it sounds like someone who stopped enjoying the game.
@xwtek3505
@xwtek3505 3 года назад
That's true, but note that many newer ELO ratings do penalize you if you stop playing for a long time by giving you a bigger standard deviation. Essentially, if you stop playing for long enough, your rank becomes essentially provisional again.
@Steveross2851
@Steveross2851 3 года назад
​@@xwtek3505 as far as I know no national or international chess federation does that since most federation rated games are from tournaments. But many chess website playing platforms (having their own rating systems and being unsanctioned by chess federations) do that since most of their rated games are individual games not tournament games. As such they have a much greater incentive to encourage players to remain active lest not enough players be on the site at any given time.
@zxxkcxxz
@zxxkcxxz 4 года назад
I need a separate Elo rating for sober playing vs high af or an equation to factor it all in
@sameash3153
@sameash3153 3 года назад
Separate ratings for normal openings vs bongcloud games
@Tzizenorec
@Tzizenorec 3 года назад
ELO does factor it in... by figuring that you have a certain % chance of being high each game (just like you have a % chance of just having a bad day, as explained in the video).
@dannyhuffman3587
@dannyhuffman3587 3 года назад
@@sameash3153 yes
@hj2479
@hj2479 3 года назад
@@Tzizenorec Yes but also no, ELO will average your rating in between a state of highness and normal play depending on how often you are high thereby accounting for how much of the time you will be high when playing however if you were to consciously choose what level of play you are participating in an altered state then you could potentially be a higher rated player destroying in tournaments for "mid-rated players" which slowly raises your score however you then go and destroy your score when you turn into a brain dead zombie for say, 25% of the time in other matches with lower stakes because the ELO stakes are still the same more or less. Easy fix, just make a troll account for stupid fun and have a professional account for serious play.
@user-qv6fs8if7o
@user-qv6fs8if7o 3 года назад
so true🥲 It takes such a long time to recover from a night of frustrated playing.
@Thepokeshasseur
@Thepokeshasseur 4 года назад
English is not my native language and I really wanted to say that the way you express yourself as well as your accent are perfect for people like me ! Thank you for being that smooth in your speaking and clear in your explanations.
@timanlam7504
@timanlam7504 5 лет назад
4:53 that 32 is a very important factor called K-factor, which is the sensitivity of each game affecting the elo system. The calculation of K-factor could be another topic. it might be depended by Elo Tiers, Continuous Win/Lose, Day of last previous game, etc.
@kubakopcil9992
@kubakopcil9992 4 года назад
and it's often not 32, e.g. in Czech it's 40 for kids, 20 for adults and 15 for masters or something like that. It should be set to reflect the change of player's rating as fast as possible, but it can not be too hight, since then it will have a low accuracy.
@artfuldodger5698
@artfuldodger5698 4 года назад
^ does that mean masters don't improve
@kubakopcil9992
@kubakopcil9992 4 года назад
@@artfuldodger5698 not really, it only means that they improve slowly than begginers. It's mostly because when you're weak, you became much stronger if you learn some tactical idea, while when you're a master, you improve mostly by just improving your positional play (since you allready know almost all tactical ideas) which takes way longer.
@pigeonapology9816
@pigeonapology9816 3 года назад
What about the 400? Any reasoning behind choosing 400 specifically for a 10% increase in chances of winning?
@jedinxf7
@jedinxf7 3 года назад
@@pigeonapology9816 that's not a 10% increase, it's a tenfold increase. a 1000% factor.
@antoniogarest7516
@antoniogarest7516 5 лет назад
Now I can calculate why I am Iron IV
@lukijuxxl
@lukijuxxl 5 лет назад
ironically, if you are, then you probably can't.
@ryanfoley8897
@ryanfoley8897 5 лет назад
You just need to improve.
@mayankraj2294
@mayankraj2294 3 года назад
Wot? Wdym?
@nidzeksmocni659
@nidzeksmocni659 3 года назад
@xd lmao hahhahahhahahhahahhaha
@gamespotlive3673
@gamespotlive3673 3 года назад
Skybloooooock!
@cookiecan10
@cookiecan10 5 лет назад
I've played multiple online games that use this system, I've always wanted to know how it works. Thanks for making this video!
@DaiLoDong
@DaiLoDong 5 лет назад
Dota 2 is a pretty popular one that does this
@dashua1735
@dashua1735 5 лет назад
@@DaiLoDong There's Dota 2, League of Legends, Overwatch, StarCraft, literally every competitive games uses it.
@gabrielandy9272
@gabrielandy9272 4 года назад
the games don't use exactly perfectly like this, dota 2 gabe neweel confirmed it uses a modified glicko2 system, but in the end the system are very likely to be very similar or the same maybe just accounting for some specific game things.
@BlueGrovyle
@BlueGrovyle 3 года назад
@@dashua1735 plenty of competitive games use Elo, but it is factually incorrect to say that all of them do. Elo is a rating algorithm, but not all games with rating algorithms use Elo.
@dashua1735
@dashua1735 3 года назад
@@BlueGrovyle True. Maybe I can excuse what I said as a hyperbolic statement.
@RetroLPGames
@RetroLPGames 3 года назад
Your 'random number drawing competition' analogy to explaining the normal distribution and probability of overlap was simply brilliant. That would've helped me a lot in my earlier statistics classes!
@RealCadde
@RealCadde 5 лет назад
There is one unintended side effect of this though in online scrabble for instance. High rated players ONLY want to play other high rated players because scrabble is partly about luck. Even a low rated player can win against a high rated one given the right letters at just the right moment. Whereas low rated players want to play those with a much higher rating than themselves. Maybe not a big problem in tournaments but in day to day play, new players will have a hard time finding anyone else to play as lower than 1000 rated players will quit playing leaving only higher than 1000 players left playing the game and they don't want to play a 1000 rated player due to the risks of randomness. Chess isn't random so it works just fine there.
@matthewbertrand4139
@matthewbertrand4139 5 лет назад
That's less a problem with Scrabble, and more a problem with online lobbies. There isn't anything inherently wrong with an element of randomness to the game, since it is for the most part skill. All you have to do to get accurate Elo ratings is enforce matchups between players of similar ratings. Since online lobbies allow you to cherry pick opponents, that creates this issue. At an officially sanctioned Scrabble event, this won't present a challenge. Using a Smart Match feature, like the one in the Scrabble app, would also help. (I'm not advocating for the Scrabble app, by the way... it's riddled with ads, I wouldn't get it as much as I love Scrabble)
@DeadlyCrab7
@DeadlyCrab7 5 лет назад
Then the difference in elo that is predicted to result in 10 times more wins for one of the players should be higher than 400 for scrabble, the parameter should be adapted to the each game.
@xnopyt647
@xnopyt647 5 лет назад
Chess is actually psuedo-random. Most players rated below 2.3k elo have limited calculation speed and range. So if two players calculate the same line up to a point, then there's a chance that beyond that line, there's a game-changing tactic missed by both players. One wins and the other loses just because of this random tactic.
@PhilBagels
@PhilBagels 5 лет назад
It seems to me that for Scrabble, and any other game that has some degree of luck in it, the maximum score change should simply be made lower than 32. If a no-luck game like chess uses 32, and we'll assume that a pure-luck game like Craps would use 0 (in other words, no skill-rating system for a game with no skill), then Scrabble might use only 24, say. A game with more luck than Scrabble, but still some skill, might use 16. etc.
@rilesmcgiles1145
@rilesmcgiles1145 5 лет назад
@@xnopyt647 Chess is not pseudo-random, players are pseudo-random. For example, Connect 4 has been proven to be solvable, that the player going first, given optimal play, will always win despite anything player 2 does. There is literally 100% chance that player 1 will win if player 1 plays optimally. By definition this means Connect 4 is not random. However, because players do not always play optimally there is a good chance for player 2 to win in Connect 4. This does not mean Connect 4 is pseudo-random, just that players are when they don't play optimally. A simpler example is Tic Tac Toe. You and I know how to play optimally (or could easily learn in 5 minutes) so we would always draw. However, if two 4 year old's play, neither knowing the optimal strategy, and one wins and the other loses because they missed a tactic, that doesn't suddenly turn Tic Tac Toe into a pseudo-random game just because the two players are unskilled in it. Chess is the same way, there is perfect information on both sides, the same preset conditions, limited number of moves that could be done via player decision, and a built in time limit so games can't go on for infinity. Because of this, in theory, chess is mathematically solvable (just not anytime soon as there is vastly more complexity to chess than Connect 4). It is possible and likely that solvable in chess just means a player never losing, but always ending in stalemate if both players play optimally (like Tic Tac Toe).
@yokokuramaful
@yokokuramaful 5 лет назад
Wish there was discussions of the weaknesses of the Elo system, such as that it's really only good for 1v1 games and that it creates incentive at the top end for players to get to a high ranking and then stop playing so as to not risk losing.
@IC-23
@IC-23 3 года назад
What game are you playing that doesn't incentivize you to keep rising? You get to a high rank stop playing and then everyone else of equal skill passes you.
@victory7302
@victory7302 2 года назад
@@IC-23 it’s not an uncommon thing. The reasoning is simple. Losing your ranking is worse than not moving forward. In any game where ranks do not depreciate in value(chess, rocket league, PUBG etc), then it’s much more convenient to stop playing. The idea, that your opponents will pass you, makes no sense in the context of elo. If you are of equal skill, then their elo will also stagnate regardless of your participation(by virtue of them having opponents they have an equal chance of winning and losing to).
@driesjansen3273
@driesjansen3273 Год назад
@@victory7302 That's why in rocket league every season (so every 3 months on average) you have to do placement games to keep your rank
@vibovitold
@vibovitold Год назад
and yet top players keep on playing, don't they? : ) if you stop playing, you'll get regarded as inactive, and you won't appear on the rating lists anymore (this happens after a year, if i'm not mistaken)
@MaxIronsThird
@MaxIronsThird Год назад
why would you want to stop playing in fear of losing Elo? The only way to win tournaments and prizes is by winning. Also even if you're on your 60's and well off, you don't lose your earned title, just points.
@LeoWattenberg
@LeoWattenberg 5 лет назад
Thanks for the explanation! I often see Elo lumped together with Glicko and Glicko2 (the rating systems that Prof. Glickman made), and apparently Glicko is more telling of your actual skill because instead of giving you a number, it gives you a number and a confidence interval. However, how that actually gets calculated, I have no idea, the explanations of that also just throw some formulas at you. Can you make a video for that as well?
@Eyusdorus
@Eyusdorus 5 лет назад
I'd be interested in this as well!
@ShaneLillie
@ShaneLillie 5 лет назад
Yes, please! I would love to see a video going over Glicko2!
@thetacklezone8946
@thetacklezone8946 5 лет назад
I agree that singingbanana should make a video on it. In the world of Blood Bowl (a wonderfully silly game that is like a cross between chess, American Football, and Lord of the Rings) we recently moved away from Elo to Glicko. There were a few reasons for this that made it work better compared to chess. For example there are lots of different teams all with their own styles and some people are better at one than another. So it is hard to get a global player score, but rather better to get a player-team score. I wrote an article explaining why we used this method which can be found here: thetacklezone.net/glicko-new-naf-tournament-rankings-explained/
@shapeflicker1999
@shapeflicker1999 5 лет назад
Leo Wattenberg like Trueskill from Halo!
@bobsquaredme
@bobsquaredme 5 лет назад
Lichess.org also uses a Glicko2 rating system, so it does have some relevance here.
@kittybeans8192
@kittybeans8192 5 лет назад
Skill: The Currency of the FUTURE!
@Yggdracyril
@Yggdracyril 5 лет назад
Now for todays sponsor: Wouldn't it be awesome to get this currency in a easy way from your home? This is where Skillshare comes in to help you.
@AaronHollander314
@AaronHollander314 5 лет назад
Reputation will be the currency of the future.
@markconrad9619
@markconrad9619 5 лет назад
meritocracy
@bumpty9830
@bumpty9830 5 лет назад
No coincidence that the number of people with financial access to education is steadily falling.
@AaronHollander314
@AaronHollander314 5 лет назад
@@bumpty9830... that's absurd. The Internet has given EVERYONE access to an education that only the wealthy or royalty had not too long ago. I just watched an MIT lecture and didn't pay a dime.
@MasterDeanarius
@MasterDeanarius 5 лет назад
So is there a formula that explains why I get matched with triggered players on a losing streak whenever I'm 1 game away from ranking up?
@shadiester
@shadiester 5 лет назад
Yep, it's called Murphy's Law ;)
@aditg847
@aditg847 4 года назад
Do they then proceed to go ultra instinct and release all their pent up skills on you?
@CraftyF0X
@CraftyF0X 4 года назад
@Albert Whisker I belive there is an even simpler explanation. The system was devised to handle "personal ability" and so using it in any sort of team game is a bad move. Matchmaker systems has plenty of other parameter to take into consideration when it tries to build a game (like geographical distance of the players) and it needs to make comprimises based on playerbase availability, all that for 10 players and optimised to low wait time. This means the consideration of ELO ends up to be a low priority wheres the system clamp together players with relativly high ELO range, and attempt to balance the teams with using either avarage or mean ELO. The consequence of this is that good players tend to be "counter weighted" with bad allies even against avarage opponents. What such system fails to acknowledge is the general working principles of these games (mostly mobas but works on ow too) which is that, winning a game requires a team effort (as everyone has their designated role which must be done properly at least to a certain extend) but losing it requires nothing but one player with a troll or defeatist mindset. For example, it doesn't matter if you are talented has a high experience and winning rate in a certain role if you get the disagreeable (and low ELO) ally who denies this position and/or lowers dramatically the winning chanche of the whole team by not playing a required role (most frequently support). As a somewhat exaggerated example: Team 1: 700,1000,1000,1000,2000 Team 2: 1140,1140,1140,1140,1140 In this case as you can see it doesn't matter if you are the best in your team with 2k elo and better than any one on the opposing team, your team still expected to lose on 4 position out of 5 and you also have the "ticking time bomb" worst player on your team to throw the game in case you can't close it quickly enough, but these kind of configurations are perfectly fine matches to make in the "eyes of the matchamker system" because the aformentioned other constrains. The system simply can't wait to get 10 ppl with excatly the same elo so it uses avarages (or maybe means) and as soon as it does that you can get unwanted differences between player quality and a resoulting unfair game which feels like a waste of time.
@LeventK
@LeventK 3 года назад
@@shadiester haha lol
@jedinxf7
@jedinxf7 3 года назад
@@aditg847 normally players like that do the opposite and play like tilted trash lol
@TheZotmeister
@TheZotmeister 5 лет назад
I forget where I originally learned this, but I always found it far more intuitive than thinking of Elo calculation as a function of a function as presented here: a game is always worth the same number of rating points-winner taking from loser-but in order to compensate the weaker player, the higher-rated opponent gives them a percentage of the difference in their ratings in advance of the game. To use the same example numbers as at 5:23 in the video, the difference in rating is 107. The weaker player gets 4% of this, rounded up, taken from the stronger, so five points change hands; then the winner of the game gets 16 points (this is half the K-factor) from the loser. This means A will have gained a total of 21 from B if they won, or will have given a net of 11 to B if they lost. In the event of a draw, the initial five-point shift is the only one that happens. These results match what one gets from using the full formulae. This "quick-and-dirty" calculation breaks down when the difference in player ratings exceeds 400, obviously, but within that range it matches pretty much exactly-I'm fairly certain it's never off by more than a point. I used this version of the formula to maintain league rankings for several popular board games in my university gaming club. I hope this little blurb helps someone understand this whole thing better, just as it did for me.
@Fexghadi
@Fexghadi 5 лет назад
Does the 4% have an explanation, or is it just a number that works well in the case of Chess Elo?
@TheZotmeister
@TheZotmeister 5 лет назад
@@Fexghadi Both! Elo is defined upon a gap of 400 rating points representing a ten-fold increase in skill - that's what the 400 in the Elo formula is doing. It just so happens that 4% of that number is 16, half the K-factor and therefore the "game value" of the shortcut version I presented. That's where the "4%" is hiding in the original Elo formula. The starting rating value and the K-factor are both arbitrary amounts that together define the sense of scale of the rating number and nothing else; I actually made it a point to start everyone at zero in my ratings and let them go negative, and I used a game value of 100 (K-factor of 200), so it can be clearly seen how many games on average one was "up" or "down" against average-rated opponents; it just made the rating number something more intuitive to interpret. That percentage, on the other hand, is key: it defines how dynamic the ratings are and therefore the implied skill ratio between opponents of different ratings. Ideally, the percentage would scale down as the gap between players increases, so that the amount of points given from the stronger player to the weaker only _approaches_ the game value, never actually reaching or exceeding it: this is effectively what Elo does that this shortcut version doesn't.
@jedinxf7
@jedinxf7 3 года назад
hmm. this does seem to work as a quick and dirty approximation unless players are poorly matched, but I'm not sure it adds much conceptual clarity to people trying to understand the logistic function at play or why it is designed the way it is, or acts the way it does. it would be a useful shortcut for predicting outcomes, though, if extreme precision isn't necessary. basically, a practical engineer's take on a math question :)
@pio-dorco
@pio-dorco 3 года назад
@@TheZotmeister but why when i win against a player who have 100+ more points of me i take only 9/10 points? I'm 1650 for now
@amazuri3069
@amazuri3069 2 года назад
@@pio-dorco The K-Factor in the game you are playing is lower than 32, probably.
@CannarWilm
@CannarWilm 5 лет назад
The ELO system was used for Magic: the Gathering tournaments for a while. While it was active I noticed a "geographical clumping" effect where certain areas hoarded points. This meant that players of the same skill level in different parts of the country has different ELO ratings. I wonder if the same thing happens in Chess?
@patricioiglesias5346
@patricioiglesias5346 3 года назад
In online chess you play against people from all over the world, and If you play in FIDE tournaments you need to play against people from different countries to get the NM, IM and GM forms, so that’s one way of controlling the Elo by country.
@niklashannemann8482
@niklashannemann8482 2 года назад
The more you play with different regions the more this differs. There regional leagues, also in chess, which result in you playing against your region and your elo will reflect your skill relative to that region. If you stay there, yes, there will be differences between the 1800 rated in different regions
@cosasverdes
@cosasverdes Год назад
Western Russia & India are probably two of those clumps (and US?) of higher concentration of better players, 'deflating' one's Elo,' which, most likely, has created awesome situations where one travels to a farther away tournament with mild expectations and ends up wiping the floor with that weak region and its inflated Elo.
@WhirlwindHeatAndFlash
@WhirlwindHeatAndFlash 2 года назад
1:41 I like how he immediately starts to explain away bad results with external factors. What a gamer.
@brettchr777
@brettchr777 5 лет назад
WELCOME BACK !!! Thought you'd gone dark to RU-vid. As an avid chess player, loved your very excellent description of the Elo system. Hope you'll keep posting. Love your work.
@amanharwara
@amanharwara 5 лет назад
Finally, now I know how the system behind me being stuck in Silver hell in CSGO works
@harshvirgrewal2403
@harshvirgrewal2403 5 лет назад
ShadyThGod yessssss
@richardcheney6964
@richardcheney6964 5 лет назад
Not quite; James didn't explain how it works with teams. Why we're permanently silver remains a mystery.
@Bjorneization
@Bjorneization 5 лет назад
@@richardcheney6964 see, it doesnt work with teams does it? Aye lmao
@Architector_4
@Architector_4 5 лет назад
Besides, there isn't an explicit explanation of how CS:GO's rating system works - it might be based on Elo, and it might be that there's so little of that base it practically doesn't matter.
@spaceseal2268
@spaceseal2268 5 лет назад
i'm pretty sure csgo is not based on elo
@pola50001
@pola50001 4 года назад
Bless you honestly, i had a team of designers who decided it good to copy paste elo from website without understanding which cost alot of problems for us. Thanks to that video i can fix this problem and made it better system
@singingbanana
@singingbanana 4 года назад
This is awesome.
@danielorr7124
@danielorr7124 3 года назад
I think the biggest takeaway from this is that everybody has a "range" of playing strengths. You don't play at your ELO rating every day, sometimes you are better and sometimes worse. Unless you are overrated or underrated in which case the ELO rating will soon adjust itself to the proper level.
@rafaelfranceschetti3428
@rafaelfranceschetti3428 5 лет назад
Great video, and the comment at the end indicating that your rating is relative to a generation is quite interesting. Garry Kasparov, a multiple times World Chess Champion, reached a peak rating of 2851. Bobby Fischer, regarded as one of the greatest chess geniuses, reached a peak rating of 2785, and he was born 20 years before Kasparov. Paul Morphy, one of the greatest prodigies of the game (specially since he started playing late, around 18 years of age), peaked at 2690, and he was born almost 100 years before Fischer. It's quite interesting to see how the ratings of some of the legends of the game of chess vary so much!
@Katniss218
@Katniss218 2 года назад
And Stockfish 14.1 now is at like 3800 - 1000 (!) points more.
@vibovitold
@vibovitold Год назад
Paul Morphy couldn't have "peaked at 2690", since there was no rating system back in his day. it could be some retrospectively done estimation maybe, but it makes little practical sense, because it's very error-prone (how do you estimate the ratings of Morphy's opponents in order to come up with his? etc.)
@vibovitold
@vibovitold Год назад
@@Katniss218 the rating of Stockfish can't be compared with human rating, because Stockfish doesn't play rated games against humans. Stockfish plays rated games against other chess engines. it's a separate pool of players. of course those chess engines are way stronger than humans, but comparing their ratings against human ones as "1000 points more" or whatever is completely pointless. these ratings would only normalize if both pools of players were merged, and they'd have play edenough games against eachother to gauge the relative performance. but that's not the case.
@samarthbhat2317
@samarthbhat2317 5 лет назад
A small correction ig : The K factor is no longer 32,16 it's 40,20,10 depending on your age and current rating. If your rating is below 2400 and your age is less than its 40, but if you are older its 20. But if u have rating > 2400, it's 10 :)
@anselmschueler
@anselmschueler 5 лет назад
I misread the title as "The Elo rating system for Cheese and Beyond", which was funny.
@stevenvanhulle7242
@stevenvanhulle7242 5 лет назад
I am reading this two months later, and I can attest that it still is.
@andreaspeters6791
@andreaspeters6791 4 года назад
i had to read this 3 times till i saw the cheese
@darylallen2485
@darylallen2485 4 года назад
How else would we know the good cheese from the bad?
@NStripleseven
@NStripleseven 4 года назад
Who wants to rate some cheese!
@kushgroover54
@kushgroover54 4 года назад
Competitive cheese
@sebastianjost
@sebastianjost 4 года назад
That system is basically the same principle that's used for Q-learning (a particular algorithm to train AIs). Your get Rewards for certain events (in here win, loss, draw) and update a score based on that reward. It's just that in Q-Learning it's not players getting the scores but pairs of a states and possible actions. i.e. a chess board and a particular move. And the formulas are slightly different.
@Houshalter
@Houshalter 8 месяцев назад
You might be overcomplicating the analogy. The model is just a logistic function like used in logistic regression and neural networks. The update rule is equivalent to gradient descent. The k factor is just a learning rate.
@1996Pinocchio
@1996Pinocchio 5 лет назад
Just yesterday I was thinking about how much I loved this channel. Thank you for this explanation. I feel like there are much such concepts that the general public has no feeling of, I just don't know which...
@tucatnev123
@tucatnev123 5 лет назад
This is pure awesome! All I would like to say his name is Élő. It means "living" in Hungarian (he was born in in Austro-Hungary into a Hungarian family.) This is a language with 14 vowels, so the pronunciation could be tricky, but nor that difficult: the ő is is really close to the French "-eu" . In English it is between the 'dress' and the schwa ə the vowel in ‘turn'. (+sorry for your break-up) Oh yeah, and I am nearly sure the number 32 comes from the number of the figurines on a chess table.
@JasonLayton
@JasonLayton 5 лет назад
I’ve spent the 2 years developing an ELO rating system for the sport of wrestling; it adjusts your rating as your weight changes (size matters in wrestling). I used a lot of interesting math to make it work. It’s called WAR Zone (for Weight Adjustive Rating). This video was a good explanation of ELO ratings 👍
@recklessroges
@recklessroges 4 года назад
Are you going to do a Glicko 2 version?
@JasonLayton
@JasonLayton 4 года назад
Reckless Roges I plan on it, there is a lot to do first though.
@joseluizpereirafilho7222
@joseluizpereirafilho7222 3 года назад
Just want to say that i love the way you explain math related topics. I am graduating to be a math teatcher and i'm really inspired by you
@colemanhoyt5437
@colemanhoyt5437 5 лет назад
I've been monitoring my USCF Elo rating for years and still learned a lot from this video! Very cool!
@Eric4372
@Eric4372 5 лет назад
I'm curious how the constants 400 and 32 were decided? Great video btw! :D
@pola50001
@pola50001 4 года назад
I think the 400 related to average played games in one tournament or event or in online games season
@mr.schloopka1124
@mr.schloopka1124 4 года назад
Fide now uses 40 for kids under 2300 elo, 20 for adults under 2300 (I dont remember exactly, but it is something about it) and 10 for everyone above 2300. It is because children get better really fast and on higher ratings it is much harder to get better
@davidwillis7991
@davidwillis7991 4 года назад
they are pretty much arbitrary and different version of elo ratings use different numbers. If you wanted ratings to be more spread out you would use bigger numbers than 400 and if you wanted ratings to move faster you would use a bigger number than 32. If you have a bigger amount of matches to draw data from you would be able to reduce 32 down a lot lower.
@uberneanderthal
@uberneanderthal 3 года назад
since it was originally developed for chess, i suspect it's symbolic of the squares on a chess board. there are 64 squares, so if you win a game you've taken your opponent's share of the board, hence 32.
@philsoady4351
@philsoady4351 3 года назад
A fundamentally good video tarnished by the an apparent failure to recognize draw in most of the explanation . Especially the statement the probability you win or lose with no mention of the probability of draw and if you should get or lose points when you draw against a sufficiently weaker or stronger opponent.
@StopFear
@StopFear 5 лет назад
that guy has a very happy face. It's like he is not smiling but it looks like he is .
@MamToCos
@MamToCos 18 дней назад
Very nice explained with Gauss bell curve, I appreciate your effort. Thanks.
@user-nj4pd8ye1i
@user-nj4pd8ye1i 5 лет назад
I admire your explanations, thanks! Have a good day!
@SuperSight
@SuperSight 2 года назад
Helpful explanation. 3 October 2021 2:31pm NZST
@fazekaszs
@fazekaszs 5 лет назад
one of the formulas resemble a Boltzmann-distribution where the rating is the energy :)) this is so cool
@Danicker
@Danicker 5 лет назад
Fantastic video! As a recreational chess player, I've always been curious about the mathematics behind the Elo rating system and this video was very helpful
@singingbanana
@singingbanana 5 лет назад
Thanks. I've been pleasantly surprised at the positive reaction, it looks like a few people were curious about the mathematics of Elo.
@koningsp
@koningsp 5 лет назад
Interesting, some video games use a hidden elo on top of the actual elo. The hidden elo somehow decides if you're ready to face stronger or weaker opponents. Id be interested to hear how that works!
@dougsundseth2303
@dougsundseth2303 5 лет назад
Nice explanation. Clean and easy to understand. OT: Your video was about 1 stop underexposed (likely because of your camera being set to auto-exposure and your use of a white background). Recommendation: Switch to a gray background, use an exposure compensation of +1 - +1.5, or set a manual exposure rather than using an automatic exposure mode.
@x3ICEx
@x3ICEx 5 лет назад
It is named after its creator Árpád Élő, a *Hungarian*-American physics professor.
@SteveGuidi
@SteveGuidi 5 лет назад
I use to play competitively many years ago and recall that some chess federations added a performance multiplier to rating updates for tournaments. Tournament groups typically contain players within a 200-rating point range, with the margin being tighter for the strong players in the open section. Being the strongest player and winner of the tournament group may not yield any rating points if everyone is significantly lower rated than you are, so the performance multiplier aims to correct this.
@joan_gonzalez
@joan_gonzalez 5 лет назад
Great video, as always! Could you make one explaining the Glicko system too? That'd be awesome!
@mattgsm
@mattgsm 5 лет назад
Congratulations on 200k subscribers!
@Elendrial
@Elendrial 5 лет назад
So why was an increase of 400 chosen as 10x more likely to win? Likewise with starting on 1000? Are they both like the 32 in that it's just arbitrary?
@JivanPal
@JivanPal 5 лет назад
Yup, an arbitrary choice.
@AbiGail-ok7fc
@AbiGail-ok7fc 5 лет назад
Somewhat arbitrary. They're arbitrary in the sense than other numbers could have picked as well. OTOH, you want to have numbers which are easy to understand. Having people start with 1, using .1 as "10x more likely to win" and .01 as the maximum change would have worked in a technical sense. Practically, it would never be popular. The as when you would have picked 7283014 as the number to start with, 89123 as 10x more likely to win, and 5934 as the maximum rating to gain/lose per match. I would have worked, but it would never be popular.
@confucheese
@confucheese 5 лет назад
It’s basically just a decent number to serve as the 10 fold range. If it’s too high then you start ending up with massive Elo ratings, too low and you have a very small rating range which isn’t specific enough.
@memoryleaked
@memoryleaked 5 лет назад
From what I remember when studying ELO, a K-Value--the max number of points a rating can go up or down after a match--of 8 to 32 would be used based on the tournament's size (both player count and geographic area served). A K-Value of 32 would be an international match, and an 8 would be a local match. There are other circumstances that would lower the K-Value, in which the player's were both highly rated. Using a value too low, and your rating moves too slowly to where it should be. A value too high, and your rating over shoots the rating it "should" be moving to.
@davidp.7620
@davidp.7620 5 лет назад
If you chose different numbers for the 400-10x relationship, the only thing that would change is the scale. You would have the same system with "different labels". As for the 32, different users of the system have different choices. You want it to be high enough so you do not get stuck in the same rating forever. But you also don't want it too high or else players will have huge fluctuations depending on their last few games
@josenoelteh69
@josenoelteh69 5 лет назад
Thanks heaps James! Finally, a clear explanation of the ELO rating. Cheers.
@oicirbaf239
@oicirbaf239 3 года назад
"So if we forget chess from the moment, imagine each player brings with them a box of numbers."
@karangupta4978
@karangupta4978 3 года назад
Spoken like a true mathematician
@pegy6384
@pegy6384 5 лет назад
When I saw the title I thought you were going to tell me whether "Don't Bring Me Down" was better than "Mr. Blue Sky," and I wondered what either had to do with chess. I enjoyed this one, even though my chess ranking is more like a dash than a bell curve.
@bip901
@bip901 5 лет назад
Who would win? -Numberphile -A singing banana
@brakosjacob8019
@brakosjacob8019 5 лет назад
A Parker Square.
@PhilBagels
@PhilBagels 5 лет назад
What are their Elo ratings?
@nofanfelani6924
@nofanfelani6924 5 лет назад
@@PhilBagels infinity
@PhilBagels
@PhilBagels 5 лет назад
@@nofanfelani6924 But which infinity?
@00bean00
@00bean00 5 лет назад
60 Symbols, unless it's math(s) only
@exoplanet11
@exoplanet11 4 года назад
Cool video...thanks. As a player on chess.com, I wondered exactly how that worked. Basically, I only have a 10% chance against someone with a rating 400 higher than mine. If the difference is 400, that player's odds of winning are a factor of 10 higher. This is similar to the magnitude system used in astronomy. If a star is 5 magnitudes brighter then it is 100 times brighter.
@AdeonWriter
@AdeonWriter 4 года назад
Couldn't this system become flooded with "inflation points" of new people coming in, losing all their points, and then quitting?
@nobat00
@nobat00 4 года назад
I think new players tend to be matched with either other new players or low ranked ones, thus giving those new points to people also more likely to quit anyway.
@zxb995511
@zxb995511 4 года назад
Rating inflation has been a thing for years in chess. But its not really a problem because the numbers assigned to players will still (relative to each-other) represent a relatively accurate measure of their playing strength. The problem becomes that this system makes it impossible to compare ratings of players over time, as an example of this is that in the 1970s and 1980s the two best players in the world had PEAK ratings of around 2700 ELO, while now most players in the Top 100 have a rating over 2700 and most of them are not even close to being as good at chess as the two top masters of decades past.
@adimyokich
@adimyokich 4 года назад
@@zxb995511 Seriously? top 100 players we have right now are very much better than top 5 players we had 30-40 years ago, same as any other sport or competition, we have new ways to study, internet, supercomputers to analyze openings. Current players are just better than older players we had with as they had limited resources.
@inner_x3407
@inner_x3407 4 года назад
@@adimyokich well, it is the case now. If there will be no major breaks in chess for, like, 200 years, this could be a problem
@mytiliss682
@mytiliss682 4 года назад
It can also be deflated by retiring GMs, though they are not such numerous as low rank players and difference in rating doesn't compensate it. Of course, current top players can perform better than past, but not so much. Like in the Olympics, there should be a room for improvement, but it's not very huge. It could be interesting to see how Elo per capita changed trough years on professional level.
@giwrgosgewrgiou4066
@giwrgosgewrgiou4066 5 лет назад
P(A wins)=1-P(B wins)-P(draw), since you have put in there the probality of the draw (3:20)
@singingbanana
@singingbanana 5 лет назад
A draw is considered a half win and a half loss in this system. Including draws from the beginning can be done and you get a slightly different formula. There's a link in the description.
@giwrgosgewrgiou4066
@giwrgosgewrgiou4066 5 лет назад
@@singingbanana Thank you! Keep up!
@thekleopinetwork1033
@thekleopinetwork1033 5 лет назад
dealing with elo every day, let me just tell you one thing: dont use elo for anything that isn't both: 1v1 and deterministic (as in not containing RNG). It does not work in any other case. Not in online moba games, not in 2v2 team games, not in monopoly or whatever. (use trueskill or something bayesian instead). Its a difference of "design once and forget" and a bloody mess of patched statistical garbage. thanks!
@MrWizardjr9
@MrWizardjr9 5 лет назад
i mean if you play enough games then the times you have crappy teammates and the times the enemy has crappy teammates will even out
@thekleopinetwork1033
@thekleopinetwork1033 5 лет назад
@@MrWizardjr9 if it would just work that way :/ in fact elo doesn't support team mates, so mostly game devs use some tricks to make it work somehow, and that way ruin all statistic logic behind it.
@thekleopinetwork1033
@thekleopinetwork1033 5 лет назад
@@realeconomy5348 in its base, its maybe possible to estimate team contribution with it, but as soon as you have random matchmaking, the system will fail anyways (statistically). There are many systems designed to work with teams, elo really isn't a good solution for team games. There are adaptions of elo, as the one League Of Legends uses that works with a lot of resets and huge maintainance costs. It's not worth it. If you do know your maths, it doesn't take much to see the advantage of other systems (like bayesian systems).
@thekleopinetwork1033
@thekleopinetwork1033 5 лет назад
@@realeconomy5348 i worked with those things, yes it kinda works, you are right there. but it's not practical. I saw many people struggling with it, it sounds good but doesn't work. Not due to the system itself, but mostly due to developers having to put artificial restrictions everywhere to get the system running, which usually breaks the system somewhere eventually. You will always require some workaround and devs aren't good in making them work.
@thekleopinetwork1033
@thekleopinetwork1033 5 лет назад
@@realeconomy5348 works fine for dart, its somewhat deterministic, theres no strategy involved no real teamplay, its just the combined individual skill to do throws well (so mean or sum peoples rating to get a team estimation works fine). Everyone has exactly the same contribution to the team and usually the teams are fixed. However take a soccer game where the keeper has a bad day. it can completely screw a team of 11 people, sometimes certain strategies, playstyles, positioning outweigh each other, thats when elo becomes an issue. The more chaotic a system the worse elo performs at it, you wouldn't rate weather using elo because it makes no sense. Games, especially MOBAs and such are as unpredictable, still people think elo can predict it properly.
@emoluv54865
@emoluv54865 3 года назад
Amazing!, I got a sudden urge to learn how the Rating systems work, then James showed up on the search results. Best search result I've got in years.
@hannesc.9823
@hannesc.9823 4 года назад
Your shirt just suits you unbelievably fine!!
@twelfthdoc
@twelfthdoc 3 года назад
My undergraduate thesis paper was on Elo ratings and its application to competitive online games. I scored a Firsts on the paper! I had suffered all through my final year from ill health, and the paper pulled my entire degree classification up to a 2:1 =)
@Pawntoe4
@Pawntoe4 5 лет назад
As a chess player I've always been confused by the 400. Thanks for educating me on it! I would be interested to know what you think of k-values and a bit more about the year-on-year variance adjustment methods that are employed in chess also.
@MrBoubource
@MrBoubource 5 лет назад
I'd be happy to hear some about these elo derived systems, greatest video of the year
@TheBlueboyRuhan
@TheBlueboyRuhan 5 лет назад
Elo is so last century Can we shine more light on it's successor, Glicko 2?
@JohnDoe-po3ku
@JohnDoe-po3ku 5 лет назад
omg such a savage stop making my panties wet
@Finkelfunk
@Finkelfunk 5 лет назад
It's just like Elo except someone sat down and created a formula to determine the highest and lowest points you have in your box you pull numbers out of.
@DeinBestrFreund
@DeinBestrFreund 5 лет назад
Glicko? True Skill? Or maybe Bayesian elo? No, let's look at the true successor of them all, whole history rating.
@Indygoflow1
@Indygoflow1 5 лет назад
Glicko is periodical. Elo can be updated instantly.
@Storiaron
@Storiaron 5 лет назад
@@Indygoflow1 guess you havent heard of the Storiaron adjusted glicko2 elo rating system, have you ?
@gbraadnl
@gbraadnl 5 лет назад
This is very useful. Wish I had this a few years back when working on an Elo rating-based competitive website
@steeevealbright
@steeevealbright 5 лет назад
Please make ten more videos about chess.
@scottekoontz
@scottekoontz 9 месяцев назад
Draws are more common at higher ratings. This is digging in the math weeds, but there are formulas that will not only show draw percentages (that are pretty close when compared to 100,000+ completed games) and as a result there is a change in win%. The math makes use of ERFC so this is not something most people would be calculating often, although it can be done in a program like Excel with relative ease. Ratings, win/loss/draw probabilities +400 Elo 1200 vs 800: 0.899 0.061 0.040 2000 vs 1600: 0.881 0.042 0.077 2800 vs 2400: 0.856 0.017 0.127 More exaggerated for similar-rated players: 1000 vs 1000: 0.441 0.441 0.118 2000 vs 2000: 0.346 0.346 0.309 2700 vs 2700: 0.215 0.215 0.569
@23deepakiyer76
@23deepakiyer76 4 года назад
"eduardo i need the algorhithm"
@enriqueflores5283
@enriqueflores5283 3 года назад
Very well explained. I wasn't aware that the measurement was relative to the population.
@randomaccount1310
@randomaccount1310 5 лет назад
So ELO depends on the transitivity of winning. And also doesn't differentiate the qualitiy of draw vs loss (i.e. winning and losing one game is the same as drawing two). It would be interesting to look at alternative rating systems that differ in those regards.
@thecuriousgorilla6005
@thecuriousgorilla6005 5 лет назад
Glicko systems have a rating deviation I believe, based on one's consistency.
@RobloxKid123
@RobloxKid123 Год назад
4:03 "You win half the games and you lose half the games" Draws - Do we look like a joke to you?
@Voltanaut
@Voltanaut 5 лет назад
Great video, James. You are such a dude.
@styleisaweapon
@styleisaweapon Год назад
In CS we simply say that a recurrence of the form: z => k*(x-z) will tend towards the average of x, over time. In statistics they call it the exponential running average or whatever. In early AI the technique was called "temporal differences."
@chorthithian
@chorthithian 5 лет назад
Yeah, exactly. elo rating inflation (score measures your ability relative only to the population) is not great. Prevents us from comparing players that lived during different times. Is there any method to adjust older players scores such as Capablanca or Fischer to be reflected in the current population's scores? Any necessary assumptions to these methods?
@connorp3030
@connorp3030 5 лет назад
Mau Mou would love to know this too
@doctorbobstone
@doctorbobstone 5 лет назад
I wonder if you could take a frozen version of an AI and use it as a standard measure of ratings going forward. I mean, obviously you could do it, but would it tell us something useful? I'm not sure. Maybe over time people would learn what the AI was susceptible to and scores would increase because of training relevant to beating the standard AI, but not relevant to beating humans. Another thought experiment: If you had a time machine and used it to match up grandmasters from 100 years ago with those of today, the GMs of today have the benefit of being able to study an additional 100 years of chess history. If I'm attempting to measure chess ability, am I interested in the real ability to win games as they were out in the theoretical ability of those historical GMs to absorb the same history and then see how they performed? Unless we discover some absolute measure of chess ability, I think correlating scores across time may end up being an estimate at best and highly dependent on the assumptions you make in your models.
@andymcl92
@andymcl92 5 лет назад
@@doctorbobstone I wondered about a frozen AI too. Maybe the way to do it would be to have a frozen AI as the standard. To avoid players learning the methods to beat that one, you train up other AIs to play against it until they reach an equal standard. Then you match real players against the new AI. This way, the AI they're facing should be at the standard level but wouldn't be the SAME AI that they'd seen before. Disclaimer: I'm not a chess player. I know the rules but haven't played much. I just enjoy maths. :)
@aednil
@aednil 5 лет назад
as long as we have their games archived it should be possible to compare their ability. you could analyze their games with the computer and then look at the number of blunders- and perfect moves they played. out of the number of perfect moves played, you could eliminate trivial perfect moves (like forced moves) and check out of how many possible moves someone has found the perfect move. if you consistently find the perfect move out of 30 possible moves and someone else only finds the perfect move consistently when they have only 10 possible moves, you'd be clearly better than that person.
@Piotrszyba.1
@Piotrszyba.1 5 лет назад
Are you fricken serious? XD we do not live in the vacuum, there's always a context everything relates to and that relates to everything. It's called a reality and reality is not a disadvantage :D it's rather an advantage.
@prithwishguha309
@prithwishguha309 Год назад
This video was really good James, make another about those sophisticated methods
@davidl5921
@davidl5921 5 лет назад
Won't there be rating inflation constantly because new players are joining?
@FluffyFractalshard
@FluffyFractalshard 5 лет назад
thats why he said that its one flaw is its dependency on the population. however a game like chess I believe has a pretty constant number of players because not only new are joing but also old are quitting.
@jogadorjnc
@jogadorjnc 5 лет назад
No, because new players join with the average rating, in this case, 1000.
@bunpeishiratori5849
@bunpeishiratori5849 5 лет назад
I believe that in the US, deflation is an issue because young players learn the game, develop skill, improve, and "take points" away from established players, only to quit the game themselves when they reach college age. So adult players spend their lives getting points taken away from them, which leads to the deflation. There are measures taken to offset this, however.
@vibovitold
@vibovitold Год назад
@@FluffyFractalshard yes, but the ones that are quitting have absorbed some points, but those points will never return to the system anymore (they won't lose any rating points once they stopped playing - these points get "frozen" forever). whereas new players come in, most of them will lose some rating points (because they suck at chess, no time to study etc., and that's why they quit), so then they quit, but the rating points they lost have been absorbed by an active pool of players and will slowly bubble up the "food chain".
@barsaf9989
@barsaf9989 Год назад
Very good explanation. Thanks. I've been curious about elo rating for years.
@emilymcplugger
@emilymcplugger 4 года назад
The important thing is not to “Turn to Stone”, avoid “Confusion” and “hold on tight” to as many pieces as possible.
@BenHoyt
@BenHoyt 4 года назад
Great tips, these will work all over the world
@psibarpsi
@psibarpsi 2 года назад
Great video! I loved the fact that the video explained everything from scratch.
@zippymax1
@zippymax1 5 лет назад
But...but...what's _YOUR_ Elo rating?!
@Immediately_
@Immediately_ 5 лет назад
1,400 but it's probably inflated these days because I haven't played in a while.
@bunpeishiratori5849
@bunpeishiratori5849 5 лет назад
Mine is about 1800 (USCF).
@rubetornabene8543
@rubetornabene8543 5 лет назад
Mine is >2500
@heyandy889
@heyandy889 5 лет назад
it's about three fifty
@xander1052
@xander1052 5 лет назад
Mine is about 1500 since I do play mostly players above or around my skill level One of my friends though is definitely above 2k these days, been trying to convince him to go to a chess tournament one of these days for a laugh, he was our school's Connect 4 champion interestingly enough too.
@gamespotlive3673
@gamespotlive3673 Год назад
Legitimately the BEST explenation of ELO that I've ever seen. WOW.
@ButteryCowEgg
@ButteryCowEgg 5 лет назад
This has always been my favorite rating system. It's such a shame Ultimate didn't use this.
@winobouwens
@winobouwens 5 лет назад
I've been playing a game named 'Warzone' for a few years and it uses Bayesian ELO. It's basically ELO, but with a few adjustments and it tries to approach a Bayesian way of thinking about ELO. When you beat someone with a rating of 1000, but later they turn out to have an actual rating of 2000, you get compensated for it. It goes to other way round too, of course; beat a highly rated opponent who starts playing worse in the long run and you'll get deducted some points. This system is not necessarily good for long-time, continuous events such as ladders in chess (or Warzone), but seems to do a great job in round-robin style tournaments with about 8-15 players. I think the code for it is open source somewhere, definitely worth checking out for chess freaks :)
@LeventK
@LeventK 4 года назад
And again RU-vid recommendations brought us together.
@quinnlaya331
@quinnlaya331 4 года назад
No I searched for this. Hello recommendation people tho! Shout out to my CSGO and LoL bros!
@pawncube2050
@pawncube2050 3 года назад
....
@appleslover
@appleslover 3 года назад
Who doesn't know james? Gosh
@darth_sidius5311
@darth_sidius5311 3 года назад
Dude wtf you are here too.
@FirasSawaf
@FirasSawaf 5 лет назад
Thanks for the video, very interesting indeed. Not sure if +400 point equates to 10 times the chance of winning for the higher rated player. It might be slightly higher than that, at least in practice that is. I'll have to dig up the article from which I'd worked this out but I seem to recall something like this: the lower rated player's chance of winning a game against someone 100 points higher rated is roughly 1/3. When the difference is 200 it's 1/4, when 300 it's 1/7, and when 400 it's 1/12, i.e. rather than 1/10. It then approximately halves for each additional 100 point difference, so when 500 it's 1/24, when 600 it's 1/48, and when 700 it's 1/96.
@k-intefleush4934
@k-intefleush4934 5 лет назад
400 points difference implying the higher rated player is 10 times more likely to win is the definition of the Elo system (at least how it's used in chess). lower rated player : E = 1 / (1 + 10^(400/400)) = 1/11 higher rated player : E' = 1 / (1 + 10^(-400/400)) = 1/1.1 = 10/11 You can change that /400 factor to whatever you want to scale it accordingly to what you need (maybe that's where your confusion comes from?). Of course, this is just a model, so I'd be interested in seeing statistics about real games and see to what extent they match the Elo expectations.
@FirasSawaf
@FirasSawaf 5 лет назад
@@k-intefleush4934 Sure, though in practice there are other factors which confound this picture. It would be much easier to share this if only I could find that article. From what I remember, that article took other factors into account, like elo rating "inflation" phenomena. This is a well understood shortcoming of how elo is currently calculated/updated but there seems to be no real desire to switch over to a more accurate/stable version. I'll post a link to that article when I eventually find it, it's pretty academic but there are some useful tables illustrating these aspects, so should be interesting I would have thought.
@k-intefleush4934
@k-intefleush4934 5 лет назад
@@FirasSawaf Yeah, the Elo system as presented in this video does not care about inflation and deflation, and probably a lot of other phenomena are left out. Would love to read the article you mentionned if you can find it again. The Wikipedia articles are very interesting too by the way.
@trdi
@trdi 5 лет назад
2 points. First one is that initial ELO is calculated differently than it was suggested here. My initial ELO was average of all opponents I played in my first tournament, because I had exactly 4 wins and 4 losses. I'm talking about the actual international ELO, not national ratings, where the video explanation might be true for some countries. Second point is that ELO has a very big weakness in that it can't be compared between era's easily, but for a different reason than it was said in the video. There is a very nasty effect of ELO inflation, meaning that average ELO will be higher and higher and ELO records are going to be broken all the time because of that. Carlsen does not hold ELO record if you count the inflation effect. It's easy to see why the inflation happens. Typically players get better with age and their rating gets higher. However from some age onward the ability starts deteriorating. So when the new kids come in, they play against player with ELO 2600 whose real ELO should be 2300. Younger players gain their rating faster than older players are losing their ELO (consequence of math and there is also an additional factor K, which makes the gains faster for players with fewer games).
@markstanbrook5578
@markstanbrook5578 5 лет назад
The extra weighting you describe for newer players means the system isn’t a strict ELO system. Without it the issue of deteriorating skill is a non-issue because all games have a zero-sum effect on the total ELO pool and it is absolutely required for the integrity of the system that the player loses their ELO in the mathematically appropriate manner. Of course there are all manner of add-ones to the system like the k-factor you describe and the fast-track to a nominal ELO you also talk about in the first paragraph. They give a certain convenience to players but cause inflation. In fact I think it might be arguable that in a strict system inflation might take place - in the pool of active players - as underperformers quit playing while leaving their lost ELO in the pool. If the rate at which people did that sufficiently exceeded the rate at which successful players retired, taking the extra out of the active pool... and technology might well assist that. Online chess for example.
@Fatalgh0stt14
@Fatalgh0stt14 Год назад
I immediately saw the thumbnail and I was isn’t this the guy from numberphile , anyways lovely video subbing to you on here aswell , love the breakdowns sm❤
@SeabasstianTV
@SeabasstianTV 3 года назад
3:34 time to go write this on my dorm room window *trent reznor and atticus ross soundtrack intensifies*
@grapetoad6595
@grapetoad6595 Год назад
I am so glad for this video, i was so confused about what it meant, and hit the same wall of explanationless-ness.
@LeventK
@LeventK 3 года назад
Queen's Gambit: Exists RU-vid: Alright let's recommend this video so people will learn ELO was a guy.
@ABomm
@ABomm 5 лет назад
I really appreciate the write-up in the description!
@ganifraterdogan1062
@ganifraterdogan1062 5 лет назад
I wonder what is the highest ELO difference in a chess game that resulted in an underdog victory. Any statistics pros?
@MrMctastics
@MrMctastics 5 лет назад
I'd say an average player has 20 time chance of beating me because I dont play the game. I was messing around with the best chess player in the school and he tried to beat me in only a couple turns but he used the only trick I know in chess which I don't remember the name of so I managed to counter it and win the game. 1/20000 chance
@geitekop507
@geitekop507 5 лет назад
There was a huge upset last year I believe where a grandmaster lost a whole lot of elo points, after losing to a 1200, because he wasn't paying attention.
@jasperw.7664
@jasperw.7664 5 лет назад
In online chess, GM eric hansen (~2500?) once lost to a 100 rated player
@edwardshowden5511
@edwardshowden5511 5 лет назад
@@jasperw.7664 Nope. It was a 'special' account
@gasdive
@gasdive 5 лет назад
@@geitekop507 he just explained that you can't lose more than 32 points.
@InternetLAMA
@InternetLAMA 5 лет назад
Great explanation! Follow-up question, does this means that since you basically take points from the loser when you win, it's a zero sum game, and that the total player base(total Elo in the system) affects how high Elo rating you can get?
@singingbanana
@singingbanana 5 лет назад
Correct
@NoamMohr
@NoamMohr 5 лет назад
I heard that Elo was Hungarian, and his name was pronounced Ay-lo.
@oraszuletik
@oraszuletik 3 года назад
Hungarian original writing is Élő. Meaning is "living". Pronouncing is hard for me to explain :-)
@psionl0
@psionl0 2 года назад
I learned that the probability of winning was based on a Normal Distribution with a mean of 0 and a standard deviation of 283.165. That is, p(A wins) = N(RA-RB, 0, 283.165). Both formulas give the same value (24%) when player B is rated 200 points higher than player A. However, the Normal Distribution formula tapers off more sharply for higher differences in the ELO ratings.
@daon23
@daon23 3 года назад
Me: Calculates chances of winning... "Ah! it's more than 100% nice, this is gonna be easy" *Blunders queen on move 5
@user-id9ct1ow9h
@user-id9ct1ow9h 2 месяца назад
And now this 1960s math is the best way to determine which AIs are better than others. Thank you for the formulas explanation, James, it's perfect as always. I was very glad to find out you have the video on this topic
@keckscraftinghd4385
@keckscraftinghd4385 4 года назад
Uff my english ist Bad i thought this was an elon Musk cheese rating
@Osama-Bon-Jovi-01
@Osama-Bon-Jovi-01 3 года назад
lol
@recklessroges
@recklessroges 5 лет назад
Nice to have you back.
@LiamE69
@LiamE69 5 лет назад
The Elo rating is a great example of Mr Blue Sky thinking.
@andymcl92
@andymcl92 5 лет назад
Mr Deep Blue Sky Thinking?
@LancenHarms
@LancenHarms 5 лет назад
Took me WAY too long to get this.
@xander1052
@xander1052 5 лет назад
It took me too long to work out.
@DaTux91
@DaTux91 Год назад
I missed this video when you first published it, but it's so interesting! Did you ever do a follow-up?
Далее
I made an F1 ELO Engine. Who's highest rated?
18:38
Просмотров 511 тыс.
How many chess games are possible? - Numberphile
12:11
Chess Ratings: A Deep Dive
26:35
Просмотров 100 тыс.
The Dirty Truth About Brilliant Moves...
23:05
Просмотров 1,4 млн
The rarest move in chess
17:01
Просмотров 1,8 млн
The history of the top chess players over time
6:09
Просмотров 2,8 млн
How GOOD Is 1,000 Chess Elo?
28:16
Просмотров 1 млн
Ranking Systems: Elo, TrueSkill and Your Own
37:44
Просмотров 49 тыс.