Elo rating system

Principle, calculations and mathematical construction

The Elo rating system is a system for evaluating the relative level of players.
This system allows players to be ranked solely based on their results. This ranking system is actually a real mathematical model that evaluates the strength, or level, of each player compared to other ranked players.
In particular, the difference in Elo ranking between two opponents directly gives the probability of winning for each, and therefore makes this mathematical system a privileged and essential tool for any bettor, bookmaker, etc.
Armed with this objective probability of victory of a player or a team, we can calmly study whether the situation is a value bet or not, as well as optimally size our bets using the Kelly criterion.
In this page, we explore the mathematical construction of the Elo ranking system.

On ranking systems

A ranking system is a set of rules that allows opponents to be classified in a given discipline.
Even if the question may seem simple, many ways of proceeding can be imagined, each having its qualities and its faults.
In general, a system with points is used: in each match the winner earns points and the loser loses points (or gains fewer). The first is then the one with the most points, then comes the second, ... The question of the ranking system then turns to that of the number of points to be awarded after each match: are we awarded as many points to the first of the ranking which beats (easily) the last in the ranking, what if this first in the ranking beats the second? and as many points during a world level championship as during a national or more local, or more anecdotal meeting?
For example, notably in tennis, the number of points awarded depends on the type of tournament and the player's past results (the previous year). We were thus able to see the following paradox:

In the final of a tournament (Pekin, 2013), N. Djokovic, world number 1, beats R. Nadal, world number 2.
Following this victory, N. Djokovic becomes world number 2, giving way to R. Nadal who overtakes him in the ranking...
This paradox is explained quite simply in the ATP points ranking system: the winner N. Djokovic had already won the tournament last year and his number of points does not change, while R. Nadal was injured the previous year on the dates of this same tournament and was therefore not able to participate: his number of points therefore increases because this year he reached the final.
In fact, upon closer inspection, the reversal in the ranking between the first two places was already in place before the final, whatever the outcome.
In a way, the ranking does not directly reflect the results of the matches. Can we then use it to predict the outcome of future matches?

A ranking system is therefore a set of rules which allows players to be ordered according to their results against others. This order in turn allows you to get an idea of the “strength” of each player. This is the essential element, exclusively, on which the Elo ranking system is based and constructed.

Elo ranking

In the Elo ranking system, named after its inventor Arpad Elo, a physicist at the University of Chicago, each player's ranking depends on his results against other players.
A particularity of this ranking system is that it can directly reflect the probability of a player defeating another player.
This ranking system is therefore actually a mathematical model for evaluating the level of each player, a level relative to the other ranked players.

The Elo mathematical system was initially designed for chess. It tends, slowly, to attract other disciplines such as football with the FIFA ranking , in particular the FIFA women's world ranking which is an Elo type ranking.
In tennis too, the world Elo ranking of tennis players is starting to appear.

Mathematical construction of the Elo ranking system

The Elo system aims to measure the relative strength of two adversaries A and B by estimating the probability that one has of winning over the other.
We then note A et B the probability that A wins against B. We therefore already directly have the first property P(B/A) = 1 − P(A/B).

Hypothesis for constructing the ranking system

We want the chances of winning or losing of one player against another, or rather the probability of winning/losing, to depend only on the difference between the rankings: in other words, we want that

P(A/B) = F ( Elo(A) − Elo(B) )

where F is a function, a priori arbitrary.
Constructing such a classification system thus ultimately and simply amounts to defining this function F.

Choice of the function F

We have F: R→]0;1[. We quite naturally impose that F is a continuous and increasing function: the greater the difference in classification, the greater the probability of winning.
Finally, the value F(0) = 0.5 is also logically necessary: with equal ranking, the same probability of winning for each opponent.

Balance of power, or relative strength of adversaries

We define the balance of power between opponents A and B, or relative strength of the two players, by

R(A/B) = P(A/B)/P(B/A) = P(A/B)/1 − P(A/B)

For example, if P(A/B) = 60% then

R(A/B) = 0.6/0.4 = 1.5

In other words, if the probability that A wins is 60% then A is one and a half times more likely to win than B.

Hypothesis on the balance of power

A ranking system must make it possible, among other things, to measure the strength of two adversaries even when they have never met. We must therefore be able to evaluate it by transitivity, that is to say evaluate the force of A in relation to C knowing that of A in relation to B and that of B in relation to C.
We assume for the Elo ranking that this balance of power is multiplicative, i.e.

R(A/C) = R(A/B) × R(B/C)

Choice of function F

According to the previous expressions, we now quite naturally introduce the function

G: x ↦ ln x/1 − x

where ln denotes the natural logarithm, and which allows us to express the balance of power according to

ln(R(A/B)) = ln P(A/B)/1 − P(A/B) = G(P(A/B))

We now return to our desired function F, which we had introduced into the relation

P(A/B) = F ( Elo(A) − Elo(B) )

where now

ln(R(A/B)) = G ( F ( Elo(A) − Elo(B) )) = H( Elo(A) − Elo(B) )

by setting H = G o F the composition of the two functions.

The hypothesis of multiplicity of power relations

R(A/C) = R(A/B) × R(B/C)

is then transformed into additivity relationship via the logarithm:

ln(R(A/C)) = ln(R(A/B)) + ln(R(B/C))

To simplify, let's introduce the distances d₁ = Elo(A) - Elo(B) and d₂ = Elo(B) - Elo(C). We then have d₁ + d₂ = Elo(A) - Elo(C) and then the previous relation is rewritten:

H(d₁ + d₂) = H(d₁) + H(d₂)

which shows that the function H is an additive function.
As we have assumed that F is continuous, the function H is therefore continuous and additive: we deduce that H is linear (french proof), that is to say that there exists a real constant α such that, for all real d,

H(d) = αd

We can now come back to the function F:

H(d) = G o F (d) = αd

from where

F(d) = G⁻¹(αd)

It now remains to determine the explicit analytical expression.

Analytical expression

In the previous expression of the function F comes the reciprocal function G⁻¹ of G whose expression we now need to determine:

G(x) = y ⇔ ln x/1 − x = y ⇔ x/1 − x = e^y ⇔ x = (1−x)e^y ⇔ x(1+e^y) = e^y ⇔ x = e^y/1 + e^y = G⁻¹(y)

and we therefore find the analytical expression of the function F sought, and on which the calculations of the Elo ranking system are based:

F(d) = G⁻¹(αd) = e^αd/1 + e^αd

or, after multiplying numerator and denominator by e^−αd,

F(d) = 1/1 + e^−αd

We therefore quite naturally end up with a logistic function, more precisely here F is the distribution function of the standard logistic function.

Return to the probability of victory

We recall the main and sought-after use of this function: the probability that A wins against his opponent A at distance, in ranking,

d = Elo(A) − Elo(B)

P(A/B) = F(d) = 1/1 + e^−αd

There now remain two elements to clarify: this constant α, and calculations to update the Elo rankings at the end of the match and victory or defeat of A.

Choice of parameter α

The coefficient α is a scaling parameter of all the players' rankings. In chess, the choice of value

α = ln(10)/400

leads to a range of ranking of the order of 0 up to 3000.
With this value, we have

exp(−αd) = exp−d ln(10)/400 = expln10^−d/400 = 10^−d/400

and we therefore have the expression of the probability as a function of the ranking difference d = Elo(A) − Elo(B):

P(A/B) = 1/1 + 10^{−d / 400}

Updating Elo rankings after victory or defeat

The previous probability function is used to update rankings after a match.
When a player A with the Elo(A) ranking is against an opponent B with the Elo(B) ranking, his new Elo'(A) ranking is calculated by the formula

Elo'(A) = Elo(A) + K(W − P(A/B))

where

W is the match result: W = 1 if A wins, W = 0 if he loses, and W = 0,5 in case of draw.
K is a ranking stability parameter. The smaller the value of K, the more stable the rankings are and therefore vary little after a match; a lot of matches are necessary to change significantly the ranking. These small values are in general (and in particular in chess) used for strong players, with already a good ranking (professional players, international chess masters, etc.).
A larger value for this parameter K implies on the contrary that each result has a more significant influence on the ranking.
As an example for Elo ranking in chess, the choice is:
- K = 40 until the player's 30th game
- K = 20 for an Elo ranking below 2,400 Elo
- K = 10 for an Elo rating above 2,400.

Elo ranking calculator

The following calculator allows you to automatically calculate the new ELo ranking of two opponents, according to their rankings during the meeting, of course the result of the meeting, and finally the parameter K.