Elo, Glicko, and Glicko-2 Explained
In the world of competitive games and sports, rating systems play a crucial role in measuring player or team skill levels, predicting match outcomes, and facilitating fair matchmaking. Among the most influential are the Elo rating system and its successors, Glicko and Glicko-2. These systems, rooted in statistical models, help quantify relative strengths in zero-sum games like chess, esports, and even traditional sports. In this blog post, we’ll dive deep into how each system works, explore their mathematical foundations with equations, and examine real-world applications, including their specific use in Dota 2 for player MMR and team rankings. Whether you’re a data enthusiast, a gamer, or a sports analyst, understanding these can shed light on why your favorite team ranks where it does.
The Elo rating system, named after its creator Arpad Elo, a Hungarian-American physicist and chess master, was developed in the mid-20th century as an improvement over earlier methods like the Harkness system.
At its heart, Elo predicts the expected outcome of a match based on rating differences and updates ratings accordingly.
Expected Score Calculation: For two players A and B with ratings $ R_A $ and $ R_B $, the expected score $ E_A $ for player A (probability of winning, plus half a draw) is given by the logistic function:
\[E_A = \frac{1}{1 + 10^{(R_B - R_A)/400}}\]Similarly, $ E_B = 1 - E_A $. The factor 400 is chosen so that a 400-point difference implies the higher-rated player has about a 91% chance of winning.
Rating Update: After the game, player A’s new rating $ R_A’ $ is:
\[R_A' = R_A + K \cdot (S_A - E_A)\]Here, $ S_A $ is the actual score (1 for win, 0.5 for draw, 0 for loss), and $ K $ is the “K-factor,” which controls the magnitude of changes. Higher K values (e.g., 40 for new players) allow rapid adjustments, while lower ones (e.g., 10 for experts) stabilize ratings.
The USCF uses a dynamic K: $ K = \frac{800}{N_e + m} $, where $ N_e $ is the effective number of games rated, and $ m $ is games in the current tournament.
Originally designed for chess, Elo has expanded far beyond. In esports, it’s used for matchmaking in games like League of Legends (pre-Season 2), Overwatch (with seasonal adjustments), and Classic Tetris.
Developed by statistician Mark Glickman in 1995, the Glicko system addresses Elo’s limitations by incorporating “Ratings Deviation” (RD), a measure of rating reliability that increases with inactivity or inconsistent play.
Unlike Elo, which treats all ratings as equally reliable, Glicko uses RD to dampen updates when uncertainty is high (e.g., after long breaks). Glicko-2 further refines this with volatility, assuming strengths follow an auto-regressive normal process.
RD Update (Pre-Games): RD increases over time:
\[RD = \min\left(\sqrt{RD_0^2 + c^2 t}, 350\right)\]where $ t $ is rating periods elapsed, and $ c \approx 34.6 $ (tuned so RD reaches 350 after ~100 periods).
Rating Update (Post-Games): For $ m $ games, new rating $ r $:
\[r = r_0 + \frac{q}{\frac{1}{RD^2} + \frac{1}{d^2}} \sum_{i=1}^{m} g(RD_i)(s_i - E(s|r_0, r_i, RD_i))\]with $ q = \frac{\ln(10)}{400} $, $ g(RD_i) = \frac{1}{\sqrt{1 + \frac{3q^2(RD_i^2)}{\pi^2}}} $, and expected score $ E $ similar to Elo but adjusted by $ g $. Then, $ d^2 $ is the inverse of a sum involving variances, and new RD is $ RD’ = \sqrt{\left(\frac{1}{RD^2} + \frac{1}{d^2}\right)^{-1}} $.
Glicko-2 builds on this with volatility. Key steps include computing variance $ v $ and delta $ \Delta $:
\[v = \left[\sum_{j=1}^{m} g(\phi_j)^2 E(\mu, \mu_j, \phi_j)\{1 - E(\mu, \mu_j, \phi_j)\}\right]^{-1}\] \[\Delta = v \sum_{j=1}^{m} g(\phi_j)\{s_j - E(\mu, \mu_j, \phi_j)\}\]where $ \phi $ is RD (scaled), $ \mu $ is rating, and functions $ g $ and $ E $ are analogous.
Glicko shines in online environments with irregular play. It’s used on chess platforms like Lichess and Chess.com, where RD stabilizes ratings for infrequent players.
Elo is simple and effective for consistent competitors but ignores uncertainty. Glicko adds RD for better handling of inactivity, while Glicko-2’s volatility makes it ideal for volatile performances. In practice, Elo suits stable environments like professional chess, whereas Glicko variants excel in dynamic online games.
While the core principles of Elo, Glicko, and Glicko-2 provide a universal framework for skill assessment, their adaptations in specific games like Dota 2 highlight how these systems evolve to meet the demands of massive multiplayer online battle arena (MOBA) environments. Dota 2, developed by Valve, has undergone significant changes to its matchmaking rating (MMR) system over the years, balancing competitive integrity, player retention, and computational efficiency. This section explores how Dota 2 has implemented (and iterated on) these systems for individual players and professional teams, drawing from historical transitions up to the current landscape in September 2025. We’ll focus on MMR evolution for players and third-party team rankings, as Valve does not maintain an official team MMR.
Dota 2’s matchmaking system, which pairs players for fair games, has always been inspired by Elo’s zero-sum transfer of points based on expected vs. actual outcomes. Early iterations (pre-2013) used a basic Elo variant, where MMR was a hidden integer value starting around 1500, adjusted by a fixed amount per win/loss (typically ±25 MMR). This mirrored the Elo update equation:
\[R' = R + K (S - E)\]with K often fixed at 32–40, leading to issues like MMR deflation (ratings drifting downward over time due to inconsistent play) and poor handling of new or inactive players.
By 2023, with Patch 7.33, Valve overhauled the system to adopt Glicko, a more sophisticated Bayesian approach that incorporates rating deviation (RD) to account for uncertainty.
where q = ln(10)/400, and the g(φ) term dampens predictions for uncertain ratings.
As of September 2025, Dota 2’s player MMR remains Glicko-based, integrated into a seasonal ranking structure introduced in 2024.
| Tier | MMR Range (Season 6, Aug 2025) |
|---|---|
| Herald | 1–769 |
| Guardian | 770–1539 |
| Crusader | 1540–2309 |
| Archon | 2310–3079 |
| Legend | 3080–3849 |
| Ancient | 3850–4619 |
| Divine | 4620–5389 |
| Immortal | 5390+ |
Wins grant 20–30+ MMR (factoring behavior score and party size), while losses deduct similarly, modulated by RD for “confidence” in the rating—low-confidence players see smaller swings.
The evolution has improved matchmaking quality: Pre-Glicko, MMR inflation/deflation skewed queues; now, RD ensures balanced games even for returning players. Community feedback in 2025 praises the system’s fairness, though debates persist on party MMR penalties.
For professional teams, Valve relies on community tools like datDota for Glicko-2 ratings, as official MMR is player-centric.
Glicko-2’s volatility (σ) is key here: It models how a team’s strength fluctuates with patches (e.g., 7.37 in early 2025 buffed carries, spiking underdog wins).
followed by σ and φ recalibration, with a rolling 6–12 month window for relevance.
These ratings inform bracket predictions, seeding, and betting, outperforming Elo in volatile metas. datDota’s variants (Glicko-1, Elo-32/64) allow comparisons, but Glicko-2 is the gold standard for pros.
Dota 2’s adoption of Glicko for players and Glicko-2 for teams exemplifies iterative refinement: Elo provided the base, but uncertainty modeling via RD and σ ensures resilience against the game’s 120+ heroes, patches, and team dynamics. As of 2025, no major overhauls are announced, but seasonal tweaks continue to refine confidence intervals.
Here are some more articles you might like to read next: