skelo
The skelo package is an implementation of the Elo and Glicko2 rating systems with a scikit-learn compatible interface.
The skelo package is a simple implementation suitable for small-scale rating systems that fit into memory on a single machine. It’s intended to provide a convenient API for creating Elo/Glicko ratings in a data science & analytics workflow for small games on the scale thousands of players and millions of matches, primarily as a means of feature transformation in other sklearn pipelines or benchmarking classifier accuracy.
Motivation
What problem does this package solve?
Despite there being many opensource rating system implementations available, it’s hard to find one that satisfies several criteria:
A simple and clean API that’s convenient for a data-driven model development loop, for which use case the scikit-learn estimator interface is the de facto standard
Explicit management of intervals of validity for ratings, such that as matches occur a timeseries of ratings is evolved for each player (i.e. type-2 data management as opposed to type-1 fire-and-forget ratings)
This package addresses this gap by providing rating system implementations with:
a simple interface for in-memory data management (i.e. storing the ratings as they evolve)
time-aware ratings retrieval (i.e. resolving a player to their respective rating at an arbitrary point in time)
scikit-learn classifier methods to interact with the predictions in a typical data science workflow
Installation
Install via the PyPI package skelo using pip:
pip3 install skelo
License
This project is released under the MIT license. Please see the LICENSE header in the source code for more details.
Quickstart
As a quickstart, we can load and fit an EloEstimator (classifier) on some sample tennis data:
import numpy as np
import pandas as pd
from skelo.model.elo import EloEstimator
df = pd.read_csv("https://raw.githubusercontent.com/JeffSackmann/tennis_atp/master/atp_matches_1979.csv")
labels = len(df) * [1] # the data are ordered as winner/loser
model = EloEstimator(
key1_field="winner_name",
key2_field="loser_name",
timestamp_field="tourney_date",
initial_time=19781231,
).fit(df, labels)
The ratings data are available as a pandas DataFrame if we wish to do any further analysis on it:
Once fit, we can transform a DataFrame or ndarray of player/player match data into the respective ratings for each player immediately prior to the match
>>> model.transform(df, output_type='rating')
r1 r2
0 1598.787906 1530.008777
1 1548.633423 1585.653196
2 1548.633423 1598.787906
3 1445.555739 1489.089241
4 1439.595891 1502.254666
... ... ...
3954 1872.284295 1714.108269
3955 1872.284295 1698.007094
3956 1837.623245 1714.108269
3957 1837.623245 1698.007094
3958 1698.007094 1714.108269
[3959 rows x 2 columns]
Alternatively, we could also transform a datafrom into the forecast probabilities of victory for the player “winner_name”:
>>> model.transform(df, output_type='prob')
0 0.597708
1 0.446925
2 0.428319
3 0.437676
4 0.410792
...
3954 0.713110
3955 0.731691
3956 0.670624
3957 0.690764
3958 0.476845
Length: 3959, dtype: float64
These probabilities are also available using the predict_proba or predict classifier methods, as shown below. What distinguishes transform from predict_proba is that predict_proba and predict return predictions that only use past data (i.e. you cannot cheat by leaking future data into the forecast), while transform(X, strict_past_data=False) may be used to compute ratings that “peek” into the future and could return ratings updated using match outcomes pushed (slightly) back in time to the match start timestamp. This is a specific convenience utility for non-forecasting use cases in which the match start time is a more convenient timestamp with which to index and manipulate data.
>>> model.predict_proba(df)
pr1 pr2
0 0.597708 0.402292
1 0.446925 0.553075
2 0.428319 0.571681
3 0.437676 0.562324
4 0.410792 0.589208
... ... ...
3954 0.713110 0.286890
3955 0.731691 0.268309
3956 0.670624 0.329376
3957 0.690764 0.309236
3958 0.476845 0.523155
[3959 rows x 2 columns]
>>> model.predict(df)
0 1.0
1 0.0
2 0.0
3 0.0
4 0.0
...
3954 1.0
3955 1.0
3956 1.0
3957 1.0
3958 0.0
Name: pr1, Length: 3959, dtype: float64
API Reference
Rating Estimators
- class skelo.model.elo.EloEstimator(key1_field=None, key2_field=None, timestamp_field=None, default_k=20, k_fn=None, initial_value=1500, initial_time=0, **kwargs)[source]
A scikit-learn Classifier implementing the Elo rating system.
- __init__(key1_field=None, key2_field=None, timestamp_field=None, default_k=20, k_fn=None, initial_value=1500, initial_time=0, **kwargs)[source]
Construct a classifier object, without fitting it.
- Parameters
key1_field (string) – column name of the player1 key, if fit on a pandas DataFrame
key2_field (string) – column name of the player2 key, if fit on a pandas DataFrame
timestamp_field (string) – column name of the timestamp field, if fit on a pandas DataFrame
- class skelo.model.glicko2.Glicko2Estimator(key1_field=None, key2_field=None, timestamp_field=None, initial_value=(1500.0, 350.0, 0.06), initial_time=0, **kwargs)[source]
A scikit-learn Classifier for creating ratings according to the Glicko2 rating system.
- RATING_MODEL_CLS[source]
alias of
Glicko2Model
- __init__(key1_field=None, key2_field=None, timestamp_field=None, initial_value=(1500.0, 350.0, 0.06), initial_time=0, **kwargs)[source]
Construct a classifier object, without fitting it.
- Parameters
key1_field (string) – column name of the player1 key, if fit on a pandas DataFrame
key2_field (string) – column name of the player2 key, if fit on a pandas DataFrame
timestamp_field (string) – column name of the timestamp field, if fit on a pandas DataFrame
Rating Models
- class skelo.model.elo.EloModel(default_k=20, k_fn=None, initial_value=1500, initial_time=0, **kwargs)[source]
Dictionary-based implementation of the Elo rating system.
This class creates a dictionary of Elo ratings for each player inserted into the rating system, such that each match update will append new ratings for the respective match players, calculated according to the Elo update formula.
This model may be used directly, but is primarily intended as a utility class for an EloEstimator.
- __init__(default_k=20, k_fn=None, initial_value=1500, initial_time=0, **kwargs)[source]
Construct an Elo RatingsModel.
- Parameters
default_k (int) – default value of k to use in the Elo update formula if no k_fn is provided
k_fn (callable) – univariate function of a rating that returns a value of k for updates
initial_value (int) – initial default rating value to assign to a new player in the system
initial_time (int or orderable) – the earliest “time” value for matches between players.
- class skelo.model.glicko2.Glicko2Model(initial_value=(1500.0, 350.0, 0.06), initial_time=0, **kwargs)[source]
Dictionary-based implementation of the Glicko2 rating system.
This class creates a dictionary of Glicko2 ratings for each player inserted into the rating system, such that each match update will append new ratings for the respective match players, calculated according to the Glicko2 update formula.
This model may be used directly, but is primarily intended as a utility class for an Glicko2Estimator.
- __init__(initial_value=(1500.0, 350.0, 0.06), initial_time=0, **kwargs)[source]
Construct a Glicko2 RatingsModel.
- Parameters
initial_value (float, float, float) – initial default rating and deviation assigned to a new player
initial_time (int or orderable) – the earliest “time” value for matches between players.
- evolve_rating(r1, r2, label)[source]
Update a Glicko rating based on the outcome of a match.
This is based on the example in the glicko2 package’s unit tests, available here
- static compute_prob(r1, r2)[source]
Return the probability of a player with rating r1 beating a player with rating r2.
For more background, please see the Glicko Paper
Utilities
- skelo.utils.elo_data.sigmoid_k_fn_builder(r1, r2, k1, k2)[source]
Build an function that returns an Elo k value that is univariate in the rating value. This function builder creates a sigmoid-shaped function that decreases approximately linearly from value k1 to k2 between ratings values r1 to r2 (r2 > r1).
The function shape is illustrated in the schematic below:
r1 k1 _______ \ \ \________ k2 r2
Such a function shape provides a decreasing value of k in accordance with the recommendation by Arpad Elo that better (i.e. higher ranked) players have a lower value of k and therefore less volatility in their ratings updates after a match.
- skelo.utils.elo_data.generate_ratings(num_players, num_timesteps, mu=1500, sigma=1, seed=1)[source]
Create an array of player skill ratings that follow a random walk at each timestamp. The 2-dimensional array returned will have shape (num_players, num_timesteps), where each row represents the timeseries of skill for that respective player.
- Parameters
num_players (int) – the number of players to simulate
num_timesteps (int) – the number of players to simulate
mu (float) – the
- Returns
A (num_player, num_timesteps) array of ratings
Examples
More usage examples, including using sklearn cross validation routines to tune Elo hyperparameters are available in the project repository’s README.