Introduction
I enjoy playing on
Chess.com and
Lichess under the moniker
iEnjoySomeChess. (Challenge me!) I’m an amateur chess player, but I’ve
been following high-level chess tournaments since 2016, when Magnus
Carlsen defended his world championship title against Sergey Karjakin in
New York City. I began playing on Chess.com and Lichess in 2016 and
2017, respectively, so I have years of data, which I will analyze using
tidyverse
, lubridate
, and chessR
in RStudio.
Download my .Rmd file here.
ChessR: Extracting Data
I’m using the package ChessR to
extract data from Chess.com and Lichess. The ChessR
functions query an
API, which is rate limited; the limit is unknown for Chess.com, but the
limit is 15 games per second for Lichess. Then, I write the data to
CSV’s, so that I can quickly load the data in the future.
# Chess.com
my_chesscom_data_raw <- get_game_data(usernames = 'iEnjoySomeChess')
write_csv(my_chesscom_data_raw,
file = 'my_chesscom_data_raw.csv',
na = 'NA')
# Lichess
my_lichess_data_raw <- get_raw_lichess(player_names = 'iEnjoySomeChess')
write_csv(my_lichess_data_raw,
file = 'my_lichess_data_raw.csv',
na = 'NA')
Cleaning the Data
Chess.com
Extracting the data from Chess.com, there are 36 variables, and each row has information about a single game. I keep, rename, and edit the core variables, and I create other useful variables:
site
: In this case,Chess.com
.event
: The time category determined bytime_control
. For example,bullet
(fast),blitz
(fast but slower thanbullet
),rapid
(slow and slower thanblitz
).time_control
: The time allotted to each player, with the number of minutes and the increment (time in seconds added to a player’s clock after each move). For example,10+0
.white
: The player who played with the white pieces. For example,iEnjoySomeChess
.white_elo
: The rating of the player who played with the white pieces.black
: The player who played with the black pieces. For example,iEnjoySomeChess
.black_elo
: The rating of the player who played with the black pieces.rating_difference
: The absolute value of the rating difference betweenwhite_elo
andblack_elo
.termination
: The ending of the game. For example,time
,checkmate
,resignation
, among other values.n_moves
: The number of moves in the game, noting that both players each making one move counts as one move.result
: The result with format “white-black”, where a loss is zero points, a draw is half of a point, and a win is one point. For example,1-0
.opening
: The opening, if the opening was recognized as a book opening on Chess.com. For example,Pawn-Opening-Accelerated-London-System
.ECO
: The code (a shorthand) for the opening, if the opening was recognized as a book opening on Chess.com. For example,A00
.opponent
: The opponent of the user whose games you requested.opponent_rating
: The rating of the opponent of the user whose games you requested.user
: The user whose games you requested. In this case,iEnjoySomeChess
.user_color
: The color of the pieces played by the user whose game you requested.user_result
: The result of the game from the perspective ofuser
. For example,win
,loss
, ordraw
.user_rating
: The rating of the user whose games you requested.time_UTC
: The date and time (UTC) at the beginning of the game. For example,2022-08-23 21:22:52
.end_time_UTC
: The date and time (UTC) at the end of the game. For example,2022-08-23 21:36:19
.time_CDT
: The date and time (CDT) at the beginning of the game. For example,2022-08-23 16:22:52
.end_time_CDT
: The date and time (CDT) at the end of the game. For example,2022-08-23 16:36:19
.previous_time_CDT
: The date and time (CDT) at the beginning of the previous game played by the user. For example,2022-08-23 16:06:06
.time_diff
: The difference in time betweenprevious_time_CDT
andtime_CDT
.hour_interval
: The hour (CDT) during which the game was played. For example,16:00
if the game was played at2022-08-23 16:06:06
.link
: The link to the game on Chess.com.
my_chesscom_data %>%
glimpse()
## Rows: 7,423
## Columns: 28
## $ site <chr> "Chess.com", "Chess.com", "Chess.com", "Chess.com", …
## $ event <fct> rapid, rapid, rapid, rapid, rapid, rapid, blitz, rap…
## $ time_control <chr> "10+0", "10+0", "10+0", "10+0", "10+0", "10+0", "3+2…
## $ date <date> 2022-08-23, 2022-08-23, 2022-03-04, 2022-01-14, 202…
## $ white <chr> "Piercepoo17", "iEnjoySomeChess", "abdfghjk", "iEnjo…
## $ white_elo <dbl> 1818, 1804, 699, 1804, 1742, 1778, 1794, 1798, 1757,…
## $ black <chr> "iEnjoySomeChess", "Piercepoo17", "iEnjoySomeChess",…
## $ black_elo <dbl> 1804, 1818, 1804, 1693, 1795, 1784, 1642, 1771, 1774…
## $ rating_difference <dbl> 14, 14, 1105, 111, 53, 6, 152, 27, 17, 2, 37, 30, 1,…
## $ termination <chr> "repetition", "checkmate", "checkmate", "abandonment…
## $ n_moves <dbl> 24, 31, 17, 56, 6, 69, 21, 29, 15, 34, 32, 38, 52, 4…
## $ result <chr> "1/2-1/2", "0-1", "0-1", "1-0", "0-1", "0-1", "1-0",…
## $ opening <chr> "Geet-Opening-1...d5-2.d3-Nf6", "Gambit-Declined-Que…
## $ ECO <chr> "A00", "D31", "D00", "D13", "D02", "B21", "D85", "B5…
## $ opponent <chr> "Piercepoo17", "Piercepoo17", "abdfghjk", "Timurka55…
## $ opponent_rating <dbl> 1818, 1818, 699, 1693, 1742, 1778, 1642, 1798, 1774,…
## $ user <chr> "iEnjoySomeChess", "iEnjoySomeChess", "iEnjoySomeChe…
## $ user_color <chr> "black", "white", "black", "white", "black", "black"…
## $ user_result <chr> "draw", "loss", "win", "win", "win", "win", "win", "…
## $ user_rating <dbl> 1804, 1804, 1804, 1804, 1795, 1784, 1794, 1771, 1757…
## $ time_UTC <dttm> 2022-08-23 21:22:52, 2022-08-23 21:06:06, 2022-03-0…
## $ end_time_UTC <dttm> 2022-08-23 21:36:19, 2022-08-23 21:22:41, 2022-03-0…
## $ time_CDT <dttm> 2022-08-23 16:22:52, 2022-08-23 16:06:06, 2022-03-0…
## $ end_time_CDT <dttm> 2022-08-23 16:36:19, 2022-08-23 16:22:41, 2022-03-0…
## $ previous_time_CDT <dttm> 2022-08-23 16:06:06, 2022-03-04 22:58:12, 2022-01-1…
## $ time_diff <Period> 16M 46S, 5m 18d 17H 7M 54S, 1m 18d 1H 9M 59S, 1d …
## $ hour_interval <chr> "16:00", "16:00", "22:00", "21:00", "16:00", "14:00"…
## $ link <chr> "https://www.chess.com/game/live/55028774859", "http…
Lichess
Extracting the data from Lichess, there are 23 variables, and each row
has information about a single game. Again, I keep, rename, and edit the
core variables, and I create other useful variables, with significant
overlap with Chess.com; however, Lichess lacks n_moves
,
end_time_UTC
, and end_time_CDT
, while having some additional
information:
white_rating_change
: The rating change of the player with the white pieces following the termination of the game.black_rating_change
: The rating change of the player with the black pieces following the termination of the game.user_rating_change
: The rating change of the user whose game you requested following the termination of the game.
my_lichess_data %>%
glimpse()
## Rows: 8,009
## Columns: 28
## $ site <chr> "Lichess", "Lichess", "Lichess", "Lichess", "Liche…
## $ event <fct> rapid, rapid, rapid, rapid, rapid, rapid, bullet, …
## $ time_control <chr> "10+0", "10+0", "10+0", "10+0", "10+0", "10+0", "2…
## $ date <date> 2022-10-12, 2022-10-11, 2022-10-11, 2022-10-11, 2…
## $ white <chr> "iEnjoySomeChess", "Ibrahim780", "CleverRocky20", …
## $ white_elo <dbl> 2175, 2137, 2245, 2172, 2077, 2183, 1893, 1887, 18…
## $ black <chr> "adamnani", "iEnjoySomeChess", "iEnjoySomeChess", …
## $ black_elo <dbl> 2137, 2177, 2190, 2219, 2193, 2028, 1938, 1905, 18…
## $ rating_difference <dbl> 38, 40, 55, 47, 116, 155, 45, 18, 6, 93, 22, 3, 6,…
## $ termination <chr> "Normal", "Normal", "Time Forfeit", "Normal", "Nor…
## $ result <chr> "0-1", "1/2-1/2", "1-0", "1-0", "1-0", "1-0", "1-0…
## $ white_rating_change <dbl> -15, 1, 5, 18, 8, 10, 7, 6, 6, -4, -6, 6, -6, 15, …
## $ black_rating_change <dbl> 6, -2, -13, -6, -21, -3, -6, -6, -6, 5, 6, -6, 6, …
## $ opening <chr> "Nimzo-Indian Defense: Classical Variation", "Sici…
## $ ECO <chr> "E32", "B90", "B23", "D37", "D03", "E32", "D20", "…
## $ opponent <chr> "adamnani", "Ibrahim780", "CleverRocky20", "Divisi…
## $ opponent_rating <dbl> 2137, 2137, 2245, 2219, 2077, 2028, 1938, 1905, 18…
## $ user <chr> "iEnjoySomeChess", "iEnjoySomeChess", "iEnjoySomeC…
## $ user_color <chr> "white", "black", "black", "white", "black", "whit…
## $ user_result <chr> "loss", "draw", "loss", "win", "loss", "win", "win…
## $ user_rating <dbl> 2175, 2177, 2190, 2172, 2193, 2183, 1893, 1887, 18…
## $ user_rating_change <dbl> -15, -2, -13, 18, -21, 10, 7, 6, -6, -4, 6, 6, 6, …
## $ time_UTC <dttm> 2022-10-12 14:33:07, 2022-10-11 18:22:08, 2022-10…
## $ time_CDT <dttm> 2022-10-12 09:33:07, 2022-10-11 13:22:08, 2022-10…
## $ previous_time_CDT <dttm> 2022-10-11 13:22:08, 2022-10-11 09:34:08, 2022-10…
## $ time_diff <Period> 20H 10M 59S, 3H 48M 0S, 1H 37M 24S, 12H 14M 46S…
## $ hour_interval <chr> "09:00", "13:00", "09:00", "07:00", "19:00", "19:0…
## $ link <chr> "https://lichess.org/bQmZJs4r", "https://lichess.o…
Analysis
While I want to compare data between Chess.com and Lichess, I first want to perform separate analysis for Chess.com and Lichess. I will analyze five main questions:
- How do my games end when I lose, draw, and win?
- How often do I play with the white pieces, and how often do I play with the black pieces?
- On average, how much rating do I gain/lose according to color of pieces? According to month?
- What is my longest break from playing on Chess.com? Lichess?
- What times of day am I most active?
After I perform separate analysis for Chess.com and Lichess, I will combine the data sets, and I will perform a combined analysis for Chess.com and Lichess.
Chess.com
Game Termination
Q: “How do I win games?”
First, let’s check how many games I’ve won in bullet
, blitz
, and
rapid
. (Note: I have only played bullet
, blitz
, and rapid
games
on Chess.com, but I’ve played more time controls on Lichess,
includingclassical
and chess960
. For the most part, I limit my
attention to bullet
, blitz
, and rapid
.)
Event | Won Games | Total Games | Win Percent |
---|---|---|---|
bullet | 1861 | 3682 | 0.5054318 |
blitz | 1746 | 3569 | 0.4892127 |
rapid | 92 | 172 | 0.5348837 |
I win a significant portion (74.4%) of
bullet
games bytime
, which makes sense given thatbullet
is faster thanblitz
andrapid
, wheretime
is less of a factor. It’s interesting thatabandonment
becomes more important inblitz
andrapid
(with values of 1.83% and 2.17%, respectively), as opponents may start a game, but leave due to the time committment. Lastly, whereas my method of winning games is somewhat uniform forblitz
, I win a significant portion (71.7%) ofrapid
games byresignation
.
Q: “How do I lose games?”
First, let’s see how many games I’ve lost across bullet
, blitz
, and
rapid
.
Event | Lost Games | Total Games | Loss Percent |
---|---|---|---|
bullet | 1731 | 3682 | 0.4701249 |
blitz | 1624 | 3569 | 0.4550294 |
rapid | 57 | 172 | 0.3313953 |
I lose a significant portion (61.9%) of
bullet
games bytime
, which, again, makes sense given thatbullet
is faster thanblitz
andrapid
, wheretime
is less of a factor. Further, whereas my methods of winningblitz
games were more uniform, I lose a large portion (56.2%) ofblitz
games bytime
.
Q: “How do I draw games?”
First, let’s see how many games I’ve drawn across bullet
, blitz
, and
rapid
.
Event | Drawn Games | Total Games | Draw Percent |
---|---|---|---|
bullet | 90 | 3682 | 0.0244432 |
blitz | 199 | 3569 | 0.0557579 |
rapid | 23 | 172 | 0.1337209 |
I draw a large portion (57.8%) of my
bullet
games bytimeout vs. insufficient material
, which indicates that in my drawnbullet
games there are usually no remaining pieces on the board. Mostly, I do not draw my games by50-move rule
. Further, my methods of drawingblitz
games are somewhat uniformly distributed, whereas I draw a large portion (47.8%) of myrapid
games byagreed draw
.
Color of Pieces
Q: “Do I play with the white or black pieces more?”
I would expect to play with the white and black pieces with uniformly random probability, that is, 50%, which is approximately true, though for
rapid
I tend to play with the black pieces (52.9%) more than the white pieces (47.1%).
Rating Gain/Loss
As stated in Cleaning the Data, the data for Chess.com does not have
information about rating change, but we can create a
user_rating_change
column, just like Lichess, since we have a complete
record of user games, which includes user_rating
. (Note: Chess.com and
Lichess provide separate ratings for different values of event
; thus,
we have to sort by event
, and subsequently time_CDT
, before finding
the difference between successive values of user_rating
.)
Q: “On average, how much rating do I gain/lose with the black and white pieces?”
I would expect to gain more rating with the white pieces, rather than the black pieces, since white plays the first opening move, and thus has the first chance to create an advantage, so it’s suprising that I gain more rating (it’s even suprising that I don’t lose rating) with the black pieces. Further, I tend to gain more rating from
rapid
than I gain frombullet
orblitz
. I have not played as manyrapid
games asbullet
andblitz
games, so perhaps my rating is cathcing up to my skill level.
Q: “On average, how much rating do I gain/lose in a given month?”
I tend to lose rating around February, whereas other months are mostly consistent, or I gain rating. I don’t have an explanation for the rating loss in February.
Number of Moves
The extracted data from Chess.com has information about the number of
moves (n_moves
) for each game. In addition to the five main questions
stated under analysis, I include the following question:
Q: “How long are my games in number of moves?”
My games are typically between 30 and 50 moves, with some notable outliers. The outlier game with 142 moves is a game I played with a friend at UW-Madison, where I walked my king, being pursued by a queen, from h8 to b1 to h8, and I went on to win the game!
Longest Break
Q: “What’s the longest break I’ve taken from playing on Chess.com?”
Previous Game Time (CDT) | Next Game Time (CDT) | Time Difference |
---|---|---|
2016-04-13 18:35:49 | 2016-10-26 16:58:46 | 6m 12d 22H 22M 57S |
My longest break from Chess.com was 6 Months and 12 Days, which time elapsed between April 13, 2016 and October 26, 2016. Considering I started playing on Chess.com in January 2016, said break was probably a natural waning of interest, since I wasn’t as enamored with chess then as I am now.
Time of Day
Q: “What times of day am I most active?”
I have not played any games at 23:00 or 24:00 (00:00), and I play a lot of games in the evening, between 16:00 and 19:00. I tend to play
rapid
games either in the morning (08:00 to 09:00) or the evening (16:00 to 19:00).
Lichess
Game Termination
The exported Lichess data only distinguishes between Normal
and
Time Forfeit
for the variable termination
, so we don’t have the same
breakdown as Chess.com. As such, I exclude classical
(longer time
control), since Time Forfeit
is not a strong factor for classical
.
(Note: Again, I did not play classical
games on Chess.com.)
Q: “How do I win games?”
First, let’s see how many games I’ve won across bullet
, blitz
, and
rapid
.
Event | Won Games | Total Games | Win Percent |
---|---|---|---|
bullet | 1632 | 3452 | 0.4727694 |
blitz | 1932 | 4118 | 0.4691598 |
rapid | 78 | 158 | 0.4936709 |
Again, I win a large portion (52.5%) of
bullet
games byTime Forfeit
, which makes sense given thatbullet
is faster thanblitz
andrapid
, wheretime
is less of a factor, as evidenced by the decreasing percentage ofTime Forfeit
frombullet
toblitz
, and fromblitz
torapid
.
Q: “How do I lose games?”
First, let’s see how many games I’ve lost across bullet
, blitz
, and
rapid
.
Event | Lost Games | Total Games | Loss Percent |
---|---|---|---|
bullet | 1651 | 3452 | 0.4782735 |
blitz | 1917 | 4118 | 0.4655172 |
rapid | 69 | 158 | 0.4367089 |
Whereas I won less and less games by Time Forfeit
from bullet
to
blitz
to rapid
, I lose a significant portion (62.3%) of rapid
games by Time Forfeit
. As I play rapid
games with a time_control
of 10+0
(no increment, that is, no time added after each move), I need
to work on my time management.
Q: “How do I draw games?”
First, let’s see how many games I’ve drawn across bullet
, blitz
, and
rapid
.
Event | Drawn Games | Total Games | Draw Percent |
---|---|---|---|
bullet | 169 | 3452 | 0.0489571 |
blitz | 269 | 4118 | 0.0653230 |
rapid | 11 | 158 | 0.0696203 |
The breakdown of drawn games on Lichess isn’t too useful, since it’s rare to draw games by
Time Forfeit
, unless no pieces remain on the board, which appears to occur mostly forbullet
games on Lichess. We saw the same trend on Chess.com.
Color of Pieces
Q: “Do I play with the white or black pieces more?”
Again, I would expect to play with the white and black pieces with uniformly random probability, that is, 50%, which is approximately true, though for
rapid
I tend to play with the white pieces (54.4%) more than the black pieces (45.6%); more suprising, I have only played 8classical
games, of which 6 (75.0%) were with the black pieces and 2 (25.0%) were with the white pieces.
Rating Gain/Loss
Q: “On average, how much rating do I gain/lose with the black and white pieces?”
I tend to lose rating with the black pieces on Lichess, whereas I tend to gain rating with the white pieces on Lichess. Compared to Chess.com, this is more consistent with my expectations.
Q: “On average, how much rating do I gain/lose in a given month?”
I tend to lose rating during May, June, and July, whereas I tend to gain rating in other months. What’s up with summer?
Longest Break
Previous Game Time (CDT) | Next Game Time (CDT) | Time Difference |
---|---|---|
2022-03-07 16:33:05 | 2022-07-14 08:10:09 | 4m 6d 15H 37M 4S |
My longest break from Lichess was 4 Months and 6 Days, which time elapsed between March 7, 2022 and July 14, 2022. Really, this year? I went to Alaska in June; otherwise, I don’t have an explanation for the break. Zugzwang with my courses (quantum mechanics, thermal physics, and laboratory), perhaps?
Time of Day
Q: “What times of day am I most active?”
I tend to play
rapid
games on Lichess much earlier in the day (07:00, 08:00) than I dobullet
orblitz
games. Otherwise, the times are somewhat uniform during mid-day, and they dropoff around nighttime hours.
Comparison
Now, I will combine the data and perform a combined analysis for Chess.com and Lichess. I drop the columns unique to each data set; then I bind the data sets, so there is one large data set. (Note: The percentages will now be over all games, from both Chess.com and Lichess.)
Start Date/Total Games
Q: “When did I start playing on either site?”
Site | Earliest Game Time (CDT) |
---|---|
Chess.com | 2016-01-03 01:35:30 |
Lichess | 2017-07-19 12:24:50 |
Q: “How many games have I played on each site? On which site have I played more games?”
Site | Games Played |
---|---|
Chess.com | 7423 |
Lichess | 8009 |
I started playing on Chess.com, but I now play primarily on Lichess. My total number of games played on Lichess has surpassed my total number of games played on Chess.com.
Time of Day
Q: “What times of day am I most active by site?”
Rating
Q: “What is my highest rating achieved on either site by
event
?”
Site | Event | Highest Rating |
---|---|---|
Chess.com | bullet | 1677 |
Lichess | bullet | 2010 |
Chess.com | blitz | 1794 |
Lichess | blitz | 2129 |
Chess.com | rapid | 1804 |
Lichess | rapid | 2194 |
Lichess has higher ratings than Chess.com, and on average my rating is around 300-400 points lower on Chess.com.
Conclusion
As always, there’s a lot of data, and there’s a lot more to do with the
data. For example, I could generate further insights by filtering the
data according to year
, which could be useful, since my chess skills
have improved over time, and thus my above analysis might not reflect
the most recent trends in my chess games. Nonetheless, the above
analysis provided useful insights, and I plan to update and revisit my
chess data in the future.