Team Chess Analytics: Chess ‘Moneyball’

“I’m not worried about Daryl Morey. He’s one of those idiots who believe in analytics”

Charles Barkley

Maximising score

Team chess is fun – you get to play with your friends and team-mates and compete together and also one of the highest competitive forms of chess: the Olympiad is played every two years between countries in a 4v4 format, the German chess league Schachbundesliga boasts one of the most competitive leagues in the world in an 8v8 format. Online, the Pro Chess league is the most prominent club competition. According to many top players, the strategy seems quite straightforward: be solid with black and try to win with white. 

Analytics has changed many sports, for example in basketball scoring a 2-point shot (around 48% chance of success) brings on average 2×0.44 = 0.84 points and a 3-point shot ( around 35% chance of success) brings 3×0.35 = 1.05 points on average. This lead to a radical change in the way the game is played at the highest level.

Almost nothing exists in terms of analytics for team chess: can we do anything? Can maths and numerical ‘monte-carlo’ simulation give some insight into ways to play of set up teams to maximise results? 

The setup for team chess

Team chess is usually played over multiple rounds, either as a swiss system or as round robin. Games over N boards are played simultaneously and the individual scores 0 for loss, 0.5 for a draw and 1 for a win are added up to a sum S. The team will score T points.  If S > N/2 the team get T=1 point, S=N/2 then the team gets T=0.5 points, and S<N/2 gives the team T=0 points.   

This is very important to think of: the team score T is a non-linear transform of the sum of scores S. This was my starting point for thinking of the whole problem and how it hasn’t been properly investigated before. Once more: the score is the sum of the scores, but the team score isn’t the sum of the scores!

Let’s illustrate the way the team score is calculated with a simple plot say a 4-player (4er) team:

We will aim at maximizing the expected value of T aka EV[T] so that the team gets the maximum amount of points. To model this we need to model the individual matches of the players.

Modelling the individual results: Elo is not enough 

The base for modelling the expected score between two chess players is the Elo model. If you can’t remember the formula by heart here is my favourite way of remembering and understanding it. Give each player a winning strength Q_i = 10^(r_i/400) and the expected score (between zero and one) is your strength ‘proportion’ 

EV_i = Q_i / (Q_i + Q_j) 

We can multiply the numerator and denominator by 10^(-r_i/400) and we obtain:

EV_i = 1 / (1+10^(r_j – r_i )/400)

This formula only depends on the difference of the ratings! Now there can be considerable discussions about how good the Elo is as a model of scores but this is beyond the point for the current discussion and a good enough model of expected scores. 

Now as you may have noted, a chess game can end in a draw. The draw gives 0.5 points, and the win one point. So the expected value is:

EV = 1x Pw + 0.5xPd + 0xPl = Pw + Pd/2

Pw and Pd are the probabilities of a win and draw respectively. But this is the catch: the Elo model does _not_ gives us the proportion of the draw, and we have to assume it externally. In particular, it might for example be very different for different time controls or dependent on the level of the players – while Elo only cares about the differences and not the absolute level! 

But more importantly, the draw percentage is something that is to some extent within the control of the players – at least from the level of an advanced amateur onward, a player can decide to play ‘risky’ chess and go for a win or be more conservative and have more likely to go for a draw. For a fixed expected value EV if we vary the draw probability Pd we have this type of chart for all possible three outcomes: 

The draw rate, i.e. the amount of risk taken by the players, will be one of our critical variables while discussing the team modelling because the other variable which is the EV is fixed by the player’s Elo differences. 

The draw rate

Given that the draw rate is one of three key variables that we are discussing here, it is worth paying extra attention to its meaning. 

The draw rate can vary from 75% of results being drawn for some players above 2750 in classical chess to much less than 1% for bullet between amateurs. In our discussion, we consider players skilled enough to understand that there is a ‘risky’ way to play and a ‘safer’ one, be it in risks taken in the opening or the middle game. Of course, it takes two to tango, and you have to deal with the way your opponent wants the game, so it’s not completely up to you, but still in your control to a large extent. 

Not all draw rate values are possible, it is not possible to have a large imbalance in ratings and a large draw rate. 

Team modelling

We can compute the expected value EV[T] algebraically and also run so-called ‘Monte-Carlo’ simulations where the outcome win/draw/loss is randomly generated – the computer easily generates hundreds of thousands of games in a few seconds. I’ve computed the EV[T] for 2, 3, and 4 players and beyond we will rely on the Monte-Carlo simulation only. We have two parameters: the individual player’s expected value (driven by the Elo difference) and the draw rate. First, let’s look at the expected team score as a function of the Elo Difference for different values of the draw rate, we assume the Elo difference applies on every board. 

Of course when the team are balanced the expected score is 0.5, as the Elo difference is large the score goes to 0 or 1. It is interesting that for a given Elo difference, different draw rates give different expected values. We can make the draw rate our dependant variable:

This is an important result that the draw rate is an influence on the team result. If a team is outgraded, it should seek to minimize the draw rate and play more aggressively. The value of a full point is very large to the team. On the other hand, if a team is outgrading their opponent they should seek for safe options and let the few wins tip the result. Unfortunately, this is contrary to the intuition of many players: when playing higher-rated opponent, they would rather secure a draw than risk losing or scoring a full point. 

The overall magnitude of the effect is fairly small, of about 0.05 points between two extremes of a super safe and super risky strategy, so we can expect about an extra point over a 9-round tournament, which is still large enough to make a difference on the final rankings.

Chess is a game of black and white

It is quite surprising that the Elo model has historically never been modified to account for the first mover’s advantage of the white pieces. White scores an expected 0.55 which translates to an Elo advantage of +35. If you played a player of your rating, a much better prediction of the result would be to add +35 to whoever is playing with the white pieces. As this applies to both teams symmetrically there is no overall net effect of colours. 

Influence of player count

This is a fairly simple one to understand: the more players you have, the more likely that the results will be ‘as expected’. Let’s compare 4 and 8-player teams: 

For a given draw rate and equal strength difference, the 8-player team match would score almost 0.05 points more per game. It is frequent that a large player count increases the Elo difference on the lower boards if one team lacks ‘depth’, increasing the difference between teams.  

Board order shenanigans

Some competitions enforce that the boards have to be in the order of ratings, and some do give the freedom to choose ( like the Olympiad, provided that the board order is fixed at the start of the competition and does not change in between rounds). Some competitions allow some board order change provided that the gap in ratings is not above some threshold. All of the analysis below, where we assume that the board order can be changed, must be adjusted according to the rules of the league that is relevant for you. 

Let’s assume that a team is [2600, 2550, 2500, 2450] (remember that only Elo differences matter, you can subtract 500 Elo from everything and all is still identical) plays against a team of similar ratings. As you would guess the expected score is 0.5. Now, what is we ‘invert’ by playing the weakest player on board one? This leaves a big deficit on board one and an advantage on the three remaining boards.  

Surprisingly, the Inverted lineup scores around 0.507 depending on the draw rate! The higher the draw rate the better. Of course, a ‘full inversion’ also scores 0.5. Why does the inverted board order works? Let’s build some intuition by looking at a two-player team that is outgraded on every board. One team is [2000,1500] and the other is [2500, 2000]. If you have a regular board order, you are saddled with 500 deficit on each board. If you decide to invert, you have a huge deficit 1500-2500 which is a ‘sure’ defeat but an even match 2000 vs 2000. Assume no draws possible, this should give you 1 individual point half the time therefore a 0.5 team draw half the time! The EV of the match should be 05×0.5=0.25 in that case, and in the regular case only about 0.05. We can plot this two-player toy study as a function of the Elo gap and see that the inverted case does not go below a worst case of 0.25.  

If we look at the case of a 4 player team that is outgraded by about 100 points on every board, the inverted lineup is significantly better. If you are forced to play with a regular board order, keep in mind the large effect of the draw rate, when you are the underdog, play forceful chess and play ‘for the three results’.

Of course, one must not forget that is true ‘strength’ inversion. If you have a very underrated player that you place ahead of higher-rated but weaker players, this isn’t board inversion but you are doing a standard ordering according to strength.

The Match of the Century: USSR vs the World 1970

Fisher vs Petrosian

About the “Match of the Century”  USSR vs Rest of the World match of 1970, Bobby Fisher was above Bent Larsen in Elo rating but had been inactive while Larsen had great tournament successes.  Larry Evans writes

“Larsen threatened to withdraw unless he was allowed to play at the first board. To everyone’s astonishment, Fischer gave way. “Larsen’s got a point,” he said. By the time the first round started, he had second thoughts. […] Minutes before the match Fischer was asked how he felt about giving up the first board. “It was a big mistake,” he said. “I shouldn’t have agreed to it.”

Nevertheless, on board one Larsen held a good score of 2.5/4 against Spassky and Stein, and Fischer scored 3 /4 on board 2 against Petrosian. Only weaknesses deep in the field prevented The World team to win. We will never know the results had Larsen and Fischer swapped boards, but I will argue it is not a bad tally of points with this involuntary inversion. 

Inverted board order at the Olympiad

At the 2014 Olympiad in Norway, India won the bronze medal – the country’s first in any Olympiad.

Indian news service IANS wrote:

“In 2014, the Indian team fielded the strongest players at the lower boards and a mix of high- and lower-rated but solid players at the top two boards. The idea was to secure a win at the lower boards while the lower-rated team members try to hold/draw their games at the top boards against stronger opponents.”

The Indian team, without Anand or Harikrishna, managed great results with the following board order:

1 Negi 2645 – 2 Sethuraman 2590 – 3 Sethuraman 2669 – 4 Adhiban 2619.

In 2022 Israel played with an inverted board 1 and finished 16 outperforming their starting rank of 22. 

At the same Olympiad Serbia also played with an inverted board 2 and finished 20th outperforming their starting rank of 23. We can notice the high scoring tally of Markus.

Don’t forget the basics

The main criticism levelled at analytics is that the analytics are ‘taking over’ and that data gives a remote picture that is not reflecting the reality on the ground. The message here is that any analytics must complement and enrich your intuitive understanding of the game, and not replace it

One of the fundamental aspects of team chess is the awareness of the current situation on the boards of your teammates. Of course, if the match situation makes it that your team is winning there is no need to take any risk and you can safely steer the game to a draw. Conversely, if your team is losing, you must start to take risks and try to win the game at all costs. Now, if your game is very complex and tense and you have no time to look at the boards of your teammates, you have to trust that your teammates will keep an eye on it.. Depending on the competition, the captain can help or not by intervening in the draw offer – always carefully read the rules and regulations and make sure players and the captain are aware of them.   

The psychological cost of playing optimally

Going back to basketball, we have mentioned great analytics success with 3 points throws, but sometimes players will just refuse to play optimal strategy. This is the curious case of the basketball ‘underhand’ free throw also called the ‘granny shot’. The ‘standard’ way of shooting free throws relies on one hand and the underhand uses two hands, it is a lot more stable, and the ball spin and trajectory make it a lot more easy and forgiving. One of the best free throw shooters ever in the NBA, Rick Perry, used the underhand in a deadly manner and had a 90% success rather than the usual 75% success rate. Wilt Chamberlain, one of the greatest NBA players, was poor at free throws until he switched to underhand and scored the all-time record of points in an NBA game with 100 points on march 2nd 1962, including scoring 23 out of 25 free throws underhand. The most Kobe Bryant scored is 81 and Michael Jordan 69.

However, he abandoned the technique:

“I felt silly, like a sissy, shooting underhanded. I know I was wrong.

I know some of the best foul shooters in history shot that way…I just couldn’t do it.”

Things are even direr as a legitimate strategy is to target a player with low free throw success and constantly foul him. Yet players just won’t switch techniques because it doesn’t look manly enough.

What is the chess equivalent of refusing to shoot underhand? 

  • Play safely when the lower rated, because of the psychological cost of a loss. Most players would rather have two draws rather than a win and a loss. For individual results, this might not be a problem, but for team results, the importance of the win might be critical. Loss aversion never is the friend of the chess player. 
  • Closely related, many players will be happy to settle for a draw in clearly winning positions against a much higher-rated opponent.  
  • If we want to ‘invert’ the board order, it might be that the higher-rated player feels that he should be the board one as he is the better/more important player in the team. A lot of examples from the soccer/football world are where a key player with a big ego is on the substitute’s bench for a game, and when called upon to enter during the match, he flat-out refuses. 

I get it. You’re a chess team captain, but you can’t move your board one player to board 2 because he will throw a tantrum, and you can’t get your lower-rated players to draw higher-rated opponents in much better positions. But just like for sports analytics, eventually, people will catch up on optimal play (or not … like the underhand throw).

Takeaways & summary of recommendations

  • Analytics can help teams. By tuning the element of ‘risk’, teams can increase or decrease their expected score.
  • When you are the underdog, you should seek a risky strategy.
  • Strategical board order inversions increase the expected value.
  • Psychological roadblocks like loss aversion can make it difficult to take risks. 

Leave a Reply