Diplomacy Scoring Systems

I am going to write about scoring systems. If you want to skip ahead to the end and see a good system you can use, feel free to do so. Recall that the purpose of a scoring system is to provide a just numerical evaluation of a drawn position in a diplomacy game, for quantifying performances in a tournament.

We will start very slowly by stating some properties that diplomacy scoring systems should have.

1. Solo beats Draw beats Elimination/Loss:
A soloist should score more than someone who draws, who should in turn score more than someone who is eliminated or loses.

Comment: This sounds like an obvious property to want. Yet I feel compelled to mention this, and others explicitly, because not all scoring systems have this property. Carnage is a notorious example. (A historical note: We (by this I mean Bob Holt) fixed this problem with Carnage back in 2009. Then people reverted to the older, worse version).

2. Monotonicity:
Compare the score from two games (games are assumed drawn unless otherwise specified), where the only difference is that player A takes a centre off player B in the 2nd game. Then Player A should score more in Game 2 than in Game 1, and it should be the other way around for Player B.

Comment: Again, this sounds obvious, but does not always happen. An example of a system which doesn’t have this property is Tribute.

3. Path independence
A player’s score should only depend on the final centre count of the game.

Comment: In the past systems have been used (e.g. Origins) which use the centre count at each year in the figuring of the final scores. These produce perverse outcomes, where smaller powers can score more than larger powers. Such systems should be avoided on principle and due to their known poor performance.

Comment 2: For a further comment about survival points for eliminated or losing players, keep reading to point #5.

4. Zero Sum.
That the total points given out for each game is a constant.

Comment: There are people who belive that this should be a requirement. It sounds nice at first glance, but (spoiler alert) we will end up ditching this idea, in favour of ensuring requirements 5, 6 and 7 happen.

5. Eliminated or losing players score zero.
As it says in the title.

Comment: This is something that I would argue for in a major tournament, because to be eliminated or lose is to have achieved nothing. In a minor tournament, I am amenable to giving a token sum of survival points as seen in Detour or Cricket after hearing Melissa Call argue persuasively in its favour. An alternative approach that is irrelevant for the question of how to score a single game but relevant for a tournament is to give every player a fixed extra number of points for playing in a round.

6. Convexity/No unnatural Draw Whittling.
If Player A allows Player B to capture every supply centre of Player C, then Player A’s score should not increase.

Comment: If you reach a position which is naturally drawn, and the large players are incentivised to artificially manouver in order to ensure the smaller powers can be safely eliminated, then this causes the game to drag on longer than it should, and feels particularly nasty towards those who are eliminated late. This is essentially the standard argument against draw size based systems, although draw based scoring has other undesirable implications that I won’t get into here.

7. Reward the draw
The difference between a 1SC power in a draw must be significantly different from an elimination/loss.

Comment: I believe that there is a significant difference in achievement in making it into the draw, and that this should be rewarded by the scoring system. This is a property that is seriously lacking in some systems (Squares is a notable offender here).

8. No ties.
It is impossible to avoid ties together, but the ideal is for a system is to have as few ties at the top end of the field as possible.

Comment: If you have a simple scoring system, then you have to deal with tiebreakers. I have determined that humans are bad at coming up with tiebreaker rules. For an example, look at the farcical boundaries rule which determined the winner of the 2019 World Cup. Thus, I prefer to avoid them. Though I will mention one tiebreaker rule I read once in some tournament rules and rather fancied: “Ties will be broken by strength of opponents faced. Strength of opponents is determined by their performance in this tournament.” How exactly the strength of opponents was determined was not explicated.

9. The soloist does not win the tournament by default
A solo should be beatable by a combination of non-solo results.

Comment: This is more about tournament structure than scoring a single game so is orthogonal to most of this article, but I mention it for completeness. I believe that to do well in a tournament, you should have to do well in more than one game, rather than getting lucky in a single game (the difference between a solo and a dominant board top is often the luck of the draw in how well the defenders play).

OK enough about desirable (or undesirable) properties. And on to actually talking about some systems.

The simplest (reasonable) scoring system is 1 point per supply centre. It’s actually pretty good. Problems it comes up against are points 8 and 7 above. Still, this system is a good litmus test for any scoring system designers. I think any scoring system designer has to justify that their system is better than just counting up the dots.

A more nuanced system is one where you get a points per supply centre, b points for being in the draw, and c points are shared equally between the board-toppers. The values of a, b and c can be adjusted as desired. This is the system in use for the virtual World Diplomacy Classic running later this month. These systems can be very good.

To reduce the chance of ties, we can now make some changes. Let f(x) be an increasing convex function with f(0)=0. Suppose player i finishes with a_i supply centres. Then we give Player i the score

\displaystyle \frac{f(a_i)}{\sum_{j=1}^7 f(a_j)}.

The first system mentioned above is the same as this one with f(x)=x. It is customary to multiply scores by 100 in real life systems of this type but as that makes no mathematical difference so we ignore it.

The consequence of making the function convex is that it means that at a given supply centre count, the more evenly split your opponents are, the higher your score. It ensures that property 6 holds.

One example with a history of use in the hobby is the function f(x)=x^2 (Squares). This has two serious problems. One is that it fails property 7 (reward the draw). In this system, the score aquired by a 1 supply centre power is so small as to be essentially meaningless in any final tournament standings. The second problem is that it is too convex (to quote Nicolas Sahuget). This results in a large range of scores which a dominant board-topper can achieve. This range feels disproportionately large compared to the other scores in the sytem. But this problem is easily fixed by changing f.

We can fix the problem of failing property 7 by removing the requirement that f(0)=0, and allowing f(0) to take any postive value. Now if player i finishes with a_i supply centres, then they score as in the previous displayed formula when a_i>0, and score 0 otherwise.

Note that it is important to still include the eliminated players in the denominator, otherwise the convexity condition will be broken. This system breaks the zero-sum condition (in a fairly mild way). I strongly believe this is a small price to pay for its other desirable factors.

I hereby propose that we take f(x)=\sqrt{x^2+6x+10}. This clearly has some motivations from the Cricket scoring system, which only fails the tie-breaker condition amongst the desirable points raised above.

I have left out the question of how many points a soloist should get since it is independent of the discussion of how to score the drawn positions. I will give an explicit sample number below in my final description that agrees with my philosophy that a soloist should be able to be beaten in a (fictionally 3-round) tournament, since a tournament is meant to require playing well in multiple games, not just one.

Afficionados of lead based systems can always throw on extra bonus points for placement in a draw, though we start to lose some elegance with that approach.

So finally, we reach:

Unnamed Scoring System

Let f(x)=\sqrt{x^2+6x+10}.

For games that end in a solo, the soloist scores 0.55 points. All other players score 0.

For games that end in a draw, let a_i be the number of supply centres of player i. Then if a_i=0, player i scores 0 points, while if a_i>0, player i scores \displaystyle \frac{f(a_i)}{\sum_{j=1}^7 f(a_j)} points.

PostScripts

1. The Unnamed system is essentially 3 points for being in a draw, plus one point per supply centre, tweaked ever so slightly so that it is unlikely to create ties.

2. A comment by SK suggests that the word Carnage should be in the scoring system name.

3. Comments are not showing up, but are readable by me. I do not know why.