Single Comment

Chess with Different Armies. Betza's classic variant where white and black play with different sets of pieces. (Recognized!)[All Comments] [Add Comment or Rating]

H. G. Muller wrote on Wed, May 8, 2019 11:01 AM UTC:

End-games part 3: Super-pieces versus a pair

This is a very murky problem. I have generated the relevant 5-men EGT, but they seem very hard to interpret. Take for example Queen vs Bishop + Knight. This has 98.56% of all positions won when the Queen has the move (including 40.42% immediate King capture). The weak side is lost in 49.08% of the positions where it has the move, 28.44% of such positions are instant wins by King capture (so really illegal positions, that one could choose not to count). And 17.88% are wins by other means, which has to mean in this case gaining of the Queen and a subsequent mate with Bishop + Knight (obtained from generating the reverse EGT), or (rarely) a checkmate with the Queen still on the board. Almost all of these (98%) capture the Queen (or mate) on the first move, and none in more than 5 moves. These should not really be counted as Q vs B+N, they are tactically non-quiet positions in the process of converting to a simpler end-game. The remaining 4.61% of the positions with the weak side to move must be draws.

This looks as much as a general win as one could hope for. Nevertheless it is well known that B + N can make a 'fortress' that even resists the onslaught of an Amazon (Ka1, Bb2, Nd4). The resulting fortress draws are hidden in those 4.61% (which amounts to 8.5% after disrecarding the illegal and non-quiet positions). So in most cases the end-game in a quiet position (where chess engines evaluate) would be a win for the Queen, so it seems reasonable not to excessively discount it. (The factor 2 applied to all pawnless advantages would already do justice to the difficulty of winning this, as the 'raw' advantage is equivalent to a single minor, which after discounting translates to 1.5 Pawns, which is only marginally above the threshold for winning advantages.)

This makes it impossible to avoid the fortress, however. The problem with fortresses that are not recognized by the evaluation is that the engine continues to count itself rich for the almost indefinite duration the defender can maintain the fortress (until the 50-move rule puts an end to it, but that will be seen only after 100 ply, way beyond the horizon when you first enter the fortress). The alternative is to always discount end-games that contain a fortress draw heavily. That would be wrong in the majority of cases, but the won cases will eventually convert to another end-game (KQ-KB or KQ-KN), or checkmate outright. And once this gets within the horizon the score will be corrected. Basically this puts the 'burden of proof' for that an end-game with a fortress draw is a win on the winning side, even when it is the most likely case, because that case is easier to prove. E.g. 26.75% of all positions (=49% of the quiet ones) converts in 5 moves or less, and the search can presumably find that. This still leaves more cases where it is in error than just ignoring the fortress, though. In addition to such a 'passive' fortress there can also be draws due to perpetual checking. But these usually lead to repetitions quickly, so that the search has no difficulty recognizing those without any special discounting.

It is kind of hard to devise a satisfactory algorithm here without actually probing the EGT, or putting in dedicated code to recognize the fortress. The latter doesn't seem feasible for CwDA, where in most end-games we really have no idea at all whether there is a fortress or not, let alone how it looks. When embedding a single exotic piece in, say, a FIDE context, it does seem feasible to generate the Q vs 2 minors EGTs (6 of those, for all combinations of B, N and the exo-piece) in advance. Even an uncompressed 5-men EGT only takes 160MB, so with today's memory sizes a number of those can easily be kept in memory (possibly shared between several instances of the engine).

Fortunately in many cases of super-piece versus a pair of light pieces the discounting is not really important, because the 'raw' advantage is already pretty small to begin with. E.g. with Q vs 2R the difference is only 0.5 Pawn in favor of the Rooks, and for Q vs R+B it is only 1.25 in favor of the Queen. And the general factor 2 penalty for pawnlessness already would reduce that to 0.25 and 0.625, respectively. So it would always shy away of these end-games in favor of an advantage of a healthy Pawn, even when they are not listed as drawish. The drawishness discounting is only important for end-games that have a large raw advantage, possibly only super-piece vs pairs of the weakest minors B, N, WA, WD and Fibnif.

I will publish a table here when I have figured out how to best present the calculated statistics.

[Edit]

I made a useful addition to my EGT generator: when it is done generating the normal staticstic for a 2-vs-1 end-game, it declares all drawn positions in the successor 2-vs-0 and 1-vs-0 end-games a win, and then continues generating from there, effectively calculating whether King-baring can be forced (and in how many moves). This is a great help in investigating end-games like KQ.KBN, by generating the 'reverse' end-game KBN.KQ with King-baring victory. That makes it possible to recognize draws achieved by trading B or N for Q, which otherwise would show up as draws, indistinguishable from any fortress draws with all material, but now are reported as wins. This leads to the conclusion that almost all draws in KQ.KBN are due to shallow tactics that loses the Q against one or both minors: of the legal positions with the weak side to move only 0.14% are fortress draws. The known fortress is apparently very difficult to reach. This is in sharp contrast to Q vs two WD, which has 46.93% wins (38.12% converting within 3 moves), 28.5% forced losses of Q or K (the large majority in 1 move) , leaving 24.57% for fortress draws. Indeed the WD pair has a huge capacity for setting up fortresses: a mutually protecting pair can confine the enemy King on boards of any size, trapping it behind the file or rank they are on. You either gain one of the WD by checking/forking before they connect, or it will be a dead draw. Such end-games deserve heavy discounting, as the search (using check extension and capture search) will easily find the won or lost cases. Queen vs two WA has rather similar statistics, although I don't have a clue as to how the fortress looks there.

[Edit 2]

OK, I finally compiled a table, by combining info from the super-piece vs pair end-games themselves, the reverse end-games, and the reverse end-games under the baring rule. I extracted the info from the positions with the pair on move. This shouldn't really paint a different picture from when the super-piece was on move, except that in the latter case the large majority of positions (>80%) captures a hanging piece on the first move, altering the material balance from the intended one, so that the interesting results are much diluted there. Of course when such a capture does not happen, the other player gets to move, with the statistics presented here.

I only considered end-games where the advantage based on piece values would be large enough to reasonably suspect it could be a win even in the absence of Pawns.

The table list 6 numbers, all percentages:
1) win by shallow tactics (conversion in first 3 moves)
2) win by deep tactics (conversion in move 4-6)
3) lengthy wins
4) fortress draws
5) forced loss of super-piece (or checkmate)
6) immediate loss through King capture

           Q                  C                   A                   Colonel
NN  26-4-13-11-21-25  21-10-25-.1-19-25     9- 5- 2-40-19-25     16-9-14-15-20-25
BN  21-7-21-.1-23-28  18-10-21- 1-22-28     6- 2-15(~8)-27-22-28 10-7- 8-23-23-28
BB  15-7-18- 1-24-35  14- 5-20(~6)-.1-26-35 3-.2- 0-38-24-35      7-3- 4-25-26-35
XX  31-4- 5-15-19-26  26- 9-14- 5-20-26    10- 8- 3-32-20-26     16-9- 9-20-21-26
FX  21-5- 5-19-21-29  18- 6- 4-20-22-29     7- 4- 2-36-22-29
FF  22-7- 1-14-23-34  29- 5- 1-15-26-34     6- 2- 1- 2-54-34
WW  27-5- 2-18-20-28  23-11- 3-14-21-28    12- 9- 2-28-21-28     15-9- 4-11-32-28
II  30-5- 4-16-20-26  23- 9- 5-17-20-26     9- 8- 5-32-20-26     16-7- 3-28-21-26
YY  18-4- 1-13-27-37  11- 8- 3-15-26-37     6- 3- 1-19-34-37      8-5- 1- 7-42-37
KK                    15- 4- 2-30-20-28                           6-5- 2-13-45-28

The relevant statistics for classifying the end-game are highlighted in bold. (Note '.1' means 0.1!) These are the lengthy (i.e. non-tactical) wins versus the fortress draws. The other cases resolve fast enough to simpler end-games for the engine to base the score on static evaluations outside this end-game. A smart evaluation strategy for these end-games could be to initially classify them as a (pawnless) win, but for those that are mainly fortress draws increase the discount factor to a drawish value when the 50-move counter goes up, reflecting the observation that when you cannot make a winning exit from the end-game in the first 3 moves, your chances for a win will be pretty bleak. When looking ahead from end-games with a single Pawn in jeopardy (e.g. Q+P vs F+2X) they should be treated as drawish, as after sacrifycing X or F for P the remaining F and/or X will typically be tactically safe (or they would have been picked off before).

The Archbishop vs two Fads sticks out because in 54% of the cases the Fads can force capture of the Archbishop. (More typically the chances to force super-piece capture are only 20-25%.) One should not conclude from this that the game is mostly won for the Fads, though. The Archbishop is only rarely captured without compensation, and even trading it for a single Fad leaves no mating potential, and thus causes an instant draw. Only 7.46% are genuine losses (Archbishop lost without compensation, or an immediate checkmate). The Fads do dominate the game, however. Where in the other end-games gaining the super-piece in almost all cases happens on the first or second move, here that happens in only 10% of the cases, and takes on average 25 moves otherwise (worst case even 57 moves). The Fads will just methodically tighten the mating net around the enemy King, keeping their own King safe from perpetual check, and at some point the mate can only be averted by sacrificing the Archbishop.

In two cases (A vs B+N, C vs B-pair) a large fraction of the lengthy wins was cursed, and the table mentions the number of cursed wins in parentheses. We see the Archbishop doesn't perform very well; the only case where it has a good number of wins is against B+N (which is the weakest defending combination). A Queen beats the FIDE minors; even the pair of Knights, which still puts up a fight, manages to reach a fortress in less than half the cases, after disregarding all initial tactics. It doesn't manage to beat any pair from the other armies, though. The Chancellor does better: it also beats two WA, and thoroughly crushes the pair of Knights, but has some difficulty with the B-pair because the wins take too long.

[Edit 15-4-2019]

The Colonel is also weak, and only has some success against a pair of Knights. But because it is quite poor in delivering perpetual check, it actually runs a large risk of losing against pairs of majors, where sacrifycing it for one leaves a lost 3-men ending. Even against the weak ones, where the piece values suggest it has an advantage (Woody Rook, Commoner and Dragonfly). The large part of the forced conversions against these pairs are indeed mostly losing conversions, and especially for the Commoners most of these are lengthy.