[ Help | Earliest Comments | Latest Comments ]

[ List All Subjects of Discussion | Create New Subject of Discussion ]

[ List Latest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Ratings & Comments

⇩Latest ⇩Later ⇩Reverse Order⇧ Earlier⇩ Earliest⇧

Chess programs move making[Subject Thread] [Add Response]

Aurelian Florea wrote on Mon, Sep 19, 2022 11:00 AM UTC in reply to H. G. Muller from 08:25 AM:

The NN outputs a probability distribution over all the possible moves (illegal moves are set to 0 and the probabilities sum to 1). The MTCS call for this distribution, combine it with an exploration coefficient and Dirichlet noise, to form a score, and choses a move to expand until a certain number (in chess is 6000) of nodes have been visited. Nodes are expanded until leaf nodes are explored. This link explains it better than I:

https://joshvarty.github.io/AlphaZero/

H. G. Muller wrote on Mon, Sep 19, 2022 08:25 AM UTC in reply to Aurelian Florea from 08:17 AM:

I thought AlphaZero used the output of its NN for evaluating leaf nodes. That makes it different from 'normal' MCTS, which would randomly play out games until they satisfy a win or draw condition, and uses the statistics of such 'rollouts' as a measure for the winning probability in the leaf.

Aurelian Florea wrote on Mon, Sep 19, 2022 08:17 AM UTC in reply to Greg Strong from Sun Sep 18 05:53 PM:

Thanks Greg, My conundrum comes from the definition of leaf nodes. In the traditional way you apply the evaluation function, but in the MCTS of Alpha zero are only when the endgame conditions apply.

Greg Strong wrote on Sun, Sep 18, 2022 05:53 PM UTC in reply to H. G. Muller from 10:09 AM:

A quick overview for those who are interested ... A traditional Chess program has 3 parts:

Move Generation - Given a position, find all legal moves for the side on the move.

Search - Recursively play out moves and counter-moves to find the best sequence (called the PV or Principal Variation.)

Evaluation - At the leaf nodes on the end of each search path, evaluate the position. This function returns a number - the more positive, the better for player 1, the more negative, the better for player 2.

Chess variants. To program a chess variant, you definitely need to change Move Generation. You probably also need to change the Evaluation function. If nothing else, at least give material values to the new piece types. This is the most pronounced of all the evaluation terms. But other things may need to be altered -- for example, pawn structure is important, but should not apply to Berolina Chess.

The Search is typically extended by something called a Quiescent Search. The Evaluation function cannot reliably detect pins, forks, hanging material, etc., which would all affect the evaluation a lot. So, the Evaluation function can only be used on "quiet" ("quiescent") positions. So after the search function searches all legal moves to the desired depth, it then hands off to Quiescent Search, which continues searching recursively, but searches only captures (and, in some programs, checking moves too, but this isn't common in chess variant engines.) This way, all exchanges in progress are played out and hanging pieces are captured before the position is evaluated.

So, with that background on Quiescent Search ... Remember how I said the Search function doesn't need to change for different variants? Well, that's not entirely true. For some variants, like Shogi (or other variants with drops), or double-move variants, there are no "quiet" positions. So traditional Quiescent Search doesn't work. Other approaches must be taken. ChessV doesn't modify the Search function for different variants at all. That's why it doesn't play Shogi. It does play Marseillais Chess, but I haven't modified the Search, I'm basically just sweeping the issues under the rug... I don't know how Zillions-of-Games works for certain, but I believe it has no Quiescent Search function at all. It modifies the Move Generator to support different games, but there is no variant-specific Search or Evaluation.

ChessV handles the Evaluation by building it from various elements that can be turned on, off, or configured. (Pawn structure, king safety, outposts, colorbinding, open file bonuses, castling rights, etc.) You can find out basically everything ChessV does for every game it plays by looking at the Game Reference: http://www.chessv.org/reference/games/

H. G. Muller wrote on Sun, Sep 18, 2022 10:09 AM UTC in reply to Kevin Pacey from Sat Sep 17 09:29 PM:

Well, it is a bit more subtle than that, because you would not know what line B leads to without searching it first. And after you have done that, it is too late to prune it. So what really happens if that after you have found the single reply to the first move of line B that leaves you stuck with the doubled Pawns, you consider the first move of B refuted, and you prune the remaining replies to it.

But you are right in the sense that positional characteristics can also be sufficient reason to reject a move. This is purely a matter of evaluation, though. Whatever search method you would use, it should in the end give you the line to the position with the best evaluation that you can force (the 'principle variation'). If there is no mate in view the evaluation of positions must be based on heuristics. Piece values are one such heuristic. In general one should base the evaluation only on features that are not easily changed. Otherwise it would be pretty meaningless, as you know the game must go on from there (even though it is beyond your capability to take into account what will happen then), and if the feature would in general not persist after the next move it is pointless to optimize it. But next to material composition pawn structure is definitely a persistent feature (with FIDE or Shatranj Pawns!). King safety as well, when the King is a slowly moving piece.

What you will have to consider in evaluation can indeed depend very much on the rules of the variant. Using a neural network for evaluation (as both AlphaZero and NNUE do) and training that on positions from games is a method to avoid having to think about that yourself; you hope the NN is clever enough to recognize the important features, and quantify their importance. Whether this leads to different pruning in alpha-beta search is of no importance; this kind of pruning is guaranteed to have no effect on the principal variation that will be found. With or without pruning this would be the same. The pruning is just an optimization to get that same result, which was fully specified by the evaluation of the tree leaves, without wasting time on lines that neither of the players would want to play.

From conventional chess engines it is known that pawns on the forelast rank, and king safety can have values comparable to the material value of a minor.

Kevin Pacey wrote on Sat, Sep 17, 2022 09:29 PM UTC:

Hi H.G.

My memory/understanding may be off, but I thought if for a chess engine line A is at least = and line B results in doubled isolated pawns without compensation, then line B would be pruned out. It's not due to material loss (e.g. a pawn for nothing) but rather due to chess-specific knowledge (that doubled isolated pawns are bad if not compensated for). I could imagine a CV where the opposite might be true, depending on the rules of movement or topology of the board, i.e. where doubled pawns might be good (e.g. in a game of Berolina's Pawns).

H. G. Muller wrote on Thu, Sep 15, 2022 11:22 AM UTC in reply to Aurelian Florea from 08:49 AM:

The only moves that are pruned in an alpha-beta search are those whose score could not possibly affect the move choice at the root, because they are in a branch that has already been refuted. E.g. if you find that in a position that is at least equal (i.e. you have already searched a move that does not lose anything), and an alternative move has a reply that loses a pawn without compensation, you would consider that alternative move refuted. It would be a waste of time to continue searching other replies to the move, in order to determine whether these make you lose even more. Because losing a Pawn is already bad enough to dissuade you from playing that alternative move.

This is purely determined from the scores of the moves; you don't need any variant-specific knowledge for it.

Aurelian Florea wrote on Thu, Sep 15, 2022 08:49 AM UTC in reply to H. G. Muller from 08:19 AM:

Ok, HG, I could have misconstrued something. But from what I understand I'd have to cut some moves while doing the alpha beta search. For example, after a certain number of plies. And then do the classification learning. Have you said something else? Anyway, I was saying that I can use handcrafted features at input to the neural network.

H. G. Muller wrote on Thu, Sep 15, 2022 08:19 AM UTC in reply to Aurelian Florea from 06:56 AM:

What pruning? Alpha-beta search only prunes after a beta cutoff, and this does not require any knowledge about the variant.

Aurelian Florea wrote on Thu, Sep 15, 2022 06:56 AM UTC in reply to H. G. Muller from Wed Sep 14 08:17 PM:

I meant the pruning!

H. G. Muller wrote on Wed, Sep 14, 2022 08:17 PM UTC in reply to Aurelian Florea from 08:12 AM:

What do you mean? Alpha-beta search is always the same, no matter what variant it is used for. It does not even have to be a chess variant. As long as it is a deterministic 2-player game with perfect information it should work the same.

Aurelian Florea wrote on Wed, Sep 14, 2022 08:12 AM UTC in reply to Samuel Trenholme from Tue Sep 13 03:35 PM:

Actually, when programming a new game it is not easy to do alpha beta search. I was thinking to add some positional features (like moving the central pawns) to the input of the neural network. This should speed thinks up a bit.

Samuel Trenholme wrote on Tue, Sep 13, 2022 03:35 PM UTC:

A NNUE with an alpha-beta search is fine: Stockfish 15 is stronger than any other chess player in the world, either human or computer.

The main thing that is interesting with Alpha Zero is that it can play a super human game of chess with no human chess knowledge except the game's rules. So, for example, any opening or midgame strategy is has is not based on human play.

Aurelian Florea wrote on Sat, Sep 10, 2022 08:09 AM UTC in reply to H. G. Muller from Fri Sep 9 01:58 PM:

@HG, Thanks for not discouraging me. That matters a lot, thrust me. I'll then slowly train weak bots but stronger every time. Maybe in time hardware will become available. And also, who knows, better software ideas. Good luck!

H. G. Muller wrote on Fri, Sep 9, 2022 01:58 PM UTC in reply to Aurelian Florea from 12:33 PM:

Indeed such questions are more suitable for talkchess.com.

But the fact that the AlphaZero NN also has a 'policy head' next to the evaluation is not the largest difference. IIRC this is only a matter of the final layer, which calculates a preference for each possible move in the same way as the score of the position is calculated. (I.e. each move output is fully connected to the previous layer.)

The main difference is the size. The AlphaZero net is so large that even using a Google TPU or a fast GPU to calculate it slows down the engine by a factor 100-1000, in terms of positions/sec searched. The NNUE only slows the engine by a modest factor, even on a CPU, because it is so small.

Aurelian Florea wrote on Fri, Sep 9, 2022 12:33 PM UTC in reply to H. G. Muller from 08:07 AM:

@HG,

Yes, in my previous comment I was referring to the difference between neural net and the NNUE approach. Alpha Zero's neural net has 2 parts. A CNN feature extraction part and a fully connected classification part. In the NNUE, from what I understand, you keep only the second part. I have said that I did not fully understand how the CNN part works. I'm not sure what filters it uses. But I do not think this is a question for here anyway.

H. G. Muller wrote on Fri, Sep 9, 2022 08:07 AM UTC in reply to Aurelian Florea from 06:42 AM:

You will be already 5000 times slower by the fact alone that Google used 5000 servers, (if I recall correctly), and you only have a single PC. And each Google server contained 4 TPU boards, each capable of performing 256 (or was it 1024) multiplications per clock cycle, while a single core in a PC can do only 8. So I think you are way too optimistic.

I am not sure what your latest comment refers to. If it is the difference between AlphaZero and NNUE: the main point there is that the NNUE net uses 5 layers of 32 cells, plus an initial layer that can be calculated incrementally, (so that the size matters little). While the AlphaZero net typically has 32 layers of 8 x 8 x 256 cells.

I don't want to discourage you, but the task you want to tackle is gigantic, and not very suitable as an introductory exercise in chess programming. A faster approach would be to start with something simple (at the level of Fairy-Max or King Slayer, simple alpha-beta searchers with a hand-crafted evaluation) to get some experience with programming search, then replace its evaluation by NNUE, to get some experience programming neural nets. The experience you gain in the simpler preliminary tasks will speed up your progress on the more complex final goal by more than the simple tasks take. Also note that you will have the disadvantage compared to people who do this for orthodox chess of not having high-quality games available to train the networks.

Aurelian Florea wrote on Fri, Sep 9, 2022 06:42 AM UTC in reply to H. G. Muller from Thu Sep 8 09:53 PM:

@HG,

The alpha-beta search would have replaced the feature extraction part of the NN. That is from what I understand. That is the part I did not understood (especially what filters does the CNN use). Anyway even if it takes a huge amount of time the algorithm will still improve. So I could present a user with monthly updates. But I'm afraid it will take time until an challenging AI is outputed by the program.

Once again thanks a lot for your help!

Aurelian Florea wrote on Thu, Sep 8, 2022 09:58 PM UTC:

Well hundreds of times slower it is not that but. 1000 times slower will be 4000 hours which is 167 days. That is doable. I have one computer for one game (2 in total, remakes of apothecary chess). But my games are 10x10 and have bent riders and an imitator which could itself make things much worse. There could be artifices in the beginning by inserting fake endgame conditions which will train things like "Do not exchange your queen for a knight!".

Anyway I'm doing this because it is the only thing that motivates me so far. With the new treatment (actually 3 years now), I am able to pay attention to stuff. I really enjoy doing this. And the programming is not that hard actually, although there are also NN things I do not understand.

@HG and @Greg Thanks for your advices!

H. G. Muller wrote on Thu, Sep 8, 2022 09:53 PM UTC in reply to Aurelian Florea from 07:17 PM:

It would be a few month when you had a few dozen computers with very powerful GPU boards as graphics cards. For orthodox chess.

But I should point out there is another approach called NNUE, which uses a far simpler neural net just for evaluation in a conventional alpha-beta search. This is far easier to train; for orthodox chess a few hundred-thousand positions from games with known outcome would be enough.

Greg Strong wrote on Thu, Sep 8, 2022 09:23 PM UTC:

Pretty sure H. G. is correct. There is a reason I spend no effort on the neural network approach.

Aurelian Florea wrote on Thu, Sep 8, 2022 07:17 PM UTC in reply to H. G. Muller from 07:09 PM:

I'm aware of this fact, but I'm not sure about six years. Isn't it closer to a few months?

H. G. Muller wrote on Thu, Sep 8, 2022 07:09 PM UTC in reply to Aurelian Florea from 05:19 PM:

Also note that LeelaChess Zero, the imitation of Alpha Zero that can run on a PC, took about 6 months of computer time donated by a large collaboration of people with powerful computers to learn how to play Chess. For AlphaZero this only took 4 hours, but only by virtue of the fact that they used 5000 of their servers equiped with special hardware that calculated hundreds of times faster than an ordinary PC. On a single PC you would be lucky to achieve that in 6 years of non-stop computing.

Aurelian Florea wrote on Thu, Sep 8, 2022 05:19 PM UTC in reply to Greg Strong from 05:06 PM:

Yes, Greg I'm sure you are correct. Moreover in this context it matters less. This is a training program. It can take more time to learn the game, but as HG has said, it is of little consequence.

Greg Strong wrote on Thu, Sep 8, 2022 05:06 PM UTC:

A mistake most starting chess programmers make is trying to optomize these kinds of things. It is not a good use of your effort. Even if you double the computational speed of your program, that is only good for about 70-100 ELO. The important thing is being bug-free and the optimizaitons make things more complicated and introduce more bugs.

25 comments displayed

⇩Latest ⇩Later ⇩Reverse Order⇧ Earlier⇩ Earliest⇧

Permalink to the exact comments currently displayed.