[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]
Comments/Ratings for a Single Item
H.G.Muller: | Well, this is basically how I got the empirical values I quoted. Except | that so far I only did it for opening positions, so the values are all | opening values. But, like I said, they don't seem to change a lot during | the game. For the complete list of exchanges that I tried, see | http://z13.invisionfree.com/Gothic_Chess_Forum/index.php?showtopic=389&st=1 There might be some problems here: For the classical piece value system, basically, what one gets is determining balance when the position is positionally equal and when both players play 'correctly'. By contrast, you take a statistical approach, and as you say for opening positions. The classical piece value system depends on a deeper analysis; perhaps the statistical approach can serve as a first approximation. So the approach I suggested is to find a set of end-game positions where the outcome is always a draw or always a win, with extreme positions excepted. This might produce different values. In this setup, it is not possible to set values as exactly, as equal material means 'always result in draw for the positions considered'; perhaps it leads to only integral values, as a fraction pawn value cannot be used to decide the outcome of games. The fractions might be introduced as position compensations, like in orthodox chess reasoning, for example 'the sacrificed pawn is compensated by positional development'.
'So the approach I suggested is to find a set of end-game positions where the outcome is always a draw or always a win, with extreme positions excepted. This might produce different values.' The problem is that such positions do not exist, except for some very sterile end-games. In theory every position is either a draw or has a well-defined 'Distance To Mate' for black or white, assuming perfect play from both sides. But this is only helpful for end-games with little enough material that tablebases can be constructed (currently upto 6 men on 8x8). And even there it would not be very helpful, as the outcome usually depends more on exact position than on the material present, and thus cannot be translated to piece values. For a given end-game with multiple pieces, there usually are both non-trivial wins for black as well as white. (Look for instance at the King + Man vs King + Man end-game, which should be balanced by symmetry from a material point of view. Yet, a large fraction of the non-tactical initial positions, i.e. where neither side can gain the other's Man within 10 moves, are won for one side or the other.) And for positions with 7 men or more, we don't even know the theoretical game result for any of the positions. This is why Chess is more interesting than Tic Tac Toe. It is impossible to determine if the position is won or lost, and if you have engines play the same position many times, they will sometimes win, sometimes lose, and sometimes draw. So the only way to define a balanced position is that it should be won as often as it is lost, in a statistical sense (i.e. the probability to win or lose it should be equal). Positions that are certain draws simply do not exist, unless you have things like KBK or take an initial setup where a side with extreme material disadvantage has a perpetual.
|| So the approach I suggested is to find a set of end-game positions || where the outcome is always a draw or always a win, with extreme || positions excepted. This might produce different values.' H.G.Muller: | The problem is that such positions do not exist, except for some very | sterile end-games. There is a difference between finding formally proved positions, and those that have such an outcome by human experience. If one in middle game exchanges ones queen for a rook and temporary very strong initiative, suppose this initiative does not lead to an immediate mate combination - it might be very effective to threaten a mate, which the opponent can only avoid by giving back some extra material - how much more material is needed in order to ensure a draw? A bishop perhaps, if the pawns are otherwise favorable, otherwise at least some more material, a pawn or two. This is a different judgement than a statistical one: a purely statistical judgment can probably be quite easily beaten by a human. It goes into a difficult AI problem: computers are not very good at recognizing patterns and contexts. But the classical piece value system probably builds on some such information.
It seems to me that the example you sketch is exactly what the piece-value system cannot solve and is not intended to solve. You want to have an estimate of how much material it wil cost the opponent to solve a certain mobility or King-safety advantage. Those are questions about the corresponding positional evaluation, not about piece values. Mind you, I am not saying that these positional values are not important. They can be worth many Pawns. But a system of piece values cannot be any good unless it is also able to (statistically) predict the outcome of a game from a given position if these positional terms are negligibly small, or exactly cancel for the two sides. This is why I only use symmetric positions, without blocked Pawns (to avoid positions where the differences needed to create the material imbalance could include trapped pieces). First you have to know piece values, to be able to handle quiet positions without extreme positional characteristics. Once you have those, you are in a position to quantitatively express the equivalent material value of positional characteristics like King safety and piece trapping.
H.G.Muller: | It seems to me that the example you sketch is exactly what the piece-value | system cannot solve and is not intended to solve. You want to have an | estimate of how much material it wil cost the opponent to solve a certain | mobility or King-safety advantage. Those are questions about the | corresponding positional evaluation, not about piece values. The situation I depicted is the other way around: I see a good queen for rook exchange that looks promising. Can I at least ensure a draw? How much additional material would I need to get to ensure. Next, have a look at the likely pawn development. So the traditional system will tell me whether I need 1, 2 or 3 pawns. The statistical system does not say anything - I am not interested in how players on the average would handle this position.
'The statistical system does not say anything - I am not interested in how players on the average would handle this position.'
I suspect you misunderstand what quantity is analyzed. In any case not how players handle the position. But the very question you do ask, 'what are my chances for a draw with 1, 2 or 3 Pawns in compensation', can only be answered in a statistical sense. The answer will never be 'with 1 Pawn I will lose, with 2 Pawns I will draw, and with 3 Pawns, I will win'. It will be something like: 'With one Pawn I will have 5% chance on a win and 10% on a draw, (and thus 85% for a loss) with 2 Pawns this will be 20-30-50, and with 3 Pawns 50-30-20. And I can count a Passer as 1.5, so if my 2 Pawns include a passer it will be 35-30-35).'
This is all you could ever hope for. But to get that answer, you must know that the W-D-L statistics wfor the quiet situation after the opponent has defused the mate threat with so-and-so-many Pawns are x%-y%-z%. You will never be able to say if the position is won or not, unless your final position is a tablebase hit (and you have the tablebase).
H.G.Muller: | I suspect you misunderstand what quantity is analyzed. In any case not how players handle | the position. But the very question you do ask, 'what are my chances for a draw with 1, 2 | or 3 Pawns in compensation', can only be answered in a statistical sense. The answer will | never be 'with 1 Pawn I will lose, with 2 Pawns I will draw, and with 3 Pawns, I will | win'. It will be something like: 'With one Pawn I will have 5% chance on a win and 10% on | a draw, (and thus 85% for a loss) with 2 Pawns this will be 20-30-50, and with 3 Pawns 50-| 30-20. And I can count a Passer as 1.5, so if my 2 Pawns include a passer it will be 35- | 30-35).' No, this is the flaw of your method (but try to refine it): Chess is not played against probabilities, as in say poker. There is really thought to be a determined outcome in practical playing, just as the theory says. I can have a look at my opponent and ask 'what are the chances my opponent will not see my faked position' - but that would lead to poor playing. Much better is trying to play in positions that your opponent for some reason is not so good at, but it does not mean that one takes a statistical approach to playing. Playing strength is dependent roughly on how deep on can look - the one that looks the furthest. There are two methods of looking deeper - compute more positions. Or to find a method by which positions need not be computed, because they are unlikely to win. 'Unlikely' here does not refer to a probability of position, but past experience, including analysis. With a good theory in hand, one can try to play into positions where it applies -- this is called a plan. If I have an when the position values applies, I can try to play into such situation, and try to avoid the others. When I looked at your statistical values, I realized I could not use them for playing, because they do not tell me what I want to know. A computer that does not care about such position evaluations may be able to use them. But I think the program will not be very strong against an experienced human. But in the end, it is the method that produce the wins that is the best.
Well, let us take one particular position then: the opening position of FIDE Chess. You maintain that the outcome of any game starting from this position is fixed? A quick peek in the FIDE database of Grand Master games should be sufficient to convince you that you are very, very wrong about this. Games starting from this position are lost, won and drawn in enormous numbers. There is no pre-determined outcome at all. Computers that use the statistical approach, such as Rybka, are incomparably strong. Humans, using the way you describe, simply cannot compete. These are the facts of life.
HGM: It is pure luck, that chess computer program strengths accompanied our masters for a time. Nobody will try to win a sprint against a Porsche, because the difference is even more obvious. So it does not tell anything about how to handle a subject, when different species are competing. As for our both chess programs, which are obviously of different maturity, you can find out, which is currently more successful, but that does not mean, that the winning one is based on 'more correct' ideas. Ideas are rarely implemented in equal quality. Maybe it needs a lot of approaches of different people to have a decision on such a question after a long time period.
H.G.Muller: | Well, let us take one particular position then: the opening position of | FIDE Chess. You maintain that the outcome of any game starting from this | position is fixed? Since there is only a finite number of positions, there is theoretically an optimal strategy. | A quick peek in the FIDE database of Grand Master games | should be sufficient to convince you that you are very, very wrong about | this. Games starting from this position are lost, won and drawn in | enormous numbers. There is no pre-determined outcome at all. They probably haven't searched through all positions to find the optimal strategy. | Computers that use the statistical approach, such as Rybka, are | incomparably strong. Humans, using the way you describe, simply cannot | compete. These are the facts of life. I suspect that humans are not allowed to search through zillions of positions like computers do. Specifically, if strong human players are allowed to experiment to search out of the weaknesses of the computer programs and using other computer programs to check against computational mistakes, I think that these computer programs will not be as strong. But as for the question of piece values, humans and computers may prefer different values - it depends on how they should be used.
Reinhard Scharnagl: | It is pure luck, that chess computer program strengths accompanied our masters for a time. | Nobody will try to win a sprint against a Porsche, because the difference is even more obvious. So | it does not tell anything about how to handle a subject, when different species are competing. The computer programs for orthodox chess succeeds in part because the number of positions required for achieving a good lookahead is fairly small relative to the capacity of the most powerful computers. If one designs a chess variant with more material in a way that it is still very strategic to humans, then it will be harder for computers to be good on that variant, as the number of positions that must be searched for will be much larger.
Hans, Indeed, your argument fails to take account of this gap between theory and reality. People have not explored a significant fraction of the game tree of Chess from the opening, or even from late middle-game psitions. And computers cannot do it either. Nor will they ever, for the lifetime of our Universe. Outcomes that are in principle determined, but by factors that we cannot control or cannot know, are logically equivalent to random quantities. That holds for throwing dice, generating pseudo-random numbers through the Mersenne Twister algorithm, and Chess alike. So Chess players, be they Humans or Computers, will have to base their decisions on GUESSED evaluations of teh positions that are in their search tree. And the nature of guessing is that there is a finite PROBABILITY that you can be wrong. Chess is a chaotic system, and a innocuous difference between two apparently completely similar positions, like displacing a Pawn ofr a King by one square, ort even who has the move, can make the difference between win and loss. The art of evaluation of game positions is to make the guesses as educated as possible. By introducing game concepts you can pay attention to ('evaluation terms') one can classify positions, but the number of positions will always be astronomical compared to the number of classes we have: even if we would consider the collection of all positions that have the same material, the same Pawn structure, the same King safety, the same mobility, etc., we are still dealing with zillions of positions. And unless material is very extreme (like KQQK), some of these will be won, some drawn, some lost. The moment this would change (e.g. because we can play from 32-men tablebases) Chess would cease to become an interesting game. But until then, any evaluation of a position is probabilistic. The game will most often be won for the player that goes for the positions that offer the best prospects. The player that only moves to positions that are 100% certain wins, will forfeit all games on time... Now piece values are the largest explaining factor of evaluation scores, if you would do a mathematical 'principal component anlysis' of the game result as a function of the individual evaluation terms. That is a property of the game, and independent of the nature of the player. Humans and computers have to use the same piece values if they want to play optimally. Only when the search would extend to checkmate in every branch, (possibly via a two-directional search, starting from opening and checkmate, meeting in the middle as tablebase hits) evaluation would no longer be necessary, and piece values would no longer be needed.
Reinhard, Apparently my point is not yet clear. My conclusion on piece values has nothing to do with the fact that Joker80 ends consistently above Smirf in 10x8 Chess tourneys, and that Joker values pieces in one way, and Smirf values them in another. This indeed is not proof for anything. The result is 100% based on the fact that if I let engines play positions with material imbalance, certain combinations of pieces systematically beat other combinations, if the players are equally strong (e.g. because they are identical). This is independent of the engine used, provided it is not completely silly. No matter how you PROGRAMMED Smirf to value the Archbishop, you cannot prevent that Smirf that plays a complex position with where it has A will systematically beat the Smirf that has B+N+P in stead. Because A is much stronger than B+N+P, and Smirf's search will discover that, because it can use the A to gobble up Pawns which the B+N will not be able to defend. And even if it could trade the A for B+N, and wuld think it is an equal trade, it will most of the time prefer to win a Pawn. And by the time it has captured the Pawn, the win of the other Pawn will be within the horizon, and then another. And when the opponent is out of Pawns, its own passers will get the promotion within the horizon. Most of the time you will not be able to fool the search by giving it faulty piece values.
GMs only search 6-8 ply deep, looking through at most a few hundred searches per position. That is feasible for a brute force computer (with pruning and noting that position evaluation still is needed), give the rather low average number of moves per position in orthodox chess. But suppose that the average number of moves per position is made a couple of times larger. If the game is still such that humans can play it strategically, then that might improve for humans in the competition with computers.
H.G.Muller: | Chess is a chaotic system, and a innocuous difference between two | apparently completely similar positions [...] can make the difference | between win and loss. This is only true if positions are viewed out of context. Humans overcome this by assigning a plan to the game. Not all position may be analyzable by such a method. The human analyzing method does not apply to all positions: only some. For effective human playing, one needs to link into the positions to which the theory applies, and avoid the others. If one does not succeed in that, a loss is likely. The subset of positions where such a theory applies may not be chaotic, then.
The main difference between computers and GMs is that the latter search selectively, and have very efficient unconscious heuristics for what to select and what to ignore (prune). Computers have no insight what t prune, and most attempts to make them do so have weakened their play. But now hardware is so fast that they can afford to search everything, and this bypasses the problem. Also for the evaluation, Human pattern-recognition abilities are much superior to computer evaluation functions. But it turns out extra search depth can substitute for evaluation. It has been shown that even an evaluation that is a totally random number unrelated to the position, combined with a fairly deep search, will lead to reasonable play. (This because the best of a large number of randoms is in general larger than the best of a small number. So the search seeks out those nodes that have many moves, and that usually are the positions where the most valuable pieces are still on the board. So it will try to preserve those pieces.) Making the branching ratio of a game larger merely means the search depth gets lower. If this helps the Human or the computer entirely depends on if the fraction of PLAUSIBLE moves, that even a Humancannot avoid considering, increases less than proportional. Otherwise the search depth of the Human might suffer even more than that of the computer. So it is not as simple as you make it appear below.
'This is only true if positions are viewed out of context. ...' It never happened to you that early in the game you had to step out of check, and because of the choice you made the opponent now promotes with check, being able to stop your passer on the 7th? I think that if you are not willing to consider arguments like 'here I have a Knight against two Pawns (in addition to the Queen, Rook, Bishop and 3 Pawns for each), so it is likely, although not certain, that I will win from there', the number of positions that remains acceptable to you is so small that the opponent (not suffering from such scruples) will quickly drive you into positions where you indeed have 100% certainty.... That you have lost! What is your rating, if I may ask?
This is an interesting and thought-provoking conversation. I've got a couple questions. If the game has a large-enough branching ratio that computers can't adequately search in 'reasonable' [however it's defined] time, then wouldn't the different values that a human and a computer might assign to the same pieces *possibly* contribute to or result from the person's 'intuition'/superior playing ability? Following is a quote from a recent HG Muller post: 'Making the branching ratio of a game larger merely means the search depth gets lower. If this helps the Human or the computer entirely depends on if the fraction of PLAUSIBLE moves, that even a Human cannot avoid considering, increases less than proportional. Otherwise the search depth if the Human might suffer even more than that of the computer.' If each side gets multiple moves per turn, this would increase the branching ratio. But without changing other things about chess, the human would probably be even more at a disadvantage; in Marseilles Chess, for example, with its 2 moves/side/turn, because the computer would calculate that out easily, correct? Consider a larger game, with several kings and several moves per turn. Let each side have as many moves per turn as they have kings. Restrict movable pieces on each side to only those friendly pieces within close proximity of the kings. This begins to shift the advantage to the human, I suspect, especially when there are maybe 6 - 8 kings or more per side, the pieces are reasonably simple, easy to understand and use, work well in groups, and number few in variety, with fairly high numbers of each type of piece. This should shift the game more toward pattern recognition and away from pure brute-force calculations, which would seemingly take rather long. Opinions? Btw, I really haven't played FIDE seriously in 4 decades and I never got a rating, but my ratings at CV are 1550ish lifetime and bouncing around in the 1600's for the past 18 months, and I'd really love to get a chance to play against a program in games like the kind described above.
Harm, let me then focus on your method to have one engine play itself other using different armies: I am convinced - so please correct me if need be - that your engine has implemented just that value scheme, you are talking about. Suppose, that true values probably are somehow different from the built in figures. Then your engine will throw away underestimated pieces too cheap and keep overestimated too dear. Thus it will start and avoid a lot of trades in unjustified manner. And the bad payload for this is, that you are not able to detect it, because equal engines are blind to penalize each other for such mistakes. And in the end therefore such tests tend to become a self fulfilling prophecy.
Reinhard:
| I am convinced - so please correct me if need be - that your engine
| has implemented just that value scheme, you are talking about. ...'
Well, initially, of course NOT. How could it? I am not clairvoyant. I
started by 'common-sense logic' like 'Q = R + N + 1.5, and B << N
probably means A << Q, and the synergy bonus probably scales proportional
to piece value, so let me take A = B + N + 1'. Which translated to A =
7.5 with my 8x8 values B = N = 3.25 (taken from Kaufman's work).
And with the setting A=7.5, C=9.0, I played the 'Chancellor army'
against the 'Archbishop army', expecting the latter to be crushed,
because of the 300 cP inferiority. (Which corresponds to piece odds, and
should give 85%-90% scores.) But to my surprise, although the two
Chancellors won, it was by less than the Pawn-odds score.
| Then your engine will throw away underestimated pieces too cheap and
| keep overestimated too dear. Thus it will start and avoid a lot of
| trades in unjustified manner.
This hardly occurs, because this is SELF-PLAY. The opponent has the same
misconception. If I tell the engine A < R, there the engines wil NOT throw
away their A for R, because the opponent will not let them, and 'save'
its Rook when it comes under A attack. Trades of unlike material occur
only rarely, unless the material is considered exactly equal (which I
therefore avoid). So putting A=R is dangerous, and would suppress the
measured A value because of bad A vs R trades. But not completely, as it
would not always happen, and a fair fraction of the games would still be
able to cash in on the higher power of A by using it to gain material or
inflict checkmate before the trade was made. So even when you do set A=R,
or A=R+P, the A will score significantly better than 50% in an A vs R (or
AA vs RR) match. And when I discover that, I increase the A value
accordingly, until self-consistency is reached.
So my initial tests of CC vs AA, with the engine set to A=750, C=900
suggested that C-A ~< 50 cP. Then I repeated the match with A=850, and
this eliminated the few bad trades that could not be avoided by the
opponent. So CC beat AA now by an even smaller margin, of less than half
the Pawn-odds score (in fact more like a quarter). So I set A=875. I did
not repeat the test with A=875 yet, but I don't expect this 25cP
different setting to cause a significant change in the result (compared to
the statistical error with the number of games I play), if changing a full
100 cP only benifited the AA army 6%. The extra 25cP will not reverse the
sign of any trade.
So in practice, you are highly insensitive to what values you program into
the engine, and iterating to consistency converges extremely fast. You
should not make it too extreme, though: if you set Q < P, the side with
the Queen will always squander it on a Pawn, as there is no way the
opponent could prevent that, the Queen being so powerful and the Pawns
being abundant, exposed and powerless. Similarly, setting A < N would
probably not work even in an A vs B+N ending (with Pawns), as the A is
sufficiently powerfol compared to individual B and N that the latter
cannot escape being captured by a suicidal A. But if you are off
'merely' 2-3 Pawns, the observed scores will already be very close to
what they should be based on the true piece values.
H.G.Muller: | Computers have no insight what to prune, and most attempts to make | them do so have weakened their play. But now hardware is so fast that | they can afford to search everything, and this bypasses the problem. So it seems one should design chess variants where the average number of moves per position is so large that one has to prune. | Making the branching ratio of a game larger merely means the search | depth gets lower. If this helps the Human or the computer entirely | depends on if the fraction of PLAUSIBLE moves, that even a Human | cannot avoid considering, increases less than proportional. Otherwise | the search depth of the Human might suffer even more than that of the | computer. So it is not as simple as you make it appear below. I already said that: the variant must be designed so that it is still very strategic to human. - It is exactly as complicated as I already indicated :-). Therefore, I tend to think that perhaps a 12x8 board might be better, with a Q+N piece, and perhaps an extra R+N piece added.
Why do you think the bigger board and the stronge piece make the game more strategical? The game stage that computers usually have most difficulty coping with it the end-game, where they fail to recognize strategic patterns such as a defended passer tying the King forever, the Pawns on their other wing being doomed once the opponent's King will get there, as it eventually (far behind the horizon) unavoidably will. Or W:Ra1; B:Bb1,Pa2, with a full Rook that will sooner or later fall victim to an attack by the King. End-games with slow pieces (Kings, Knights, Pawns) are usually the most strategic of all.
H.G.Muller: | It never happened to you that early in the game you had to step out of | check, and because of the choice you made the opponent now promotes with | check, being able to stop your passer on the 7th? Early in the game, most things happen by opening theory. And if one is getting an advantage like a passer, one should be careful to not let down the defense of the king, including computing checks. With those computer programs, a tactic that may work is to let down the defenses of the king enough that the opponent thinks it is worth going after it, and then exploit that in a counterattack. | I think that if you are not willing to consider arguments like 'here I | have a Knight against two Pawns (in addition to the Queen, Rook, | Bishop and 3 Pawns for each), so it is likely, although not certain, that | I will win from there', the number of positions that remains acceptable | to you is so small that the opponent (not suffering from such scruples) will | quickly drive you into positions where you indeed have 100% certainty.... | That you have lost! As I said, the outcome is decided by the best playing from both sides. So if one starts to play poorly in the face of a material advantage, that is inviting a loss. So a material advantage of one pawn must happen in circumstances of where one can keep the initiative, otherwise, it might be better to returning that material for getting the initiative hopefully. | What is your rating, if I may ask? I have not been active since the 1970s, just playing computers sometimes. About expert, I think.
H.G.Muller: | Why do you think the bigger board and the stronge piece make the game | more strategical? I said: if one increases the average number of moves in each position, then a full search may fail, as there will be too many of them. Then a different strategy is needed for success. If it is doubled, then in a 7-ply search, if the positions are independent, a search for all would require 2^7 = 128 more positions to search for. If there are 10 times more average moves, then 10^7 more positions need to be searched. Strategic positions is another matter: indeed, in orthodox chess, trying to settle for positions were advantage depends on long term development is a good choice against computers, the latter which tend to be good in what humans find 'chaotic' positions. The design of a variant must be so that it admits what humans find strategic, and so it is possible to play towards them from the initial position. I am not sure exactly what factors should be there. Just putting in more material may indeed favor the computer. In orthodox chess, one can stall by building a pawn chain, and then use the minor pieces for sacrifices to create breakthrough. The chess variant must contains some such factors as well.
25 comments displayed
Permalink to the exact comments currently displayed.