carry on being nice in the face of nastiness. And, as the name suggests, Win Stay, Lose Shift carries on exploiting its fellow players, when it is not punished with retaliation. Or, as Karl and I put it, this strategy cannot be subverted by softies. This characteristic turns out to be an important ingredient of its success.
The deeper lesson here is that a strategy that does not appear to make sense when played in a straightforward deterministic way can triumph when the game of life is spiced up with a little realistic randomness. When we surveyed the existing literature, it turned out that others had studied the very same strategy, under various guises. The great Rapoport had dismissed it, calling it “Simpleton” because it seemed so stupid—in encountering a defector, it will alternate between cooperation and defection. He reasoned that only a dumb strategy would cooperate with a defector every other time.
But the strategy is, in fact, no simpleton. Our work made it clear that randomness was the key to its success. When confronted with defectors, it would cooperate unpredictably, with a given probability, protecting it from being exploited by opportunists. The same strategy was called “Pavlov” by David and Vivian Kraines of Duke University and Meredith College, North Carolina, who had noted that it could be effective. Moreover, two distinguished American economists, Eric Maskin and Drew Fudenberg, had also shown that such a strategy can achieve a certain level of evolutionary stability for about half of all Prisoner’s Dilemmas. But they had all looked at a deterministic (nonrandom) version of Win Stay, Lose Shift, when it was the probabilistic version that was the winner in our Rosenburg tournaments.
In the great game of evolution, Karl and I found that Win Stay, Lose Shift is the clear winner. It is not the first cooperative strategy to invade defective societies but it can get a foothold once some level of cooperation has been established. Nor does it stay forever. Like Generous Tit for Tat, Win Stay, Lose Shift can also become undermined and, eventually, replaced. There are and always will be more cycles.
Many people still think that the repeated Prisoner’s Dilemma is a story of Tit for Tat, but, by all measures of success, Win Stay, Lose Shift is the better strategy. Win Stay, Lose Shift is even simpler than Generous Tit for Tat: it sticks with its current choice whenever it is doing well and switches otherwise. It does not have to interpret and remember the opponent’s move. All it has to do is monitor its own payoff and make sure that it stays ahead in the game. Thus one would expect that, by requiring fewer cognitive skills, it will be more ubiquitous. And, indeed, Win Stay, Lose Shift was a better fit for Milinski’s stickleback data than Tit for Tat had been.
In the context of the Prisoner’s Dilemma, think of it like this. If you have defected and the other player has cooperated, then your payoff is high. You are very happy, and so you repeat your move, therefore defecting again in the next round. However, if you have cooperated and the other player has defected, then you have been exploited. You become morose and, as a result, you switch to another move. You have cooperated in the past, but now you are going to defect. Our earlier experiments had shown that Tit for Tat is the catalyst for the evolution of cooperation. Now we could see that Win Stay, Lose Shift is the ultimate destination.
Does that mean we had solved the Dilemma? Far from it. Karl and I realized in 1994 that there is yet another facet to this most subtle of simple games. The entire research literature was based on an apparently innocent and straightforward assumption: when two players decide to cooperate or to defect, they do so simultaneously. What I mean by this is that the conventional formulation of the Prisoner’s Dilemma is a bit like that childhood game, Rock Scissors Paper. Both players make their choice at precisely the same time.
Karl and I thought that this restriction was artificial. We could think of examples, such as the vampire bats that donate excess blood to hungry fellow bats and creatures that groom each other and so on, where cooperation does not happen simultaneously and partners have to take turns. So we decided to play a variant of the Prisoner’s Dilemma, called the Alternating Prisoner’s Dilemma, to see if it this change had any effect.
When we played the alternating game we were reassured to find as before that there was a tendency to evolve toward cooperation. We also observed the same cycles that saw the rise, and the fall, of cooperative and defective societies as we had seen in the simultaneous game. Once again, cooperation can thrive. But there was an important twist. We were surprised to find that the Win Stay, Lose Shift principle that had trumped all comers in the simultaneous games (eventually) no longer emerged as victor. Instead, it was Generous Tit for Tat that reigned supreme.
Drew Fudenberg, now a colleague at Harvard, pointed out to me years later that one can think of the alternating and the simultaneous games as two different limiting examples of situations found in the real world. In the alternating game it is your turn and then mine. I get all the relevant information about your move before I need to decide what to do, and vice versa. In the simultaneous game, however, neither of us gets any information about what the other will do in the present round. In everyday life, the reality most likely lies somewhere in between. We might always get some information about what the other person is up to (whether he is delivering his part of the deal or not) but that information may not be complete or reliable.
Manfred Milinski has studied how people use these strategies. In experiments with first-year biology students in Bern, Switzerland, cooperation dominated in both the simultaneous and the alternating Prisoner’s Dilemma and he observed how players tended to stick to one strategy, whichever timing of the game they played, with 30 percent adopting a Generous Tit for Tat–like strategy, and 70 percent the Win Stay, Lose Shift. As our simulations had suggested, the latter were more successful in the simultaneous game while Generous Tit for Tat–like players achieved higher payoffs in the alternating game. Both strategies appear to play a role in the ecology of human cooperation.
DILEMMA PAST, DILEMMA FUTURE
Even today, the repeated Prisoner’s Dilemma maintains a tight grip on the curious scientist. We have seen how one mechanism to solve the Dilemma and nurture cooperation is direct reciprocity, where there are repeated encounters between two players, whether people, institutions, companies, or countries. At first winning seemed easy with the Tit for Tat strategy, one that at most ends up sharing the wins equally among players. But by adding some randomness, to depict the effect of mistakes, we found that Tit for Tat is too harsh and unforgiving. The strategy triggers bloody vendettas.
We need a sprinkling of forgiveness to get along, and we found it in the strategies of Win Stay, Lose Shift and Generous Tit for Tat. The latter strategy always reminds me of a piece of advice that Bob May once gave me: “You never lose for being too generous.” I was impressed by that sentiment because he has thought more deeply about winning and losing than anyone else I know, yet being number one means everything to him. As his wife once kidded, “When he comes home to play with the dog, he plays to win.”
Let’s compare the successful strategies of Tit for Tat and Win Stay, Lose Shift. Both cooperate after mutual cooperation in the last round. Thus neither is the first to defect, at least intentionally. Only a mistake, a misunderstanding, or simply having a bad day can cause the first defection. If this occurs and the other person defects and I end up being exploited, then both strategies tell me to defect in the next move. If, on the other hand, I defect and the other person cooperates then I switch to cooperation according to Tit for Tat but continue to defect according to Win Stay, Lose Shift.
One can explain the Tit for Tat reasoning as follows: now I feel regret and I want to make up for the defection last round. But the Win Stay, Lose Shift reasoning seems—regrettably—more “human”: if we get away with exploiting someone in this round then we continue to do it in future rounds. There’s another basic difference between these strategies. If both players defect, then Tit for Tat will also defect and will not attempt to reestablish a good relationship. Win Stay, Lose Shift will cooperate, on the other hand, and try to restore better terms.
Both options make sense, but again it seems that the Win Stay, Lose Shift feels more realistic if we are in a relationship where there is hope of reestablishing cooperation. Overall, Win Stay, Lose Shift can cope better with mistakes because it actively seeks good outcomes, trying to restore cooperation after mutual defection, though it will try to exploit unconditional