The Axelrod Tournaments

The Axelrod Tournaments

Robert Axelrod

“Effective Choice in the Prisoner’s Dilemma”, “More Effective Choice in the Prisoner’s Dilemma”, The Evolution of Cooperation

I think I can safely say that the Axelrod Prisoner’s Dilemma Tournaments (and their subsequent literature) are seminal works in game theory, and really even social science in general. With the 1981 “The Evolution of Cooperation” article alone showing 18,883 citations, I think it is clear the line of work is influential.

For me, learning about the Axelrod Tournament was one of the most interesting lessons from my first game theory class. At the conclusion of that class I kind of knew that I wanted to keep learning about cooperation and competition (I’ve since added a third “c” to my interests, coordination). You could even throw in another C, computer science. People interested in computer science, both practically and theoretically, should find this tournament interesting. There are lots of implications for Artificial Intelligence here too- a program that could “learn” what was going on could have won the tournament.

What was the Axelrod Tournament?

Of course, read the papers for the full story, but I will describe it here.

Robert Axelrod sent out a letter to a bunch of people in the social sciences. In effect it said, “Hey, I’m doing this tournament. I’m going to have people play the Prisoner’s Dilemma game against other people. It’s going to work like this:

The tournament is run as a round robin so that each entry was paired with each other entry. As announced in the rules of the tournament, each entry is also paired with its own twin and with RANDOM, a program that cooperates and defects with equal probability. Each game consisted of exactly 200 moves. The payoff matrix for each move follows the basic PD formula payoffs. (That’s copied and pasted from the book)

Who won the Axelrod Tournament?

Anatol Rapoport won with the strategy of TIT FOR TAT. TFT starts with cooperate and then does whatever the other person did on the previous move. What’s funny is that TFT was distributed to everyone in the tournament, so it was common knowledge that anyone could use it. Furthermore, it is “probably the most widely known and discussed rule for playing the Prisoner’s Dilemma.” But, apparently everyone but Dr. Rapoport decided they could do better than TFT.

It is interesting that people must have thought that common knowledge of TFT made it a bad choice for a strategy. There must have been some reasoning on their part that went,

“Ok, everyone knows about TFT. If lots of people were going to use TFT, one should just use all D and beat TFT every time. Since the players in this tournament are all smart professors, they all know that TFT is easily defeated if you know that TFT will be there. Plus, we know that RANDOM will be there and TFT does not do well against random. So, TFT is out. Since TFT is out, All D is out.”

So, common knowledge in this case seemed to contribute to the incorrect conclusion that TFT was a bad choice. Maybe. Or they had some other reason for not picking TFT.

Why did TFT win?

It was simple. Longer more complex programs did not do any better than simpler ones. TFT is pretty much the simplest program after ALL D or ALL C.

It was nice. This means it didn’t defect on the first turn and didn’t defect first until around the 199th round . Aha! This lends credence to my thought that R1CC (round 1 cooperate-cooperate) is possibly a necessary, but as the experimental literature shows not sufficient, condition for sustained cooperation.

It was forgiving. That means that it keeps the possibility of future cooperation even after the other player defects. In essence it doesn’t apply GRIM TRIGGER logic and go into a total shutdown of all D after the first D from the other player.

Lastly, it was provokable. A rule is provokable if it immediately defects after an “uncalled for” defection from the other. Basically, it won’t allow itself to be bullied the whole game.

What did we learn from Axelrod’s Tournament?

My first reaction to the Axelrod tournament literature was that he had “solved” the prisoner’s dilemma. I was awestruck by the work.  I was going to start looking to see why more governmental policies weren’t TFT oriented. But, after a good discussion with my advisor, probably back as far as last spring, I began to alter my perception.

There are some issues with and limitations to the tournament. First of all, the competitors were almost exclusively professors, which are a particularly unrepresentative sample group. We can’t necessarily cross apply what we learned from the tournament to “the real world”. We have already looked at some “real world” experimental literature, to see how non professors play the iterated PD.

There is also an issue that it was finite. You knew that it would end at round 200. I assume there were a lot of R199 and/or R200 defections. 200 is a fairly random number as well. You rarely know that you will meet someone exactly 200 times. You might be able to discern situations where you are likely to never meet that person again, or conversely ones where you will be interacting with that person with great frequency.

Additionally, TFT isn’t exactly a realizable policy goal. At best, it is sort of a attitude or a way that you could approach things, but Congress could not really enact a TFT law.

Moreover, Axelrod doesn’t even think that TFT is the best decision rule for PDs (pg 401 More Effective Choice). What would have been better?

“First,  TIT  FOR  TAT  could have been beaten in the second round by any rule which was able to identify and never cooperate with RANDOM, while not mistaking other rules for RANDOM. 
Second, had only the entries which actually ranked in the top half been present, then TFT would have come in 4th.
Third, there is no best rule independent of the environment.”

So, TFT is not the “solution” to the prisoner’s dilemma.

Lastly, even though it doesn’t purport to, the Axelrod tournaments are not set up to necessarily create a socially desired end. The point of the tournament was not to create social harmony or bring about the best results for all.

Further Thoughts

Here are some thoughts I leave with you the reader:

  • #1) 2011 is the 30th anniversary of the 1981 Axelrod Tournament. If I held another tournament with the exact same rules next week, and you can assume roughly the same type of players, what would you submit?
  • #2) Standard tournament, but say the rules are everybody can submit 100 entries of any type they want?
  • #3) Standard tournament, but there will be exactly 1,000 contestants submitting 1 program: 500 random Americans, 250 econ undergraduates, 200 non game theory professors,  50 game theorists?
  • #4) Standard tournament, but held at a conference of ~100 game theory enthusiasts. You all attend a dinner beforehand with the option of engaging in collusion by making any kind of promise you want. No outside authority will enforce your commitments.
  • #5) Standard tournament, but with a defect-defect “penalty” that both players lose 10 points every time there is a series of 10 straight defection-defection plays in a row. 20 DD’s in a row would be -20 points for both sides etc. 9 DDs followed by at least one C is not penalized.
  • Bonus #6 Can you create a rule by which the program could detect RANDOM (so you can defect with it always) without mistaking other programs for RANDOM?

Comments welcome!

Posted on September 5, 2011, in Uncategorized and tagged , , , , , , , , , , . Bookmark the permalink. 5 Comments.

  1. Want to play around with Axelrod’s IPD tournaments? Try the Easy-IPD flash tool.

  2. Endless fun, that Axelrod tournament!

    Note that, actually, playing ALL D in an environment in which everybody else uses TFT would not result in winning the tournament. For simplicity, let’s ignore the fact that everybody gets paired once with RANDOM. Let’s use Axelrod’s parameters of 200 iterations and payoffs of 5, 3, 1, and 0, and assuming there are n players in the tournament, one who uses ALL D and the remaining n-1 using RANDOM, each TFT player would achieve a total payoff of 3 x 200 x (n-1) + 199 = 600n – 401, while the ALL D player’s total would be (5 + 199) x (n-1) + 200 = 204n – 4, which is lower for all value of n>1. What’s happening here is that two TFTs roll up huge payoffs (600) together, while ALL D “beats” TFT but achieves only a low payoff (204) in doing so.

  1. Pingback: Robust Cooperation: A Case Study in Friendly AI Research | Machine Intelligence Research Institute

  2. Pingback: Le P2P ne meurt jamais: petite histoire d’économie expérimentale. | Economie du numérique

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: