Prisoner’s Dilemma | The Daily Omnivore

Prisoner’s Dilemma

The prisoner’s dilemma (PD) is a paradox about co-operation. It shows why two ‘rational’ individuals might not co-operate, even if it seems in their best interests. It is studied in game theory.

In the classic example two people are arrested for a crime, and the police are uncertain which person committed the crime, and which person abetted the crime. If each remains silent, they are both soon released. If one betrays the other, the betrayer goes free, and the other is imprisoned for a long time. If each betrays the other, they both are held for a short time. No matter what happens, they will never see each other again.

If you are a prisoner in this situation and you only care about yourself, the way to get the smallest sentence is to betray the other prisoner. If the other prisoner stays silent and does not betray, then betraying means you do not go to jail at all instead of going to jail for 6 months. If the other prisoner betrays, then betraying lets you go to jail for two years instead of ten years. In other words, it’s always best for you to betray, even though the two of you would be better off if you both stayed silent. Betraying the other prisoner is your ‘dominant strategy’ because it is always the best thing for you to do, no matter what the other prisoner does.

Game theory was much studied during the Cold War period. In that case the ‘players’ being studied were the United States and the Soviet Union. The prisoner’s dilemma was originally framed by American mathematicians Merrill Flood and Melvin Dresher working at the RAND think tank in 1950. Canadian mathematician Albert W. Tucker formalized the game with prison sentence rewards and named it, ‘prisoner’s dilemma.’ In his version if both betray each other they both serve two years in prison. If one betrays and the other stays silent, the betrayer goes free and the betrayed serves three years. If neither betrays they both serve one year.

Because betraying a partner offers a greater reward than cooperating with them, all purely ‘rational,’ self-interested prisoners would betray the other. In reality, humans display a systemic bias towards cooperative behavior in this and similar games. A model based on a different kind of rationality, where people forecast how the game would be played if they formed coalitions and then maximized their forecasts, has been shown to make better predictions of the rate of cooperation in this and similar games, given only the payoffs of the game.

The prisoner’s dilemma game can be used as a model for many real world situations involving cooperative behavior. In casual usage, the label ‘prisoner’s dilemma’ may be applied to situations not strictly matching the formal criteria of the classic or iterative games: for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation.

In environmental studies, the PD is evident in crises such as global climate-change. It is argued all countries will benefit from a stable climate, but any single country is often hesitant to curb carbon emissions. The immediate benefit to an individual country to maintain current behavior is perceived to be greater than the purported eventual benefit to all countries if behavior was changed.

The prisoner setting may seem contrived, but there are in fact many examples in human interaction as well as interactions in nature that have the same payoff matrix. The prisoner’s dilemma is therefore of interest to the social sciences such as economics, politics, and sociology, as well as to the biological sciences such as ethology and evolutionary biology. Many natural processes have been abstracted into models in which living beings are engaged in endless games of prisoner’s dilemma.

Cooperative behavior of many animals can be understood as an example of the prisoner’s dilemma. Often animals engage in long term partnerships, which can be more specifically modeled as iterated prisoner’s dilemma. For example, guppies inspect predators cooperatively in groups, and they are thought to punish non-cooperative inspectors. Vampire bats are social animals that engage in reciprocal food exchange. Applying the payoffs from the prisoner’s dilemma can help explain this behavior:

In addiction research / behavioral economics, George Ainslie points out that addiction can be cast as an intertemporal PD problem between the present and future selves of the addict. In this case, defecting means relapsing, and it is easy to see that not defecting both today and in the future is by far the best outcome. The case where one abstains today but relapses in the future is the worst outcome — in some sense the discipline and self-sacrifice involved in abstaining today have been ‘wasted’ because the future relapse means that the addict is right back where he started and will have to start over (which is quite demoralizing, and makes starting over more difficult).

Relapsing today and tomorrow is a slightly ‘better’ outcome, because while the addict is still addicted, they haven’t put the effort in to trying to stop. The final case, where one engages in the addictive behavior today while abstaining ‘tomorrow’ is problematic because tomorrow one will face the same PD, and the same obvious benefit will be present then, ultimately leading to a string of defections. Psychologist John Gottman in his research described in ‘the science of trust’ defines good relationships as those where partners know not to get dynamically stuck in a similar loop.

Advertising is sometimes cited as a real-example of the prisoner’s dilemma. When cigarette advertising was legal in the United States, competing cigarette manufacturers had to decide how much money to spend on advertising. The effectiveness of Firm A’s advertising was partially determined by the advertising conducted by Firm B. Likewise, the profit derived from advertising for Firm B is affected by the advertising conducted by Firm A. If both Firm A and Firm B chose to advertise during a given period, then the advertising cancels out, receipts remain constant, and expenses increase due to the cost of advertising. Both firms would benefit from a reduction in advertising.

However, should Firm B choose not to advertise, Firm A could benefit greatly by advertising. Nevertheless, the optimal amount of advertising by one firm depends on how much advertising the other undertakes. As the best strategy is dependent on what the other firm chooses there is no dominant strategy, which makes it slightly different from a prisoner’s dilemma. The outcome is similar, though, in that both firms would be better off were they to advertise less than in the equilibrium. Sometimes cooperative behaviors do emerge in business situations. For instance, cigarette manufacturers endorsed the making of laws banning cigarette advertising, understanding that this would reduce costs and increase profits across the industry.

Doping in sport has been cited as an example of a prisoner’s dilemma. Two competing athletes have the option to use an illegal and/or dangerous drug to boost their performance. If neither athlete takes the drug, then neither gains an advantage. If only one does, then that athlete gains a significant advantage over their competitor, reduced by the legal and/or medical dangers of having taken the drug. If both athletes take the drug, however, the benefits cancel out and only the dangers remain, putting them both in a worse position than if neither had used doping.

Many real-life dilemmas involve multiple players. Although metaphorical, n American ecologist Garrett Hardin’s tragedy of the commons may be viewed as an example of a multi-player generalization of the PD. It describes a problem where many people with their own ideas can make something they all share worse, even if no one wants to. For example, even if no one wants to pollute water because that makes it unhealthy, it can still end up like that because so many want to use the water for their own reasons, like washing and throwing away rubbish. Each person thinks that their small bit of pollution of the water is too small to affect the quality of the water, but because there are many people the total effect ends up making the water too polluted for mostly anybody to use for drinking or even washing. This can occur in slums and other overcrowded places like refugee camps.

Each person makes a choice for personal gain or restraint. The collective reward for unanimous (or even frequent) defection is very low payoffs (representing the destruction of the ‘commons’). A commons dilemma most people can relate to is washing the dishes in a shared house. By not washing dishes an individual can gain by saving his time, but if that behavior is adopted by every resident the collective cost is no clean plates for anyone.

The commons are not always exploited: American skeptic William Poundstone, in a book about the prisoner’s dilemma describes a situation in New Zealand where newspaper boxes are left unlocked. It is possible for people to take a paper without paying (defecting) but very few do, feeling that if they do not pay then neither will others, destroying the system. Subsequent research by Elinor Ostrom, winner of the 2009 Nobel Prize in Economics, hypothesized that the tragedy of the commons is oversimplified, with the negative outcome influenced by outside influences. Without complicating pressures, groups communicate and manage the commons among themselves for their mutual benefit, enforcing social norms to preserve the resource and achieve the maximum good for the group, an example of effecting the best case outcome for PD.

In international political theory, the Prisoner’s Dilemma is often used to demonstrate the coherence of strategic realism, which holds that in international relations, all states (regardless of their internal policies or professed ideology), will act in their rational self-interest given international anarchy. A classic example is an arms race like the Cold War and similar conflicts. During the Cold War the opposing alliances of NATO and the Warsaw Pact both had the choice to arm or disarm.

From each side’s point of view, disarming whilst their opponent continued to arm would have led to military inferiority and possible annihilation. Conversely, arming whilst their opponent disarmed would have led to superiority. If both sides chose to arm, neither could afford to attack the other, but at the high cost of developing and maintaining a nuclear arsenal. If both sides chose to disarm, war would be avoided and there would be no costs.

Although the ‘best’ overall outcome is for both sides to disarm, the rational course for both sides is to arm, and this is indeed what happened. Both sides poured enormous resources into military research and armament in a war of attrition for the next thirty years until the Soviet Union could not withstand the economic cost. The same logic could be applied in any similar scenario, be it economic or technological competition between sovereign states.

Related

Posted on February 3, 2017 at 8:20 pm in Games, Politics, Science | RSS feed | Reply | Trackback URL

One Comment to “Prisoner’s Dilemma”

ksbeth
February 3, 2017 at 9:00 pm

so interesting…

Reply

Leave a comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.