You are on page 1of 3

Cunicula s Game Theory Primer for Computer Scientists invovled in Bitcoin Abstract: Recently, a couple of incompetent computer scientists

released a white paper entitled "Majority is not Enough: Bitcoin Mining is Vulnerable." The paper represents a fundamental misunderstanding of the incentives underlying the bitcoin protocol. I expected bitcoin devs and other smart people to immediately dismiss the paper because it gets the relevant game theory all wrong. This hasn t happened. This paper oers a brief explanation of incentives supporting honest mining and explains why the incompetent computer scientists are incompetent. 1) Bitcoin mining is and always has been a prisoner s dilemna. The author s supposed revelation is that bitcoin is "not incentive compatible". In slightly, more familar terms this means that bitcoin mining is similar to the prisoner s dilemana. In other words, mining should degenerate into a bad equilibrium of selsh mining. The payos from a standard type of prisoner s dilemna game as shown below. a0 = good 0 a = self ish a = good (1; 1) (2; 0) a = self ish (0; 2) (1; 1)

The game has a unique nash equilbirum as shown in the bottom right corner. This is the selsh equlibrium where everyone adopts selsh mining. Here I have made the good,good and selsh, selsh payos symmetric. I introduce price eects later that cause asymmetry. This is an extremely obvious point. It is not a noteworthy research result. Any pool could announce, hey I m going to conduct a 51% attack and I will pay a 10% premium on shares to the rst 51% of hashing power that joins me. Short-sighted miner s would respond by joining the pool and earning 10% higher prots than they normally would. If everyone is very short-sighted, the the 51% attack would succeed. We do not need to think about sophisticated mining strategies to see this. 2) Bitcion mining is a dynamic game. In a dynamic setting, the prisoners rationally choose to be good rather than selsh. The so-called prisoner s dilemna becomes a non-issue. The dynamic prisoner s dilemna is modelled recursively in discrete time as shown in the equation below. V (st
1 ; k)

= max fp (st
a

0 1 ) ku(a; a )

+ V (st (a; a0 ); k )g

V (st 1 ; k ) is a value function equal to the miner s expected lifetime payo. st 1 (a:a0 ) describes the choice of strategies in the previous period. Here this is just st 1 = someone played self ish or st 1 = everyone was good 1

p (st 1 ) is the current bitcoin price as a functoin of st 1 We can assume that p (someone played self ish) = pself ish < pgood = p (everyone was good) : That is, selsh behavior will have a negative eect on future bitcoin prices. k is the number of units of mining hardware owned by the miner. I am assuming this remains constnat over time. a is the miner s choice of strategy. There are two options here a = good or a = self ish: a0 is the strategy pursued by other miners. There are two options here 0 a = good or a0 = self ish: u(a; a0 ) is the payo from one period of the game (refer to the two by two table above) is the discount factor representing a lower valuation on payos realized in the future. It must be between 0 and 1: The basic prisoners dilemna models the payo in a single period, focusing on u (a; a0 ) : This is incorrect. The miner enter s the next period with k units of hardware. Surely he will cares if these hardware drop in value. This is the essential point. Because he own s specialized hardware, the miner has a stake in the system. In a dynamic setting, the one period game is called a "subgame." The basic dynamic period equilibrium concept is called a subgame perfect equlibrium. Let s look at the goodsubgame perfect equilibrium. Suppose that all miners adopt the following strategy (called a grim trigger strategy) 1) If anyone has ever played selsh in the past, then play selsh. 2) If no one has ever played selsh in the past, then play good. Suppose all other miners have been good in the past. Therefore our miner believes they will to continue to be good. (why would he believe otherwise?) Our miner can choose either cheat or be good. If he cheat s then he gets one period of good payos and a whole lifetime of bad payos. The expected value from cheating is 2phigh k + 1 kplow

The expected value from being good is kphigh + kphigh = k 1 1 phigh

This tells us that playing good is a subgame perfect equilibrium as long as

k Simplifying

1 1

phigh > 2phigh k +

kplow

phigh >

(1

2 (1

))

plow if

>

1 2

This tells us two things. 1) The larger the price dierence between phigh and plow the easier it is to sustain the good equilbrium. 2) The higher the discount factor , the easier it is sustain the good equilibrium. Conclusion: Selsh mining is one nash equilibirum. However, provided that other people have been playing nicely. The individually rational behavior for bitcoin miner s is to continue to play nicely. It is irrational to behave selshly. Caveats: Suppose we used CPU mining instead of ASIC mining. In this case, we could guess that the CPU s value is unrelated to bitcoin and that one could easily sell o the CPU. If so, we would not want to assume that k is constant over time. The miner could go into the next period with no mining hardware at all. Without any mining hardware, the miner would no longer care about future bitcoin prices. Under this condition (where the miner has no sunk investment in the system), there is no way of sustaining the good equilibrium. Implication, restricting a coin to CPU mining is a very bad idea. Exercise for the reader: What does this exercise tell us about proof-of-stake?

You might also like