Finitely Repeated Games: SPNE, Rewards and the Uniqueness Theorem
Repetition and observability create the opportunity for rewards and punishments. In a finitely repeated Prisoner’s Dilemma, the unique subgame‑perfect Nash equilibrium (SPNE) is to defect in every period. As Paul Milgrom observes, “The game is always larger than you think,” reflecting how real‑world interactions often extend beyond the modeled horizon.
Modeling Finitely Repeated Games
A finitely repeated game is a special case of a multistage game. The stage game (G) is played from period 0 to period (T), so the repeated game is denoted (G(T)). Players observe the entire history of past actions, which defines information sets and allows history‑dependent strategies. A strategy is a complete contingent plan that specifies an action for every possible history. Payoffs are usually averaged over the (T+1) periods.
The one‑shot deviation principle is the standard method for verifying SPNE in multistage games. It requires that no player can improve his payoff by changing his action in a single period while keeping all later actions unchanged.
The Uniqueness Theorem
Theorem. If the stage game (G) has a unique Nash equilibrium (a^{}), then the finitely repeated game (G(T)) has a unique SPNE in which players play (a^{}) after every history.
Proof sketch. In the final period (T) there is no future to influence, so players must choose a Nash equilibrium of the stage game. Because the equilibrium is unique, the only feasible action profile is (a^{}). Backward induction then fixes the incentives in period (T-1), (T-2), …, 0, and no history‑dependent rewards or punishments can arise. Consequently, the only SPNE repeats (a^{}) in every period.
Games with Multiple Nash Equilibria
When a stage game possesses multiple Nash equilibria, the uniqueness theorem no longer applies. The Stag Hunt illustrates this case. Its payoff matrix includes a “good” equilibrium (both players hunt stag, payoff 2,2) and a “bad” equilibrium (both hunt hare, payoff 1,1). Players can construct SPNE that use the good equilibrium as a reward in early periods and the bad equilibrium as a punishment in later periods.
A deviation today changes the history (h_{t+1}) and can trigger a switch to the worse equilibrium in the final period. The threat of receiving the lower‑payoff equilibrium tomorrow is just large enough to offset any immediate gain, deterring profitable deviations even though the stage game itself offers multiple Nash outcomes.
Mechanisms and Explanations
Backward induction in repeated games. In period (T) players must play a Nash equilibrium of the stage game because no future period exists. This fixed point determines the incentives for period (T-1); the logic propagates backward to period 0, ensuring consistency across all subgames.
Deterrence mechanism. A player’s action (a_{it}) influences both the immediate flow payoff and the subsequent history (h_{t+1}). If future actions are contingent on that history, a player can be deterred from a profitable deviation today by the threat of a lower‑payoff equilibrium tomorrow. As the lecture notes, “The punishment you experience tomorrow is just large enough to offset the gain you experienced today.”
Illustrative Payoffs
- Prisoner’s Dilemma: (2,2) for mutual cooperation, (3,‑1) or (‑1,3) for one‑sided defection, (0,0) for mutual defection.
- Stag Hunt: (2,2) for mutual stag, (1,1) for mutual hare, (1,0) or (0,1) for mixed outcomes.
These numerical examples make the reward‑punishment logic concrete and show how the structure of the stage game shapes the set of possible SPNE in its finite repetition.
Takeaways
- If a stage game has a single Nash equilibrium, the finitely repeated game has a unique SPNE that repeats that equilibrium in every period.
- The one‑shot deviation principle checks that no player can profit by deviating in a single period while keeping later actions unchanged.
- When a stage game has multiple Nash equilibria, such as the Stag Hunt, players can use reward and punishment equilibria to sustain cooperative behavior early on.
- Backward induction forces the final period to be a Nash equilibrium of the stage game, and this fixed point determines incentives in all earlier periods.
- A future threat of a lower‑payoff equilibrium can offset a short‑term gain, making deviations unattractive even when immediate payoffs are higher.
Frequently Asked Questions
Why does a unique Nash equilibrium in the stage game guarantee a unique SPNE in the finitely repeated game?
Because backward induction forces the last period to play the unique Nash equilibrium, and with no alternative equilibria future play cannot be conditioned on history, so the only strategy profile that survives subgame perfection is to repeat that equilibrium each period.
How does the threat of punishment work in a finitely repeated Stag Hunt?
In a Stag Hunt with two Nash equilibria, players can agree to play the high‑payoff “stag” outcome early on and threaten to switch to the low‑payoff “hare” equilibrium if anyone deviates; the expected loss from the future punishment outweighs the immediate gain, deterring deviation.
Who is MIT OpenCourseWare on YouTube?
MIT OpenCourseWare is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.
Does this page include the full transcript of the video?
Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.
Helpful resources related to this video
If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.
Links may be affiliate links. We only include resources that are genuinely relevant to the topic.