Game Theory in the context of DS

Well as mentioned in Adv CS topics to explore , I could not read about Game Theory back in university, due to well, academic restrictions (that somehow only allowed me to pick a single elective for each semester in my sophomore year). This led me to choose between Probability and Random Processes by Prof. SNS and Game Theory by Prof. Ojha. I ended up picking PRP as I wanted to end up somewhere in DS.

Okay, now that we’re done with a bit of backstory, let’s jump into the juicy part.


𓆝 𓆟 𓆞 𓆝 𓆟
fish as a line break

Basics of Game Theory

Before jumping into continuous games and equations, it helps to ground ourselves in a very small example.

Consider a simple two-player game:

The payoff matrix below lists utilities in the form (Sam, Tom).

Sam \ Tom A B
A (1, 1) (1, -1)
B (-1, 1) (0, 0)

Utility

Utility is simply a numerical representation of how good an outcome is for a player. Higher utility means the player prefers that outcome more.

In this table:

You can think of utility as:

Dominant Strategy

A dominant strategy is an action that gives a player higher utility regardless of what the other player does.

Let’s look at Sam’s choices:

So A is a dominant strategy for Sam.

Now look at Tom:

A is also a dominant strategy for Tom.

When a dominant strategy exists, players do not need to reason deeply about the other player — they simply play it.

Nash Equilibrium

A Nash equilibrium is an action profile where no player can improve their utility by unilaterally deviating.

In this game:

Formally, each player is playing a best response to the other’s action.

Not all games have dominant strategies, but many still have Nash equilibria. Nash equilibrium is therefore the more general and important concept.

Cournot Duopoly

The Cournot duopoly is a textbook example of a simultaneous-move game where:

This is especially relevant for commodities and FMCG goods.

In a marketplace context, this can be loosely mapped to two sellers on a niche SERP, listing highly similar products with little to no differentiation. Excess supply reduces prices, compresses margins, and can reduce result diversity — eventually leading to user fatigue.

Model setup

Let:

The inverse demand function is:

\[P(s_1, s_2) = A - B(s_1 + s_2)\]

Price decreases as total quantity in the market increases.

Let:

Total costs:

Utilities (profits)

Utility is profit = revenue − cost.

For firm 1:

\[u_1(s_1, s_2) = [A - B(s_1 + s_2)] s_1 - Cs_1\]

Rewriting:

\[u_1(s_1, s_2) = [A - C - B(s_1 + s_2)] s_1\]

Firm 1 chooses s1 assuming s2 is fixed. The question becomes:
what quantity maximizes firm 1’s profit?

Best response

To find the best response of firm 1, we maximize u1 with respect to s1.

\[\frac{\partial u_1}{\partial s_1} = A - C - 2Bs_1 - Bs_2\]

Setting this to zero:

\[s_1^* = \frac{A - C - Bs_2}{2B}\]

This best-response function tells us that firm 1 produces less when it expects firm 2 to produce more.

By symmetry, firm 2’s best response is:

\[s_2^* = \frac{A - C - Bs_1}{2B}\]

Nash equilibrium (competitive equilibrium)

At Nash equilibrium, both firms play their best responses simultaneously. Solving the system:

\[s_1^* = s_2^* = \frac{A - C}{3B}\]

This point is called the Cournot–Nash equilibrium or competitive equilibrium.

Neither firm can increase its profit by changing output alone, given the other firm’s quantity.

Joint profit maximization vs competition

Now ask a different question:
What if both firms coordinated to maximize total profit?

Total (joint) utility is:

\[u_T = u_1(s_1, s_2) + u_2(s_2, s_1)\]

Maximizing u_T corresponds to joint profit maximization, similar to a cartel or monopoly outcome.

This outcome yields:

However, it is not stable without enforcement. Each firm has an incentive to deviate and produce slightly more, which pushes the system back toward the competitive (Nash) equilibrium.

This is a recurring theme in game theory:

Nash equilibria are stable, but not necessarily Pareto optimal.

Mixed Strategy Games

So far, we’ve mostly discussed games where players pick a pure strategy, i.e. they deterministically choose a single action. Some of these games have dominant strategies, some do not — but even when dominant strategies are absent, a stable outcome may still exist in pure strategies.

However, there are games where no pure-strategy Nash equilibrium exists at all. In such cases, any deterministic choice can be exploited by the opponent. This is where mixed strategies become necessary.

A mixed strategy is a probability distribution over a player’s actions. Instead of always choosing a single action, the player randomizes — not out of indecision, but as a deliberate strategy to remain unpredictable.

Matching Pennies

Matching Pennies is a classic two-player, zero-sum game.

The payoff matrix below lists utilities as (Player A, Player B):

Player A \ Player B Heads Tails
Heads (1, -1) (-1, 1)
Tails (-1, 1) (1, -1)

Best responses and cycling

Let’s look at best responses:

So:

Now for Player B:

So:

This creates a best-response cycle:

No action pair is stable.
There is no pure-strategy Nash equilibrium.

Why randomization is necessary

Suppose Player A always plays Heads.

Player B immediately exploits this by always playing Tails and winning every round.

The same happens for any deterministic strategy. Predictability is punished.

To avoid being exploited, each player must randomize in a way that makes the opponent indifferent between their available actions.

In Matching Pennies, the only such equilibrium is:

This is the mixed-strategy Nash equilibrium.

At this point:

This idea — using randomness as a strategic tool — shows up frequently in:

Bayesian Games

So far, we’ve assumed that all players know everything about the game and each other. In many real systems, this assumption breaks down.

Bayesian games model situations of incomplete information, where players do not know some aspects of the environment — such as the type, intent, or constraints of other players.

Core ingredients of a Bayesian game

A Bayesian game introduces three new elements:

  1. Nature
    A random process that determines unknown parameters of the game.

  2. Types
    Each player may have a private type, known only to themselves.

  3. Beliefs
    Other players hold probabilistic beliefs over these types.

The solution concept is a Bayesian Nash Equilibrium (BNE), where:

Marketplace example: signaling and trust

Consider a generic online marketplace with buyers and sellers.

Some sellers are genuine, others are scammers.

Nature and types

Actions

The payment method acts as a signal.

Payoffs (illustrative)

For a genuine seller:

For a scammer:

The buyer prefers to transact only when the probability of scam is sufficiently low.

Belief updates

The buyer updates beliefs based on the observed payment method:

This belief update is what drives buyer behavior.

Separating equilibrium

In equilibrium:

This is called a separating equilibrium, because:

The key requirement is that:

Why this matters

In isolation, a genuine seller might prefer direct transfer due to lower fees.
But once signaling and belief updates are introduced, insured payment becomes the rational choice.

This explains why:

Game theory predicts that without credible signals, trust collapses, and mutually beneficial trades stop happening — even when most participants are honest.

Auction Theory

Auctions are one of the most important applications of game theory in real systems — especially in ads, marketplaces, and platform economics.

At a high level, auctions solve a price discovery problem.

The seller does not know how much buyers value an item. Each buyer knows their own willingness to pay, but this information is private. An auction is a mechanism designed to extract this private information through strategic interaction.

Values, bids, and utility

Before comparing auction formats, it’s important to separate three concepts:

utility = value − price paid (if you win) = 0 (if you lose)

A well-designed auction aligns incentives so that:

First-Price vs Second-Price Auctions

We restrict attention to single-item sealed-bid auctions.

First-Price Sealed-Bid Auction (FPA)

Strategic behavior: bid shading

If you bid your true value in a first-price auction and win, you pay exactly what the item is worth to you — yielding zero utility.

To achieve positive utility, bidders shade their bids downward:

bid = true value − shading

This creates a trade-off:

As a result:

This is why first-price auctions require sophisticated bidding strategies and strong assumptions about competitors.

Second-Price Sealed-Bid Auction (SPA)

Also known as a Vickrey Auction.

At first glance, this looks odd — why shouldn’t the winner pay what they bid?

The answer lies in incentives.

Why truthful bidding works in a Second-Price Auction

In a second-price auction, bidding your true value is a weakly dominant strategy.

Let:

The utility function is:

\[u_i(b_i, B) = \begin{cases} v_i - B & \text{if } b_i > B \\ 0 & \text{if } b_i < B \end{cases}\]

We compare truthful bidding (b_i = v_i) against deviations.


Case A: Underbidding (b_i < v_i)

  1. If B < b_i
    You win. Utility = v_i − B.
    Same outcome as bidding truthfully.

  2. If B > v_i
    You lose. Utility = 0.
    Same outcome as bidding truthfully.

  3. If b_i < B < v_i

    • Underbidding → you lose → utility 0
    • Truthful bidding → you win → utility v_i − B > 0

Underbidding can strictly reduce utility.


Case B: Overbidding (b_i > v_i)

  1. If B < v_i
    You win. Utility = v_i − B.
    Same outcome as truthful bidding.

  2. If B > b_i
    You lose. Utility = 0.
    Same outcome as truthful bidding.

  3. If v_i < B < b_i

    • Truthful bidding → you lose → utility 0
    • Overbidding → you win and pay B > v_i
      → utility is negative

Overbidding risks losses.


Conclusion

In all possible scenarios:

Therefore, truthful bidding is a weakly dominant strategy in a second-price auction.

Crucially, this strategy:

This robustness is what makes second-price auctions special.

Second-Price Auctions and Price Discovery

Now we connect incentives to price discovery.

In a second-price auction:

Conceptually:

True values (sorted): v1 < v2 < v3 < v4 Winner: v4 Price paid: v3

The price reflects:

Because bids reveal values, the seller learns:

This is why second-price auctions are often described as truth-revealing mechanisms.

Revenue Equivalence Theorem

A common misconception is that second-price auctions generate less revenue because the winner does not pay their own bid.

Under specific assumptions, this is not true.

The Revenue Equivalence Theorem states:

In auctions with risk-neutral bidders, independent private values, and no reserve price, all auction formats that allocate the item to the highest bidder yield the same expected revenue.

Key assumptions:

Interpretation:

This is why mechanism design focuses on incentives, not just revenue.

Generalized Second Price (GSP) Auctions

Many ad auctions are not single-item auctions but slot auctions (e.g. multiple ad positions).

The most common mechanism used in practice is the Generalized Second Price (GSP) auction.

Simplified view:

GSP is inspired by second-price auctions, but:

Despite this, GSP remains popular because:

Modern ad systems often combine:


Last updated: December 23, 2025