Ads Ranking

Saw a video on YouTube talking about Ads Ranking at Meta, closely after I read about Ads in general, and what goes behind serving ads in August. So here’s a deep dive.

Meta makes a lot of revenue from running ads, and it’s a pretty large problem to solve, as both sides—users and advertisers—are very diverse on the platform. This requires a lot of feature engineering and selection to really grasp what to model.


ᓚᘏᗢ ᓚᘏᗢ ᓚᘏᗢ ᓚᘏᗢ ᓚᘏᗢ
cats as a line break

The Ranking Formula: Predicting Total Value

At its core, ad ranking is about predicting the total value an ad impression will generate. This isn’t just about whether a user will click, but also what happens after the click. This is often summarized by a metric called eCPM (effective Cost Per Mille). To calculate this, we first need to predict user actions.

pCTR: Will the user click?

pCTR (predicted Click-Through Rate) is the probability that a user will click on an ad after seeing it. A high pCTR means the ad is likely relevant and engaging to a specific user at that moment.

\[\text{pCTR} = P( ~ \text{Click} ~ | ~ \text{User, Ad, Context} ~ )\]

pCVR: Will the user convert?

pCVR (predicted Conversion Rate) is the probability that a user will take a desired action after clicking the ad. A “conversion” could be anything the advertiser values: add to wishlist/cart, making a purchase, or installing an app (which TataClip / TataNeu loves).

\[\text{pCVR} = P(~ \text{Conversion} ~ | ~ \text{Click, User, Ad, Context} ~ )\]

This is crucial because a click alone doesn’t mean the advertiser achieved their goal. High pCVR suggests the ad’s landing page and offering are compelling to the user.

Calculating the Ad Score (eCPM)

The final ranking score combines these predictions with the advertiser’s bid. It balances the platforms goal of showing relevant content to users with the advertiser’s goal (click purchase whatever).

A simplified version of the formula looks like this:

\[\text{Ad Score} = ( ~ \text{pCTR} \times \text{Bid}_{\text{click}} ~ ) + ( ~ \text{pCVR} \times \text{Bid}_{\text{conversion}} ~ )\]

This score represents the total expected value. An ad with a slightly lower pCTR but a much higher pCVR and conversion bid might outrank an ad that’s just simple clickbait.


Modeling: The Two-Tower Architecture

To predict pCTR and pCVR, you need a model that understands the relationship between users and ads. Given the massive scale at Meta, a Two-Tower Model is a very effective architecture for this.

Two Tower architectrue
A Two Tower model

It works by separating the user and the ad into two different “towers” (which are essentially two separate deep neural networks).

  1. User Tower: This network takes all the user-side features (age, interests, past interactions) and computes a fixed-size numerical vector, or embedding, that represents the user.
  2. Ad Tower: This network does the same for ad-side features (category, brand, text, image) and also computes an embedding of the same size.

During serving, the model calculates the embeddings for the user and for many candidate ads. The final score (a proxy for pCTR/pCVR) is just the dot product of the user embedding and the ad embedding. A higher dot product implies a better match.

This is an efficient design as the ad embeddings can be pre-computed and stored. When a user logs in, you only need to compute their user embedding once and then use approximate nearest neighbor search to find the best-matching ad embeddings from millions of candidates.


~.~”~..~”~..~”~..~”~..~”~._.~”~.
a wave as a line break

The Features

The models rely on a rich set of features from both the user and the ad.

User-side Features

These are split into two main types:

User Description Features

User Interaction Features

Context Features

Ad-side Features

Content Features

Historical Performance Features


Auctions and Bidding

When a user opens their feed there isn’t just one ad waiting; there’s an auction in a ad exchange.

Here’s how it generally works:

  1. Ad Request: User opens app => sends an ad request to ad exchange
  2. Candidate Selection: The system retrieves a few hundred relevant ads from a pool of millions
  3. Bidding & Ranking: For each of these ads, advertisers have already placed bids. The ranking model calculates the Ad Score (eCPM) for each one.
  4. The Auction: The ads are ranked by their Ad Score, one with the highest score wins the auction.

Generalized Second-Price Auction

The winner does not pay what they bid, instead a Generalized Second-Price (GSP) auction is used.

Price Calculatation in a GSP Auction

In GSP auctions, ads are ranked by Ad Rank, computed as:

\[\text{Ad Rank}_i = \text{Bid}_i \times \text{Quality}_i\]

Then, ads are ordered by Ad Rank (highest first).
The winner doesn’t pay their own bid , instead they pay just enough to maintain their position above the next advertiser.
So the price per click (PPC) for advertiser I is:

\[\text{Price}_i = \frac{\text{Ad Rank}_{i+1}}{\text{Quality}_i} + \epsilon\]

where


Example Table

Advertiser Bid ($) Quality (pCTR etc.) Ad Rank (Bid × Quality) Price Paid ($)
A 2.00 0.50 1.00 (0.81 ÷ 0.50) + 0.01 = 1.63
B 1.80 0.45 0.81 (0.48 ÷ 0.45) + 0.01 = 1.08
C 1.20 0.40 0.48 ~ (lowest rank, pays reserve price) = 0.01

Step-by-step for Advertiser A:

  1. Next advertiser (B) has Ad Rank = 0.81
  2. A’s Quality = 0.50
  3. Apply formula:
    \(\text{Price}_A = \frac{0.81}{0.50} + 0.01 = 1.63\) So A pays $1.63 per click, not their full $2.00 bid.


Cold Start For Ads ?

How do you rank ads or personalize for users when they are brand new to the platform ?

  User Cold Start Ad Cold Start
The Problem A new user has no interaction history, making it difficult to build a user embedding or personalize their feed. A new ad has no performance data (no historical CTR or conversion rate), making it hard to predict its eCPM.
Solutions - Use sign-up information (age, gender, location) as initial features.
- Serve globally or regionally popular/trending ads.
- Use a simpler exploration model to quickly learn user preferences.
- Use content-based features from the ad tower (text, image, category, price) to infer its likely performance.
- The system can give the ad a temporary “exploration budget” to show it to a small, diverse audience to quickly gather performance data.

Table showing how different cold start for ads is, comparing to users



Further Reading

Acknowledgements

Codementor talk by a Ranking Engineer at Meta YouTube link

Multi Objective Loss Function

Ads ranking as a multi objective loss function, as we have to combine engagement and value. Balance between user experience (relevace, fatigue, diversity) and value for advertiser.

\[L = -(\alpha \cdot \text{Revenue Gain} + \beta \cdot \text{User Engagement})\]

Last updated: October 4, 2025