Ads Ranking
Saw a video on YouTube talking about Ads Ranking at Meta, closely after I read about Ads in general, and what goes behind serving ads in August. So here’s a deep dive.
Meta makes a lot of revenue from running ads, and it’s a pretty large problem to solve, as both sides—users and advertisers—are very diverse on the platform. This requires a lot of feature engineering and selection to really grasp what to model.
- Ads Ranking
- The Ranking Formula: Predicting Total Value
- Modeling: The Two-Tower Architecture
- The Features
- Auctions and Bidding
- Cold Start For Ads ?
- Further Reading
ᓚᘏᗢ ᓚᘏᗢ ᓚᘏᗢ ᓚᘏᗢ ᓚᘏᗢ
cats as a line break
The Ranking Formula: Predicting Total Value
At its core, ad ranking is about predicting the total value an ad impression will generate. This isn’t just about whether a user will click, but also what happens after the click. This is often summarized by a metric called eCPM (effective Cost Per Mille). To calculate this, we first need to predict user actions.
pCTR: Will the user click?
pCTR (predicted Click-Through Rate) is the probability that a user will click on an ad after seeing it. A high pCTR means the ad is likely relevant and engaging to a specific user at that moment.
\[\text{pCTR} = P( ~ \text{Click} ~ | ~ \text{User, Ad, Context} ~ )\]pCVR: Will the user convert?
pCVR (predicted Conversion Rate) is the probability that a user will take a desired action after clicking the ad. A “conversion” could be anything the advertiser values: add to wishlist/cart, making a purchase, or installing an app (which TataClip / TataNeu loves).
\[\text{pCVR} = P(~ \text{Conversion} ~ | ~ \text{Click, User, Ad, Context} ~ )\]This is crucial because a click alone doesn’t mean the advertiser achieved their goal. High pCVR suggests the ad’s landing page and offering are compelling to the user.
Calculating the Ad Score (eCPM)
The final ranking score combines these predictions with the advertiser’s bid. It balances the platforms goal of showing relevant content to users with the advertiser’s goal (click purchase whatever).
A simplified version of the formula looks like this:
\[\text{Ad Score} = ( ~ \text{pCTR} \times \text{Bid}_{\text{click}} ~ ) + ( ~ \text{pCVR} \times \text{Bid}_{\text{conversion}} ~ )\]This score represents the total expected value. An ad with a slightly lower pCTR but a much higher pCVR and conversion bid might outrank an ad that’s just simple clickbait.
Modeling: The Two-Tower Architecture
To predict pCTR and pCVR, you need a model that understands the relationship between users and ads. Given the massive scale at Meta, a Two-Tower Model is a very effective architecture for this.
It works by separating the user and the ad into two different “towers” (which are essentially two separate deep neural networks).
- User Tower: This network takes all the user-side features (age, interests, past interactions) and computes a fixed-size numerical vector, or embedding, that represents the user.
- Ad Tower: This network does the same for ad-side features (category, brand, text, image) and also computes an embedding of the same size.
During serving, the model calculates the embeddings for the user and for many candidate ads. The final score (a proxy for pCTR/pCVR) is just the dot product of the user embedding and the ad embedding. A higher dot product implies a better match.
This is an efficient design as the ad embeddings can be pre-computed and stored. When a user logs in, you only need to compute their user embedding once and then use approximate nearest neighbor search to find the best-matching ad embeddings from millions of candidates.
~.~”~..~”~..~”~..~”~..~”~._.~”~.
a wave as a line break
The Features
The models rely on a rich set of features from both the user and the ad.
User-side Features
These are split into two main types:
User Description Features
- Demographics:
age,country,gender - Inferred Interests: Topics they’ve engaged with
User Interaction Features
- Past ad engagement:
clicks,add-to-carts,purchasesover a 90-day period (not fixed) - Ad IDs that led to a conversion.
Context Features
- Ongoing or expected events:
user local time,user city(for hyperlocal demand),season,is_holiday.
Ad-side Features
Content Features
Title,descriptionad_categorybrand_nameavg_priceof the product(s)image
Historical Performance Features
avg_CTRacross different user segments- number of
likes,comments, andsharesthe ad already has.
Auctions and Bidding
When a user opens their feed there isn’t just one ad waiting; there’s an auction in a ad exchange.
Here’s how it generally works:
- Ad Request: User opens app => sends an ad request to ad exchange
- Candidate Selection: The system retrieves a few hundred relevant ads from a pool of millions
- Bidding & Ranking: For each of these ads, advertisers have already placed bids. The ranking model calculates the Ad Score (eCPM) for each one.
- The Auction: The ads are ranked by their Ad Score, one with the highest score wins the auction.
Generalized Second-Price Auction
The winner does not pay what they bid, instead a Generalized Second-Price (GSP) auction is used.
- The ad with the highest Ad Score wins the top slot.
- The winner pays the bid of the advertiser ranked just below them, plus a small increment (e.g. $0.01).
Price Calculatation in a GSP Auction
In GSP auctions, ads are ranked by Ad Rank, computed as:
\[\text{Ad Rank}_i = \text{Bid}_i \times \text{Quality}_i\]Then, ads are ordered by Ad Rank (highest first).
The winner doesn’t pay their own bid , instead they pay just enough to maintain their position above the next advertiser.
So the price per click (PPC) for advertiser I is:
where
- $\text{Ad Rank}_{i+1}$ is the next advertiser’s score,
- $\text{Quality}_i$ is the ads own quality score,
- and $\epsilon$ is a tiny increment (like $0.01).
Example Table
| Advertiser | Bid ($) | Quality (pCTR etc.) | Ad Rank (Bid × Quality) | Price Paid ($) |
|---|---|---|---|---|
| A | 2.00 | 0.50 | 1.00 | (0.81 ÷ 0.50) + 0.01 = 1.63 |
| B | 1.80 | 0.45 | 0.81 | (0.48 ÷ 0.45) + 0.01 = 1.08 |
| C | 1.20 | 0.40 | 0.48 | ~ (lowest rank, pays reserve price) = 0.01 |
Step-by-step for Advertiser A:
- Next advertiser (B) has Ad Rank = 0.81
- A’s Quality = 0.50
- Apply formula:
\(\text{Price}_A = \frac{0.81}{0.50} + 0.01 = 1.63\) So A pays $1.63 per click, not their full $2.00 bid.
- This system rewards high-quality ads, because if ad quality is better (higher pCTR, better engagement), advertiser can pay less for the same position.
- It also discourages overbidding as price depends on competitors.
Cold Start For Ads ?
How do you rank ads or personalize for users when they are brand new to the platform ?
| User Cold Start | Ad Cold Start | |
|---|---|---|
| The Problem | A new user has no interaction history, making it difficult to build a user embedding or personalize their feed. | A new ad has no performance data (no historical CTR or conversion rate), making it hard to predict its eCPM. |
| Solutions | - Use sign-up information (age, gender, location) as initial features. - Serve globally or regionally popular/trending ads. - Use a simpler exploration model to quickly learn user preferences. |
- Use content-based features from the ad tower (text, image, category, price) to infer its likely performance. - The system can give the ad a temporary “exploration budget” to show it to a small, diverse audience to quickly gather performance data. |
Table showing how different cold start for ads is, comparing to users
Further Reading
Acknowledgements
Codementor talk by a Ranking Engineer at Meta YouTube link
Multi Objective Loss Function
Ads ranking as a multi objective loss function, as we have to combine engagement and value. Balance between user experience (relevace, fatigue, diversity) and value for advertiser.
\[L = -(\alpha \cdot \text{Revenue Gain} + \beta \cdot \text{User Engagement})\]