How to Conduct Lift Tests: A Practical Guide

If you've ever asked "did this campaign actually drive results, or would those conversions have happened anyway?", you need a lift test. It's the only way to establish true causation in marketing measurement, and it's more accessible than most teams think.

What Is a Lift Test?

A lift test (also called an incrementality test) is a controlled experiment that measures the causal impact of a marketing activity. The core idea is simple: split your audience or geographic regions into a test group (exposed to marketing) and a control group (not exposed), then compare outcomes.

The difference in performance between the two groups represents the incremental lift — the additional conversions, revenue, or other outcomes that would not have occurred without the marketing activity.

Unlike attribution models that distribute credit across touchpoints, lift tests answer a fundamentally different question: "What would have happened if we did nothing?"

Two Types of Lift Tests

1. Geo-Based Experiments

Geo experiments divide geographic regions into test and control groups. The test regions receive the marketing activity while the control regions do not. This approach works for any channel — including offline channels like TV, radio, and out-of-home — because you're measuring aggregate outcomes at the regional level.

When to use geo tests:

Measuring the impact of TV, radio, or outdoor advertising
Testing channels where user-level targeting isn't possible
When you need a privacy-compliant approach with no user-level data
Validating your MMM model outputs

How it works in practice:

Select matched markets — Use historical data to pair regions with similar sales patterns, demographics, and seasonality. Statistical matching methods like Euclidean distance or dynamic time warping work well here.
Randomly assign — Flip a coin (or use a randomisation algorithm) to assign one region in each pair to test, the other to control.
Run the experiment — Activate the campaign in test regions only. Typical test duration is 4-8 weeks, depending on the purchase cycle.
Analyse with causal methods — Use Bayesian structural time series (CausalImpact) or difference-in-differences to estimate the counterfactual and measure lift.

2. Audience-Based Experiments (RCTs)

Audience-based tests (randomised controlled trials) work at the user level. You randomly split users into test and control groups, serve ads only to the test group, and compare conversion rates. This is the standard approach for digital channels like paid social, display, and paid search.

When to use audience tests:

Measuring incrementality of digital campaigns (Meta, Google, TikTok)
Testing specific creative strategies or audience segments
When you need faster results with higher statistical power
A/B testing campaign variations for incremental impact

Platform-specific options:

Meta Conversion Lift — Meta's built-in tool creates holdout groups at the user level. You define the campaign and Meta handles randomisation.
Google Conversion Lift — Similar approach for Google Ads, using ghost ads or intent-to-treat holdouts.
Custom holdouts — Build your own by splitting audience lists and suppressing ads to the control group.

Designing a Robust Lift Test

Statistical Power and Sample Size

Before running any test, you need to ensure you have enough statistical power to detect a meaningful effect. This means calculating the minimum sample size (or number of geo regions) required.

Key inputs to a power calculation:

Baseline conversion rate — What's the current conversion rate without the campaign?
Minimum detectable effect (MDE) — What's the smallest lift you'd consider meaningful? (e.g. 5% relative lift)
Significance level (alpha) — Typically 0.05 (5% false positive rate)
Power (1 - beta) — Typically 0.8 (80% chance of detecting a true effect)

For geo experiments, the number of available regions is often the constraint. If you only have 20 DMAs, you need a larger effect size to achieve statistical significance. This is where Bayesian methods shine — they can extract more signal from smaller sample sizes by incorporating prior information.

Avoiding Common Pitfalls

Contamination — Ensure control groups aren't accidentally exposed to the campaign. For geo tests, check for media spillover between adjacent regions.
Seasonality — Don't run tests during unusual periods (Black Friday, product launches) unless that's specifically what you're testing.
Test duration — Run long enough to capture the full conversion cycle. If your average purchase cycle is 3 weeks, a 2-week test will undercount lift.
Novelty effects — The first week of a test often shows inflated results. Consider a burn-in period before measurement begins.
Multiple testing — If you're running many tests simultaneously, adjust for multiple comparisons to avoid false positives.

Analysing Lift Test Results

The Bayesian Approach

We strongly recommend Bayesian analysis for lift tests. Unlike frequentist p-values, Bayesian methods give you a full probability distribution over the lift estimate, which directly answers the business question: "What's the probability that this campaign had a positive impact, and how large is it likely to be?"

For geo-based tests, Bayesian structural time series (the method behind Google's CausalImpact) works by:

Building a time series model using the control group's data as predictors
Generating a synthetic counterfactual — what the test group would have done without the campaign
Comparing actual vs. counterfactual to estimate incremental impact with credible intervals

For audience-based tests, a simple Bayesian A/B test using Beta-Binomial models gives you the posterior probability that the test group outperformed the control, along with the distribution of the lift percentage.

Interpreting Results

When presenting lift test results, focus on:

Point estimate of lift — e.g. "The campaign drove a 15% increase in conversions"
Credible interval — e.g. "We're 95% confident the true lift is between 8% and 23%"
Probability of positive lift — e.g. "There's a 98% probability the campaign had a positive effect"
Incremental cost per acquisition — Campaign cost divided by incremental conversions, the true iCPA

Using Lift Tests to Calibrate Other Models

One of the most powerful applications of lift tests is calibrating your Marketing Mix Model. MMM estimates channel-level ROI from observational data, but those estimates can drift over time. Running periodic lift tests on key channels gives you ground-truth data points that you can use as Bayesian priors in your MMM, dramatically improving accuracy.

This creates a virtuous measurement cycle:

MMM identifies which channels appear to have the highest ROI
Lift tests validate (or challenge) those estimates with causal evidence
Calibrated MMM produces more accurate ROI estimates informed by experimental ground truth
Repeat quarterly as conditions change

Getting Started

You don't need a massive budget or a PhD in statistics to run lift tests. Start with a single channel where you have the most uncertainty about ROI. Design a simple geo or audience holdout, run it for 4-6 weeks, and analyse the results.

The insight you gain from one well-designed lift test is worth more than months of attribution data, because it tells you something attribution never can: what actually works.

Want help designing your first lift test?

We help brands build and run incrementality testing programmes in-house. Book a call to discuss your measurement needs.

Book a Discovery Call