The Staggered Rollout Illusion: Why Your Intuitive Analysis of Phased Programs is Likely Flawed.
- Maria Alice Maia

- Feb 3
- 2 min read
You rolled out a new CRM to the East region in Q1 and the Central region in Q3. You ran a standard Difference-in-Differences (DiD) model to get the “average effect.”
I have bad news: that number is probably meaningless. And it might even be telling you the program is a failure when it’s actually a success.
This is one of the most dangerous, subtle, and widespread "Doing Data Wrong" mistakes in modern analytics. It happens when we take a trusted tool—the two-way fixed effects (TWFE) DiD model—and apply it to a staggered rollout scenario, where different groups receive a treatment at different times.

The "Staggered Rollout Illusion" in a Sales Department
The TWFE model feels right. It’s easy to run. But as groundbreaking research by Andrew Goodman-Bacon shows, it is a Frankenstein's monster of comparisons.
A standard TWFE model with staggered adoption doesn't just compare treated units to clean, untreated units. It also makes “forbidden comparisons”: it uses your early-adopter regions (East, treated in Q1) as a “control group” for your later-adopter regions (Central, treated in Q3).
This is a logical catastrophe.
You are using a group whose sales are already being affected by the treatment as the baseline for what “would have happened anyway.”
If the CRM is working and the East region’s sales are still growing in Q3, this makes the Central region’s performance look worse in comparison. When the model averages everything together, the true positive effects get watered down or, worse, cancelled out.
This can lead to the infamous negative weighting problem. The model can literally put negative weight on a positive treatment effect. Your CRM could be a wild success in every single region, yet the TWFE regression could spit out an average effect of zero, or even a negative number.
As a leader who has managed phased rollouts of new technologies at companies like Stone and Ambev, this is my analytical nightmare: a model that punishes you for success.
The solution is to abandon the simple TWFE model in staggered settings and use modern methods (from researchers like Callaway & Sant’Anna, Sun & Abraham, and Borusyak, Jaravel & Spiess) that are designed for this reality. These new estimators are smarter. They only make clean comparisons—always comparing a newly treated group to other groups that have not yet been treated.
My mission is to bring these critical updates from the research frontier to the front lines of business. We have to stop using broken tools. This knowledge, which protects us from making terrible decisions based on flawed models, is not mine to keep.
If you’re ready to graduate from the old DiD playbook and learn how to correctly analyze your staggered programs, join my movement. Subscribe to my email list for more no-nonsense insights on state-of-the-art methods.
And if you’re looking at a staggered DiD result right now, book a 20-minute, no-nonsense consultation with me. Let’s make sure you haven’t fallen into the forbidden comparison trap.


