top of page

Be the first to know

Leave your e-mail to receive our weekly newsletter and access Ask-Me-Anything sessions exclusive to our subscribers.

New Method: Machine Learning Methods for Causal Inference (General Overview)

  • Writer: Maria Alice Maia
    Maria Alice Maia
  • Aug 5, 2024
  • 3 min read

Updated: Jul 14

Your marketing ML model is brilliant at predicting who will buy. But does it actually know why?

There’s a dangerous gap between prediction and causation, and many companies are falling into it, armed with the most advanced machine learning tools. They're using a Ferrari to drive straight into a wall.


This is one of the most insidious forms of “Doing Data Wrong”: using powerful predictive models to make causal decisions, like where to allocate your marketing budget. It feels sophisticated, but it often leads to burning cash on an industrial scale.


Let's break it down with a classic Marketing Department example.


The Wrong Way (The Illusion of Impact): A data science team builds a state-of-the-art predictive model that identifies customers highly likely to purchase after seeing a digital ad. The model boasts 90% accuracy! Management sees a high correlation between ad exposure and sales in "Segment A" and doubles down, pouring millions into targeting this group.

The problem? The model is a black box trained to find correlations, not causes. It’s possible "Segment A" were already your best customers, predisposed to buy with or without the ad. The model can't distinguish between the ad causing the sale and the ad simply being present when a sale was going to happen anyway. You're not measuring impact; you're measuring coincidence.


The Right Way (Unlocking True ROI with Causal ML): This is where the real work begins. We need to move beyond off-the-shelf prediction and embrace Causal Machine Learning. Groundbreaking research, like the work of Susan Athey and Guido Imbens, gives us the tools to do this.


Instead of one model to predict sales, we use smarter methods like Double Machine Learning (DML). In simple terms, we build two separate, supervised models:

  1. One to predict the outcome (e.g., the likelihood of a sale, based on customer history, demographics, etc.).

  2. Another to predict the treatment (e.g., the likelihood a customer was exposed to the ad, based on their Browse habits, platform, etc.).


By ingeniously playing these models against each other, DML isolates the true causal effect of the ad, stripping away the confounding factors. We're no longer guessing. We're measuring.


With this, the team discovers the ad has a massive ROI on first-time visitors but a negative ROI on loyal, repeat customers. They reallocate the budget away from the loyalists—where it was being wasted—and focus it on acquiring new customers, dramatically increasing the actual, causal return on ad spend.


My time as an executive at Itaú, Ambev, and Alura, and building NaHora.com from scratch, taught me that value is created by making better decisions, not by building more complex prediction engines. My research at Berkeley, HEC Paris, and FGV is driven by this same pragmatism: translating rigorous methods into tangible results.


This knowledge is not mine to keep. It's for all of us to share, to elevate our practices.

For Managers: Your question shouldn't be "How accurate is the model?" but "How are you isolating the causal impact of our investment?" Demand to see the methodology. If your team is only talking about predictive accuracy for a causal question, you have a problem.

For Data Professionals: The next frontier of your value isn't a higher prediction score. It's delivering unbiased causal estimates. Dive into the work on Causal Forests and Double Machine Learning. Learn how to build models that don't just predict the future but tell you how to change it for the better.


Let's stop celebrating correlations and start creating real, measurable impact.


If you’re ready to move beyond kindergarten data—even when it's disguised in a fancy ML package—and join a community committed to unlocking real business value, subscribe to my email list. You'll get no-nonsense, research-backed insights to fix your broken data practices.


And if this problem feels painfully familiar, let’s talk. Schedule a 20-minute, no-nonsense consultation, and let's discuss your real-world case.


Stay Ahead of the Curve

Leave your e-mail to receive our weekly newsletter and access Ask-Me-Anything sessions exclusive to our subscribers.

If you prefer to discuss a specific, real world challenge, schedule a 20-minutes consultation call with Maria Alice or one of her business partners.

Looking for Insights on a Specific Topic?

You can navigate between categories on the top of the page, go to the Insights page to see all articles and navigate across all pages, or use the box below to look for your topic of interest.

bottom of page