What is Uplift Modeling and how can it be done with CausalML?

Uplift modeling is a predictive modeling technique that uses machine learning models to estimate the incremental effect of treatment at the user level. It is frequently used to personalize product offers, as well as to target promotions and advertisements. In the context of causal inference, in this article, we will discuss uplift modeling, its types of modeling, and finally, we will see how a Python-based package called CausalML can be used to deal with causal inference. Here are the main points to be discussed in this article.


  1. What is Uplift Modeling?
  2. Types of modeling
  3. Common requests
  4. How does CausalML do Uplifting?
  5. CausalML Features

Let’s start the discussion by understanding uplift modeling.

What is Uplift Modeling?

Uplift modeling is a predictive modeling technique that predicts the gradual influence of a therapy (such as a direct marketing campaign) on a person’s behavior. Uplift modeling uses scientific randomized control to test the effectiveness of an intervention as well as to build a predictive model that projects the incremental response to activity.

The response can be discrete (eg, a website visit) or continuous (eg, a phone call) (eg, customer revenue). Uplift modeling is a data mining approach that has been used primarily for upselling, cross-selling, churn, and retention operations in the financial services, telecommunications, and retail direct marketing industries.

Machine learning is used to answer the question: “How likely is the consumer to buy in the future?” in the propensity to buy model, which essentially explains a customer’s behavior towards a given action. This is enhanced by uplift modeling, which addresses more pressing issues:

  • Did the buyer buy from me because of my ad?
  • Am I wasting money promoting to customers who have already decided to buy?
  • Has my marketing had a negative impact (negative impact) on the likelihood of someone buying?

In other words, a traditional propensity model (as well as most machine learning algorithms) predicts the outcome (y) given a set of variables (x). Given certain variables, the uplift seeks to determine the influence of the therapy

The term “improvement” refers to the increased chance of an outcome with treatment compared to an outcome without treatment. We cannot directly perceive this difference or this causal effect but must infer it by an experiment. As shown in the image below, it is very beneficial to visualize a 2 x 2 matrix with four types of people (say) to classify as (a) persuasive, (b) sure, (c) do not disturb, and (d) lost cause.

We target the “a” demographic, those who are persuasive, to encourage the desired reaction. The treatment is either ineffective or useless for everyone, including the Do-Not-Disturbs. It is risky to “wake a sleeping dog” by contacting the Do-Not-Disturbs.

Finding persuasive is the goal of uplift modeling. Of course, uplift modeling can be used to model any projected outcome, human or otherwise, such as the influence of fertilizers on crop yields or the sending of political campaign emails.

Uplift modeling focuses on treatment efficacy, while typical predictive modeling focuses on outcome. Then you can focus your efforts on the cases most likely to benefit you.

Types of modeling

Direct modeling and indirect modeling are the two basic approaches here. The fundamental difference between the two approaches is the way the uplift patterns are measured and evaluated.

Direct modeling

In direct modeling, we model “directly” the difference in probabilities between two distinct groups. There are many approaches to this, almost all of which rely on tree-based algorithms that have been slightly modified to accommodate uplift modeling.

Tree models are ideal because they naturally model at the group level by iteratively splitting one group into two groups with each splitting decision. Unlike traditional tree-based models, which are designed to split data into smaller and smaller homogeneous groups, uplift models are designed to split our customers into heterogeneous groups each time they split (maximizing a uplift measurement).

They use various division criteria, such as Kullback-Leibler divergence, Euclidean distance, p-value, and chi-square distance. Hundreds of trees would be installed globally, similar to traditional tree-based methods.

Indirect modeling

Regular response patterns are reused to infer uplift using indirect uplift modeling techniques (meta-learners), which can be based on any basic algorithm. We model the expected value of the response for different treatments, rather than trying to directly optimize an elevation measurement.

For our direct mail campaign, we would calculate the likelihood of the customer using their credit card if the DM is sent, and the likelihood of them using their product if the DM is not sent. The estimated uplift is the difference between the two estimated probabilities. In practice, this can be a two-model approach (a separate model fitted to all control/treatment groups) or a unified model (a single model with the treatment part allocated from the feature space ).

Common requests

Below are some of the possible applications that are informed on various industries.

  • Elevation modeling can help to understand how treatments may impact certain groups differently rather than just comparing outcomes for the whole treatment group versus the control group. Also, how different are these effects?
  • A company wants to save customers who are about to unsubscribe by contacting them. When reaching out to its customers, the company wants to avoid upsetting them further by focusing only on salvageable high-risk customers.
  • A business wants to run a cross-sell campaign, but they don’t want to cross-sell to everyone because resources are limited and some customers may not need or want the other products.
  • A company has a database of leads, but it generates more leads than it can handle, and many of those leads are a waste of time. Agents can now process leads in any order they choose.

How does CausalML do Uplifting?

CausalML is a Python module that provides a suite of bottom-up modeling and causal inference tools based on state-of-the-art search and machine learning algorithms. Traditional causal analysis approaches, such as performing t-tests on randomized trials (A/B tests), can estimate the mean treatment effect (ATE) of a treatment or intervention.

However, in many applications, estimating these impacts at a finer scale is often desirable and useful. CausalML can be used by the end user to estimate the conditional mean treatment effect (CATE), which is the effect at the individual or segment level. Such estimates can allow a wide range of customization and optimization applications by applying different treatments to different users.

Uplift modeling is a crucial modeling approach made possible by CausalML. Uplift modeling is a causal learning approach to estimating the individual treatment effect of an experience. Using experimental data, the end user can calculate the additional impact of a treatment (such as a direct marketing action) on the behavior of an individual.

For example, if a company is deciding between many product lines to sell to its customers, CausalML can be used as a recommendation engine to identify the products that generate the maximum expected uplift for each given user.

CausalML provides a consistent API for running lifting algorithms, making it as easy as fitting a standard classification or regression model. Included metrics and visualization features, such as uplift curves, can be used to assess model performance. The first version of CausalML includes eight state-of-the-art uplift modeling algorithms (shown in the figure below).

CausalML Features

Targeting optimization, engagement personalization, and causal impact analysis are just a few of the use cases for CausalMLs.

Targeting optimization

We can use CausalML to target promotions to people who will bring the most value to the business. For example, we may offer promotions to consumers who are more likely to use a new product due to exposure to promotions in a cross-sell marketing campaign for existing customers.

Causal impact analysis

We can also use CausalML to examine the causal impact of a specific event using experimental or observational data with rich attributes. For example, we can examine how a cross-selling event with a customer influences the long-term spend of the platform.


To personalize engagement, CausalML can be used. A business can communicate with its customers in different ways, such as offering upsell options or using messaging channels for interactions. CausalML can be used to evaluate the effect of each combination for each customer and present customers with the most personalized offers possible.

Last words

Through this article, we have discussed model elevation which is basically a technique that models user behavior by applying an intervention with input variables. We have also discussed its main types of modeling and some applications where it can be applied. In this article, we have discussed a python package called CausalML which gives a path to practically implement causal inference or elevation. For more understanding on implementing uplift modeling with CausalML, you can refer to their GitHub repository where they have listed many implementation examples.

The references

Source link

Comments are closed.