The goal of any direct mail campaign, or other messaging effort, is to persuade somebody to do something. In the business world, it is usually to buy something. In the political world, it is usually to vote for someone (or, if you think you know who they will vote for, to encourage them to actually vote). Standard predictive modeling assumes a single message being mailed, and answers the question “which people should we mail to.” Uplift modeling goes beyond this and answers the question “which message should we send to which people?”
Long before the advent of the internet, sending messages directly to individuals (i.e., direct mail) held a big share of the advertising market. Direct marketing affords the marketer the ability to invite and monitor direct responses from consumers. This, in turn, allows the marketer to learn by conducting scientific experiments and tests. A message can be tested with a small section of a large list and, if it pays off, the message can be rolled out to the entire list. With predictive modeling, the rollout can be targeted to that portion of the list that is most likely to respond. Direct response also makes it possible to test one message against another and find out which does better in a controlled experiment. None of this was possible with traditional media advertising (television, radio, newspaper, magazine), and it was this direct response feature that allowed direct mail, unglamorous though it was, to garner roughly a third of all advertising dollars spent in pre-internet days. Direct response methodology has been reborn in email campaigns and microtargeted web campaigns.
What is uplift, as distinct from lift? It’s the combination of the A-B test plus the predictive model. Here are the basic steps in uplift modeling in the context of a mail or email messaging campaign:
- Conduct a randomized test of message A versus message B, recording the outcome (e.g. purchase or no-purchase)
- Train a model using the outcome variable (e.g. response yes or no), relevant predictor variables, and a binary predictor variable that indicates which message was sent, A or B
- Score the model twice to new data with unknown response – once with the message variable set to A and then again with the message variable set to B, to get two “probability of purchase” scores for each person (one from message A and one from message B)
- Record the increase (decrease) in probability of purchase from using message A instead of message B – this is the uplift of message A over message B (it might be negative, in which case it is the uplit for B over A)
The adaptation of this process, and other statistical techniques, for political microtargeting has received a lot of attention of late. The process is much the same as above, except in the political world we do not have a confirmed purchase/no-purchase outcome (since the voting booth is secret). We need to find a proxy outcome, which we can do by adding two opinion surveys to the experiment phase – one before a message goes out and the other after.
The steps in the process for a political persuasion effort are as follows:
- Start with a list of customers or voters who you can reach (via mail, email or other direct response method)
- Take a sample (its size should be large enough to detect the difference in proportions you are aiming to find)
- Survey the sample to determine candidate preference
- Split the sample randomly, and send each half a different message – an A-B test
- Survey the sample shortly afterwards to determine candidate preference, and record the extent to which individuals moved for or against your preferred candidate
Now we have the outcome variable that can be used to train an uplift model – the extent of movement for or against your candidate. We can couple it with predictor variables, such as:
- Official voting records that show which elections, including primaries, a voter has participated in (but, obviously, not how they voted); this can actually be the list used in step 1 above
- Demographic information
- Consumer purchase information
- Donation records
- Detailed demographic and behavioral information about the voter’s neighborhood
Determining Which Message to Send
Now we have data for our uplift model, which we train, as above. The final stage is to score this model, twice, to new data that has received no actual message either way. The first time we assign a value of “A” to the message predictor, the second time we assign a value of “B.” We end up with a guide, for new voters, as to which message should be sent.
In political marketing, you want to avoid rousing opponents to action. So, often, the comparison is made not between two different messages but between sending a message versus sending no message. In this way you can identify the voters who would swing the wrong way upon receiving your message, and avoid disturbing them.
Ethical issues are receiving greater attention in data science discussions – did you note the lurking one here? It’s the purchase of data that some people might think of as private (on consumer behavior, donations, etc.) that is then coupled with their political behavior at an individual level. Obviously, all regulations and rules concerning private data should be adhered to. In addition, I think data scientists working on a project like this should deliberate, with decision-making colleagues in their organization, on exactly what they are doing and what the targets (voters) might think. With commercial consumers, at least, there is awareness that a benefit of targeted modeling with private data is a web experience and consumer information that is relevant to them. With political persuasion, there is no such benefit.