In his book Predictive Analytics, Eric Siegel tells the story of marketing efforts at Telenor, a Norwegian telecom, to reduce churn (customers leaving for another carrier). Sophisticated analytics were used to guide the campaigns, but the managers gradually discovered that some campaigns were backfiring: they were inducing customers to leave who otherwise might have stayed. What went wrong? It wasn’t the content of the campaigns, it was simply the fact that some messages were sent to what the marketing industry terms “sleeping dog” customers, rousing them to cancel their service.
A “sleeping dog” customer is one that you are better off leaving alone. It might be a disgruntled customer spurred into action by a marketing message. It might be a customer who saw a better offer recently from another carrier and is now reminded by your messaging to pursue that offer. It might be an inattentive customer who was previously disposed to let an auto-renew deadline slip by, but now will take a look at the competition.
Predicting these sleeping dogs is just one example of a targeting goal in persuasion analytics, though perhaps not the most important one.
To predict the sleeping dogs we use uplift modeling. The goals are to identify the upper left quadrant and leave them alone, and the lower right corner and market to them:
|Action/Outcome||Yes /no churn||No /churn|
|Message||No /churn||Arousing the sleeping dog (Do not disturb)||Lost cause either way|
|Yes/no churn||Loyal customer, either way||Focus attention to save (persuadable)|
You are probably thinking “if we have already sent the messaging and learned all this, isn’t the game over?” The answer is that these are results from a test campaign to a small proportion of customers, in which we send the marketing message to some customers and not others and record the outcome. We then use that result in a predictive model to predict outcomes. The sequence is as follows:
- Perform an A-B test (message or no message) recording the outcome (churn, no churn) for each case.
- Fit a predictive model with “churn or no-churn” as the outcome, including “message or no message” (1 or 0) as a predictor. You now have a model that incorporates the effect of the message, in addition to the effects of other predictor variables.
- For a new record to be scored, add a binary synthetic “message” predictor that is set to 1 and run the model (calculate the probability of the outcome being “1”, churn).
- Switch the value of the message variable to 0 and score it again.
For any new record to be scored, the difference in the probabilities between #3 and #4 is the uplift of the message. If it is positive (probability of churn is higher with message than without), the customer is predicted as a sleeping dog to be left alone. If it is negative (probability of churn is lower with message than without), the customer is predicted as saveable IF a message is sent. If the uplift is close to zero, sending the message will probably do no good. If the message is costly to send, these are to be avoided.
The “message or no message” dichotomy is perhaps the most common use of uplift modeling, due to its suitability for churn reduction, where marketing can have a negative impact beyond its out of pocket cost. The ability to avoid that negative impact at a highly targeted level is valuable. However, you can see that the “message or no message” dichotomy is really just a specific case of “treatment A or B.” Uplift models are useful in non-sleeping-dog situations where you want to test the impact of message A or message B at the individual level. In this case, all customers might get a message, and it is just a question of which one. This is an extension of standard A-B testing:
- Standard A-B test: Test campaign A versus campaign B in a limited sample, and roll out the winner to everyone:
- A-B test as part of uplift model: Test campaign A versus campaign B in a limited sample, then use uplift modeling to determine who gets which message
A rapidly-growing use of uplift is in political campaigns. Elections are typically decided among a relatively small group of voters, and whether you vote is just as important to a campaign as who you vote for. Campaign managers divide potential voters up by two dimensions that are relevant for targeting:
- With us or against us
- Likely to vote or not
These categories may be binary or have middle grounds as well. The “with us or against us” category, in particular, has the all-important “undecideds.” Including the undecideds they can be tabulated as follows, along with the appropriate marketing strategy:
|Likely to Vote||Not Likely to Vote|
|With us||Little value in targeting||“Get out to vote” messaging|
|Against us||Negative value in targeting||Hands off!|
This table reflects a conceptual view of the underlying factors, one that might serve to limit the audience we are attempting to reach. We can distill it into an “individual action” table for the persuasion message, similar to the commercial marketing one described above:
|Message||Action/Outcome||Voter moves favorably||Voter moves unfavorably|
|Voter moves unfavorably||Arousing the sleeping dog (Do not disturb)||Lost cause either way|
|Voter moves favorably||Loyal voter, either way||Message will help!|
Directing the persuasion messaging can be governed by the uplift results; the get-out-the-vote messaging may not benefit as much from an uplift campaign, since it is hard to find a proxy for the act of voting that can be tested (the way the movement of opinion on the survey was a proxy outcome for sentiment).
Implementing an Uplift Model for Political Campaigns
The first step is to collect the data for a test. The likelihood of voting is assessed by consulting voting records, which are public by law. They do not show who you vote for, of course, but they do show whether you vote in each election, including primaries. Regular participation in Democratic or Republican primaries can be taken as evidence of partisan leaning. Additional information about partisan preference can be obtained from voter surveys.
But what to use as the outcome in A-B testing of the persuasion messaging? We cannot conduct a mini-election among a small proportion of voters, so we do the next best thing: conduct a survey. Two surveys, to be precise: survey 1 establishes baseline political sentiment for your candidate (pro or con) and survey 2 establishes post-messaging sentiment for your candidate (pro or con). The outcome of interest is a binary variable indicating whether the voter opinion has moved in the direction of your candidate. The process has more steps than the commercial marketing campaign:
- Conduct a sentiment survey in a sample of voters.
- Send a message promoting your candidate to half the sample.
- Conduct a second sentiment survey of the same voters; record whether the voter has moved in the direction of your candidate; this is the outcome of interest.
Note: the control group of no-message is important, as it is possible that some overall shift in opinion has occurred.
The remaining steps track the commercial marketing case outlined above:
- Fit a predictive model with “moved or no-move” as the outcome and include “message or no message” as a predictor. You now have a model that incorporates the effect of the message, in addition to the effects of other predictor variables.
- In a new record to be scored, add a binary synthetic “message” that is set to 1.
- Score the record, then switch the value of the synthetic message variable to 0.
- For both #5 and #6 calculate the predicted probability of “1”.
The probability in step 5 minus that in step 6 is the uplift of the message.
The use of uplift modeling for political campaigns emerged straight out of commercial marketing applications. Ken Strasma, instructor for our Persuasion Analytics and Targeting course, outlines these fundamental techniques in more detail: He was President Obama’s Director of Targeting in the 2008 campaign.
Both commercial and political domains have now moved beyond the simple A-B test followed by the fitting of a predictive model to a discrete campaign. With the advent of Facebook, Google and Twitter, the opportunity to do experiments and target campaigns to users based on their web behavior has grown enormously, in some cases beyond the realm of human review. Brad Parscale, currently the campaign director for President Trump, described in a 60 Minutes interview how in 2016 the campaign used algorithms to create more than ten thousand Facebook ads daily, on the fly, responding to prior results. Read more about those massive automated methods here.
To learn more about uplift, check out Mike Thurber’s blog on this topic.