The Real Facebook Controversy - Statistics.com: Data Science, Analytics & Statistics Courses

Cambridge Analytica’s wholesale scraping of Facebook user data is big news now, and people are shocked that personal data is being shared and traded on a massive scale on the internet. But the real issue with social media is not harming to individual users whose information was shared, but sophisticated and sometimes subtle mass manipulation of social and political behavior by bad actors, facilitated by deceit, fraud, and amplification of lies that spread easily through societal discourse on the internet.

We abandoned any pretense to privacy long ago when we accepted the free service model of Google, Facebook, Twitter, etc. (though Senators listening to Mark Zuckerberg seemed only dimly aware of Facebook’s ad-based revenue model).

The controversy about Cambridge Analytica that landed Mark Zuckerberg before Congress actually began brewing over a year ago. It was a controversy not about privacy but about how Cambridge Analytica put vast amounts of personal data, mostly from Facebook, into its so-called “psychographic” engine to influence behavior at the individual level (see my March 2017 article When the Big Lie Meets Big Data). Cambridge Analytica worked with researchers from Cambridge University who developed a Facebook app that provided a free personality test, then proceeded to scoop up all the user’s Facebook data plus that of all their friends (thus leveraging the actual users, who numbered less than a million, to harvest the data of more than 80 million people). Using this data, Cambridge Analytica then classified each individual’s personality according to the so-called OCEAN scale (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism) and fashioned individually-targeted messages to appeal to each person’s personality.

The real danger revealed by the Cambridge Analytica scandal is that the information and social platforms of the internet, on which we increasingly spend our time and through which more and more of our personal and social connections flow, are being corrupted in the service of con men, political demagogues, and thieves. Russia’s troll farm, the Internet Research Agency, employs fake user accounts to post divisive messages, purchase political ads, spread fabricated images, and even organize political rallies. Until recently, the social media giants seemed indifferent to this problem; any serious attempt to stem the creation of fraudulent accounts would have depressed the growth of the user base, which is all-important in Silicon Valley. Yet analytic methods to detect fake accounts are available

Detecting Fake Social Media Accounts

n 2015, Dr. Jen Golbeck (right), who teaches Network Analysis at Statistics.com, published an ingenious real-time method for identifying fake social media accounts.
She found that the number of a user’s followers (Twitter) or friends (Facebook) follows a well-known distribution called Benford’s Law. Benford’s Law states that in a conforming data set, the first significant digit of numbers is a 1 about 30% of the time – 6 times more often than it’s a 9. Golbeck and others identified a number of accounts that did not follow this pattern and found they were all fake Russian troll accounts. Read more about this method, and how the social media companies reacted here: https://blogs.scientificamerican.com/observations/the-facebook-controversy-privacy-is-not-the-issue/

This is a summary of a more detailed account published this morning in Scientific American Online.