Skip to content

Healthcare Analytics: Exploration versus Confirmation

Perhaps the most active application of analytics and data mining is healthcare. This week we look at one success story, the use of machine learning to predict diabetic retinopathy, one story of disappointment, the use of genetic testing in a puzzling disease, and a basic dichotomy in statistical analysis. In his famous 1977 book thatContinue reading “Healthcare Analytics: Exploration versus Confirmation”

Matching Algorithms

Some applications of machine learning and artificial intelligence are recognizably impressive – predicting future hospital readmission of discharged patients, for example, or diagnosing retinopathy. Others – self-driving cars, for example – seem almost magical. The matching problem, though, is one where your first reaction might be “What’s so hard about that?” For example, to takeContinue reading “Matching Algorithms”

Industry Spotlight: Automotive

The auto industry serves as a perfect exemplar of three key eras of statistics and data science in service of industry: Total Quality Management (TQM) First in Japan, and later in the U.S., the auto industry became an enthusiastic adherent to the Total Quality Management philosophy.  Fundamentally, TQM is all about using data to improveContinue reading “Industry Spotlight: Automotive”

Feature Engineering and Data Prep – Still Needed?

It is a truism of machine learning and predictive analytics that 80% of an analyst’s time is consumed in cleaning and preparing the needed data. I saw an estimate by a Google engineer that 25% of the time was spent just looking for the right data. A big part of this process is human-driven featureContinue reading “Feature Engineering and Data Prep – Still Needed?”

A Deep Dive into Deep Learning

On Wednesday, March 27, the 2018 Turing Award in computing was given to Yoshua Bengio, Geoffrey Hinton and Yann LeCun for their work on deep learning. Deep learning by complex neural networks lies behind the applications that are finally bringing artificial intelligence out of the realm of science fiction into reality. Voice recognition allows youContinue reading “A Deep Dive into Deep Learning”

Industry Spotlight: Package Delivery

Nothing better illustrates the encroachment of data science and analytics on the older “economy of tangible things” than the business of delivering packages. The use of analytics in package delivery is not new. Companies like UPS and Fedex are longtime users of operations research methods like optimization and simulation to route inter-city shipments, site newContinue reading “Industry Spotlight: Package Delivery”

Ethical Practice in Data Mining

Prior to the advent of internet-connected devices, the largest source of big data was public interaction on the internet. Social media users, as well as shoppers and searchers on the internet, make an implicit deal with the big companies that provide these services: users can take advantage of powerful search, shopping and social interaction toolsContinue reading “Ethical Practice in Data Mining”

Industry Spotlight: Customer Segmentation

Are you “young and rustic?” Or perhaps a “toolbelt traditionalist?” These are nicknames given to customer segments identified by market research firm Claritas, with its statistical clustering tool. Long before the advent of individualized product recommendations, business sought to segment customers into distinct groups on the basis of purchase behavior, demographic variables, and geography, toContinue reading “Industry Spotlight: Customer Segmentation”

“Defiant” Supervision

How did the phrase “defiantly recommend”, as in “I defiantly recommend this product,” come into common usage on the internet? The answer is a good look inside the workings of supervised learning. Supervision, generally from humans, is instrumental in much of statistical and machine learning. Google’s precise search algorithms are not public, but the generalContinue reading ““Defiant” Supervision”

Alaskan Generosity

People in Alaska are extraordinarily generous – that’s what a predictive model showed, when applied to a charitable organization’s donor list. A closer examination revealed a flaw – while the original data was for all 50 states, the model’s training data for Alaska included donors, but excluded non-donors. The reason? The data was 99% non-donors,Continue reading “Alaskan Generosity”

Industry Spotlight – The Military

Abraham Wald, a persecuted Jewish mathematician who fled Austria just before World War II, led an analysis of allied bombers returning from missions. Hitherto, the Air Force had focused on reinforcing areas that showed the most damage on return. Wald convinced them instead to focus on the areas that consistently showed no damage. He reasonedContinue reading “Industry Spotlight – The Military”

Political Analytics and Microtargeting

The statistics of targeting individual voters with specific messages, as opposed to messaging that went to whole groups, began in the U.S over a decade ago with the Democrats. Political targeting is now an established business, or at least a discipline within the broader realm of political consulting. By 2016, the Republicans had surged wellContinue reading “Political Analytics and Microtargeting”

The Statistics of Persuasion

The Art of Persuasion is the title of more than one book in the self-help genre, books that have spawned blogs, podcasts, speaking gigs and more. But the science of persuasion is actually of more interest, because it produces useful rules that can be studied and deployed. Marketers and politicians have long been enthusiastic usersContinue reading “The Statistics of Persuasion”

Job Spotlight: Digital Marketer

A digital marketer handles a variety of tasks in online marketing – managing online advertising and search engine optimization (SEO), implementing tracking systems (e.g. to identify how a person came to a retailer), web development, preparing creatives, implementing tests, and, of course, analytics. There are typically three types of employers: Marketing agencies that contract outContinue reading “Job Spotlight: Digital Marketer”

Artificial Lawyers

Can statistical and machine learning methods replace lawyers? A host of entrepreneurs think so, and do the folks who run Text mining and predictive model products are available now to predict case staffing requirements and perform automated document discovery, and natural language algorithms conduct legal research and case review. In 2017, a predictive algorithmContinue reading “Artificial Lawyers”

Entity Resolution and Identifying Bad Guys

Earlier, we described how Jen Golbeck (who teaches Network Analysis at analyzed Facebook connections to identify fake accounts (the account holders friends all had the same number of friends, which is highly improbable statistically). Network analysis and studying connections lie at the heart of entity resolution. To a sales and marketing person, entity resolutionContinue reading “Entity Resolution and Identifying Bad Guys”

How Google Determines Which Ads you See

A classic machine learning task is to predict something’s class, usually binary – pictures as dogs or cats, insurance claims as fraud or not, etc. Often the goal is not a final classification, but an estimate of the probability of belonging to a class (propensity), so the cases can be ranked. A good example ofContinue reading “How Google Determines Which Ads you See”

Job Spotlight: Data Scientist

Data science is one of a host of similar terms. Artificial intelligence has been around since the 1960’s and data mining for at least a couple of decades. Machine learning came out of the computer science community, and analytics, data analytics, and predictive analytics came out of the statistics and OR communities. Among all ofContinue reading “Job Spotlight: Data Scientist”

Triage and Artificial Intelligence

Predictim is a service that scans potential babysitters’ social media and other online activity and issues them a score that parents can use to select babysitters. Jeff Chester, the executive director of the Center for Digital Democracy, commented: There’s a mad rush to seize the power of AI to make all kinds of decisions withoutContinue reading “Triage and Artificial Intelligence”

Course Spotlight: Deep Learning

Deep learning is essentially “neural networks on steroids” and it lies at the core of the most intriguing and powerful applications of artificial intelligence. Facial recognition (which you encounter daily in Facebook and other social media) harnesses many levels of data science tools, including algorithms that compare images and match those with similar measurements betweenContinue reading “Course Spotlight: Deep Learning”