Skip to content

Healthcare Analytics: Exploration versus Confirmation

Perhaps the most active application of analytics and data mining is healthcare. This week we look at one success story, the use of machine learning to predict diabetic retinopathy, one story of disappointment, the use of genetic testing in a puzzling disease, and a basic dichotomy in statistical analysis. In his famous 1977 book thatContinue reading “Healthcare Analytics: Exploration versus Confirmation”

Instructor Spotlight: David Kleinbaum

David Kleinbaum developed several courses for, including Survival Analysis, Epidemiologic Statistics, and Designing Valid Statistical Studies.  David retired a little over a year ago from Emory University, where he was a popular and effective teacher with the ability to distill and explain difficult statistical concepts with clarity and concision.  David had a flair forContinue reading “Instructor Spotlight: David Kleinbaum”

Course Spotlight: Survival Analysis

Convinced that he, like his father, would die in his 40’s, Winston Churchill lived his early life in a frenetic hurry.  He had participated in four wars on three continents by his mid-20’s, served in multiple ministerial positions by his 30’s, and published 12 books by his 40’s.  Little did he know that more thanContinue reading “Course Spotlight: Survival Analysis”

Industry Spotlight: Automotive

The auto industry serves as a perfect exemplar of three key eras of statistics and data science in service of industry: Total Quality Management (TQM) First in Japan, and later in the U.S., the auto industry became an enthusiastic adherent to the Total Quality Management philosophy.  Fundamentally, TQM is all about using data to improveContinue reading “Industry Spotlight: Automotive”

Likert scale assessment surveys

Do you work with multiple choice tests, or Likert scale assessment surveys? Rasch methods help you construct linear measures from these forms of scored observations and analyze the results from such surveys and tests. “Practical Rasch Measurement – Core Topics“ In this course, you will learn practical aspects of data setup, analysis, output interpretation, fit analysis, differentialContinue reading “Likert scale assessment surveys”

Historical Spotlight: Jacob Wolfowitz

World War II was a crucible of technological innovation, including advances in statistics. Jacob Wolfowitz, born a century ago (1920), looked at the problem of noisy radio transmissions. Coded radio transmissions were critical elements of military command and control, and they were plagued by the problem of atmospheric or other interference – “noise”. The weakerContinue reading “Historical Spotlight: Jacob Wolfowitz”

Statistically Significant – But Not True

If you are looking for the Feature Engineering blog post, you can find it here: In 2015, at an Alzheimer’s conference, Biogen researchers presented dramatic brain scans showing that the antibody aducanumab effectively cleared out plaque in the brain, plaque that was associated with Alzheimer’s disease. Their study involved 166 patients in a randomized,Continue reading “Statistically Significant – But Not True”

Student Spotlight: Barry Eggleston

Barry Eggleston is a health research statistician who has worked on both clinical trials and observational studies, and is currently with RTI in North Carolina. In his early career, his work was solely designing and analyzing clinical trials using typical biostatistics methods ranging from t-test to survival analysis and mixed models. After moving to RTIContinue reading “Student Spotlight: Barry Eggleston”

Industry Spotlight: The IRS is Watching You

The IRS (U.S. Internal Revenue Service) has been using computers to choose tax returns for audit since 1962. Early on, the selection was rule-based, but the IRS turned to statistical modeling in 1969, using the oldest predictive analytics model in the toolbox – discriminant analysis. Discriminant analysis, a linear classification technique, was first proposed byContinue reading “Industry Spotlight: The IRS is Watching You”

Job Spotlight: Sports Statistician

The field of sports statistician is not exactly new; the American Statistical Association’s section on Sports Statistics was formed in 1992. Three of’s instructors have professional experience in sports statistics – Ben Baumer (SQL) served as statistician for the NY Mets, Stephanie Kovalchik (Meta Analysis in R) with Tennis Australia, and Joe Hilbe, whoContinue reading “Job Spotlight: Sports Statistician”

Industry Spotlight: Baseball – Opening Day & Statistics in Sports

The U.S. baseball season opens Thursday, March 28, and celebrates the 48th season of analytics in baseball, beginning with the founding of the Sabermetric Society in 1971 (the same year that Satchel Paige entered the Hall of Fame). Analytics has come a long way in sports, and now has its own conference, the MIT SportsContinue reading “Industry Spotlight: Baseball – Opening Day & Statistics in Sports”

Darwin’s Legacy in Statistics

Charles Darwin, the most famous grandson of the Enlightenment thinker Erasmus Darwin, published his ground-breaking theory of evolution, “The Origin of Species,”160 years ago. Another grandson of Erasmus, Francis Galton, became one of the founding fathers of statistics (correlation, the “wisdom of the crowd,” regression and regression to the mean are all Galton’s ideas). HeavilyContinue reading “Darwin’s Legacy in Statistics”

Industry Spotlight: CROs

CRO’s, or contract research organizations, are a $40 billion industry, growing at close to 12% per year. They provide contract services to the pharmaceutical industry, including statistical design and analysis, laboratory services, administration of clinical trials, and monitoring of drugs once they are on the market. Developing a new drug and bringing it to marketContinue reading “Industry Spotlight: CROs”

Handling the Noise – Boost It or Ignore It?

In most statistical modeling or machine learning prediction tasks, there will be cases that can be easily predicted based on their predictor values (signal), as well as cases where predictions are unclear (noise). Two statistical learning methods, boosting and ProfWeight, use those difficult cases in exactly opposite ways – boosting up-weights them, and ProfWeight down-weightsContinue reading “Handling the Noise – Boost It or Ignore It?”

Good to Great

In 1994, Jim Collins and Jerry Porras, former and current Stanford professors, published the best-seller Built to Last that described how “long-term sustained performance can be engineered into the DNA of an enterprise.”  It sold over a million copies. Buoyed by that success, Collins and a research team set out to find the characteristics of companiesContinue reading “Good to Great”

Space Shuttle Explosion

In 1986, the U.S. space shuttle Challenger exploded several minutes after launch. A later investigation found that the cause of the disaster was O-ring failure, due to cold temperatures. The temperature at launch was 39 degrees, colder than any prior launch. The cold caused the O-rings to become stiff and brittle, losing the flexibility thatContinue reading “Space Shuttle Explosion”

The Statistics of Persuasion

The Art of Persuasion is the title of more than one book in the self-help genre, books that have spawned blogs, podcasts, speaking gigs and more. But the science of persuasion is actually of more interest, because it produces useful rules that can be studied and deployed. Marketers and politicians have long been enthusiastic usersContinue reading “The Statistics of Persuasion”

Book Review: Thinking Fast and Slow

Daniel Kahneman won a Nobel Prize in Economics for his work in behavioral economics, much of it with his colleague Amos Tversky, who died in 2006. Kahneman’s 2011 classic, Thinking Fast and Slow, is a superbly-written non-technical summary of their fascinating research and its often counter-intuitive findings. The best feature of the book is theContinue reading “Book Review: Thinking Fast and Slow”