Skip to content

Random Selection for Harvard Admission?

An ethical algorithm… Ethics in algorithms is a popular topic now. Usually the conversation centers around the possible unintentional bias or harm that a statistical or machine learning algorithm could do when it is used to select, score, rate, or rank people. For example – a credit scoring algorithm may include a predictor that isContinue reading “Random Selection for Harvard Admission?”

GE Regresses to the Mean

Thirty years ago, GE became the brightest star in the firmament of statistical ideas in business when it adopted Six Sigma methods of quality improvement. Those methods had been introduced by Motorola, but Jack Welch’s embrace of the same methods at GE, a diverse manufacturing powerhouse, helped bring stardom to industrial statisticians. Last week, GE’sContinue reading “GE Regresses to the Mean”

Examples of Bad Forecasting

In a couple of days, theWall Street Journalwill come out with its November survey of economists’ forecasts. It’s a particularly sensitive time, with elections in a few days and President Trump attacking the Federal Reserve for for raising interest rates. It’s a good time to recall major forecasting gaffes of the past. In 1987, best-sellingContinue reading “Examples of Bad Forecasting”

Historical Spotlight: Risk Simulation – Since 1946

Simulation – a Venerable History One of the most consequential and valuable analytical tools in business is simulation, which helps us make decisions in the face of uncertainty, such as these: An airline knows on average, what proportion of ticketed passengers show up for a flight, but the number for any given flight is uncertain. Continue reading “Historical Spotlight: Risk Simulation – Since 1946”

BOOTSTRAP

I used the term in my message about bagging and several people asked for a review of the bootstrap. Put simply, to bootstrap a dataset is to draw a resample from the data, randomly and with replacement.

100 years of variance

It is 100 years since R A Fischer introduced the concept of “variance“(in his 1918 paper “The Correlation Between Relatives on the Supposition of Mendelian Inheritance“). There is much that statistics has given us in the century that followed. Randomized clinical trials, and the means to analyze them, moved medicine fully into the modern, science-based era.Continue reading “100 years of variance”

Course Spotlight: Deep Learning

Deep learning is essentially “neural networks on steroids” and it lies at the core of the most intriguing and powerful applications of artificial intelligence. Facial recognition (which you encounter daily in Facebook and other social media) harnesses many levels of data science tools, including algorithms that compare images and match those with similar measurements betweenContinue reading “Course Spotlight: Deep Learning”

Course Spotlight: Structural Equation Modelling (SEM)

SEM stands for “structural equation modeling,” and we are fortunate to have Prof. Randall Schumacker teaching this subject at Statistics.com. Randy created the Structural Equation Modeling (SEM) journal in 1994 and the Structural Equation Modeling Special Interest Group (SIG) at the American Educational Research Association (AERA) He has also co-authored several books, including: A Beginner’sContinue reading “Course Spotlight: Structural Equation Modelling (SEM)”

Benford’s Law Applies to Online Social Networks

Fake social media accounts and Russian meddling in US elections have been in the news lately, with Mark Zuckerberg (Facebook founder) testifying this week before the US Congress. Dr. Jen Golbeck, who teaches Network Analysis at Statistics.com, published an ingenious way to determine whether a Facebook, Twitter or other social media account is fraudulent. HerContinue reading “Benford’s Law Applies to Online Social Networks”

The Real Facebook Controversy

Cambridge Analytica’s wholesale scraping of Facebook user data is big news now, and people are shocked that personal data is being shared and traded on a massive scale on the internet. But the real issue with social media is not harming to individual users whose information was shared, but sophisticated and sometimes subtle mass manipulationContinue reading “The Real Facebook Controversy”

Course Spotlight: Two statistical modeling courses

Two important statistical modeling courses are coming up in May. May 18 – Jun 15: Principal Components and Factor Analysis May 18 – Jun 15: Modeling Count Data   Factor analysis is used frequently in social science research where you want to examine that which you cannot observe (latent variables) using data that you canContinue reading “Course Spotlight: Two statistical modeling courses”

Masters Programs versus an Online Certificate in Data Science from Statistics.com

We just attended the analytics conference of INFORMS’ (The Institute for Operations Research and the Management Sciences) this week in Baltimore, and they held a special meeting for directors of academic analytics programs to better align what universities are producing with what industry is seeking. The number of such programs is still growing rapidly (>200),Continue reading “Masters Programs versus an Online Certificate in Data Science from Statistics.com”

Course Spotlight: Likert scale assessment surveys

Do you work with multiple choice tests, or Likert scale assessment surveys? Rasch methods help you construct linear measures from these forms of scored observations and analyze the results from such surveys and tests. “Practical Rasch Measurement – Core Topics“ In this course, you will learn practical aspects of data setup, analysis, output interpretation, fitContinue reading “Course Spotlight: Likert scale assessment surveys”

Course Spotlight: Customer Analytics in R

“The customer is always right” was the motto Selfridge’s department store coined in 1909. “We’ll tell the customer what they want” was Madison Avenue’s mantra starting in the 1950’s. Now data scientists like Karolis Urbonas help companies like Amazon (where he works in Europe as Head of Data Science, Amazon Devices) use data to figureContinue reading “Course Spotlight: Customer Analytics in R”

Course Spotlight: Predictive Analytics

Predicting whether an internet user will click on a link or buy a product, whether an insurance claim is fraudulent, whether a home mortgage will be paid on time (or early), how much a house will sell for, what internet ad you should see next, whether a discharged patient will need to return to theContinue reading “Course Spotlight: Predictive Analytics”