Good psychics have a knack for getting their audience to reveal, unwittingly, information that can be turned around and used in a prediction. Statisticians and data scientists fall prey to a related phenomenon, leakage, when they allow into their models highly predictive features that would be unavailable at prediction time. In one noted example, aContinue reading “Controlling Leaks”
Category Archives: Statistical Thinking Series
As an Aspiring Data Scientist, What Do I Really Need to Know About Statistics?
As the popularity of data science has grown, so too has advice on how to get jobs in data science. A common form of advice is a list of sample questions you might be asked at your job interview (see here and here for examples). Often, the list starts out with statistics, but beware: itContinue reading “As an Aspiring Data Scientist, What Do I Really Need to Know About Statistics?”
July 21: Statistics in Practice
In this week’s brief, a continuation of our “Statistical Thinking” series, we reflect on three “myths” in data science and statistics, and spotlight our ten-course Social Science Statistics certificate program. You can get started with either of these courses: Aug 7- Sep 4: Survey Design and Sampling Procedures Oct 2 – 30: Regression Analysis See youContinue reading “July 21: Statistics in Practice”