Glossary

Predictive Modeling

Predictive modeling is the process of using a statistical or machine learning model to predict the value of a target variable (e.g. default or no-default) on the basis of a series of predictor variables (e.g. income, house value, outstanding debt, etc.). Many of the techniques used (e.g. regression, logistic regression, discriminant analysis) have been used for nearly a century in statistical research. However, in predictive modeling the emphasis is on predicting values in new data, rather than trying to explain an existing data set. Prediction can work and be quite effective even if the relationships between predictor variables and the target variable are not understood. Hence, traditional metrics that measure how well a model fits the data that it was fit to (e.g. R-squared or goodness-of-fit) are not that important in predictive modeling. What is important is how well the model predicts, and this is typically measured by applying the model to a hold-out sample where the value of the target variable is known.

Test Yourself

Planning on taking an introductory statistics course, but not sure if you need to start at the beginning? Review the course description for each of our introductory statistics courses and estimate which best matches your level, then take the self test for that course. If you get all or almost all the questions correct, move on and take the next test.