The training data are a subset of all the data that you have available, and are used to fit various models. The models are then applied to another subset(s) of the same data and predicted values of the outcome variable are calculated. The predicted values are then compared to the actual values, and measures of model performance are calculated and the models are compared.
The Institute for Statistics Education offers an extensive glossary of statistical terms, available to all for reference and research. To celebrate the International Year of Statistics in 2013, we will provide a statistical term every week, delivered directly to your inbox. Make it your New Year's resolution to improve your own statistical knowledge! Sign up here.
Rather not have more email? Bookmark our "Stats Word of the Week" page.