A test set is a portion of a data set used in data mining to assess the likely future performance of a single prediction or classification model that has been selected from among competing models, based on its performance with the validation set. While the validation set provides a useful guide to selecting the best-performing model, that best-performing model may still do better with the validation data than it will with unseen future data. This is particularly the case when many different models are assessed – in such cases there is a greater chance that the “winning” model emerges on top because it just happens (by chance) to fit some particularities of the validation set quite well.
Browse Other Glossary Entries
Planning on taking an introductory statistics course, but not sure if you need to start at the beginning? Review the course description for each of our introductory statistics courses and estimate which best matches your level, then take the self test for that course. If you get all or almost all the questions correct, move on and take the next test.
Find the right course for you
We'd love to answer your questions
Our mentors and academic advisors are standing by to help guide you towards the courses or program that makes the most sense for you and your goals.
300 W Main St STE 301, Charlottesville, VA 22903