Skip to content
Statistics logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Skillsets
      • Bayesian Statistics
      • Business Analytics
      • Healthcare Analytics
      • Marketing Analytics
      • Operations Research
      • Predictive Analytics
      • Python Analytics
      • R Programming Analytics
      • Rasch & IRT
      • Spatial Statistics
      • Survey Analysis
      • Text Mining Analytics
    • Undergraduate Degree Programs
    • Graduate Degree Programs
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
  • Student Login

Home Blog Predictor P-Values in Predictive Modeling

Predictor P-Values in Predictive Modeling

Not So Useful

Predictor p-values in linear models are a guide to the statistical significance of a predictor coefficient value – they measure the probability that a randomly shuffled model could have produced a coefficient as great as the fitted value.  They are of limited utility in predictive modeling applications for various reasons:

  • Software typically reports the p-value for in-sample (training) data, while in most predictive modeling applications you want to assess model performance on holdout data
  • They are often misinterpreted as measuring the importance of the predictor, or the probability that the model fits the data (neither is the case)

There is one predictive modeling context in which they can be useful:  eliminating variables, to reduce the dimensionality of the data. High p-values (say above 0.20) are a good sign that the predictor’s contribution to a model is not much greater than random chance.

Subscribe to the Blog

You have Successfully Subscribed!

By submitting your information, you agree to receive email communications from statistics.com. All information submitted is subject to our privacy policy. You may opt out of receiving communications at any time.

Categories

Recent Posts

  • Table Test
  • Oct 19: Data Literacy – The Chainsaw Case
  • Data Literacy – The Chainsaw Case

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
Menu
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Linkedin-in Twitter Facebook-f Youtube

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2022 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept