Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com: Data Science, Analytics & Statistics Courses
  • Curriculum
    • Curriculum
    • About Us
    • Testimonials
    • Management Team
    • Faculty Search
    • Teach With Us
    • Credit & Credentialing
  • Courses
    • Explore Courses
    • Course Calendar
    • About Our Courses
    • Course Tour
    • Test Yourself!
  • Mastery Series
    • Mastery Series Program
    • Bayesian Statistics
    • Business Analytics
    • Healthcare Analytics
    • Marketing Analytics
    • Operations Research
    • Predictive Analytics
    • Python for Analytics
    • R Programming
    • Rasch & IRT
    • Spatial Statistics
    • Statistical Modeling
    • Survey Statistics
    • Text Mining and Analytics
  • Certificates
    • Certificate Program
    • Analytics for Data Science
    • Biostatistics
    • Programming for Data Science – R (Novice)
    • Programming for Data Science – R (Experienced)
    • Programming for Data Science – Python (Novice)
    • Programming for Data Science – Python (Experienced)
    • Social Science
  • Degrees
    • Degree Programs
    • Computational Data Analytics Certificate of Graduate Study from Rowan University
    • Health Data Management Certificate of Graduate Study from Rowan University
    • Data Science Analytics Master’s Degree from Thomas Edison State University (TESU)
    • Data Science Analytics Bachelor’s Degree – TESU
    • Mathematics with Predictive Modeling Emphasis BS from Bellevue University
  • Enterprise
    • Organizations
    • Higher Education
  • Resources
    • Blog
    • FAQs & Knowledge Base
    • Glossary
    • Site Map
    • Statistical Symbols
    • Weekly Brief Newsletter Signup
    • Word of the Week
Menu Close
  • Curriculum
    • Curriculum
    • About Us
    • Testimonials
    • Management Team
    • Faculty Search
    • Teach With Us
    • Credit & Credentialing
  • Courses
    • Explore Courses
    • Course Calendar
    • About Our Courses
    • Course Tour
    • Test Yourself!
  • Mastery Series
    • Mastery Series Program
    • Bayesian Statistics
    • Business Analytics
    • Healthcare Analytics
    • Marketing Analytics
    • Operations Research
    • Predictive Analytics
    • Python for Analytics
    • R Programming
    • Rasch & IRT
    • Spatial Statistics
    • Statistical Modeling
    • Survey Statistics
    • Text Mining and Analytics
  • Certificates
    • Certificate Program
    • Analytics for Data Science
    • Biostatistics
    • Programming for Data Science – R (Novice)
    • Programming for Data Science – R (Experienced)
    • Programming for Data Science – Python (Novice)
    • Programming for Data Science – Python (Experienced)
    • Social Science
  • Degrees
    • Degree Programs
    • Computational Data Analytics Certificate of Graduate Study from Rowan University
    • Health Data Management Certificate of Graduate Study from Rowan University
    • Data Science Analytics Master’s Degree from Thomas Edison State University (TESU)
    • Data Science Analytics Bachelor’s Degree – TESU
    • Mathematics with Predictive Modeling Emphasis BS from Bellevue University
  • Enterprise
    • Organizations
    • Higher Education
  • Resources
    • Blog
    • FAQs & Knowledge Base
    • Glossary
    • Site Map
    • Statistical Symbols
    • Weekly Brief Newsletter Signup
    • Word of the Week

Blog

Same thing, different terms..

  • July 3, 2018
  • , 4:06 pm

The field of data science is rife with terminology anomalies, arising from the fact that the field comes from multiple disciplines.

 

The most striking one is “sample” – to statisticians it means a dataset selected from a larger dataset. 

 

 

To computer scientists and machine learners it often means a single observation or record in a dataset. Other synonyms for the single record include example, instance, pattern (from machine learning), case (statistics) or row (database technology).

Outliers are also anomalies.

An outcome variable is also called a dependent variable, response variable, or a target variable.

 

In predictive modeling, subsets of the data are typically withheld from the model fitting process, then the fitted model is tested on the withheld data. Those withheld data are called the holdout data, the validation data, or the test data. In some applications two extra datasets are withheld, one of them to be used only at the very end, to assess bias in the ultimate chosen model (not to do further tuning of models). This third set is, in the original SAS terminology, the test data.

In predictive modeling a classification model predicts what category a record belongs to; in biostatistics a diagnostic test performs a similar role. In biostatistics, sensitivity is the proportion of 1’s (cases of interest) correctly identified by the test. The same metric is called recall in predictive modeling.

 

The term decision trees, by contrast, is used in two very different ways. In predictive modeling, decision trees learned from data establish rules for predictor variables that can be used to predict unknown outcome variables. In decision analysis, a decision-maker would construct a branching decision tree to account for different possible outcomes to events, and their probabilities and costs or benefits, to identify the maximum expected value of a particular decision path.

 

Subscribe to the Blog

You have Successfully Subscribed!

By submitting your information, you agree to receive email communications from statistics.com. All information submitted is subject to our privacy policy. You may opt out of receiving communications at any time.

Categories

Recent Posts

  • March 9: Statistics and Data Science in Practice March 7, 2021
  • Feb 23: Statistics and Data Science in Practice March 5, 2021
  • Word of the Week – Ruin Theory March 4, 2021

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

Latest Blogs

  • March 9: Statistics and Data Science in Practice
    March 7, 2021/
    0 Comments
  • Feb 23: Statistics and Data Science in Practice
    March 5, 2021/
    0 Comments
  • Word of the Week – Ruin Theory
    March 4, 2021/
    0 Comments

Social Networks

Linkedin-in
Twitter
Facebook-f
Youtube

Contact

The Institute for Statistics Education
4075 Wilson Blvd, 8th Floor
Arlington, VA 22203
(571) 281-8817

ourcourses@statistics.com

© Copyright 2021 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept