Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com Logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Menu
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Student Login

Blog

Home Blog ROC, Lift and Gains Curves

ROC, Lift and Gains Curves

ROC

There are various metrics for assessing the performance of a classification model.  It matters which one you use. The simplest is accuracy – the proportion of cases correctly classified.  In classification tasks where the outcome of interest (“1”) is rare, though, accuracy as a metric falls short – high accuracy can be achieved by classifying everything as a “0.”  Here are three alternatives:

  1. The Receiver Operating Characteristics (ROC) curve is widely used, as its associated metric AUC (area under the curve).  This curve plots recall (% of 1’s correctly classified, called sensitivity in the medical sciences) on the y-axis and specificity (% of 0’s correctly classified) on a reversed x-axis.  A near-perfect model – one that correctly identified nearly all the 1’s correctly then nearly all the 0’s – would have an ROC curve that hugs the upper left corner, and it would have an AUC of nearly 1.  Read more here.

Often the business goal is not to classify every case, but rather to do a good job in identifying the 1’s without capturing too many 0’s, so we focus on the cases most likely to be 1’s.  For example, a direct marketer wants to expend effort on reaching only the most probable purchasers, and an insurance investigator wants to spend time on the likeliest frauds.  

  1. Gains, and the gains chart (or cumulative gains chart), measure the number of 1’s captured on the y-axis (or the total value, if the model is predicting a numerical quantity) as you move along the count of records on the y-axis, arrayed left to right in order of decreasing probability of being a 1 (or decreasing predicted value).  It looks like the ROC curve, but it does not generate an overall measure of model performance like the AUC. It does, however, measure units that more directly relate to business goals than do recall and specificity.
  2. Lift is like gains, except that it measures not the actual counts of the 1’s (or the total predicted value), but rather the ratio of that count or value to the baseline count/value that you would achieve by selecting randomly.  

Lift and gains are often presented, for visual clarity, in a decile chart.  This allows the direct marketer, for example, or the fraud analyst, to easily consider the implications of actions for conveniently-sized groups of customers. Decline Lift Chart

The x-axis represents cases ranked by probability and grouped into deciles with the bar on the left representing the decile with the highest probability of being a 1 (or, in the case of predicting numeric values, the highest predicted value).  The y-axis is the ratio of that decile’s counts of 1’s (or predicted value) to the average count or value across all deciles.  

Recent Posts

  • Oct 6: Ethical AI: Darth Vader and the Cowardly Lion
    /
    0 Comments
  • Oct 19: Data Literacy – The Chainsaw Case
    /
    0 Comments
  • Data Literacy – The Chainsaw Case
    /
    0 Comments

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

 The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV)

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Facebook Twitter Youtube Linkedin

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2023 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept