Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com Logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Menu
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Student Login

Blog

Home Blog “Defiant” Supervision

“Defiant” Supervision

How did the phrase “defiantly recommend”, as in “I defiantly recommend this product,” come into common usage on the internet? The answer is a good look inside the workings of supervised learning.

Supervision, generally from humans, is instrumental in much of statistical and machine learning. Google’s precise search algorithms are not public, but the general approach is to return to the user a set of links that are statistically “close” to the search string. A similar approach is used in spell-check. If the user types something that is not in the dictionary, the spell-checker provides legitimate dictionary words that are close to the misspelled term.

Big Data Enables Supervision

Next comes the supervision part. Users choose the link or the word, that matches what they are looking for. Again and again – Google processes over 40,000 such queries per second. Over time, for each search string, and each misspelling, Google observes which of its suggestions receives the most votes, and moves it to the top of the list.

“Defiant” recommendations arose out of an initial misspelling – users meant “definitely recommend” but some typed “definatly recommend.” Recognizing that “definatly” was not a correct spelling, Google suggested alternatives. In early days, without any supervision to go on, Google listed “defiantly” ahead of “definitely” because it was closer to “definatly” in its spelling – it’s the same set of letters, with two needing swapping. “Definitely” needs more changes – lose the “a”, and add an “e” and “i” – so early on it was listed second.

If supervision is working properly, the users would correct the error by choosing the correct spelling. Unfortunately, the early human supervisors were lazy. They simply OK’d the first option on offer – “defiantly” – and Google learned that this was the proper correction for “definatly.”

Supervised learning, in its first appearances, was a modified form of traditional statistical modeling, in which a model is fit to a set of data in order to describe and, hopefully, explain the relationship between predictor variables and an outcome. Supervised learning adds two elements:

  • The primary purpose becomes predicting outcomes for new records

  • The model is validated and adjusted using out-of-sample data (a sample withheld from the original model fit)

Predicting credit scores using logistic regression was an early (and continuing) application of supervised learning. The same paradigm was applied to data-centric algorithms that do not impose a linear or other structural models, such as

  • Nearest neighbor algorithms (label new cases as other, similar, cases are labeled)

  • Tree algorithms (repeatedly split the data according to predictors that do a good job of separating the outcomes)

  • Naive Bayes (identify the most probable outcome, given the predictors)

  • Neural nets (repeatedly pass the cases through a set of weights applied to predictors, iteratively adjusting the weights to improve predictions)

 

The term “artificial intelligence” has somewhat eclipsed “machine learning” in the popular imagination but supervised learning remains a core function in data science

Recent Posts

  • Oct 6: Ethical AI: Darth Vader and the Cowardly Lion
    /
    0 Comments
  • Oct 19: Data Literacy – The Chainsaw Case
    /
    0 Comments
  • Data Literacy – The Chainsaw Case
    /
    0 Comments

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

 The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV)

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Facebook Twitter Youtube Linkedin

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2023 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept