Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com Logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Menu
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Student Login

Blog

Home Blog Coffee causes cancer?

Coffee causes cancer?

"Any claim coming from an observational study is most likely to be wrong." Thus begins "Deming, data and observational studies," just published in "Significance Magazine" (Sept. 2011).

Statisticians have long been wary of scientific claims based on observational data – there is too much room to “torture the data long enough, until Nature confesses,” in the words of Ronald Coase, the Nobel Prize-winning economist. Fiddle with the variables and model parameters sufficiently, run enough comparisons, look at enough subgroups, and you can find statistical significance in almost any observed data. Statisticians prefer controlled experiments, but those are expensive and hard to do, and most published epidemiological research is based on observational data.

Nonetheless, Stanley Young and Alan Karr at the National Institute of Statistical Sciences located 12 observational studies that yielded claims that were subsequently tested in randomized controlled experiments. The results?

There were 52 “statistically significant” claims arising from the original observational studies. None replicated in the controlled randomized studies. Five actually achieved statistical significance in the opposite direction.

Young and Karr cite several factors:

– Multiple comparisons and tests. Reviewing data for numerous possible correlations, differences, etc. is bound to yield something of significance, unless (1) the questions are stated in advance, and (2) you establish a higher significance threshold, the more questions are asked.

– Bias. Systematic error results from missing factors, confounding variables, and subject attrition.

– Multiple modeling. As with multiple testing, the more variables, variable combinations and interactions you try in your model, the greater the probability that you will derive a “significant” model just by chance.

The problem is serious – much medical advice (whether from your family physician, periodicals like the Harvard Health Letter or the popular press) draws upon observational studies like the ones that Young and Karr found had an 0 for 52 track record. Much energy is spent, later, retracting the advice, and draining confidence from the system.

The solution? Read the article, but you can guess from the title that Young and Karr recommend a Deming-like “process control” solution. “Reproducible research” is the byword.

http://www.significancemagazine.org/details/magazine/1324539/Deming-data-and-observational-studies.html

Recent Posts

  • Oct 6: Ethical AI: Darth Vader and the Cowardly Lion
    /
    0 Comments
  • Oct 19: Data Literacy – The Chainsaw Case
    /
    0 Comments
  • Data Literacy – The Chainsaw Case
    /
    0 Comments

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

 The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV)

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Facebook Twitter Youtube Linkedin

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2023 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept