The Curious Statistician

Sep 13, 2011

Coffee causes cancer?

"Any claim coming from an observational study is most likely to be wrong."  Thus begins "Deming, data and observational studies," just published in "Significance Magazine" (Sept. 2011).

 

Statisticians have long been wary of scientific claims based on observational data - there is too much room to "torture the data long enough, until Nature confesses," in the words of Ronald Coase, the Nobel Prize-winning economist.  Fiddle with the variables and model parameters sufficiently, run enough comparisons, look at enough subgroups, and you can find statistical significance in almost any observed data.  Statisticians prefer controlled experiments, but those are expensive and hard to do, and most published epidemiological research is based on observational data.

Nonetheless, Stanley Young and Alan Karr at the National Institute of Statistical Sciences located 12 observational studies that yielded claims that were subsequently tested in randomized controlled experiments.  The results?

There were 52 "statistically significant" claims arising from the original observational studies.  None replicated in the controlled randomized studies.  Five actually achieved statistical significance in the opposite direction.

Young and Karr cite several factors:

- Multiple comparisons and tests.  Reviewing data for numerous possible correlations, differences, etc. is bound to yield something of significance, unless (1) the questions are stated in advance, and (2) you establish a higher significance threshold, the more questions are asked.

- Bias.  Systematic error results from missing factors, confounding variables, and subject attrition.

- Multiple modeling.  As with multiple testing, the more variables, variable combinations and interactions you try in your model, the greater the probability that you will derive a "significant" model just by chance.

The problem is serious - much medical advice (whether from your family physician, periodicals like the Harvard Health Letter or the popular press) draws upon observational studies like the ones that Young and Karr found had an 0 for 52 track record.  Much energy is spent, later, retracting the advice, and draining confidence from the system.

The solution?  Read the article, but you can guess from the title that Young and Karr recommend a Deming-like "process control" solution.  "Reproducible research" is the byword.

http://www.significancemagazine.org/details/magazine/1324539/Deming-data-and-observational-studies.html

Comments


Leave a Comment

Add a Review of this item
Comment Title:
Your Name:
Your Email Address:
Notify me of new comments to this page:
Notify me of future course offerings:
Additional Comments:

Want to be notified of future course offering?


Enter your email address here:

What our students say:

“I realized one of my work projects would benefit from deeper statistical analysis, including functions I had a good background in and knew at one time, but I needed to dust the cobwebs off and catch up to changes in the field.” Douglas D. Reimel, Jr.

Douglas Reimel

"I’ve increased my exposure in my department and profession because I have experience with a number of data analysis approaches. I’ve been asked to give guest lectures in other classes on statistical methods and different strategies, and I was asked to present at a national conference." Todd Lewis, Ph.D., Associate ProfessorDepartment of Counseling and Educational DevelopmentSchool of EducationUniversity of North Carolina at Greensboro

Todd Lewis

"We’re trying to make it easier for patients to get their prosthetic arms to do exactly what they want them to do. I’ve applied what I’ve learned through my statistics.com courses, such as Baysian statistics, computing techniques, biostatistics, clinical trials, analysis and sensitivity software, bioavailability, probability distributions, data mining, and designing experiments to map brain impulses to muscle movement, which ultimate...

Patricia Shewokis

It took me a long time to find just the right program that provides the right mix of applied and theory, but I found the right one at statistics.com. My staff emerges from your training ready to make an impact on the company. Joseph SommaDirector, Market IntelligenceIndependent Health

Joseph Somma

"Traditionally, reports are designed to summarize data, but they can only tell you what happened. I'm applying data mining algorithms I've learned in my Statistics.com coursework to ask why something happened." Susan StranburgSoftware Developer

Susan Stranburg

"My courses help me look at more complex problems using different approaches to show more interesting aspects of conditions, beyond just tables and charts, more than just sampling or descriptive statistics." Cristobal BazanUnited Nations agency

Cristobal Bazan

I hear IT people complaining that they’re always needing to learn new technology because things in their field evolve and change quickly. The same thing is true in analytics. New techniques are developing rapidly. Robert Wood Director, Advanced Analytics Group, Merkle

Robert Wood

© Statistics.com 2004-2012