Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com Logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Menu
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Student Login

Blog

Home Blog Big Data and Clinical Trials in Medicine

Big Data and Clinical Trials in Medicine

Big Data and Clinical Trials in Medicine

There was an interesting article a couple of weeks ago in the New York Times magazine section on the role that Big Data can play in treating patients — discovering things that clinical trials are too slow, too expensive, and too blunt to find. The story was about a very particular set of lupus symptoms, and how a doctor, on a hunch, searched a large database and found that those symptoms were associated with an increased propensity for blood clots.

However, a search of the medical literature turned up nothing on the subject. What to do? The patient was treated with anticoagulant medication, and did not develop a blood clot. Of course, this does nothing to prove that the association was there in the first place. And on the flip side of the coin lies recent research about the non-replicability of scientific research.

A recent study looked at over 4 dozen health claims that researchers arrived at by examining existing data for possible associations – not by conducting controlled experiments. These 4 dozen claims all had one thing in common – they were tested later by controlled experiments. Astonishingly, not one of the claims held up in the controlled experiment.

Various reasons have been posited for the parlous state of scientific and medical research, including fraud and outright error, but a key issue is what statisticians call the “multiple comparisons problem.” Even in completely randomly-generated data, interesting patterns appear. If the data are big enough and the search exhaustive enough, the patterns can be very compelling.

So was the lupus association for real, or a fluke of Big Data? There’s no way to know, ex-post. The best we can do is to conduct what Lopiano, Obenchain and Youngcall “fair comparisons.” One principle is that the researcher should begin with an hypothesis to be tested, then proceed to test it on the available data, without letting any knowledge of the outcomes guide the analysis. This eliminates erroneous results that happen when you simply “look for something interesting until you find it.”

There remains the problem of hidden differences among patients, a problem that, in large controlled experiments, is effectively “washed out” by the random assignment process. Random assignment of treatment is not possible in observational data, so Lopiano et al propose the idea of clustering the patients into relatively homogeneous, and possibly quite small, clusters, where the effects of treatments can be examined for groups of similar patients. In this way, different treatment effects for different sorts of patients can be identified.

The moral? Rapid growth in the digitization and availability of patient data and health data in general holds great potential for medical research and personalized medicine. However, appropriate statistical methodology and sound study design are needed to unlock this potential, and guard against error.

Recent Posts

  • Oct 6: Ethical AI: Darth Vader and the Cowardly Lion
    /
    0 Comments
  • Oct 19: Data Literacy – The Chainsaw Case
    /
    0 Comments
  • Data Literacy – The Chainsaw Case
    /
    0 Comments

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

 The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV)

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Facebook Twitter Youtube Linkedin

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2023 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept