Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com Logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Menu
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Student Login

Blog

Home Blog Big Sample, Unreliable Result

Big Sample, Unreliable Result

Which would you rather have?  A large sample that is biased, or a representative sample that is small?  The American Statistical Association committee that reviewed the 1948 Kinsey report on male sexual behavior, based on interviews with over 5000 men, left no doubt of their preference for the latter.  The statisticians –  William Cochran, Frederick Mosteller, John Tukey, and W. O. Jenkins – were leaders in their profession, and identified multiple sources of bias in the Kinsey data collection effort.  Participation was voluntary, and generated to some degree by referral, leading to self-selection bias. Prison populations were substantially over-represented. One result was an over-estimate of the prevalence of homosexuality among men.  Tukey dismissively said that he would put greater stock in a randomly selected sample of 3 than in 300 selected by Kinsey.

Nonetheless, Sample Size Matters

On the other hand, sample size does matter, even if it is secondary to proper sample selection methods.  As Daniel Kahneman put it in Thinking, Fast and Slow:

The exaggerated faith in small samples is only one example of a more general illusion – we pay more attention to the content of messages than to information about their reliability, and as a result end up with a view of the world around us that is simpler and more coherent than the data justify.

The smaller the sample, the more it is prone to misinterpretation.  Random variation makes it unreliable as a tool for estimation, and also gives scope for interesting chance events to attract the attention of the investigator.

How Big?

How big should your sample be?  You can find general guidance associated with particular tasks (polling, auditing, behavioral studies), but a more analytical approach exists, based on the principles of statistical inference.

This approach presumes that you are gathering data to investigate a hypothesis, typically concerning the effect some condition or treatment has on subjects, an effect that shows up in a difference between, or among, groups that experience different treatments or conditions.  The basic idea is to gather a sample that is big enough to assure you that, if the effect you are investigating exists, your study will find it. This involves balancing three parameters set by the user:

  • Effect size
  • Level of significance
  • Power

Setting the Parameters

Effect size:  The smaller the effect size you hope to find, the bigger the sample needed.  A useful analogy is finding stars with a telescope – the dimmer the star, the bigger the telescope you need to distinguish it.  “Effect size” is the difference you hope exists in the population(s) you are investigating. For continuous numeric data, it would be expressed as a difference in means of the distributions.  What does “find” mean? Here it means to conclude that there is a statistically significant difference, or effect. For example, if you are testing two different colors for a “buy” button on a web site, finding a difference means that a difference between two groups of web users experiencing different colors is statistically significant at a pre-chosen level of significance.  

Level of Significance:  The “tighter” the definition of statistical significance (e.g. 0.01 instead of 0.05), the bigger the sample needed.  P-values and the whole idea of statistical significance have fallen into some disfavor as a result of their abuse – as the number of academic researchers seeking to publish papers has risen, the p-value has become a “necessary and sufficient” publishing criterion, opening the door to great numbers of published studies whose only “virtue” is that they contain a statistically significant result, lacking practical significance or proper study design.  Nevertheless, in determining sample size, a determination of statistical significance as a criterion for validating a finding is needed.

Power:  Power is the probability of achieving a statistically significant result in a sample study, if the specified effect size is real in the population being studied.  For example, if a medication has a real effect of reducing blood pressure by 10%, and you conduct a study (at your specified significance level) between a medication group and a control group, power is the probability that the study will return a result of “significant.”  Note that the study does not necessarily have to yield a 10% difference between the two groups – rather it simply has to yield a statistically significant difference. The more power you seek, the bigger the sample needed.

Tradeoffs

Specifying the three parameters is an exercise in tradeoffs.  The smaller the effect you want to be able to find, and the greater the power (probability of finding that effect), the bigger the sample you need.  If your initial goals with respect to these key parameters yield a sample requirement that is beyond your budget or capability, you must compromise something; that is, be willing to set a larger effect size threshold (meaning that you might well miss a desired effect), or you must tolerate a lower power, or both.  The level of statistical significance is not so malleable; it is usually set by external requirements, e.g. regulators or journal publishers who often specify a traditional level of 5%.

Variance

Setting the three parameters is a necessary, but not sufficient condition to find sample size.  A fourth factor affecting sample size is the variance in the data. This, of course, is not a parameter set by the user.  The greater the variance in the data, the greater the sample size needed to identify a given effect of interest. Thus, any estimate of required sample size must necessarily incorporate an assumption about variance in the data.  This might be estimated from earlier samples of data, or from knowledge about the process or population involved.

Putting it All Together

Once you have some estimate of the variance in the data, you can visualize the procedure to calculate sample size via a resampling simulation procedure, illustrated here for the case of two samples with continuous numeric data:

  1. Specify the desired effect size, level of significance, and power
  2. Specify two data random generators to generate normally-distributed data from populations with two means that differ by the desired effect size, and with variance as estimated from prior information*
  3. Generate two samples of size n1 — one from each of the data generators
  4. Conduct a significance test on the two samples; record whether the difference is significant
  5. Repeat steps 3-4 say 1000 times; note what proportion of the time the difference is significant – this is the power
  6. If the power is right on target, n1 is the appropriate sample size; if the power is too low you will need to increase the sample size and if the power is higher than needed you can reduce the sample size
  7. Iteratively try different levels of n until the power is where you need it

*If you actually have real data appropriate to the study, you can substitute two bootstrap generators (one shifted by the effect size) for the normally distributed data generator.

In most cases, power will be determined by software calculating formulas, though the bootstrap simulation approach can be used where the situation and the statistic of interest do not fit the data scenario required by the software.

Recent Posts

  • Oct 6: Ethical AI: Darth Vader and the Cowardly Lion
    /
    0 Comments
  • Oct 19: Data Literacy – The Chainsaw Case
    /
    0 Comments
  • Data Literacy – The Chainsaw Case
    /
    0 Comments

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

 The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV)

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Facebook Twitter Youtube Linkedin

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2023 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept