Skip to content

Mar 24: Statistics in Practice

In this week’s Brief, we look again at the statistics of Coronavirus.  We also spotlight our Health Analytics Mastery – a 3-course series in which you can choose from among Biostatistics 1 and 2 Designing Valid Statistical Studies Epidemiologic Statistics * Introduction to Statistical Issues in Clinical Trials You can start July 1 with BiostatisticsContinue reading “Mar 24: Statistics in Practice”

Covid-19 Parameters

There are many moving parts in modeling the spread of an epidemic, a subject that has lately attracted the attention of great numbers of statistically-oriented non-epidemiologists (like me).  I’ve put together a “lay statistician’s guide” to some of the important parameters and factors (and I welcome corrections/additions!). Terms Case fatality rate or CFR:  Deaths asContinue reading “Covid-19 Parameters”

Coronavirus – in Search of the Elusive Denominator

Anyone with internet access these days has their eyes on two constellations of data – the spread of the coronavirus, and the resulting collapse of the financial markets.  Following the 13% one-day drop of the stock market a week ago, The Wall Street Journal forecast a quarterly GDP drop of as much as 10% –Continue reading “Coronavirus – in Search of the Elusive Denominator”

Coronavirus: To Test or Not to Test

In recent years, under the influence of statisticians, the medical profession has dialed back on screening tests.  With relatively rare conditions, widespread testing yields many false positives and doctor visits, whose collective cost can outweigh benefits.  Coronavirus advice follows this line – testing is limited to the truly ill (this is also due to aContinue reading “Coronavirus: To Test or Not to Test”

Regularized Model

In building statistical and machine learning models, regularization is the addition of penalty terms to predictor coefficients to discourage complex models that would otherwise overfit the data.  An example is ridge regression.

Big Sample, Unreliable Result

Which would you rather have?  A large sample that is biased, or a representative sample that is small?  The American Statistical Association committee that reviewed the 1948 Kinsey report on male sexual behavior, based on interviews with over 5000 men, left no doubt of their preference for the latter.  The statisticians –  William Cochran, FrederickContinue reading “Big Sample, Unreliable Result”

Problem of the Week: Notify or Don’t Notify?

Our problem of the week is an ethical dilemma, posed by the New England Journal of Medicine to its readers 10 days ago.  Volunteers contributed DNA samples to investigators building a genetic database for study, on condition the data would be deidentified and kept confidential and that they themselves would not learn results.  Should participantsContinue reading “Problem of the Week: Notify or Don’t Notify?”

Factor

The term “factor” has different meanings in statistics that can be confusing because they conflict.   In statistical programming languages like R, factor acts as an adjective, used synonymously with categorical – a factor variable is the same thing as a categorical variable.  These factor variables have levels, which are the same thing as categories (aContinue reading “Factor”