Skip to content

Puzzle: Surgery or Radiation

Several decades ago, the dominant therapies for lung cancer were radiation, which offered better short-term survival rates, and surgery, which offered better long-term rates. A thought experiment was conducted in which surgeons were randomly assigned to one of two groups and asked whether they would choose surgery.

Group 1 was told: The one-month survival rate is 90%.
Group 2 was told: There is 10% mortality in the first month.

Yes, the two statements say the same thing. What did the two physician groups choose?

Problem of the Week: Notify or Don’t Notify?

Our problem of the week is an ethical dilemma, posed by the New England Journal of Medicine to its readers 10 days ago.  Volunteers contributed DNA samples to investigators building a genetic database for study, on condition the data would be deidentified and kept confidential and that they themselves would not learn results.  Should participantsContinue reading “Problem of the Week: Notify or Don’t Notify?”

Problem of the Week: Simpson’s Paradox – baseball

Question: A baseball team is comparing two of its hitters, Hernandez and Dimock. Hernandez hit .250 in 2017 and .275 in 2018. Dimock did worse in both years – .245 in 2017 and .270 in 2018. Overall, though, Dimock hit better across the two years, .263 versus .258 for Hernandez. How can this be? Answer:Continue reading “Problem of the Week: Simpson’s Paradox – baseball”

Problem of the Week: The Value of Bedrooms

Question: You work for an internet real-estate company, building statistical models to predict home price on the basis of square footage, number of bedrooms, number of bathrooms, property type (single family home, townhouse, multiplex), and age. Surprisingly, you find the coefficient for bedrooms is negative, meaning that adding bedrooms decreases value. What might account forContinue reading “Problem of the Week: The Value of Bedrooms”

Problem of the Week: Missing Data

Question: You have a supervised learning task with 30 predictors, in which 5% of the observations are missing.  The missing data are randomly distributed across variables and records. If your strategy for coping with missing data is to drop records with missing data, what proportion of the records will be dropped?  Is the assumption ofContinue reading “Problem of the Week: Missing Data”