Blog

rss

Posted on Oct 11, 2019 By: Peter Bruce
Gambler’s Fallacy I - forgetting that the “coin has no memory”   Gamblers often believe that after a long streak of one outcome, the probability of a different outcome has increased.  Sports commentators often say that a batter in a slump is “due” for a hit. Psychologically, they think that an outcome opposite to the streak of poor hitting is needed to balance the sample and bring the data into conformity with the known long term average.   In the Gambler’s Fallacy I, the ga...
Posted on Oct 04, 2019 By: Peter Bruce
Anyone who has worked in retail knows the anxiety that attends workforce scheduling for both manager and employee.  The manager wonders “Will my employees show up at the right times?” The employee wonders “Will I be scheduled for inconvenient times?  Enough hours? Too many hours?”   The ability of Uber and Lyft to attract drivers, despite average hourly wages that, according to one study, place them in the bottom tenth of all workers, lies in drivers’ ability to control their ow...
Posted on Oct 04, 2019 By: Peter Bruce
Does better AI offer the hope of prejudice-free decision-making?  Ironically, the reverse might be true, especially with the advent of deep learning.     Bias in hiring is one area where private companies move with great care, since there are thickets of laws and regulations in most countries governing bias in employment.  The total cost of recruiting, interviewing, reviewing candidates, and hiring is substantial (up to $50,000 for software engineers, by one estimate), so it is no wonder ...
Posted on Sep 27, 2019 By: Peter Bruce
Burtchworks, the analytics recruiting firm, publishes a periodic survey of tool use among analytics and data science professionals.  The latest survey, from August, shows a striking pattern:   Note the extreme dropoff in SAS preference - from 54% in the oldest group to just 9% among those with 1-5 years experience.  This dropoff probably exaggerates the likely future erosion of SAS in the market. Most experienced analysts use multiple tools, but the survey asked only about the one pre...
Posted on Sep 27, 2019 By: Peter Bruce
Independent truck drivers are a quintessential image in American blue-collar life.  They have their own iconic sub-genre in country music, and are culturally the polar-opposite of the hip urban tech-savvy professionals whose life and work are governed by the latest trends on the technology front.  Mention Mountain View, and they are more likely to think of the Tennessee town in the Smokies than the California home of Google. Image by Schwoaze from Pixabay   But the gig economy has come t...
Posted on Sep 27, 2019 By: Peter Bruce
Statisticians have long been a quarrelsome bunch.  A hundred years ago it was Pearson vrs. Yule, Fisher versus Neyman-Pearson.  Subsequently it was Bayesians versus non-Bayesians. Now, with statistical practice settled into long-worn ruts, the “reproducibility crisis” has blossomed - the awkward fact that statistical hypothesis-testing methods are being used to validate bad science.   Or worse - as Stan Young put it, “There are science crooks and statistical crooks and there are no ...
Posted on Sep 23, 2019 By: Peter Bruce
You know statistics has made it big when the literary magazine The New Yorker (Sept. 9, 2019) features an article by Hannah Fry that talks about big data, p-values, sub-group analysis, and effect size.  What Statistics Can and Can’t Tell Us About Ourselves discusses some of the “statistical truths” that should inform everyday professional life and decision-making.  The approach is anecdotal yet clear, touching on some illuminating examples: In 1998, Harold Shipman, a family doctor...
Posted on Sep 23, 2019 By: Peter Bruce
There are more than 3 dozen curses in Harry Potter.  Data scientists have only one - the “curse of dimensionality.”  Dimensionality is the number of predictors or input variables in a model, and the “curse” refers to the problems that result from including too many features (predictor variables) in a model.    Old curses are awakened in The Mummy (1932) As variables are added, the data space becomes increasingly sparse, and classification and prediction models fail because...
Posted on Sep 13, 2019 By: Peter Bruce
Last week, the Trump administration announced a forthcoming ban on e-cigarettes, following news stories of a spate of deaths from vaping.  The Wall Street Journal, on Friday the 13th, published both an editorial and an op-ed piece suggesting that any harm from e-cigarettes is minor and unproven, and counterbalanced by the good they do in helping smokers quit.  That e-cigarettes help smokers quit makes sense, and seems reasonable. The Journal opinion writers therefore accept it as fact, as we a...
Posted on Sep 13, 2019 By: Peter Bruce
In this “small ball” space a month ago, I wondered why airlines did not take better advantage of information that is in their hands to reduce missed connections.  Since then, United has implemented new technology - “Connection Saver” - to do just that. Connection Saver is AI/Optimization software that makes decisions about whether to hold a flight for late connecting passengers, according to an article in the Seattle Times.  Presumably, elements of the algorithm would include Pr...
← Older post