In deployed machine learning pipelines, “drift” is changes in the model environment that cause the model performance to degrade over time. Drift might result from data quality changes. For example, increasing amounts of missing values in the input data. Or a company might alter the definitions in categories (e.g. product groupings) that are features inContinue reading “Word of the Week – Drift”
Category Archives: Word of the Week
Word of the Week – Label Spreading
A common problem in machine learning is the “rare case” situation. In many classification problems, the class of interest (fraud, purchase by a web visitor, death of a patient) is rare enough that a data sample may not have enough instances to generate useful predictions. One way to deal with this problem is, in essence,Continue reading “Word of the Week – Label Spreading”
Word of the Week – Incidence versus Prevalence
Epidemiological terms are top of mind now, due to the pandemic. Here are two that often confuse: incidence and prevalence. For example, I encountered the following sentence on a popular medical web site: “Knee meniscal injuries are common with an incidence of 61 cases per 100,000 persons and a prevalence of 12% to 14%.” IContinue reading “Word of the Week – Incidence versus Prevalence”
Words of the Week – Inference and Confidence
An often-overlooked basic part of learning new things is vocabulary: if you don’t fully understand the meaning of terms, you are handicapped. Worse, if you think you do understand, but that understanding is wrong, you are deprived of the ability to identify the gap in your understanding. This can happen in data science, where differentContinue reading “Words of the Week – Inference and Confidence”
Word of the Week – Ruin Theory
The classic Gambler’s Ruin puzzle has an actuarial parallel: “Ruin Theory,” the calculations that govern what an insurance company should charge in premiums to reduce the probability of “ruin” for a given insurance line. “Ruin” means encountering claims that exhaust initial reserves plus accumulated premiums. The process can be depicted as a time plot, whereContinue reading “Word of the Week – Ruin Theory”