Statistical Word of the Week
Week #30 - Discriminant analysis
The values of various attributes (variables) of an object are measured (the matrix columns) and a linear classification function is developed that maximizes the ratio of between-class variability to within-class variability. The function measures statistical distance between an observation and each class, and is used to assign a classification to each object..
For example, a rule is desired to distinguish between responders and non-responders to a particular medication for multiple sclerosis. The medication has potentially harmful side effects, so it is desirable to discontinue its use in non-responders (while not removing responders from the medication). We could measure:
- # of brain lesions in the past month
- Average brain lesions per month since medication started
- Average brain lesions per month before medication started
- Average number of seizures per month since medication started
- Average number of seizures per month before medication started
Discriminant analysis seeks to establish a rule that accurately divides patients into responders and non-responders based on the above variables. Typically, the rule will be established using a portion of the data (the training data) and tested on another portion of the data.
The Institute for Statistics Education offers an extensive glossary of statistical terms, available to all for reference and research. We will provide a statistical term every week, delivered directly to your inbox. To improve your own statistical knowledge, sign up here.
Rather not have more email? Bookmark our "Stats Word of the Week" page.