The values of various attributes (variables) of an object are measured (the matrix columns) and a linear classification function is developed that maximizes the ratio of between-class variability to within-class variability. The function measures statistical distance between an observation and each class, and is used to assign a classification to each object..
For example, a rule is desired to distinguish between responders and non-responders to a particular medication for multiple sclerosis. The medication has potentially harmful side effects, so it is desirable to discontinue its use in non-responders (while not removing responders from the medication). We could measure:
- # of brain lesions in the past month
- Average brain lesions per month since medication started
- Average brain lesions per month before medication started
- Average number of seizures per month since medication started
- Average number of seizures per month before medication started
Discriminant analysis seeks to establish a rule that accurately divides patients into responders and non-responders based on the above variables. Typically, the rule will be established using a portion of the data (the training data) and tested on another portion of the data.