Statistical Glossary
Classification and Regression Trees (CART): Classification and regression trees (CART) is a set of techniques for classification and prediction. The technique is aimed at finding a rule(s) which could predict the value of a dependent variable
from known values of
explanatory variables
(predictors). The predictor variables
may be a mixture of categorical and continuous variables.
The initial data represent a set of objects with known values of the dependent variable
and predictors
. CART builds trees - i.e. formulates simple if/then rules for recursive partitioning (splitting) of all the objects into smaller subgroups. Each such step may give rise to new "branches". The goal of this process is to maximize homogeneity of the values of the dependent variable
in the various subgrops.
All the CART techniques are essentially non-parametric - they do not rely on any particular assumptions about the type of dependence of the dependent variable
on predictors
(in contrast to various regression techniques) and about statistical properties of the data. This is an essential practical advantage for the cases when apriori information about the data is limited.
There are two main approaches in CART - classification trees (used to predict the class or category of records) and regression trees , (used to predict a continuous value).
Also see: on classification trees on regression trees , CHAID .
And the short course Introduction to Data Mining

