Kappa statistic is a generic term for several similar measures of agreement used with categorical data . Typically it is used in assessing the degree to which two or more raters, examining the same data, agree when it comes to assigning the data to categories. for example, kappa might be used to assess the extent to which (1) radiologist analysis of an x-ray, (2) computer analysis of the same x-ray, and (3) biopsy agree in labeling a growth “malignant” or “benign.”
Suppose each object in a group of M objects is assigned to one of n categories. The categories are at nominal scale . For each object, such assignments are done by k raters.
The kappa measure of agreement is the ratio
where P(A) is the proportion of times the k raters agree, and P(E) is the proportion of times the k raters are expected to agree by chance alone.
Complete agreement corresponds to K = 1 , and lack of agreement (i.e. purely random coincidences of rates) corresponds to K = 0 . A negative values of kappa would mean negative agreement – that is, the propensity of raters to avoid assignments made by other raters.