Glossary

Transformation

Transformation:

Transformation is the conversion of a data set into a transformed data set by the application of a function. The statistical purpose of transformation is to produce a transformed data set that better conforms to the requirements of a statistical procedure. A typical use of transformation is to take the log of each value; this reduces the long right tail in a skewed distribution and produces a more normally-shaped distribution.

Note that the properties of the distribution can change during transformation in ways that might invalidate the analysis. For example, consider these data where the relative magnitude of the sample mean switches after transformation:

A: 15.5, 25.4, 10.5, 13.8 Mean: 16.3

B: 15.5, 13.2, 15.3, 18.4 Mean: 15.6

Sample A has a larger mean than sample B.

After transforming using the natural log:

A: 2.741, 3.235, 2.351, 2.625 Mean: 2.738

B: 2.741, 2.580, 2.728, 2.912 Mean: 2.740

Sample B has a larger mean than sample A.

An alternative to transformation is to use non-parametric techniques that do not depend on the data being distributed in a certain fashion. See permutation tests and bootstrap.

(Thanks to David Parkhurst, Environmental Science Research Center, Indiana University in Bloomington for the example.)

Browse Other Glossary Entries

Test Yourself

Planning on taking an introductory statistics course, but not sure if you need to start at the beginning? Review the course description for each of our introductory statistics courses and estimate which best matches your level, then take the self test for that course. If you get all or almost all the questions correct, move on and take the next test.