Glossary of statistical terms

facebook LinkedIn twitter Google+ Email
k-Means Clustering:

The k-means clustering method is used in non-hierarchical cluster analysis . The goal is to divide the whole set of objects into a predefined number (k) of clusters. The criteria for such subdivision is normally the minimal dispersion inside clusters - e.g. the minimal sum of squares of the distances from the mean vector ( centroid ) of the cluster. A direct rigorous solution to this problem requires testing of an impractically large number of data subdivisions. The k-means clustering is a fast heuristic method that provides a reasonably good solution, although not optimal.

For more details see the chapter in the XLMiner help .

Browse Other Glossary Entries

Want to learn more about this topic? offers over 100 courses in statistics from introductory to advanced level. Most are 4 weeks long and take place online in series of weekly lessons and assignments, requiring about 15 hours/week. Participate at your convenience; there are no set times when you must to be online. Ask questions and exchange comments with the instructor and other students on a private discussion board throughout the course.

Predictive Analytics 3: Dimension Reduction, Clustering and Association Rules

This course covers key unsupervised learning techniques - association rules, principal components analysis, and clustering. The course will include an integration of supervised and unsupervised learning techniques.

Cluster Analysis

In this online course, “Cluster Analysis,” you will learn how to use various cluster analysis methods to identify possible clusters in multivariate data. Methods discussed include hierarchical clustering, k-means clustering, two-step clustering, and normal mixture models for continuous variables.

Multivariate Statistics

This course covers key multivariate procedures such as multivariate analysis of variance (MANOVA), principal components, factor analysis and classification.

Statistical Analysis of Microarray Data with R

This course will acquaint you with the process of analysis of microarray data. You will learn how to preprocess the data, short list the differentially expressed genes, carryout principal component analysis to reduce the dimensionality and to detect interesting gene expression patterns, and clustering of genes and samples. Illustrations of the statistical issues involved at the various stages of the analysis will use real data sets from DNA microarray experiments; background will be provided on the use of Bioconductor.

Text Mining using Python

This course will introduce the essential techniques of text mining, understood here as the extension of data mining's standard predictive methods to unstructured text.

Back to Main Glossary

Promoting better understanding of statistics throughout the world

To celebrate the International Year of Statistics in 2013, we started a program to provide a statistical term every week, delivered directly to your inbox. The Word of the Week program proved to be quite popular, and continues. The Institute for Statistics Education offers an extensive glossary of statistical terms, available to all for reference and research. Make it your New Year's resolution to improve your own statistical knowledge! Sign up here. Rather not have more email? Simply bookmark our home page and check our “Stats Word of the Week” feature.

Want to be notified of future courses?

Student comments