A hold-out sample is a random sample from a data set that is withheld and not used in the model fitting process. After the model is fit to the main data (the "training" data), it is then applied to the hold-out sample. This gives an unbiased assessment of how well...

Statistical Glossary Image Processing: In image processing, the initial data are images - functions of two coordinates. Normally, images are represented in discrete form as two-dimensional arrays of image elements, or "pixels" - i.e. sets of non-negative values , ordered by two indexes - (rows) and (columns). A major class...

Hierarchical Cluster Analysis: Hierarchical cluster analysis (or hierarchical clustering) is a general approach to cluster analysis , in which the object is to group together objects or records that are "close" to one another. A key component of the analysis is repeated calculation of distance measures between objects, and between...

Statistical Glossary Harmonic Mean: Harmonic mean is a measure of central location. The harmonic mean of positive values is defined by the formula Let the path between two cities and be divided into parts of equal length. One drives the th part at velocity . Then, the average speed on...

Geometric Distribution: A random variable x obeys the geometric distribution with parameter p (0<p<1) if If a random variable obeys the Bernoulli distribution with probability of success p, then x might be the number of trials before the first "success" occurs. Browse Other Glossary Entries

Gini coefficient: The Gini coefficient is used in economics to measure income inequality. Generally speaking, it is used to measure the extent of departure from a perfectly even distribution of income. A "0" indicates no departure, i.e. everyone has the same income. A "1" indicates complete departure - all income...

Gaussian Filter: The Gaussian filter is a linear filter that is usually used as a smoother . The output of the gaussian filter at the moment is the weighted mean of the input values, and the weights are defined by formula where is the "distance" in time from the current...

General Linear Model for a Latin Square: In design of experiment, a Latin square is a three-factor experiment in which for each pair of factors in any combination of factor values occurs only once. Consider the following Latin Square, where rows correspond to 4 values of factor I, columns -...

Gamma Distribution: A random variable x is said to have a gamma-distribution with parameters a > 0 and l > 0 if its probability density p(x) is p(x) = ÃƒÂ¬ ÃƒÂ¯ ÃƒÂ ÃƒÂ¯ ÃƒÂ® la G(a) xa-1 e-lx, x > 0; 0, Browse Other Glossary Entries

Functional Data Analysis (FDA): In functional data analysis (FDA), data are considered as continuous functions (or curves). This is in contrast to multivariate statistics, where data are considered as vectors (finite sets of values). Real data are usually collected as discrete samples. In FDA, such discrete data are transformed to...

Farthest Neighbor Clustering: The farthest neighbor clustering is a synonym for complete linkage clustering . Browse Other Glossary Entries

Fourier Spectrum: Any continuous function defined on a finite interval of length can be represented as a weighted sum of cosine functions with periods : where is the frequency of the i-th Fourier component; is the amplitude of the i-th component; is the phase of the i-th component. The function...

Fleming Procedure: Fleming procedure (or O´Brien-Fleming multiple testing procedure ) is a simple multiple testing procedure for comparing two treatments when the response to treatment is dichotomous . This procedure is used in clinical trials. The procedure provides an opportunity to terminate the trial early when one treatment performs markedly...

Fixed Effects: The term "fixed effects" (as contrasted with "random effects") is related to how particular coefficients in a model are treated - as fixed or random values. Which approach to choose depends on both the nature of the data and the objective of the study. A fixed effect approach...

F Distribution: The F distribution is a family of distributions differentiated by two parameters: m1 (degrees of freedom, numerator) and m2 (degrees of freedom, denominator). If x1 and x2 are independent random variables with a chi-square distribution with m1 and m2 degrees of freedom respectively, then the random variable f...

Family-wise Type I Error: In multiple comparison procedures, family-wise type I error is the probability that, even if all samples come from the same population, you will wrongly conclude that at least one pair of populations differ. If is the probability of comparison-wise type I error, then the probability of...

Face Validity: The face validity of survey instruments and tests used in psychometrics , is assessed by cursory review of the items (questions) by untrained individuals. The individuals make their judgments on whether the items are relevant. For example, a researcher developing an IQ-test might ask his friends and relatives...

Exponential Distribution: The exponential distribution is a one-sided distribution completely specified by one parameter ; the density of this distribution is The mean of the exponential distribution is . The exponential distribution is a model for the length of intervals between two consecutive random events in time, or between a...

Explanatory Variable: Explanatory variable is a synonym for independent variable . See also: dependent and independent variables . Browse Other Glossary Entries

Exogenous Variable: Exogenous variables in causal modeling are the variables with no causal links (arrows) leading to them from other variables in the model. In other words, exogenous variables have no explicit causes within the model. The concept of exogenous variable is fundamental in path analysis and structural equation modeling...

