#### Latent Variable Growth Curve Models

Latent Variable Growth Curve Models: These techniques, also called Latent Curve Models (LCM), take traditional modeling of growth curves for repeated measures data and extend it to cover the use of latent variables. In Latent Variable Growth Curve Models, Structural Equation Modeling (SEM) methods are…

#### Latent Variable Models

Latent Variable Models: Latent variable models are a broad subclass of latent structure models . They postulate some relationship between the statistical properties of observable variables (or "manifest variables", or "indicators") and latent variables. A special kind of statistical analysis corresponds to each kind of…

#### Latin Square

Latin Square: The Latin Square is a square array in which every letter or symbol appears exactly one in each row and in each column. B C D A C D A B D A B C A B C D Browse Other Glossary Entries

#### Law Of Large Numbers

Law of Large Numbers: According to the Law of Large Numbers, the probability that the proportion of successes in a sample will differ from the population proportion by less than c ( any positive constant) approaches 1 as the sample size tends to infinity. Browse…

#### Lawley-Hotelling Trace

Lawley-Hotelling Trace: See Hotelling Trace coefficient . Browse Other Glossary Entries

#### Least Squares Method

Least Squares Method: In a narrow sense, the Least Squares Method is a technique for fitting a straight line through a set of points in such a way that the sum of the squared vertical distances from the observed points to the fitted line is…

#### Level of a Factor

Level of a Factor: In design of experiments, levels of a factor are the values it takes on. The values are not necessarily numbers - they may be at a nominal scale, ordinal scale, etc. See Variables (in design of experiments) for an explanatory example.…

#### Level Of Significance

Level of Significance: In hypothesis testing, you seek to decide whether observed results are consistent with chance variation under the "null hypothesis," or, alternatively, whether they are so different that chance variability can be ruled out as an explanation for the observed sample. The range…

#### Life Tables

Life Tables: In survival analysis, life tables summarize lifetime data or, generally speaking, time-to-event data. Rows in a life table usually correspond to time intervals, columns to the following categories: (i) not "failed", (ii) "failed", (iii) censored (withdrawn), and the sum of the three called…

#### Likelihood Function

Likelihood Function: Likelihood function is a fundamental concept in statistical inference. It indicates how likely a particular population is to produce an observed sample. Let P(X; T) be the distribution of a random vector X, where T is the vector of parameters of the distribution.…

#### Likelihood Function (Graphical)

Likelihood Function: Likelihood function is a fundamental concept in statistical inference. It indicates how likely a particular population is to produce an observed sample. Let P(X; T) be the distribution of a random vector X, where T is the vector of parameters of the distribution.…

#### Likelihood Ratio Test

Likelihood Ratio Test: The likelihood ratio test is aimed at testing a simple null hypothesis against a simple alternative hypothesis. (See Hypothesis for an explanation of "simple hypothesis"). The likelihood ratio test is based on the likelihood ratio r as the test statistic: r =…

#### Likelihood Ratio Test (Graphical)

Likelihood Ratio Test: The likelihood ratio test is aimed at testing a simple null hypothesis against a simple alternative hypothesis. (See Hypothesis for an explanation of "simple hypothesis"). The likelihood ratio test is based on the likelihood ratio r as the test statistic: where X…

#### Likert Scales

Likert Scales: Likert scales are categorical ordinal scale s used in social sciences to measure attitude. Measurements at Likert scales usually take on an odd number of values with a middle point, e.g. "strongly agree", "agree", "undecided", "disagree", "strongly disagree". The middle value is usually…

#### Lilliefors Statistic

Statistical Glossary Lilliefors Statistic: The Lilliefors statistic is used in a goodness-of-fit test of whether an observed sample distribution is consistent with normality. The statistic measures the maximum distance between the observed distribution and a normal distribution with the same mean and standard deviation as…

#### Lilliefors test for normality

Statistical Glossary Lilliefors test for normality: The Lilliefors test is a special case of the Kolmogorov-Smirnov goodness-of-fit test. In the Lilliefors test, the Kolmogorov-Smirnov test is implemented using the sample mean and standard deviation as the mean and standard deviation of the theoretical (benchmark) population…

#### Line of Regression

Line of Regression: The line of regression is the line that best fits the data in simple linear regression, i.e. the line that corresponds to the "best-fit" parameters (slope and intercept) of the regression equation. Browse Other Glossary Entries

#### Linear Filter

Statistical Glossary Linear Filter: A linear filter is the filter whose output is a linear function of the input. Any output value of a linear filter is the weighted mean of input values. In other words, to form one element of the output at time…

#### Linear Model

Linear Model: A linear model specifies a linear relationship between a dependent variable and n independent variables: y = a0 + a1 x1 + a2 x2 + Ã‚Â¼+ an xn, where y is the dependent variable, {xi} are independent variables, {ai} are parameters of the…

#### Linear Model (Graphical)

Linear Model: A linear model specifies a linear relationship between a dependent variable and n independent variables: where y is the dependent variable, {xi} are independent variables, {ai} are parameters of the model. For example, consider that for a sample of 25 cities, the following…

#### Linear Regression

Linear Regression: Linear regression is aimed at finding the "best-fit" linear relationship between the dependent variable and independent variable(s). See also: Regression analysis, Simple linear regression, Multiple regression Browse Other Glossary Entries

Linkage Function: A linkage function is an essential prerequisite for hierarchical cluster analysis . Its value is a measure of the "distance" between two groups of objects (i.e. between two clusters). Algorithms for hierarchical clustering normally differ by the linkage function used. The most common…

#### Local Independence

Statistical Glossary Local Independence: The local independence postulate plays a central role in latent variable models . Local independence means that all the manifest variable s are independent random variables if the latent variable s are controlled (fixed). Technically, the local independence may be described…

#### Log-log Plot

Log-log Plot: A log-log plot represents observed units described by two variables, say x and y , as a scatter graph . In a log-log plot, the two axes display the logarithm of values of the variables, not the values themselves. If the relationship between…

#### Log-Normal Distribution

Log-Normal Distribution: A random variable X has a log-normal distribution if ln(X) is normally distributed. Browse Other Glossary Entries

#### Logistic Regression

Logistic Regression: Logistic regression is used with binary data when you want to model the probability that a specified outcome will occur. Specifically, it is aimed at estimating parameters a and b in the following model: Li = log  pi 1-pi = a + b…

#### Logistic Regression (Graphical)

Logistic Regression: Logistic regression is used with binary data when you want to model the probability that a specified outcome will occur. Specifically, it is aimed at estimating parameters a and b in the following model: where pi is the probability of a success for…

#### Logit

Logit: Logit is a nonlinear function of probability. If p is the probability of an event, then the corresponding logit is given by the formula: logit(p) = log  p (1 - p) Logit is widely used to construct statistical models, for example in logistic regression…

#### Logit and Odds Ratio

Logit and Odds Ratio: The following relation between the odds ratio and logit is often used for constructing statistical models: log  OR(p1, p2) = logit  (p1) - logit  (p2) where p1, p2 are probabilities, OR  (p1, p2) is the odds ratio for p1 and p2 . See also: Logit…

#### Logit and Probit Models

Logit and Probit Models: Logit and probit models postulate some relation (usually - a linear relation) between nonlinear functions of the observed probabilities and unknown parameters of the model. Logit and probit here are nonlinear functions of probability. See also: Logit Models , Probit Models…

#### Logit Models

Logit Models: Logit models postulate some relation between the logit of observed probabilities (not the probabilities themselves), and unknown parameters of the model. For example, logit models used in logistic regression postulate a linear relation between the logit and parameters of the model. The major…

#### Loglinear models

Loglinear models: Loglinear models are models that postulate a linear relationship between the independent variables and the logarithm of the dependent variable, for example: log(y) = a0 + a1 x1 + a2 x2 ... + aN xN where y is the dependent variable; xi, i=1,...,N…

#### Loglinear regression

Loglinear regression: Loglinear regression is a kind of regression aimed at finding the best fit between the data and a loglinear model . The major assumption of loglinear regression is that a linear relationship exists between the log of the dependent variable and the inependent…

#### Longitudinal Analysis

Longitudinal Analysis: Longitudinal analysis is concerned with statistical inference from longitudinal data Browse Other Glossary Entries

#### Longitudinal Data

Longitudinal Data: Longitudinal data refer to observations of given units made over time. A simple example of longitudinal data is the gross annual income of, say, 1000 households from New York City for the years 1991-2000. See also: cross-sectional data , panel data , Cohort…

#### Longitudinal study

Longitudinal study: Longitudinal studies are those that record data for subjects or variables over time. If a longitudinal study uses the same subjects at each point where data are recorded, it is a panel study . If a longitudinal study samples from the same group…

#### Loss Function

A loss function specifies a penalty for an incorrect estimate from a statistical model. Typical loss functions might specify the penalty as a function of the difference between the estimate and the true value, or simply as a binary value depending on whether the estimate…

#### Machine Learning

Machine Learning: Analytics in which computers "learn" from data to produce models or rules that apply to those data and to other similar data. Predictive modeling techniques such as neural nets, classification and regression trees (decision trees), naive Bayes, k-nearest neighbor, and support vector machines…

#### MANCOVA

MANCOVA: See Multiple analysis of covariance Browse Other Glossary Entries

#### Manifest Variable

Manifest Variable: In latent variable models , a manifest variable (or indicator) is an observable variable - i.e. a variable that can be measured directly. A manifest variable can be continuous or categorical. The opposite concept is the latent variable . See also latent variable…

#### Mann – Whitney U Test

Mann - Whitney U Test: See Wilcoxon - Mann - Whitney Test. Browse Other Glossary Entries

#### MANOVA

MANOVA: See Multiple analysis of variance Browse Other Glossary Entries

#### Mantel-Cox Test

Mantel-Cox Test: The Mantel-Cox test is aimed at testing the null-hypothesis that survival function s don´t differ across groups. Browse Other Glossary Entries

#### Mantel-Haenszel test

Mantel-Haenszel test: See Cochran-Mantel-Haenszel test Browse Other Glossary Entries

#### MapReduce

MapReduce In computer science, MapReduce is a procedure that prepares data for parallel processing on multiple computers. The "map" function sorts the data, and the "reduce" function generates frequencies of items. The combined overall system manages the parceling out of the data to multiple processors,…

#### Margin of Error

Margin of Error: A margin of error typically refers to a range within which an unknown parameter is estimated to fall, given the variation that can arise from one sample to another. For example, in an opinion survey based on a randomly-drawn sample from a…

#### Marginal Density

Marginal Density: If X and Y are continuous random variables, and f(x,y ) is the joint density of X and Y, then the marginal density of X, g(x), is given by Browse Other Glossary Entries

#### Marginal Distribution

Marginal Distribution: If X and Y are discrete random variables and f(x,y) is their joint probability distribution, the marginal distribution of X, g(x) is given by Browse Other Glossary Entries

#### Markov Chain

Statistical Glossary Markov Chain: A Markov chain is a series of random values x1, x2, ... in which the probabilities associated with a particular value xi depend only on the prior value xi-1. For this reason, a Markov chain is a special case of "memoryless"…

#### Markov Chain (Graphical)

Statistical Glossary Markov Chain: A Markov chain is a series of random values x1, x2, ... in which the probabilities associated with a particular value xi depend only on the prior value . For this reason, a Markov chain is a special case of "memoryless"…

#### Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC): A Markov chain is a probability system that governs transition among states or through successive events. For example, in the American game of baseball, the probability of reaching base differs depending on the "count" -- the number of balls and…

#### Markov Property

Statistical Glossary Markov Property: Markov property means "absence of memory" of a random process - that is, independence of conditional probabilities P( U(t1 > t) | U(t) ) on values U(t2 < t). In simpler words, this property means that future behavior depends only on…

#### Markov Property (Graphical)

Statistical Glossary Markov Property: Markov property means "absence of memory" of a random process - that is, independence of conditional probabilities on values U(t2 < t). In simpler words, this property means that future behavior depends only on the current state, but not on the…

#### Markov Random Field

Statistical Glossary Markov Random Field: See Markov Chain, Random Field. Browse Other Glossary Entries

#### Maximum Likelihood Estimator

Maximum Likelihood Estimator: The method of maximum likelihood is the most popular method for deriving estimators - the value of the population parameter T maximizing the likelihood function is used as the estimate of this parameter. The general idea behind maximum likelihood estimation is to…

#### Maximum Likelihood Estimator (Graphical)

Maximum Likelihood Estimator: The method of maximum likelihood is the most popular method for deriving estimators - the value of the population parameter T maximizing the likelihood function is used as the estimate of this parameter. The general idea behind maximum likelihood estimation is to…

#### Mean

Mean: For a population or a sample, the mean is the arithmetic average of all values. The mean is a measure of central tendency or location. See also: Expected Value. Browse Other Glossary Entries

#### Mean Deviation

Mean Deviation: See Average deviation Browse Other Glossary Entries

#### Mean Score Statistic

Statistical Glossary Mean Score Statistic: The mean score statistic is one of the statistics used in the generalized Cochran-Mantel-Haenszel tests . It is applicable when the response levels (columns) are measured at an ordinal scale . If the two variables are independent of each other…

#### Mean Squared Error

Statistical Glossary Mean Squared Error: The mean squared error is a measure of performance of a point estimator. It measures the average squared difference between the estimator and the parameter. For an unbiased estimator, the mean squared error is equal to the variance of the…

#### Mean Values (Comparison)

Statistical Glossary Mean Values (Comparison): The numerical example below illustrates basic properties of various descriptive statistics with "mean" in their name, like the arithmetic mean , the trimmed mean , the geometric mean , the harmonic mean , and several power mean . Because the…

#### Measurement Error

Statistical Glossary Measurement Error: The measurement error is the deviation of the outcome of a measurement from the true value. For example, if electronic scales are loaded with a 1 kilogram standard weight and the reading is 1002 grams, the measurement error is +2 gram…

#### Median

Median: In a population or a sample, the median is the value that has just as many values above it as below it. If there are an even number of values, the median is the average of the two middle values. The median is a…

#### Median Filter

Statistical Glossary Median Filter: The median filter is a robust filter . Median filters are widely used as smoothers for image processing , as well as in signal processing and time series processing. A major advantage of the median filter over linear filters is that…

#### Meta-analysis

Meta-analysis: Meta-analysis takes the results of two or more studies of the same research question and combines them into a single analysis. The purpose of meta-analysis is to gain greater accuracy and statistical power by taking advantage of the large sample size resulting from the…

#### Minimax Decision Rule

Minimax Decision Rule: A minimax decision rule has the smallest possible maximum risk. All other decision rules will have a higher maximum risk. Browse Other Glossary Entries

#### Missing Data Imputation

Statistical Glossary Missing Data Imputation: "Imputing missing data" is a process by which the missing values in a data set are estimated from the remaining data, for the purpose of allowing statistical procedures to be performed on a complete data set. (Most statistical procedures fail…

#### Mixed Models

Mixed Models: In mixed effects models (or mixed random and fixed effects models) some coefficients are treated as fixed effects and some as random effects. See fixed effects for detailed explanations of the concepts "random effects" and "fixed effects". Browse Other Glossary Entries

#### Mode

Mode: The mode is a value that occurs with the greatest frequency in a population or a sample. It could be considered as the single value most typical of all the values. Browse Other Glossary Entries

#### Moment Generating Function

Moment Generating Function: The moment generation function is associated with a probability distribution. The moment generating function can be used to generate moments. However, the main use of the moment generating function is not in generating moments but to help in characterizing a distribution. The…

#### Moments

Moments: For a random variable x, its Nth moment is the expected value of the Nth power of x, where N is a positive integer. The Nth moment of the deviation of x from its mean is called "the Nth central moment". The 1st moment…

#### Monte Carlo Simulation

Monte Carlo Simulation: Monte Carlo simulation is simulation of a random phenomena using pseudo-random numbers . This type of simulation is widely used in practical statistics, e.g. in resampling , in queuing theory . The goal of Monte Carlo simulation is not necessarily simulation of…

#### Moving Average (MA) Models

Moving Average (MA) Models: Moving average (MA) models are used in time series analysis to describe stationary time series . The MA-models represent time series that are generated by passing the white noise through a non-recursive linear filter . A moving average model of a…

#### Multicollinearity

Multicollinearity: In regression analysis , multicollinearity refers to a situation of collinearity of independent variables, often involving more than two independent variables, or more than one pair of collinear variables. Multicollinearity means redundancy in the set of variables. This can render ineffective the numerical methods…

#### Multidimensional Scaling

Multidimensional Scaling: Multidimensional scaling (MDS) is an approach to multivariate analysis aimed at producing a spatial or geometrical representation of complex data. MDS helps to explain the observed distance matrix or dissimilarity matrix for a set of N objects in terms of a much smaller…

#### Multiple analysis of covariance (MANCOVA)

Multiple analysis of covariance (MANCOVA): Multiple analysis of covariance (MANCOVA) is similar to multiple analysis of variance (MANOVA) , but allows you to control for the effects of supplementary continuous independent variables - covariate s. If there are some covariates, MANCOVA should be used instead…

#### Multiple analysis of variance (MANOVA)

Multiple analysis of variance (MANOVA): MANOVA is a technique which determines the effects of independent categorical variables on multiple continuous dependent variables. It is usually used to compare several groups with respect to multiple continuous variables. The main distinction between MANOVA and ANOVA is that…

#### Multiple Comparison

Multiple Comparison: Multiple comparisons are used in the same context as analysis of variance (ANOVA) - to check whether there are differences in population means among more than two populations. In contrast to ANOVA, which simply tests the null hypothesis that all means are equal,…

#### Multiple Correspondence Analysis (MCA)

Multiple Correspondence Analysis (MCA): Multiple correspondence analysis (MCA) is an extension of correspondence analysis (CA) to the case of more than two variables. The initial data for MCA are three-way or m-way contingency tables. In case of three variables, a common approach to MCA is…

#### Multiple discriminant analysis (MDA)

Multiple discriminant analysis (MDA): Multiple Discriminant Analysis (MDA) is an extension of discriminant analysis ; it shares ideas and techniques with multiple analysis of variance (MANOVA) . The goal of MDA is to classify cases into three or more categories using continuous or dummy categorical…

#### Multiple Least Squares Regression

Multiple Least Squares Regression: Multiple least squares regression is a special (and the most common) type of multiple regression . It relies on the least squares method to fit the regression model to the data. See also: ordinary least squares regression . Browse Other Glossary…

#### Multiple looks

<b Multiple looks: In a classic statistical experiment, treatment(s) and placebo are applied to randomly assigned subjects, and, at the end of the experiment, outcomes are compared. With multiple looks, the investigator does not wait until the end of the experiment -- outcomes are compared…

#### Multiple Regression

Multiple Regression: Multiple (linear) regression is a regression technique aimed at finding a linear relationship between the dependent variable and multiple independent variables. (See regression analysis.) The multiple regression model is as follows: Yi = B0 + B1 X1i + B2 X2i + Ã‚Â¼+ Bm…

#### Multiple Regression (Graphical)

Multiple Regression: Multiple (linear) regression is a regression technique aimed at finding a linear relationship between the dependent variable and multiple independent variables. (See regression analysis.) The multiple regression model is as follows: where Yi are values of the dependent variable, X1i, X2i, ... ,…

#### Multiple Testing

Multiple Testing: See Multiple comparison. Browse Other Glossary Entries

#### Multiplicative Error

Statistical Glossary Multiplicative Error: A multiplicative error is proportional to the true value of the quantity being measured. An example of a multiplicative error is when electronic scales provide readings 1% higher than the true weight - i.e. 1.01 kg for 1.0 kg standard weight,…

#### Multiplicity Issues

Multiplicity Issues: Multiplicity issues arise in a number of contexts, but they generally boil down to the same thing: repeated looks at a data set in different ways, until something "statistically significant" emerges. See multiple comparisons for how to handle multiple pairwise testing in conjunction…

#### Multivariate

Multivariate: Multivariate analysis involves more than one variable of interest. Browse Other Glossary Entries

#### Naive bayes classifier

Naive bayes classifier: A full Bayesian classifier is a supervised learning technique that assigns a class to a record by finding other records with attributes just like it has, and finding the most prevalent class among them. Naive Bayes (NB) recognizes that finding exact matches…

#### Natural Language

Natural Language: A natural language is what most people outside the field of computer science think of as just a language (Spanish, English, etc.). The term "natural" simply signifies that the reference is not to a programming language (C++, Java, etc.). The context is usually…

#### Nearest Neighbor Clustering

Nearest Neighbor Clustering: The nearest neighbor clustering is a synonym for single linkage clustering . Browse Other Glossary Entries

#### Negative Binomial

Negative Binomial: The negative binomial distribution is the probability distribution of the number of Bernoulli (yes/no) trials required to obtain r successes. Contrast it with the binomial distribution - the probability of x successes in n trials. Also with the Poisson distribution - the probability…

#### Netflix Prize

Netflix Prize: The Netflix prize was a famous early application of crowdsourcing to predictive modeling. In 2006, Netflix published customer movie rating data and challenged analysts to come up with a predictive model that would improve Netflix's prediction of what your rating would be for…

#### Netflix Prize

Netflix Prize: The Netflix prize was a famous early application of crowdsourcing to predictive modeling. In 2006, Netflix published customer movie rating data and challenged analysts to come up with a predictive model that would improve Netflix's prediction of what your rating would be for…

#### Network Analytics

Network Analytics: Network analytics is the science of describing and, especially, visualizing the connections among objects. The objects might be human, biological or physical. Graphical representation is a crucial part of the process; Wayne Zachary's classic 1977 network diagram of a karate club reveals the…

#### Neural Network

Neural Network: A neural network (NN) is a network of many simple processors ("units"), each possibly having a small amount of local memory. The units are connected by communication channels ("connections") which usually carry numeric (as opposed to symbolic) data, encoded by any of various…

#### Node

Node: A node is an entity in a network. In a social network, it would be a person. In a digital network, it would be a computer or device. Nodes can be of different types in the same network - a criminal network might contain…

#### Noise

Noise: The noise is the component of the observed data (e.g. of a time series ) that is random and carries no useful information. The presence of noise is often the major nuisance factor that makes statistical inference from the data more difficult. The complementary…

#### Nominal Scale

Nominal Scale: A nominal scale is really a list of categories to which objects can be classified. For example, people who receive a mail order offer might be classified as "no response," "purchase and pay," "purchase but return the product," and "purchase and neither pay…

#### Non-parametric Regression

Non-parametric Regression: Non-parametric regression methods are aimed at describing a relationship between the dependent and independent variables without specifying the form of the relationship between them a priori. See also: Regression analysis Browse Other Glossary Entries