Relative Frequency Distribution: A relative frequency distribution is a tabular summary of a set of data showing the relative frequency of items in each of several non-overlapping classes. The relative frequency is the fraction or proportion of the total number of items belonging to a class. This definition...

Resampling: See bootstrapping and permutation tests

Robustness: Many statistical methods (particularly classical inference methods) rely upon assumptions about the distribution of the population the sample is drawn from. The robustness of a statistical method is its insensitivity to departures from these assumptions. The less sensitive a method is to departures from assumptions, the more robust the...

Sample: A sample is a portion of the elements of a population. A sample is chosen to make inferences about the population by examining or measuring the elements in the sample.

Sample Space: The set of all possible outcomes of a particular experiment is called the sample space for the experiment. If a coin is tossed twice, the sample space is {HH, HT, TH, TT}, where TH, for example, means getting tails on the first toss and heads on the second...

Sampling Distribution: When a sample is drawn, some summary value (called a statistic) is usually computed. For example, the sample mean and the sample variance are two statistics. The value of the statistic changes with the sample we have. The probability distribution of the statistic is called the sampling distribution....

Simulation: In general, simulation is modelling of a process or phenomenon. In statistics, Monte Carlo simulation is often used to model outcomes of a random experiment. This kind of simulation rests on generation of pseudo-random numbers - that is, numbers which behave like truly random numbers, though generated by a...

Singularity: In regression analysis , singularity is the extreme form of multicollinearity - when a perfect linear relationship exists between variables or, in other terms, when the correlation coefficient is equal to 1.0 or -1.0. Such absolute multicollinearity could arise when independent variable are linearly related in their definition. A...

Six-Sigma: Six sigma means literally six standard deviations. The phrase refers to the limits drawn on statistical process control charts used to plot statistics from samples taken regularly from a production process. Consider the process mean. A process is deemed to be "in control" at any given point in time...

Standard Score: The standard score of an observation is the number of standard deviation units it is above or below the mean. The standard score of an observation is calculated by subtracting the mean from the observation, then dividing by the standard deviation.

State Space: State space is an abstract space representing possible states of a system. A point in the state space is a vector of the values of all relevant parameters of the system. It is often assumed that the system is dynamic - that is, its state at...

Statistic: 1. A number measuring something 2. A measure calculated from a sample of data. Contrast "statistic" (drawn from a sample) with "parameter," which is a characteristic of a population. For example, the sample mean is a statistic; the population mean is a parameter of a population. See also: statistics...

Statistics: 1. A collection of numerical data that measure something. 2. The science of recording, organizing, analyzing and reporting quantitative information. See also: statistic

Survival Analysis: Survival analysis is concerned with "time-to-event" data. In medical statistics, the data are often in the form of "time-to-death". In the analysis of production or industrial data, "time-to-failure" is a typical application. However, the event of interest need not either be failure or death - for example, one...

Time Series: Time series data are measurements of a variable taken at regular intervals over time. Time series are represented as sequences of values like x(1), x(2), ... . A wide class of practically important data are represented as time series: economic and social data, weather records, sports data, to...

Time Series Analysis: Time series analysis is a branch of statistics dealing with data represented as time series . Time series analysis includes almost all classes of statistical approaches and problems: data description, hypothesis testing , parameter estimation , regression , etc. The practical importance of time series analysis stems...

Transformation: Transformation is the conversion of a data set into a transformed data set by the application of a function. The statistical purpose of transformation is to produce a transformed data set that better conforms to the requirements of a statistical procedure. A typical use of transformation is to take...

Truncation: Truncation, generally speaking, means to shorten. In statistics it can mean the process of limiting consideration or analysis to data that meet certain criteria (for example, the patients still alive at a certain point). Or it can refer to a data distribution where values above or below a certain...

Uncertainty and Statistics: A main goal of statistics is to quantify or measure uncertainty; this branch of statistics is called "inferential statistics." classical statistics measures uncertainty using fundamental concepts and theories of probability and randomness. Modern statistics often applies Monte Carlo simulation as well. For example, suppose you...

Univariate: Univariate analysis involves a single variable of interest.

