In statistics, “n” denotes the size of a dataset, typically a sample, in terms of the number of observations or records.
Monthly Archives: April 2016
Week #17 – Corpus
A corpus is a body of documents to be used in a text mining task. Some corpuses are standard public collections of documents that are commonly used to benchmark and tune new text mining algorithms. More typically, the corpus is a body of documents for a specific text mining task – e.g. a set ofContinue reading “Week #17 – Corpus”
Historical Spotlight: Eugenics – journey to the dark side at the dawn of statistics
April 27 marks the 80th anniversary of the death of Karl Pearson, who contributed to statistics the correlation coefficient, principal components, the (increasingly-maligned) p-value, and much more. Pearson was one of a trio of founding fathers of modern statistics, the others being Francis Galton and Ronald Fisher. Galton, Pearson and Fischer were deeply involved withContinue reading “Historical Spotlight: Eugenics – journey to the dark side at the dawn of statistics”