Week #17 – Bootstrapping

Bootstrapping is sampling with replacement from observed data to estimate the variability in a statistic of interest. See also permutation tests, a related form of resampling. A common application

Comments Off on Week #17 – Bootstrapping

Week #16 – Binomial Distribution

A Binomial distribution is used to describe an experiment, event, or process for which the probability of success is the same for each trial and each trial has only two possible outcomes.

Comments Off on Week #16 – Binomial Distribution

Week #15 – Uplift or Persuasion Modeling

A combination of treatment comparisons (e.g. send a sales solicitation, or send nothing) and predictive modeling to determine which cases or subjects respond (e.g. purchase or not) to which treatments.

Comments Off on Week #15 – Uplift or Persuasion Modeling

Week #13 – Multiplicity issues

Multiplicity issues arise in a number of contexts, but they generally boil down to the same thing:  repeated looks at a data set in different ways, until something "statistically significant" emerges.

Comments Off on Week #13 – Multiplicity issues

Week #12 – Support vector machines

Support vector machines are used in data mining (predictive modeling, to be specific) for classification of records, by learning from training data.

Comments Off on Week #12 – Support vector machines

Week #11 – Attribute

In data analysis or data mining, an attribute is a characteristic or feature that is measured for each observation (record) and can vary from one observation to another.  It might

Comments Off on Week #11 – Attribute

Week #10 – Negative Binomial

The negative binomial distribution is the probability distribution of the number of Bernoulli (yes/no) trials required to obtain r successes.

Comments Off on Week #10 – Negative Binomial

Week #9 – Random Walk

A random walk is a process of random steps, motions, or transitions.  It might be in one dimension (movement along a line), in two dimensions (movements in a plane), or in three dimensions or more.

Comments Off on Week #9 – Random Walk

Week #5 – Differencing of a Time Series

in discrete time is the transformation of the series to a new time series where the values are the differences between consecutive values of the original series.

Comments Off on Week #5 – Differencing of a Time Series

Week #1 – Data Partitioning

In predictive modeling, data partitioning is the division of the data available for analysis into two or three non-overlapping

Comments Off on Week #1 – Data Partitioning

Churn Trigger

Last year's popular story out of the Predictive Analytics World conference series was Andrew Pole's presentation of Target's methodology for predicting which customers were pregnant.

Comments Off on Churn Trigger

Randomized Trials on online learning

Evidence show that there is no significant difference between taking an online introductory statistics course and a traditional in-person class.

Comments Off on Randomized Trials on online learning

Facebook IPO

Facebook began trading around 11:30 this morning, and I spent 8 minutes

Comments Off on Facebook IPO

Congratulations to Thomas Lumley!

Newly elected American Statistical Association (ASA) Fellow, and recognized for his outstanding professional contributions to and leadership in the field of statistical science.

Comments Off on Congratulations to Thomas Lumley!

Immigration

Arizona's immigration law goes before the Supreme Court this week...

Comments Off on Immigration

Julian Simon birthday

February 12 was the 80th anniversary of the birth of Julian Simon, an early pioneer in resampling methods.

Comments Off on Julian Simon birthday

Statistics for Future Presidents

Statistics for Future Presidents - Steve Pierson, Director of Science Policy at ASA wrote interesting blog wondering how statistics for future presidents (or policymakers more generally) would compare with the recommended statistical skills/concepts for others. Take a look and let him know!

Comments Off on Statistics for Future Presidents

The Data Scientist

The story of the prospective Facebook IPO, and prior IPO's from LinkedIn, Pandora, and Groupon all involve "data scientists".  Read an interview with Monica Rogati - Senior Data Scientist at LinkedIn to see the connection.

Comments Off on The Data Scientist

Popular Mistakes in Data Mining

John Elder's presentations on common data mining mistakes are a must-see if you have any experience or plans in the data mining arena.

Comments Off on Popular Mistakes in Data Mining

Coffee causes cancer?

"Any claim coming from an observational study is most likely to be wrong." Thus begins "Deming, data and observational studies," just published in "Significance Magazine" (Sept. 2011).

Comments Off on Coffee causes cancer?

The sacrifice bunt

I was watching a Washington Nationals game on TV a couple of days ago, and the concept of "expected value" ...

Comments Off on The sacrifice bunt

Epidemiologist joke

A neurosurgeon, pathologist and epidemiologist are each told to examine a can of sardines on a table in a closed room, and present a report.

Comments Off on Epidemiologist joke

The Power of Round

Advertisers shy away from round numbers, believing that $99 appears significantly cheaper than $100...

Comments Off on The Power of Round

March Madness

Did the NCAA get the March Madness rankings right? Check out SportsMeasures.com

Comments Off on March Madness

Bees on the attack

What does Matt Asher's article "Attack of the Hair Trigger Bees" have to do with global warming? Matt Asher runs statisticsblog.com ...

Comments Off on Bees on the attack

Catastrophe Modeling Assistant

Thinking about careers that use statistics? The job title "catastrophe modeling assistant" caught my eye recently in a job announcement. ...

Comments Off on Catastrophe Modeling Assistant

Random Monkeys

One of my gifts this holiday season was "A Drunkard's Walk: How Randomness Rules Our Lives,"

Comments Off on Random Monkeys
Close Menu