Week #11 – Distance

Statistical distance is a measure calculated between two records that are typically part of a larger dataset, where rows are records and columns are variables.  To calculate...

Comments Off on Week #11 – Distance

Week #10 – Decile Lift

In predictive modeling, the goal is to make predictions about outcomes on a case-by-case basis:  an insurance claim will be fraudulent or not, a tax return will be correct or in error, a subscriber...

Comments Off on Week #10 – Decile Lift

Week #9 – Decision Trees

In the machine learning community, a decision tree is a branching set of rules used to classify a record, or predict a continuous value for a record.  For example

Comments Off on Week #9 – Decision Trees

Week #8 – Feature Selection

In predictive modeling, feature selection, also called variable selection, is the process (usually automated) of sorting through variables to retain variables that are likely...

Comments Off on Week #8 – Feature Selection

Week #7 – Bagging

In predictive modeling, bagging is an ensemble method that uses bootstrap replicates of the original training data to fit predictive models.

Comments Off on Week #7 – Bagging

Week #6 – Boosting

In predictive modeling, boosting is an iterative ensemble method that starts out by applying a classification algorithm and generating classifications.

Comments Off on Week #6 – Boosting

Week #5 – Ensemble Methods

In predictive modeling, ensemble methods refer to the practice of taking multiple models and averaging their predictions.

Comments Off on Week #5 – Ensemble Methods
Close Menu