Skip to content
Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules

This course will teach you key unsupervised learning techniques of association rules – principal components analysis, and clustering – and will include an integration of supervised and unsupervised learning techniques.

Overview

In this course, you will cover key unsupervised learning techniques including association rules, principal components analysis, and clustering. You will also review integration of supervised and unsupervised learning techniques.

Participants will apply data mining algorithms to real data, and will interpret the results. A final project will integrate an unsupervised task with supervised methods covered in our Predictive Analytics 1 and Predictive Analytics 2 courses. This course uses Analytic Solver Data Mining (previously called XLMiner), a data-mining add-in for Excel.

Note: If you prefer to work in R or Python, this course is offered using R or Python.

  • Introductory, Intermediate
  • 4 Weeks
  • Expert Instructor
  • Tuiton-Back Guarantee
  • 100% Online
  • TA Support

Learning Outcomes

After completing this course students will understand issues relating to using too many predictors and how to reduce the number of predictors to a smaller number of usable “components.” You will use various clustering techniques and association rules to describe clusters of similar records, and to find patterns in your data. You will learn to use Excel-based tools to implement the models covered in this course, and how to combine supervised and unsupervised models.

  • Understand the issues related to using too many predictors (the “curse of dimensionality”)
  • Use principal components analysis to reduce the number of predictors to a smaller number of “components” of correlated predictors
  • Use hierarchical clustering and k-means clustering to find and describe clusters of similar records
  • Use association rules to find patterns of “what goes with what” in transaction data
  • Combine unsupervised and supervised learning methods in a final project

Who Should Take This Course

Marketers seeking to specify customer segments and identify associations among products purchased, environment scientists seeking to cluster observations, analysts who need to identify the key variables out of many, MBA’s seeking to update their knowledge of quantitative techniques, managers and scientists who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.

Our Instructors

Course Syllabus

Week 1

Dimension Reduction

  • Detecting information overlap using domain knowledge and data summaries and charts
  • Removing or combining redundant variables and categories
  • Dealing with multi-category variables
  • Automated dimension reduction techniques
    • Principal Components Analysis (PCA)
    • Predictive algorithms with variable selection techniques

Week 2

Cluster Analysis

  • Popular uses of cluster analysis
  • Clustering approaches
  • Hierarchical Clustering
    • Distances between records
    • Distances between clusters
    • Dendrograms
    • Validating clusters
    • Strengths and weaknesses
  • K-Means Clustering
    • Initializing the k clusters
    • Distance of a record from a cluster
    • Within-cluster homogeneity
    • Elbow charts

Week 3

Association Rules and Recommender Systems

  • Discovering association rules in transaction databases
    • Support, confidence and lift
    • The apriori algorithm
    • Shortcomings
  • Collaborative filtering
    • Item-based
    • Person-based

Week 4

Integrating Supervised and Unsupervised Methods; Introduction to Network and Text Analytics

  • The role of unsupervised methods in predictive analytics
    • Dimension reduction of predictor space
    • Predictive models on subsets of homogeneous records
  • Advantages and weaknesses of combining unsupervised and supervised methods
  • Network analytics
  • Text analytics
  • Unsupervised methods used in network and text analytics

Class Dates

2024

01/12/2024 to 02/09/2024
Instructors: Mr. Anthony Babinec
05/10/2024 to 06/07/2024
Instructors: Mr. Anthony Babinec
09/13/2024 to 10/11/2024
Instructors: Mr. Anthony Babinec

2025

01/10/2025 to 02/07/2025
Instructors: Mr. Anthony Babinec
05/09/2025 to 06/06/2025
Instructors: Mr. Anthony Babinec
09/12/2025 to 10/10/2025
Instructors: Mr. Anthony Babinec

Prerequisites

In addition, there is a lesson in the course where supervised and unsupervised learning techniques are used in combination, so, unless you do not need this portion, you should be familiar with supervised learning methods, such as those presented in our Predictive Analytics 1 course.

Predictive Analytics 1 – Machine Learning Tools

This online course introduces the basic paradigm of predictive modeling: classification and prediction.
  • Skill: Introductory, Intermediate
  • Credit Options: ACE, CAP, CEU
Karolis Urbonas
Susan Kamp
Stephen McAllister
Amir Aminimanizani
Elena Rose
Leonardo Nagata
Richard Jackson

Frequently Asked Questions

  • What is your satisfaction guarantee and how does it work?

  • Can I transfer or withdraw from a course?

  • Who are the instructors at Statistics.com?

Visit our knowledge base and learn more.

Register For This Course

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules

Additional Information

Homework

Homework in this course consists of short answer questions to test concepts, and guided data analysis problems using software.

In addition to assigned readings, this course also has supplemental video lectures and an end of course data modeling project.

Course Text

If you are using Analytic Solver Data Mining (previously XLMiner)
The required text for this course is Machine Learning for Business Analytics: Concepts, Techniques, and Applications in Analytic Solver Data Mining, 4th Edition (2023), by Galit Shmueli, Peter Bruce, Kuber Deokar, and Nitin Patel. Also available at Amazon here.

If you are using Python
The required text for this course is Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python, (2019), by Galit Shmueli, Peter Bruce, Peter Gedeck, Inbal Yahav, and Nitin Patel.  Also available at Amazon here.

If you are using R
The required text for this course is Machine Learning for Business Analytics: Concepts, Techniques, and Applications in R, 2nd Edition (2023), by Galit Shmueli, Peter Bruce, Peter Gedeck, Inbal Yahav, and Nitin Patel.  Also available at Amazon here.

Software

This is a hands-on course, and participants will apply data mining algorithms to real data.

This course uses Analytic Solver Data Mining (previously called XLMiner), a data-mining add-in for Excel. We also offer a course using R or Python.

Course participants will receive a license for Analytic Solver Data Mining (previously XLMiner) for nominal cost – this is a special version, for this course.

IMPORTANT:  Do NOT download the free trial version available at solver.com.

Supplemental Information

Literacy, Accessibility, and Dyslexia

At Statistics.com, we aim to provide a learning environment suitable for everyone. To help you get the most out of your learning experience, we have researched and tested several assistance tools. For students with dyslexia, colorblindness, or reading difficulties, we recommend the following web browser add-ons and extensions:

 

Chrome

 

Firefox

 

Safari

  • Navidys (for colorblindness, dyslexia, and reading difficulties)
  • HelperBird for Safari (for colorblindness, dyslexia, and reading difficulties)

Register For This Course

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules