Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com
  • Curriculum
    • Curriculum
    • About Us
    • Testimonials
    • Management Team
    • Faculty Search
    • Teach With Us
    • Credit & Credentialing
  • Courses
    • Explore Courses
    • Course Calendar
    • About Our Courses
    • Course Tour
    • Test Yourself!
  • Mastery Series
    • Mastery Series Program
    • Bayesian Statistics
    • Business Analytics
    • Healthcare Analytics
    • Marketing Analytics
    • Operations Research
    • Predictive Analytics
    • Python for Analytics
    • R Programming
    • Rasch & IRT
    • Spatial Statistics
    • Statistical Modeling
    • Survey Statistics
    • Text Mining and Analytics
  • Certificates
    • Certificate Program
    • Analytics for Data Science
    • Biostatistics
    • Programming for Data Science – R (Novice)
    • Programming for Data Science – R (Experienced)
    • Programming for Data Science – Python (Novice)
    • Programming for Data Science – Python (Experienced)
    • Social Science
  • Degrees
    • Degree Programs
    • Computational Data Analytics Certificate of Graduate Study from Rowan University
    • Health Data Management Certificate of Graduate Study from Rowan University
    • Data Science Analytics Master’s Degree from Thomas Edison State University (TESU)
    • Data Science Analytics Bachelor’s Degree – TESU
    • Mathematics with Predictive Modeling Emphasis BS from Bellevue University
  • Enterprise
    • Organizations
    • Higher Education
  • Resources
    • Blog
    • FAQs & Knowledge Base
    • Glossary
    • Site Map
    • Statistical Symbols
    • Weekly Brief Newsletter Signup
    • Word of the Week
Menu Close
  • Curriculum
    • Curriculum
    • About Us
    • Testimonials
    • Management Team
    • Faculty Search
    • Teach With Us
    • Credit & Credentialing
  • Courses
    • Explore Courses
    • Course Calendar
    • About Our Courses
    • Course Tour
    • Test Yourself!
  • Mastery Series
    • Mastery Series Program
    • Bayesian Statistics
    • Business Analytics
    • Healthcare Analytics
    • Marketing Analytics
    • Operations Research
    • Predictive Analytics
    • Python for Analytics
    • R Programming
    • Rasch & IRT
    • Spatial Statistics
    • Statistical Modeling
    • Survey Statistics
    • Text Mining and Analytics
  • Certificates
    • Certificate Program
    • Analytics for Data Science
    • Biostatistics
    • Programming for Data Science – R (Novice)
    • Programming for Data Science – R (Experienced)
    • Programming for Data Science – Python (Novice)
    • Programming for Data Science – Python (Experienced)
    • Social Science
  • Degrees
    • Degree Programs
    • Computational Data Analytics Certificate of Graduate Study from Rowan University
    • Health Data Management Certificate of Graduate Study from Rowan University
    • Data Science Analytics Master’s Degree from Thomas Edison State University (TESU)
    • Data Science Analytics Bachelor’s Degree – TESU
    • Mathematics with Predictive Modeling Emphasis BS from Bellevue University
  • Enterprise
    • Organizations
    • Higher Education
  • Resources
    • Blog
    • FAQs & Knowledge Base
    • Glossary
    • Site Map
    • Statistical Symbols
    • Weekly Brief Newsletter Signup
    • Word of the Week

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules with Python

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules with Python

This course, with a focus on Python, will teach you key unsupervised learning techniques of association rules - principal components analysis, and clustering - and will include an integration of supervised and unsupervised learning techniques.

This course, with a focus on Python, will teach you key unsupervised learning techniques of association rules - principal components analysis, and clustering - and will include an integration of supervised and unsupervised learning techniques.

$549 | Enroll Now
Alert me to upcoming courses
Group Rates
  • Overview
  • Learning Outcomes
  • Instructors
  • Syllabus
  • Dates
  • Prerequisites
  • Student Stories
  • FAQS
  • Requirements
Menu
  • Overview
  • Learning Outcomes
  • Instructors
  • Syllabus
  • Dates
  • Prerequisites
  • Student Stories
  • FAQS
  • Requirements

Overview

In this course, you will cover key unsupervised learning techniques including association rules, principal components analysis, and clustering. You will also review integration of supervised and unsupervised learning techniques. Participants will apply data mining algorithms to real data, and will interpret the results. A final project will integrate an unsupervised task with supervised methods covered in predictive Analytics 1 and 2. Students will use Python, a free software environment with statistical computing and graphics capabilities.  Note: If you prefer to work in R or XLMiner, this course is offered using R or XLMiner.

Learning Outcomes

After completing this course students will understand issues relating to using too many predictors and how to reduce the number of predictors to a smaller number of usable “components.” You will use various clustering techniques and association rules to describe clusters of similar records, and to find patterns in your data. You will learn to use Python to implement the models covered in this course, and how to combine supervised and unsupervised models.

  • Use principal components analysis and variable selection techniques to reduce dimensionality
  • Cluster records using hierarchical and k-means clustering
  • Discover association rules in transaction databases
  • Specify how collaborative filtering can be used to develop automated recommendations
  • Integrate unsupervised and supervised data mining methods in a case study
  • Use Python’s sci-kit learn package to implement the models in the course

Who Should Take This Course

Marketers seeking to specify customer segments, identify associations among products purchased and design recommender systems, MBA’s seeking to update their knowledge of quantitative techniques, managers and scientists who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.

Instructors

dr-peter-gedeck

Dr. Peter Gedeck

Peter Gedeck is at the forefront of the use of data science in drug discovery. He is a Senior Data Scientist at Collaborative Drug Discovery, which offers the pharmaceutical industry cloud-based software to manage the huge amount of data involved in the drug discovery process. Drug discovery involves the exploration and testing of huge numbers of molecule combinations, and much of that testing takes place analytically, hence the need for robust software to handle the data and provide a framework for analyzing it. Peter's specialty is the development of machine learning algorithms to predict biological and physicochemical properties of drug candidates. Prior to this, he worked for twenty y...

See Instructor Bio

Course Syllabus

Week 1

Dimension Reduction

  • Detecting information overlap using domain knowledge and data summaries and charts
  • Removing or combining redundant variables and categories
  • Dealing with multi-category variables
  • Automated dimension reduction techniques
    • Principal Components Analysis (PCA)
    • Predictive algorithms with variable selection techniques

Week 2

Cluster Analysis

  • Popular uses of cluster analysis
  • Clustering approaches
  • Hierarchical Clustering
    • Distances between records
    • Distances between clusters
    • Dendrograms
    • Validating clusters
    • Strengths and weaknesses
  • K-Means Clustering
    • Initializing the k clusters
    • Distance of a record from a cluster
    • Within-cluster homogeneity
    • Elbow charts

Week 3

Association Rules and Recommender Systems

  • Discovering association rules in transaction databases
    • Support, confidence and lift
    • The apriori algorithm
    • Shortcomings
  • Collaborative filtering
    • Person-based
    • Item-based

 

Week 4

Integrating Supervised and Unsupervised Methods; Introduction to Network and Text Analytics

  • The role of unsupervised methods in predictive analytics
    • Dimension reduction of predictor space
    • Predictive models on subsets of homogeneous records
  • Advantages and weaknesses of combining unsupervised and supervised methods
  • Network analytics
  • Text analytics
  • Unsupervised methods used in network and text analytics

Class Dates

2021

May 14, 2021 to Jun 11, 2021

Sep 10, 2021 to Oct 8, 2021

2022

Jan 14, 2022 to Feb 11, 2022

May 13, 2022 to Jun 10, 2022

2023

No classes scheduled at this time.

Send me reminder for next class

Prerequisites

Introductory Statistics

We assume you are versed in statistics or have the equivalent understanding of topics covered in our Statistics 1 and Statistics 2 courses. but do not require them as eligibility to enroll in this course. Please review the course description for each of our introductory statistics courses, estimate which best matches your level of understanding of the material covered in these courses, then take the short assessment test for that course. If you can not answer more than half of the questions correctly, we suggest you take our Statistics 1 and Statistics 2 courses prior to taking this course.

    • For Statistics 1 – Probability and Study Design, take this assessment test.
    • For Statistics 2 – Inference and Association, take this assessment test.

In addition, there is a lesson in the course where supervised and unsupervised learning techniques are used in combination, so, unless you do not need this portion, you should be familiar with supervised learning methods, such as those presented in Predictive Analytics 1 with Python.

Recommended

We recommended, but do not require as eligibility to enroll in this course, an understanding of the material covered in these following courses.

Course Icon

Predictive Analytics 1 – Machine Learning Tools with Python

This course introduces the basic paradigm for predictive modeling: classification and prediction.
Topic: Data Science, Analytics, Machine Learning, Prediction/Forecasting, Using Python | Skill: Introductory, Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: May 14, 2021, Sep 10, 2021, Jan 14, 2022, May 13, 2022

What Our Students Say​

I have truly enjoyed this course with Professor Babinec!  All the courses I have taken statistics.com have met my expectations and made me happy because of my professional growth, but this course as well as the course on Likelihood Estimation have surpassed all my expectations!!

Beatriz C. Santiago

Thank you for the excellent course. I appreciated the timely and detailed responses from Dr. Babinec (for e.g. normalization example when non-numeric predictors are involved). Discussions initiated by other students were also very valuable - esp for PCA and Clusters

Satish Rao
IBM

Frequently Asked Questions

What is your satisfaction guarantee and how does it work?

We offer a “Student Satisfaction Guarantee​” that includes a tuition-back guarantee, so go ahead and take our courses risk free. That’s our commitment to student satisfaction. Students may cancel, transfer, or withdraw from a course under certain conditions. If you’re not satisfied with a course, you may withdraw from the course and receive a tuition refund.

Please see our knowledge center for more information.

Who are the instructors at the Institute?

The Institute has more than 60 instructors who are recruited based on their expertise in various areas in statistics. Our faculty members are:

  • Authors of well-regarded texts in their area;
  • Advisory board members;
  • Senior faculty; and
  • Educators who have made important contributions to the field of statistics or online education in statistics.

The majority of our instructors have more than five years of teaching experience online at the Institute.

Please visit our faculty page for more information on each instructor at The Institute for Statistics Education.

Please see our knowledge center for more information.

What type of courses does the Institute offer?

The Institute offers approximately 80 courses each year. Topics include basic survey courses for novices, a full sequence of introductory statistics courses, bridge courses to more advanced topics. Our courses cover a range of topics including biostatistics, research statistics, data mining, business analytics, survey statistics, and environmental statistics.

Please see our course search or knowledge center for more information.

Do your courses have for-credit options?

Our courses have several for-credit options:

  • Continuing education units (CEU)
  • College credit through The American Council on Education (ACE CREDIT)
  • Course credits that are transferable to the INFORMS Certified Analytics Professional (CAP®)

Please see our knowledge center for more information.

Is the Institute for Statistics Education certified?

The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV). For more information visit: https://www.schev.edu/

Please see our knowledge center for more information.

Visit our knowledge base and learn more.

FAQs + Knowledge Base

Related Courses

Course Icon

Predictive Analytics 1 – Machine Learning Tools with Python

This course introduces the basic paradigm for predictive modeling: classification and prediction.
Topic: Data Science, Analytics, Machine Learning, Prediction/Forecasting, Using Python | Skill: Introductory, Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: May 14, 2021, Sep 10, 2021, Jan 14, 2022, May 13, 2022

Predictive Analytics 2 – Neural Nets and Regression with Python

As a continuation of Predictive Analytics 1, this course introduces to the basic concepts in predictive analytics, with a focus on Python, to visualize and explore predictive modeling.
Topic: Data Science, Analytics, Machine Learning, Prediction/Forecasting, SQL, Using Python | Skill: Introductory, Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: Mar 12, 2021, Jul 9, 2021, Nov 12, 2021, Mar 11, 2022, Jul 8, 2022
Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules

This course will teach you key unsupervised learning techniques of association rules – principal components analysis, and clustering – and will include an integration of supervised and unsupervised learning techniques.
Topic: Analytics, Prediction/Forecasting | Skill: Introductory, Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: May 14, 2021, Sep 10, 2021, Jan 14, 2022, May 13, 2022
R Programming - Intermediate

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules with R

This course, with a focus on R, will teach you key unsupervised learning techniques of association rules – principal components analysis, and clustering – and will include an integration of supervised and unsupervised learning techniques.
Topic: Data Science, Analytics, Machine Learning, Prediction/Forecasting, Using R | Skill: Introductory, Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: May 14, 2021, Sep 10, 2021, Jan 14, 2022, May 13, 2022

Additional Course Information

Organization of Course

This course takes place online at The Institute for 4 weeks. During each course week, you participate at times of your own choosing – there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirements

This is a 4-week course requiring 10-15 hours per week of review and study, at times of your choosing.

Homework

Homework in this course consists of short answer questions to test concepts, and guided data analysis problems using software.

In addition to assigned readings, this course also has supplemental video lectures and an end of course data modeling project.

Course Text

The required text for this course is Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python, by Shmueli, Bruce, Gedeck and Patel. This same text is also used in the previous courses: “Predictive Analytics 1 – Machine Learning Tools – with Python” and “Predictive Analytics 2 – Neural Nets and Regression – with Python”. Please order a copy of your course textbook prior to course start date.

Software

This is a hands-on course, and participants will apply data mining algorithms to real data. The course will use Python, a free software environment for with statistical computing and graphics capabilities. We also offer and a R section and XLMiner section (Excel add-in) for this course.

Software Uses and Descriptions | Available Free Versions
To learn more about the software used in this course, or how to obtain free versions of software used in our courses, please read our knowledge base article “What software is used in courses?” 

Course Fee & Information

Enrollment
Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date unless you specify otherwise.

Transfers and Withdrawals
We have flexible policies to transfer to another course or withdraw if necessary.

Group Rates
Contact us to get information on group rates.

Discounts
Academic affiliation?  In most courses you are eligible for a discount at checkout.

New to Statistics.com?  Click here for a special introductory discount code.  

Invoice or Purchase Order
Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment.

Options for Credit and Recognition

This course is eligible for the following credit and recognition options:

No Credit
You may take this course without pursuing credit or a record of completion.

Mastery or Certificate Program Credit
If you are enrolled in mastery or certificate program that requires demonstration of proficiency in this subject, your course work may be assessed for a grade.

CEUs and Proof of Completion
If you require a “Record of Course Completion” along with professional development credit in the form of Continuing Education Units (CEU’s), upon successfully completing the course, CEU’s and a record of course completion will be issued by The Institute upon your request.

Supplemental Information

There is no supplemental content for this course.

Miscellaneous

There is no additional information for this course.

Register for This Course​

Predictive Analytics 3 – Dimension Reduction, Clustering, and Association Rules with Python
$549 | Enroll Now
Get Notified

Have a Question About This Course?

Janet Dobbins

Sales and Business Development

Phone

(571) 281-8817

Send Us A Note

We like to hear from you.

Name*

Email*

Phone

Company

Message*

 

Send

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

Latest Blogs

  • Making Predictions Self-Fulfilling Prophecies
    February 19, 2021/
    0 Comments
  • Student Spotlight – Staci Taylor
    February 18, 2021/
    0 Comments
  • Word of the Week:  Bias
    February 1, 2021/
    0 Comments

Social Networks

Linkedin
Twitter
Facebook
Youtube

Contact

The Institute for Statistics Education
4075 Wilson Blvd, 8th Floor
Arlington, VA 22203
(571) 281-8817

ourcourses@statistics.com

© Copyright 2021 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept