Flexible, affordable statistics education.

Designed to help you master the software you need to enhance your skills and the practical experience you need to get ahead.

Data Mining: Unsupervised Techniques

taught by Tony Babinec


Brief Description:

This course covers key unsupervised learning techniques - association rules, principal components analysis, and clustering. The course will include an integration of supervised and unsupervised learning techniques.

Instructor(s):
Level: Intermediate/Introductory

Who Should Take This Course:

Marketers seeking to specify customer segments and identify associations among products purchased, environment scientists seeking to cluster observations, analysts who need to identify the key variables out of many, MBA's seeking to update their knowledge of quantitative techniques, managers and scientists who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.

Dates:
October 12, 2012 to November 09, 2012October 11, 2013 to November 08, 2013
dmunsupervised Click here to be reminded of future sessions of this course.

Data Mining: Unsupervised Techniques

taught by Tony Babinec

Enter your email address and submit:
ajax loader

Thank you for your submission.


Registration:
Please read the syllabus tab, noting the prerequisites, text and software requirements.

Register Online -$499
Register Online -$399 (you must be affiliated with a college, university or high school)

Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise. Multiple course registrations may be entitled to tuition discounts; read more.


Share This : facebook LinkedIn twitter

Data Mining: Unsupervised Techniques

taught by Tony Babinec



Aim of Course:

Data mining, the art and science of learning from data, covers a number of different procedures. This course covers key unsupervised learning techniques: association rules, principal components analysis, and clustering. (Introduction to Data Mining: Supervised Learning covers techniques that are used to predict a record's class, or the value of an outcome variable on the basis of a set of records with known outcomes). The course will include an integration of supervised and unsupervised learning techniques.


This is a hands-on course -- participants in the course will have access to an Excel-based comprehensive tool for data-mining, XLMiner, the use of which will be explained in the course. Participants will apply data mining algorithms to real data, and will interpret the results.


An online bulletin board available enables you to interact with the instructor and your fellow students throughout the course and submit your own findings for discussion. The course should take about 15 hours per week. Regular visits to the course discussion board are required, but you can arrange these at your own convenience. (Follow-up consultation is available after completion of the course for an additional fee.)

Prerequisite(s):
Participants should be familiar with the fundamentals of statistical inference, such as is provided in Basic Concepts in Probability and Statistics, Introduction to Statistics 1: Inference for a Single Variable, and Introduction to Statistics 2: Working with Bivariate Data. In addition, there is a lesson in the course where supervised and unsupervised learning techniques are using in combination, so, unless you do not need this portion, you should be familiar with supervised learning methods, such as those presented in Introduction to Prective Modeling.
Course Program:

SESSION 1: Principal Components Analysis

  • The goal - dimensionality reduction
  • The principal components
  • Scale variance estimation
  • Normalizing the data
  • Principal components and least orthogonal squares
  • Exercises

SESSION 2: Clustering

  • What is cluster analysis?
  • Hierarchical methods
  • Nearest neighbor (single linkage)
  • Farthest neighbor (complete linkage)
  • Group average (average linkage)
  • Optimization and the k-means algorithm
  • Similarity measures
  • Other distance measures
  • The curse of dimensionality
  • Exercises

SESSION 3: Association Rules

  • Discovering association rules in transaction databases
  • Support and confidence
  • The apriori algorithm
  • Shortcomings
  • Exercises

SESSION 4: Integration of Supervised and Unsupervised learning

  • Clustering into customer segments
  • Profiling of customer segments
  • Classifying new records by segment

The final lesson is an integration of supervised and unsupervised techniques. To get the full benefit of this course, familiarity with supervised learning is needed, but those not requiring this integration can learn about clustering, association rules and principal components without having had a course in supervised learning.

HOMEWORK:

Homework in this course consists of short answer questions to test concepts, and guided data analysis problems using software.

Organization of the Course:

This course takes place over the internet at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.


Credit:
Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:
  1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course, 5.0 CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

The required text for this course is Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, 2nd Edition, by Shmueli, Patel and Bruce, and it can be ordered from Wiley by clicking here. Wiley typically offers statistics.com customers up to 15% discount on this book (and all other statistics titles): enter the code aff15 in the Promotion Code field when prompted during checkout and click the Apply Discount button. (If you are located in Asia, the web procedure for your location may not accept this discount – try calling your regional Wiley representative.).

PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.

Software:

This is a hands-on course. Participants will apply data mining algorithms to real data, and interpret the results. Course illustrations and homework assignments will use XLMiner, a data mining add-in for Excel. Teaching assistants will be able to offer feedback on assignments completed using XLMiner. Other data mining programs may be used by participants, but support will not be available. A six-month license to XLMiner comes bundled with the course text. For information on XLMiner or other software, click here.

Register Now

Yes, I want to register for:

Data Mining: Unsupervised Techniques

taught by Tony Babinec



Instructor(s):
Dates:
October 12, 2012 to November 09, 2012October 11, 2013 to November 08, 2013
Course Fee: $499
Academic Rate: $399

Before registering, please read the syllabus tab, noting the prerequisites, text and software requirements. When you click the register button, you will be taken to our secure transaction page.

I am affiliated with an academic institution
I am not affiliated with an academic institution


Want to be notified of future course offering?


Enter your email address here:

What our students say:

"I know I am not the perfect student but I really am learning an awful lot and absolutely adore this module [Resampling Methods]. But it doesn't matter how I am doing, I still know I have already learnt a lot about something that I knew nothing about. Thank you for this opportunity and all the support received so far.”
S. Hurlburt
University of Canterbury
"Considering all of the material that needed to be covered, I thought the course was well written and thought provoking."
P. Anderson
Albion College
"I look forward to taking another course on statistics.com - a great way to continue learning in a structured manner, but flexible enough to participate while Life continues."
B. Berg
AMPS Intl.
© Statistics.com 2004-2012