Data Mining: Unsupervised Techniques
Dr. Anthony BabinecAim of Course:
Data mining, the art and science of learning from data, covers a number of different procedures. This course covers key unsupervised learning techniques: association rules, principal components analysis, and clustering. (Introduction to Data Mining: Supervised Learning covers techniques that are used to predict a record's class, or the value of an outcome variable on the basis of a set of records with known outcomes). The course will include an integration of supervised and unsupervised learning techniques.This is a hands-on course -- participants in the course will have access to an Excel-based comprehensive tool for data-mining, XLMiner, the use of which will be explained in the course. Participants will apply data mining algorithms to real data, and will interpret the results.
An online bulletin board available enables you to interact with the instructor and your fellow students throughout the course and submit your own findings for discussion. The course should take about 15 hours per week. Regular visits to the course discussion board are required, but you can arrange these at your own convenience. (Follow-up consultation is available after completion of the course for an additional fee.)
Who Should Take This Course:
Marketers seeking to specify customer segments and identify associations among products purchased, environment scientists seeking to cluster observations, analysts who need to identify the key variables out of many, MBA's seeking to update their knowledge of quantitative techniques, managers and scientists who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.For those enrolled in a Program of Advanced Statistical Studies, this is a required or elective course in the following Programs:
- Statistics in Business & Marketing - elective
- Data Mining - required
Course Program:
The course is structured as follows- What is cluster analysis?
- Hierarchical methods
- Nearest neighbor (single linkage)
- Farthest neighbor (complete linkage)
- Group average (average linkage)
- Optimization and the k-means algorithm
- Similarity measures
- Other distance measures
- The curse of dimensionality
- Exercises
- The goal - dimensionality reduction
- The principal components
- Scale variance estimation
- Normalizing the data
- Principal components and least orthogonal squares
- Exercises
- Discovering association rules in transaction databases
- Support and confidence
- The apriori algorithm
- Shortcomings
- Exercises
- Clustering into customer segments
- Profiling of customer segments
- Classifying new records by segment
The final lesson is an integration of supervised and unsupervised techniques. To get the full benefit of this course, familiarity with supervised learning is needed, but those not requiring this integration can learn about clustering, association rules and principal components without having had a course in supervised learning.
The Instructor:
Dr. Anthony Babinec is President of AB Analytics. For over two decades, Tony Babinec has specialized in the application of statistical and data mining methods to the solution of business problems. Tony has multiple degrees from the University of Chicago, where he studied Advanced Statistics and Survey Research. Before forming AB Analytics, Babinec was Director of Advanced Products Marketing at SPSS; he worked on the marketing of Clementine and introduced CHAID, neural nets and other advanced technologies to SPSS users. He has presented at the AMA's Applied Research Methods Conference and Advanced Research Techniques Forum, the Sawtooth Software Conference, Statistical Innovation's Statistical Modeling Week, and numerous professional meetings. He is on the Board of Directors of the Chicago Chapter of the American Statistical Association, where he has held various offices including President. He is on the Editorial Board of the Journal of Targeting, Measurement and Analysis for Marketing.Organization of the Course:
The course takes place over the internet, at statistics.com. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor. The course is scheduled to take place over 4 weeks, and typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials and work through exercises. Discussion among participants is encouraged. The instructor will provide answers and comments.Certificates and Grades:
You may be interested only in learning the material presented, and not be concerned with grades or certificates. Or you may be enrolled in a statistics.com Program in Advanced Statistical Studies that requires demonstration of proficiency in the subject, in which case your work will be assessed for purposes of issuing a grade. Or you may require only a "Certificate of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's). As you begin the class, you will be asked to specify your category.Credit:
This course offers continuing education units (CEU's). For those successfully completing the course (generally this means marks of 50% or better on the homework), 5.0 CEU's and a certificate will be issued by statistics.com, upon request.Dates:
Oct. 15 - Nov. 12, 2010Click here to be notified of future course offerings.
Participants gain access to the online materials on the first day of the course, and typically spend about 15 hours per week (at their convenience). You retain full access to course materials, including discussion board, for two weeks after the course closing date.
Level:
Intermediate/IntroductoryPrerequisite:
Participants should be familiar with the fundamentals of statistical inference, such as is provided in Basic Concepts in Probability and Statistics, Introduction to Statistics 1: Inference for a Single Variable, and Introduction to Statistics 2: Working with Bivariate Data. In addition, there is a lesson in the course where supervised and unsupervised learning techniques are using in combination, so, unless you do not need this portion, you should be familiar with supervised learning methods, such as those presented in Introduction to Data Mining. For additional information about course prerequisites, click here.Course Text:
The required text for this course is Data Mining for Business Intelligence Shmueli, Patel and Bruce, from Wiley and can be ordered from Wiley by clicking here. Wiley typically offers statistics.com customers up to 15% discount on this book (and all other statistics titles): enter the code aff15 in the Promotion Code field when prompted during checkout and click the Apply Discount button. (If you are located in Asia, the web procedure for your location may not accept this discount -- try calling your regional Wiley representative.) A six-month license to XLMiner comes with this text. PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.Software:
This is a hands-on course. Participants will apply data mining algorithms to real data, and interpret the results. Course illustrations and homework assignments will use XLMiner, a data mining add-in for Excel. Teaching assistants will be able to offer feedback on assignments completed using XLMiner. Other data mining programs may be used by participants, but support will not be available. A six-month license to XLMiner comes bundled with the course text. For information on XLMiner or other software, click here.Registration:
Register Online - $469Register Online (academic) - $369 (you must be affiliated with a college, university or high school)
Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.
Consider registering for this course together with two other Data Mining courses as part of our special 3 course package registration for tuition savings.
Note: Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise.
| © statistics.com 2004-2009 | Privacy Policy | Contact Us |
