Flexible, affordable statistics education.

Designed to help you master the software you need to enhance your skills and the practical experience you need to get ahead.

Introduction to Data Mining


Brief Description:

This course covers the two core paradigms that account for most business applications of data mining: classification and prediction. The course includes hands-on work with XLMiner, a data-mining add-in for Excel.

Instructor(s):
Level: Novice/Intermediate

Who Should Take This Course:

Analysts of business data, consultants, MBAs seeking to update their knowledge of quantitative techniques, managers who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.

Dates:
March 09, 2012 to April 06, 2012September 07, 2012 to October 05, 2012
dmsupervised Click here to be reminded of future sessions of this course.

Introduction to Data Mining

Enter your email address and submit:
ajax loader

Thank you for your submission.


Registration:
Please read the syllabus tab, noting the prerequisites, text and software requirements.

Register Online -$499
Register Online -$399 (you must be affiliated with a college, university or high school)

Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise. Multiple course registrations may be entitled to tuition discounts; read more.


Share This : facebook LinkedIn twitter

Introduction to Data Mining



Aim of Course:

This course will introduce you to the basic concepts in data mining. Data mining, the art and science of learning from data, covers a number of different procedures. This course covers the two core paradigms that account for most business applications of data mining: classification and prediction. In both cases, data mining takes data where a variable of interest is known and develops a model that relates this variable to a series of predictor variables. In classification, the variable of interest is categorical ("purchased something" vs. "has not purchased anything"). In prediction, the variable of interest is continuous ("dollars spent"). Five techniques will be used: k-nearest neighbors, classification and regression trees (CART), neural nets, logistic regression and multiple linear regression. The course will also cover the use of partitioning to divide the data into training data (data used to build a model), validation data (data used to assess the performance of different models, or, in some cases, to fine tune the model) and test data (data used to predict the performance of the final model). The course includes hands-on work with XLMiner, a data-mining add-in for Excel.

Prerequisite(s):

If you are unclear as to whether you have mastered the requirements, test yourself with these placement exams here.

Participants should also be familiar with multiple linear regression, which can be studied in the "Regression Analysis" course.


Course Program:

SESSION 1: Introduction

  • Core ideas in data mining
  • Supervised and unsupervised learning
  • The steps in data mining
  • SEMMA
  • Preliminary steps
    • Sampling from a database
    • Pre-processing and cleaning the data
    • Partitioning the data
  • Building a model
    • An example with linear regression
  • K-nearest neighbor

SESSION 2: Classification

  • Judging the performance of classification algorithms
  • Classification trees
  • Logistic regression
  • Lift

SESSION 3: Neural nets

  • Neural nets
  • Comparing different models

SESSION 4: Prediction

  • Multiple linear regression
  • Regression trees

Organization of the Course:

This course takes place over the internet, at statistics.com for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and you will receive individual feedback on your homework answers.


Credit:
Students come to The Institute for a variety of reasons:
  1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. You may be enrolled in PASS (Program in Advanced Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).

As you begin the class, you will be asked to specify your category.

This course offers continuing education units (CEU's). For those successfully completing the course (generally this means marks of 50% or better on the homework), 5.0 CEU's and a record of course completion will be issued by Statistics.com, upon request.


Course Text:

The text is Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, 2nd Edition by Shmueli, Patel and Bruce, from Wiley. Please PREORDER* this text. It can be ordered directly from Wiley using the previous link. Wiley offers statistics.com customers a 15% discount on this book (and all other statistics titles): enter the code aff15 in the Promotion Code field when prompted during checkout and click the Apply Discount button. (If you are located in Asia, the web procedure for your location may not accept this discount -- try calling your regional Wiley representative.) A six-month license to XLMiner comes bundled with the course text.

*If the text does not arrive in time for the course, please provide us with proof of purchase and we will supply you with a .pdf of appropriate chapters in the course itself.

Software:

This is a hands-on course. Participants will apply data mining algorithms to real data, and interpret the results. XLMiner - a data-mining add-in for Excel, will be illustrated in the course and its use and output will be explained. A six-month license to XLMiner comes bundled with the course text. The instructor is familiar with XLMiner and the illustrations and assignments are fully integrated with XLMiner. Any software capable of handling the routines covered (i.e. most data mining software) may be used. For more information on the above mentioned statistical software, please click here.

Register Now

Yes, I want to register for:

Introduction to Data Mining

Instructor(s):
Dates:
March 09, 2012 to April 06, 2012September 07, 2012 to October 05, 2012
Course Fee: $499
Academic Discounted Rate: $399

Before registering, please read the syllabus tab, noting the prerequisites, text and software requirements. When you click the register button, you will be taken to our secure transaction page.

I am affiliated with an academic institution
I am not affiliated with an academic institution


Want to be notified of future course offering?


Enter your email address here:

What our students say:

"Interaction with the instructor was good - he encouraged questions and they were answered quickly and professionally."
J. Johnston
Colorado State University

"You really have come up with an ideal method for working academicians to improve their quantitative skills without spending a fortune and taking time off from work to travel."

R. Handel
Eastern Virginia Medical School
"I really enjoyed this course and like the instructor. The discussion board provides a valuable venue to discuss questions and clarify doubts. The instructor's feedback is prompt and helpful. I not only got my questions answered but also learned a lot from other's questions."
R. Yang
Purdue University
© Statistics.com 2004-2012