Predictive Analytics 1 - Machine Learning Tools

Predictive Analytics 1 - Machine Learning Tools

taught by Anthony Babinec and Galit Shmueli


Close Popup

Aim of Course:

In this online course, “Predictive Analytics 1 - Machine Learning Tools,” you will be introduced to the basic concepts in predictive analytics, also called predictive modeling, the most prevalent form of data mining. This course covers the two core paradigms that account for most business applications of predictive modeling: classification and prediction. In both cases, predictive modeling takes data where a variable of interest (a target variable) is known and develops a model that relates this variable to a series of predictor variables, also called features. In classification, the target variable is categorical ("purchased something" vs. "has not purchased anything"). In prediction, the target variable is continuous ("dollars spent"). You will learn how to explore and vizualize the data, to get a preliminary idea of what variables are important, and how they relate to one another.  Four machine learning techniques will be used: k-nearest neighbors, classification and regression trees (CART), and Bayesian classifiers. Then you will learn how to combine different models to obtain results that are better than any of the individual models produce on their own.  The course will also cover the use of partitioning to divide the data into training data (data used to build a model), validation data (data used to assess the performance of different models, or, in some cases, to fine tune the model) and test data (data used to predict the performance of the final model).

SOFTWARE:  This section of the course uses Analytic Solver Data Mining, a data-mining add-in for Excel (previously called XLMiner).  We also offer:

Anticipated learning outcomes:

  • Visualize and explore data to better understand relationships among variables
  • Partition data to provide an assessment basis for predictive models
  • Choose and implement appropriate performance measures for predictive models
  • Specify and implement models with the following algorithms
  • k-nearest-neighbor
  • Naive Bayes
  • Classification and Regression Trees
  • Understand how ensemble models improve predictions



Take a 10-question quiz on analytics

This course may be taken individually (one-off) or as part of a certificate program.
Course Program:

WEEK 1: Preparation

  • What is supervised learning
  • Data partitioning and holdout samples
  • Choosing variables (features)
  • Handling missing data
  • Visualization and exploration

WEEK 2: Classification and Prediction

  • Assessing classification models
    • Confusion matrix
    • Misclassification costs
    • Lift
  • Assessing prediction models
    • Common metrics
  • K-Nearest-Neighbors (KNN)
    • Measuring distance
    • Choosing k
    • Generating classifications and predictions

WEEK 3: Bayesian Classifiers; CART

  • Full Bayes classifier
  • Naive Bayes classifier
  • Classification and Regression Trees (CART)
    • Growing the tree
    • Avoiding overfit - pruning
    • Using trees for classifications and predictions

WEEK 4: Ensembles

  • Combine multiple algorithms
  • Improve results


Homework in this course consists of short answer questions to test concepts, guided data analysis problems using software, and end of course data modeling project.  Note: There will be a mid-week discussion exercise in the first week of the course.

In addition to assigned readings, this course also has supplemental video lectures, and an end of course data modeling project.


Sample Video By Dr. Shmueli

Predictive Analytics 1 - Machine Learning Tools

Who Should Take This Course:
Marketing and IT managers, financial analysts and risk managers, accountants, data analysts, data scientists, forecasters.  This course is especially useful if you want to understand what predictive modeling might do for your organization, undertake pilots with minimum setup costs, manage predictive modeling projects, or work with consultants or technical experts involved with ongoing predictive modeling deployments.
Introductory / Intermediate

You should be familiar with introductory statistics.  Try these self tests to check your knowledge.  You will benefit from some familiarity with regression, which is covered in's Statistics 2.

Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:
About 15 hours per week, at times of  your choosing.

Options for Credit and Recognition:
Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:

  1. No credit - You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. Certificate - You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. CEUs and/or proof of completion - You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course,  CEU's and a record of course completion will be issued by The Institute, upon request.
  4. Digital Badge - Courses evaluated by the American Council on Education have a digital badge available for successful completion of the course.  
  5. Other options - Specializations, INFORMS CAP recognition, and academic (college) credit are available for some courses

College credit:
Predictive Analytics 1 - Machine Learning Tools has been evaluated by the American Council on Education (ACE) and is recommended for the upper-division baccalaureate degree category, 3 semester hours in predictive analytics, data mining, or data sciences. Note: The decision to accept specific credit recommendations is up to each institution. More info here.

This course is also recognized by the Institute for Operations Research and the Management Sciences (INFORMS) as helpful preparation for the Certified Analytics Professional (CAP®) exam, and can help CAP® analysts accrue Professional Development Units to maintain their certification .
Course Text:

The required text for this course is Data Mining for Business Analytics: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, 3rd Edition, by Shmueli, Patel and Bruce.



This is a hands-on course, and participants will apply data mining algorithms to real data.  

This section of the course uses Analytic Solver Data Mining (previously called XLMiner), a data-mining add-in for Excel.  We also offer:




January 17, 2020 to February 14, 2020 May 22, 2020 to June 19, 2020 September 04, 2020 to October 02, 2020 January 15, 2021 to February 12, 2021

Predictive Analytics 1 - Machine Learning Tools


January 17, 2020 to February 14, 2020 May 22, 2020 to June 19, 2020 September 04, 2020 to October 02, 2020 January 15, 2021 to February 12, 2021

Course Fee: $549

Do you meet course prerequisites? What about book & software? (Click here to learn more)

We have flexible policies to transfer to another course, or withdraw if necessary (modest fee applies)

Group rates: Email jdobbins "at" to get information on group rates. 

First time student or academic? Click here for an introductory offer on select courses. Academic affiliation?  You may be eligible for a discount at checkout.

Register Now

Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise.

The Institute for Statistics Education is certified to operate by the State Council of Higher Education in Virginia (SCHEV).

Contact Us
Have a question about a course before you register? Call us. We're here for you. (571) 281-8817 or ourcourses (at)

Want to be notified of future courses?