Flexible, affordable statistics education.

Designed to help you master the software you need to enhance your skills and the practical experience you need to get ahead.

facebook LinkedIn twitter Google+ Email

Data Mining - R



June 27, 2014 to July 25, 2014

Thank you for your submission.

Data Mining in R - Learning with Case Studies

taught by Luis Torgo

Aim of Course:

The main goal of this course is to teach users how to perform data mining tasks using R. The course follows a learn-by-doing-it strategy, where data mining topics are introduced as needed when addressing a series of real world data mining case studies.  See also the related courses for different perspectives on the topic - e.g. a more detailed conceptual approach (Predictive Analytics 1/2).

This course may be taken individually (one-off) or as part of a certificate program.

Course Program:

WEEK 1: Predicting Algae Blooms (Case Study 1)

  • Descriptive statistics
  • Data visualization
  • Strategies to handle unknown variable values
  • Regression tasks
  • Evaluation metrics for regression tasks

WEEK 2: Predicting Algae Blooms (Continuation of Case Study 1)

  • Multiple linear regression
  • Regression trees
  • Model selection/comparison through k-fold cross-validation

WEEK 3:  Detecting Fraudulent Transactions (Case Study 2)

  • Clustering methods
  • Classification methods
  • Imbalanced class distributions and methods for handling this type of problems
  • Naive Bayes classifiers
  • Precision/recall and precision/recall curves

WEEK 4: Classifying Microarray Samples (Case Study 3)

  • Feature selection methods for problems with a very large number of predictors
  • Random forests
  • k-Nearest neighbors


Homework in this course consists of short answer questions to test concepts and guided data analysis problems using software.

Data Mining - R

Be sure you meet all of the minimum requirements before you register, click here to learn more.


June 27, 2014 to July 25, 2014

Course Fee: $549

Tuition Savings:  When you register online for 3 or more courses, $200 is automatically deducted from the total tuition. (This offer cannot be combined and is only applicable to courses of 3 weeks or longer.)


Have you reviewed the REQUIREMENTS for this course?

Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise.

Data Mining in R - Learning with Case Studies

taught by Luis Torgo

Who Should Take This Course:

R users who want to learn how to apply R to data mining.  Data mining analysts in search of new tools.  Students in statistics.com's PASS program in Data Mining seeking an affordable data mining tool.  Note that working in R will be more involved than using a specially designed interface for data mining, such as those found in major commercial data mining programs.



These are listed for your benefit so you can determine for yourself, whether you have the needed background, whether from taking the listed courses, or by other experience.
Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:
  1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course, 5.0 CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

The course text is Data Mining with R: Learning with Case Studies, by Luis Torgo. CRC Press typically gives students a generous discount when students order the text using this form (not by ordering the text online).



You must have a copy of R for the course. Click here for information on obtaining a free copy. After installing R in your computer you must also install several R add-on packages. Instructions for this installation will be provided as needed.

Want to be
notified of future
course offerings?
Please enter first name.
Please enter last name.
Please enter valid E-mail.

Students comment on our courses:

© statistics.com 2004-2014