Flexible, affordable statistics education.

Designed to help you master the software you need to enhance your skills and the practical experience you need to get ahead.

Data Mining - R

Instructor(s):

Dates:

June 28, 2013 to July 26, 2013 January 10, 2014 to February 07, 2014

Thank you for your submission.

Data Mining in R - Learning with Case Studies

taught by Luis Torgo

Aim of Course:

The main goal of this course is to teach users how to perform data mining tasks using R. The course follows a learn by doing it strategy, where data mining topics are introduced as needed when addressing a series of real world data mining case studies.

This course is a core requirement or elective in the following Program(s) in Analytics and Statistical Studies (PASS):

  • Data Analytics
  • Using R
  • Course Program:

    SESSION 1: Predicting Algae Blooms (Case Study 1)

    • Descriptive statistics
    • Data visualization
    • Strategies to handle unknown variable values
    • Regression tasks
    • Evaluation metrics for regression tasks

    SESSION 2: Predicting Algae Blooms (Continuation of Case Study 1)

    • Multiple linear regression
    • Regression trees
    • Model selection/comparison through k-fold cross-validation

    SESSION 3:  Detecting Fraudulent Transactions (Case Study 2)

    • Clustering methods
    • Classification methods
    • Imbalanced class distributions and methods for handling this type of problems
    • Naive Bayes classifiers
    • Precision/recall and precision/recall curves

    SESSION 4: Classifying Microarray Samples (Case Study 3)

    • Feature selection methods for problems with a very large number of predictors
    • Random forests
    • k-Nearest neighbors


    HOMEWORK:

    Homework in this course consists of short answer questions to test concepts and guided data analysis problems using software.

    Data Mining - R

    Instructor(s):

    Dates:
    June 28, 2013 to July 26, 2013 January 10, 2014 to February 07, 2014
    Course Fee: $499

    Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

    Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise. Those registering for multiple courses, Statistics.com's PASS students, and those affiliated with other academic institutions may be entitled to tuition discounts; read more.

    Register

    Have you reviewed the REQUIREMENTS for this course?

    Data Mining in R - Learning with Case Studies

    taught by Luis Torgo

    Who Should Take This Course:

    R users who want to learn how to apply R to data mining.  Data mining analysts in search of new tools.  Students in statistics.com's PASS program in Data Mining seeking an affordable data mining tool.  Note that working in R will be more involved than using a specially designed interface for data mining, such as those found in major commercial data mining programs.

    Level:

    Intermediate

    Prerequisite:
    1. Knowledge of the R programming language - the equivalent of either Introduction to R - Data Handling or Introduction to R - Statistical Analysis.
    2. Introduction to Predictive Modeling, or equivalent data mining experience.
    Organization of the Course:

    This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

    The course typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.


    Credit:
    Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:
    1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
    2. You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
    3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course, 5.0 CEU's and a record of course completion will be issued by The Institute, upon request.
    Course Text:

    The course text is Data Mining with R: Learning with Case Studies, by Luis Torgo, which you can order from CRC Press, or by using this form. CRC Press typically gives students a generous discount when students order the text using the above form (not by ordering the text online).

    PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.

    Software:

    You must have a copy of R for the course. Click here for information on obtaining a free copy. After installing R in your computer you must also install several R add-on packages. Instructions for this installation will be provided as needed.


    Want to be
    notified of future
    course offerings?
    Please enter first name.
    Please enter last name.
    Please enter valid E-mail.

    Students comment on our courses:

    "This course could serve as a model in the field."
    G. Vidmar
    Biostatistician, University of Ljubljana
    "This course could serve as a model in the field."
    G. Vidmar
    Biostatistician, University of Ljubljana
    "The course was very good and well presented. The material in the notes was self-explanatory for a non-technical person, and the supplementary book provided good reading for the person who is interested in more technical details."
    Gichangi
    Dept. of Statistics, Univ. of Southern Denmark (doctoral student)
    "The course was very good and well presented. The material in the notes was self-explanatory for a non-technical person, and the supplementary book provided good reading for the person who is interested in more technical details."
    Gichangi
    Dept. of Statistics, Univ. of Southern Denmark (doctoral student)
    © statistics.com 2004-2011