Skip to content

Course Spotlight: Text Mining

The term text mining is sometimes used in two different meanings in computational statistics:

  • Using predictive modeling to label many documents (e.g. legal docs might be “relevant” or “not relevant”) – this is what we call text mining.
  • Using grammar and syntax to parse the meaning of individual documents – we use the term natural language processing.

Nitin Indurkhya, co-author of Fundamentals of Predictive Text Mining (Wiley) teaches both approaches in his courses here at the Institute for Statistics Education:

Jun 8 – Jul 6:Text Mining(hands-on with Python)

Jul 13 – Aug 10:Natural Language Processing(conceptual – no software)

Sep 14 – Oct 12:Natural Language Processing with NLTK(uses Python)

Nitin Indurkhya, co-author of Fundamentals of Predictive Text Mining(Wiley) and a senior data scientist with experience at eBay, Samsung and elsewhere, teaches the NLP courses. His colleague from eBay days, Anurag Bhardwaj, now a data scientist at QuadAnalytix, teaches the Text Mining class.

Registration options:

  1. Sign up for any individual course using the above links
  2. Sign up for all three courses for just $399 each – earn a Specialization in Text Analytics and save $450; use the code “text-specialization
  3. Need better grounding in Python first? Add our May 11Python for Analyticsfor the same $399.

The courses take place online at Statistics.com in a series of weekly lesson and assignments, and requires about 15 hours/week. Participate at your own convenience; there are no set times when you are required to be online.

We hope to see you in one or more of our text analytics courses!