Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com
  • Curriculum
    • Curriculum
    • About Us
    • Testimonials
    • Management Team
    • Faculty Search
    • Teach With Us
    • Credit & Credentialing
  • Courses
    • Explore Courses
    • Course Calendar
    • About Our Courses
    • Course Tour
    • Test Yourself!
  • Mastery Series
    • Mastery Series Program
    • Bayesian Statistics
    • Business Analytics
    • Healthcare Analytics
    • Marketing Analytics
    • Operations Research
    • Predictive Analytics
    • Python for Analytics
    • R Programming
    • Rasch & IRT
    • Spatial Statistics
    • Statistical Modeling
    • Survey Statistics
    • Text Mining and Analytics
  • Certificates
    • Certificate Program
    • Analytics for Data Science
    • Biostatistics
    • Programming for Data Science – R (Novice)
    • Programming for Data Science – R (Experienced)
    • Programming for Data Science – Python (Novice)
    • Programming for Data Science – Python (Experienced)
    • Social Science
  • Degrees
    • Degree Programs
    • Computational Data Analytics Certificate of Graduate Study from Rowan University
    • Health Data Management Certificate of Graduate Study from Rowan University
    • Data Science Analytics Master’s Degree from Thomas Edison State University (TESU)
    • Data Science Analytics Bachelor’s Degree – TESU
    • Mathematics with Predictive Modeling Emphasis BS from Bellevue University
  • Enterprise
    • Organizations
    • Higher Education
  • Resources
    • Blog
    • FAQs & Knowledge Base
    • Glossary
    • Site Map
    • Statistical Symbols
    • Weekly Brief Newsletter Signup
    • Word of the Week
Menu Close
  • Curriculum
    • Curriculum
    • About Us
    • Testimonials
    • Management Team
    • Faculty Search
    • Teach With Us
    • Credit & Credentialing
  • Courses
    • Explore Courses
    • Course Calendar
    • About Our Courses
    • Course Tour
    • Test Yourself!
  • Mastery Series
    • Mastery Series Program
    • Bayesian Statistics
    • Business Analytics
    • Healthcare Analytics
    • Marketing Analytics
    • Operations Research
    • Predictive Analytics
    • Python for Analytics
    • R Programming
    • Rasch & IRT
    • Spatial Statistics
    • Statistical Modeling
    • Survey Statistics
    • Text Mining and Analytics
  • Certificates
    • Certificate Program
    • Analytics for Data Science
    • Biostatistics
    • Programming for Data Science – R (Novice)
    • Programming for Data Science – R (Experienced)
    • Programming for Data Science – Python (Novice)
    • Programming for Data Science – Python (Experienced)
    • Social Science
  • Degrees
    • Degree Programs
    • Computational Data Analytics Certificate of Graduate Study from Rowan University
    • Health Data Management Certificate of Graduate Study from Rowan University
    • Data Science Analytics Master’s Degree from Thomas Edison State University (TESU)
    • Data Science Analytics Bachelor’s Degree – TESU
    • Mathematics with Predictive Modeling Emphasis BS from Bellevue University
  • Enterprise
    • Organizations
    • Higher Education
  • Resources
    • Blog
    • FAQs & Knowledge Base
    • Glossary
    • Site Map
    • Statistical Symbols
    • Weekly Brief Newsletter Signup
    • Word of the Week

Introduction to NLP and Text Mining

Introduction to NLP and Text Mining

This course will teach you the essential techniques of text mining, understood here as the extension of data mining's standard predictive methods to unstructured text.

In this course you will be introduced to the essential techniques of natural language processing (NLP) and text mining with Python.

$549 | Enroll Now
Alert me to upcoming courses
Group Rates
  • Overview
  • Learning Outcomes
  • Instructors
  • Syllabus
  • Dates
  • Prerequisites
  • Student Stories
  • FAQS
  • Requirements
Menu
  • Overview
  • Learning Outcomes
  • Instructors
  • Syllabus
  • Dates
  • Prerequisites
  • Student Stories
  • FAQS
  • Requirements

Overview

In this course you will be introduced to the essential techniques of natural language processing (NLP) and text mining with Python. The course will discuss how to apply unsupervised and supervised modeling techniques to text, and devote considerable attention to data preparation and data handling methods required to transform unstructured text into a form in which it can be mined.

Learning Outcomes

This course focuses on learning key concepts, tools and methodologies for natural language processing with an emphasis on hands-on learning through guided tutorials and real-world examples.  You will learn how to:

  • Process text data and strings, and perform pattern matching with regular expressions in Python
  • Preprocess and wrangle noisy text data via stemming, lemmatization, tokenization, removal of stop-words and more
  • Represent text data in structured and easy-to-consume formats for machine learning and text mining
  • Represent text documents using features related to text word frequency, parts of speech and sentiment
  • Represent text documents using vectorized features like bag-of-words, TF-IDF, and document similarity
  • Use the concepts of information retrieval and document similarity (e.g. in applications like recommender systems)
  • Perform unsupervised NLP using techniques like keyphrase extraction, topic modeling and text summarization
  • Leverage pre-trained models for part-of-speech (POS) tagging and named entity recognition (NER)
  • Develop supervised models to classify documents

Who Should Take This Course

Data scientists and aspiring data scientists who want to analyze text data and build models that use text data.

Instructors

dj_headshot (1)

Mr. Dipanjan Sarkar

Dipanjan (DJ) Sarkar is a Data Science Lead, published author and has been recognized as a Google Developer Expert in Machine Learning by Google in 2019. He has also been recognized as one of the Top Ten Data Scientists in India, 2020 by a few leading technology magazines and publishing houses. Dipanjan has led advanced analytics initiatives working with several Fortune 500 companies like Applied Materials, Intel and Open Source organizations like Red Hat (now IBM). He primarily works on leveraging data science, machine learning and deep learning to build large- scale intelligent systems.

He holds a master of technology degree from IIIT Ban...

See Instructor Bio
dr-anurag-bhardwaj

Dr. Anurag Bhardwaj

Dr. Anurag Bhardwaj is Senior Manager, Data Scientist at Apple.

See Instructor Bio

Course Syllabus

Week 1

Introduction and Text Data Preparation

  • Introduction to NLP & NLP applications
  • Python for NLP
  • NLP basics – Parsing Text and Exploring Text Corpora
  • Tokenization and POS Tags
  • Shallow Parsing
  • Constituency Parsing
  • Corpus Analysis
  • WordNet & Synsets
  • Working with Text and Regular Expressions

Week 2

Feature Engineering and Representation

  • Introduction to text pre-processing and wrangling
  • Text pre-processing and wrangling – methodologies
  • Build your own text pre-processor
  • Non-vectorized text feature engineering
  • Vectorized representations of text features
  • Keyphrase Extraction – Concepts and Methodologies

Week 3

Unsupervised Natural Language Processing

Introduction to text pre-processing and wrangling
Text pre-processing and wrangling – methodologies
Build your own text pre-processor
Non-vectorized text feature engineering
Vectorized representations of text features
Keyphrase Extraction – Concepts and Methologies

Week 4

Information Extraction

  • Introduction to Supervised natural language processing
  • Text Classification – concepts and methodologies
  • Machine Learning for Text Classification
  • Sequential Tagging Models
  • Parts of Speech Tagging
  • Named Entity Recognition

Class Dates

2021

May 14, 2021 to Jun 11, 2021

Sep 10, 2021 to Oct 8, 2021

2022

Jan 14, 2022 to Feb 11, 2022

2023

No classes scheduled at this time.

Send me reminder for next class

Prerequisites

These course prerequisites are recommended but not required. This course requires a familiarity with the following topics: 

Course Icon

Introduction to Python Programming

This course will introduce you to the basics of programming in Python on either Windows or Mac platform.
Topic: Data Science, Using Python | Skill: Introductory | Credit Options: CEU
Class Start Dates: May 14, 2021, Sep 10, 2021, Jan 14, 2022
Predictive Analytics 1 – Machine Learning Tools

Predictive Analytics 1 – Machine Learning Tools

This online course introduces the basic paradigm of predictive modeling: classification and prediction.
Topic: Analytics, Prediction/Forecasting | Skill: Introductory, Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: May 14, 2021, Sep 10, 2021, Jan 14, 2022, May 13, 2022, Sep 9, 2022

What Our Students Say​

I much enjoyed your text mining course last year and plan to take the deep learning one - thanks for creating it

Milan Hejtmanek
Seoul National University

I like the format of those courses. There is a lot to learn, but it's concentrated on the key points. For somebody working full time it beats a semester-long course.

Myriam Abramson
Computer Scientist, Naval Research Laboratory

Frequently Asked Questions

What is your satisfaction guarantee and how does it work?

We offer a “Student Satisfaction Guarantee​” that includes a tuition-back guarantee, so go ahead and take our courses risk free. That’s our commitment to student satisfaction. Students may cancel, transfer, or withdraw from a course under certain conditions. If you’re not satisfied with a course, you may withdraw from the course and receive a tuition refund.

Please see our knowledge center for more information.

Can I transfer or withdraw from a course?

We have a flexible transfer and withdrawal policy that recognizes circumstances may arise to prevent you from taking a course as planned. You may transfer or withdraw from a course under certain conditions.

  • Students are entitled to a full refund if a course they are registered for is canceled.
  • You can transfer your tuition to another course at any time prior to the course start date or the drop date, however a transfer is not permitted after the drop date.
  • Withdrawals on or after the first day of class are entitled to a percentage refund of tuition.

Please see this page for more information.

Who are the instructors at the Institute?

The Institute has more than 60 instructors who are recruited based on their expertise in various areas in statistics. Our faculty members are:

  • Authors of well-regarded texts in their area;
  • Advisory board members;
  • Senior faculty; and
  • Educators who have made important contributions to the field of statistics or online education in statistics.

The majority of our instructors have more than five years of teaching experience online at the Institute.

Please visit our faculty page for more information on each instructor at The Institute for Statistics Education.

Please see our knowledge center for more information.

Do your courses have for-credit options?

Our courses have several for-credit options:

  • Continuing education units (CEU)
  • College credit through The American Council on Education (ACE CREDIT)
  • Course credits that are transferable to the INFORMS Certified Analytics Professional (CAP®)

Please see our knowledge center for more information.

Are Statistics.com courses certified?

The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV). For more information, go to https://www.schev.edu/

Visit our knowledge base and learn more.

FAQs + Knowledge Base

Related Courses

NLP and Deep Learning

In this course you will learn about deep neural networks, and how to use them in processing text with Python (Natural Language Processing or NLP).
Topic: Data Science, Machine Learning, Text Mining | Skill: Intermediate | Credit Options: ACE, CAP, CEU
Class Start Dates: Mar 12, 2021, Jul 9, 2021, Nov 12, 2021, Mar 11, 2022

Additional Course Information

Organization of Course

This course takes place online at The Institute for 4 weeks. During each course week, you participate at times of your own choosing – there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirements

This is a 4-week course requiring 10-15 hours per week of review and study, at times of your choosing.

Homework

Homework in this course consists of short answer questions to test concepts and guided data analysis problems using software.

In addition to assigned readings, this course also has a get started guide, and supplemental readings available online.

Course Text

The text used for the practical work in this course is Text Analytics with Python (Apress, 2019) by Dipanjan Sarkar, chosen for its wealth of hands on Python illustrations and code.  The code for these illustrations is organized here:

https://github.com/dipanjanS/text-analytics-with-python/tree/master/New-Second-Edition

Note: this text is also used in the follow on course, NLP and Deep Learning.

For a well-written guide to foundational concepts and context, you may wish to consider Fundamentals of Predictive Text Mining (Springer, 2015) by Weiss, Indurkhya and Zhang.

Please order a copy of your course textbook prior to course start date.

Software

This course provides problems and illustrations in Python, and assumes some familiarity with that language.

Software Uses and Descriptions | Available Free Versions
To learn more about the software used in this course, or how to obtain free versions of software used in our courses, please read our knowledge base article “What software is used in courses?” 

Course Fee & Information

Enrollment
Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date unless you specify otherwise.

Transfers and Withdrawals
We have flexible policies to transfer to another course or withdraw if necessary.

Group Rates
Contact us to get information on group rates.

Discounts
Academic affiliation?  In most courses you are eligible for a discount at checkout.

New to Statistics.com?  Click here for a special introductory discount code.  

Invoice or Purchase Order
Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment.

Options for Credit and Recognition

This course is eligible for the following credit and recognition options:

No Credit
You may take this course without pursuing credit or a record of completion.

Mastery or Certificate Program Credit
If you are enrolled in mastery or certificate program that requires demonstration of proficiency in this subject, your course work may be assessed for a grade.

CEUs and Proof of Completion
If you require a “Record of Course Completion” along with professional development credit in the form of Continuing Education Units (CEU’s), upon successfully completing the course, CEU’s and a record of course completion will be issued by The Institute upon your request.

ACE CREDIT | College Credit
This course has been evaluated by the American Council on Education (ACE) and is recommended for college credit.  For recommendation details (level, and number of credits), please see this page. Please note that the decision to accept specific credit recommendations is up to the academic institution accepting the credit.

ACE Digital Badge
Courses evaluated by the American Council on Education (ACE) have a digital badge available for successful completion of the course.

INFORMS-CAP
This course is recognized by the Institute for Operations Research and the Management Sciences (INFORMS) as helpful preparation for the Certified Analytics Professional (CAP®) exam and can help CAP® analysts accrue Professional Development Units to maintain their certification.

Supplemental Information

There is no supplemental content for this course.

Miscellaneous

There is no additional information for this course.

Register for This Course​

Introduction to NLP and Text Mining
$549 | Enroll Now
Get Notified

Have a Question About This Course?

Janet Dobbins

Sales and Business Development

Phone

(571) 281-8817

Send Us A Note

We like to hear from you.

Name*

Email*

Phone

Company

Message*

 

Send

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

Latest Blogs

  • Making Predictions Self-Fulfilling Prophecies
    February 19, 2021/
    0 Comments
  • Student Spotlight – Staci Taylor
    February 18, 2021/
    0 Comments
  • Word of the Week:  Bias
    February 1, 2021/
    0 Comments

Social Networks

Linkedin
Twitter
Facebook
Youtube

Contact

The Institute for Statistics Education
4075 Wilson Blvd, 8th Floor
Arlington, VA 22203
(571) 281-8817

ourcourses@statistics.com

© Copyright 2021 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept