In this course, you will cover key unsupervised learning techniques including association rules, principal components analysis, and clustering. You will also review integration of supervised and unsupervised learning techniques. Participants will apply data mining algorithms to real data, and will interpret the results. A final project will integrate an unsupervised task with supervised methods covered in predictive Analytics 1 and 2. Students will use Python, a free software environment with statistical computing and graphics capabilities. Note: If you prefer to work in R or XLMiner, this course is offered using R or XLMiner.
Dr. Peter Gedeck
Peter Gedeck is at the forefront of the use of data science in drug discovery. He is a Senior Data Scientist at Collaborative Drug Discovery, which offers the pharmaceutical industry cloud-based software to manage the huge amount of data involved in the drug discovery process. Drug discovery involves the exploration and testing of huge numbers of molecule combinations, and much of that testing takes place analytically, hence the need for robust software to handle the data and provide a framework for analyzing it. Peter's specialty is the development of machine learning algorithms to predict biological and physicochemical properties of drug candidates. Prior to this, he worked for twenty y...