Structured vs. unstructured data:
Structured data is data that is in a form that can be used to develop statistical or machine learning models (typically a matrix where rows are records and columns are variables or features). Or data that is in a form that can be extracted and turned into such a matrix fairly easily (e.g. database tables). Unstructured data is data, often text data, that is heterogeneous in format and requires considerable pre-processing before it can be used in a model. Examples are tweets, social network profiles and postings, and tech support cases or maintenance requests.