Data science is one of a host of similar terms. “Artificial intelligence” has been around since the 1960’s and “data mining” for at least a couple of decades. “Machine learning” came out of the computer science community, and “analytics,” “data analytics,” and “predictive analytics” came out of the statistics and OR communities. Among all of them, data science may have the greatest staying power. Unlike the others, it has yielded its own job description, data scientist, appearing in hundreds of thousands of job postings. The other terms will often appear as skilled, but rarely as the title of the job itself.
Data scientist covers sufficiently broad terrain that no one person could be an expert in all its disciplines – statistics, computer science, database management, language processing, and the storage, flow, and processing of huge amounts of data. Harlan Harris, Sean Murphy, and Marck Vaisman, in Analyzing the Analyzers, describe the ideal data scientist as a “T” – the top of the T representing familiarity with the broad range of disciplines in data science, and the vertical bar of the T representing a deeper expertise in one of the disciplines. Data scientist has now become a hot job, so much so that a number of jobs that have a more limited analytical focus, or might not require advanced analytics, are being described as data scientists.