In computer science, MapReduce is a procedure that prepares data for parallel processing on multiple computers.
The "map" function sorts the data, and the "reduce" function generates frequencies of items. The combined overall system manages the parceling out of the data to multiple processors, and managing the tasks. Apache Hadoop is a popular open-source implementation of MapReduce, featuring the ability to use low cost commodity hardware for the processing tasks.
The Institute for Statistics Education offers an extensive glossary of statistical terms, available to all for reference and research. We will provide a statistical term every week, delivered directly to your inbox. To improve your own statistical knowledge, sign up here.
Rather not have more email? Bookmark this page.