The “map” function sorts the data, and the “reduce” function generates frequencies of items. The combined overall system manages the parceling out of the data to multiple processors, and managing the tasks. Apache Hadoop is a popular open-source implementation of MapReduce, featuring the ability to use low cost commodity hardware for the processing tasks.
Week #45 – MapReduce
In computer science, MapReduce is a procedure that prepares data for parallel processing on multiple computers.