/    /  MapReduce – Map Function

MapReduce – Map Function: 

Input

Takes key-value pair as input, where key is a reference to input value and value is the actual data on which we process.

Processing

Processing is done by the Map function which defined by the user. In map function user can define his own custom business logic.

Mapper and map both are same, mapper is just a program which performs map phase. Here Map function receives 1 key-value pair at a time and process it. Number of mappers depend on the size of input file.

Here no mapper depends on other mapper to continue, they are all independent of each other which means as soon as 1 mapper completes its work it will be stopped and next mapper will be started.

Output

Here output will be also in the form of key-value pair. The output of map phase is called Intermediate output which is written on local disk not on HDFS. The output can be 0 or more records.