Reduce task no data locality:
- Consume Map output. The Map output would be removed after the output has been copied to the node running Reduce task, and that's why the Map output should not be stored on HDFS.
- The output of Reduce task would be stored on HDFS.
- For the Reduce output, the first one replica would be stored on the node ran the Reduce task.