HadoopConceptsNote

Chapter 1: Data Flow

One Map task for each Split
Overhead: Split can't be too small or there will be too much of them to process.
The suggested size is same with the block size of HDFS. The too large size of Split would cause the Map task need to grep the remainings from other nodes.