HadoopConceptsNote
Introduction
1.
Chapter 1: Data Flow
1.1.
Data Locality Optimization
1.2.
Shuffle and Sort Phase
2.
Chapter 2: Data Read and Write
2.1.
Read
2.2.
Network Topology
2.3.
Write
2.4.
Coherency Model
3.
Chapter 3: YARN and MapReduce
3.1.
Anatomy of a YARN Application Run
3.2.
Resource Request
3.3.
Application Lifespan
3.4.
Differences between YARN and MR1
3.5.
Scheduling in YARN
4.
Hadoop I/O
4.1.
Data Integrity
4.2.
Compression
Powered by
GitBook
A
A
Serif
Sans
White
Sepia
Night
Share on Twitter
Share on Google
Share on Facebook
Share on Weibo
Share on Instapaper
HadoopConceptsNote
Differences between YARN and MR1
MapReduce 1
YARN
Jobtracker
ResourceManager, ApplicationMaster, TimeLine
Tasktracker
NodeManager
Slot
Container
Scalablity
Availablity
It more easy to replicate ResourceManager status, because it doesn't have to keep tracking job status in every few seconds.
Utilization
The Slot is designated as Map and Reduce tasks.
The Slot is allocated static of fix size.
The Container can be used for either Map or Reduce tasks.
Multitenancy
Run different versions of applications
Resource isolate