Baixe o app para aproveitar ainda mais
Prévia do material em texto
YARN: The Resource Manager for Hadoop After this video you will be able to.. • Outline how YARN provides flexible resource management for Hadoop cluster • Explain how YARN extends Hadoop to enable multiple frameworks such as MapReduce, Giraph, Spark and Flink HB ase Hive Pig Zoo kee per Gir aph Sto rm Spa rk MapReduce YARN Mo ngo DB Ca ssa ndr a HDFS Flin k HDFS Cluster Utilization Share Hadoop across applications Hive Pig MapReduce HDFS Others HB ase Hive Pig Zoo kee per Gir aph Sto rm Spa rk MapReduce YARN Mo ngo DB Ca ssa ndr a HDFS Flin k Hadoop 1.0 Hadoop 2.0 Hadoop evolved over time! Hadoop 1.0 Only MapReduce jobs Other applications not supported Poor Resource utilization Hive Pig MapReduce HDFS Others One dataset many applications HDFS MAP REDUCE HDFS MAP REDUCE SPARK OTHERS YARN HADOOP 1.0 HADOOP 2.0 (Yet Another Resource Negotiator) Central Resource Manager == ultimate decision maker Each machine gets a Node Manager Data Computation Framework Resource Manager Node Manager Application Master = personal negotiator Resource Manager Node Manager Negotiates Gets the job done Container = a machine Application Master = Personal Negotiator Node Manager Container Applications Master Resource Manager Essential gears in YARN engine * Source: Apache Hadoop YARN: Yet Another Resource Negotiator.” In Proceedings of the 4th Annual Symposium on Cloud Computing, 5:1–5:16. SOCC ’13. 2.5X ↑ Number of tasks from all jobs 2X ↑ CPU utilization 2X ↑ Jobs per day YARN More Applications Apache Hama and growing … Data Value Higher Resource Utilization Lower Cost One dataset Many applications Many choices in Hadoop 2.0
Compartilhar