Hadoop - OSC2013 .Enterprise

  • View
    1.419

  • Download
    6

Embed Size (px)

DESCRIPTION

12/13OSC 2013 .EnterpriseHadoop

Transcript

  • 1. Apache Hadoop Hadoop
  • 2. (@_sinchii_) Hadoop Hadoop Advent Calendar : 12/21 http://qiita.com/advent-calendar/2013/hadoop 12/1 OSC .Enterprise 2013 2
  • 3. Hadoop Hadoop http://hugjp.org/index.php : 12/20 () Advent Calendar @ OSC .Enterprise 2013 3
  • 4. Hadoop Hadoop Hadoop Hadoop OSC .Enterprise 2013 4
  • 5. : Hadoop Google MapReduce (2004) MapReduce Google File System (2003) HDFS OS / Java OSC .Enterprise 2013 5
  • 6. MapReduce Shuffle Key Map Reduce Map Reduce Map MapReduce OSC .Enterprise 2013 6
  • 7. Hadoop TaskTracker(s) M M M R R R JobTracker JobClient DFSClient HDFS M R M NameNode OSC .Enterprise 2013 DataNode(s) 7
  • 8. Hadoop Hadoop Hadoop Hadoop OSC .Enterprise 2013 8
  • 9. Hadoop The Google File System MapReduce EMR 2003 2004 2005 HDP CDH 2006 0.20 1.0 2009 2011 API Sqoop Hive Flume OSC .Enterprise 2013 2013 Pig HBase 2 HA Impala Oozie Ambari Spark 9
  • 10. Hadoop Hive (SQL style) Pig (DSL) Mahout () HBase () Flume () MapReduce Sqoop (DB) HDFS Oozie () Spark () Ambari () Impala () ZooKeeper () OSC .Enterprise 2013 10
  • 11. Hadoop Hadoop Hadoop Hadoop OSC .Enterprise 2013 11
  • 12. YARN : Yet Another Resource Negotiator Hadoop 1.0 MapReduce JobTracker MapReduce () TaskTracker () TaskTracker Map Reduce OSC .Enterprise 2013 12
  • 13. YARN : Yet Another Resource Negotiator JobTracker ResourceManager : ApplicationMaster : (NodeManager) MapReduceApplicationMaster (Container) NodeManager : (CPU) OSC .Enterprise 2013 13
  • 14. YARN MapReduce Map(Reduce) Application Master Container Node Manager AM Node Manager Container Resource Manager Node Manager CPU Node Manager JobHistory Server () OSC .Enterprise 2013 14
  • 15. YARN MapReduce API Hadoop 1.0MapReduce ApplicationMaster MapReduce OSC .Enterprise 2013 15
  • 16. YARN MapReduce Apache Spark : Apache Storm : Apache Giraph : Apache Tez : Hive/Pig HOYA (Apache HBase) : Impala : OSC .Enterprise 2013 16
  • 17. YARNstable... 20131212... YARN HA ResourceManager ApplicationMaster CapacityScheduler FairScheduler ApplicationMaster OSC .Enterprise 2013 17
  • 18. HDFS 2.0 NameNode HA HDFS Snapshot HDFS Cache NFS (HDFS Federation) OSC .Enterprise 2013 18
  • 19. NameNode HA ZooKeeper QJM(edits) ZooKeeper ZKFC NameNode (active) fsimage JournalNode JournalNode OSC .Enterprise 2013 ZKFC NameNode (standby) edits JournalNode 19
  • 20. HDFS Snapshot /user hoge file1 65535 fuga dir1 file2 file3 file5 file4 Read-Only OSC .Enterprise 2013 20
  • 21. HDFS Snapshot : hdfs dfs -deleteSnapshot : hdfs dfs -renameSnapshot : hdfs dfs -createSnapshot : hdfs snapshotDiff hdfs dfs -ls /.snapshot/ OSC .Enterprise 2013 21
  • 22. Apache Pig : 0.12.0 AvroStorage ASSERT IN CASE (HCatalogHive) Apache Hive : 0.12.0 Date Parallel ORDER BY OSC .Enterprise 2013 22
  • 23. Java 7 Windows Hadoop audit stacktrace OSC .Enterprise 2013 23
  • 24. Hadoop Hadoop Hadoop Hadoop OSC .Enterprise 2013 24
  • 25. Hadoop Trunk Hadoop 2(2.2current) 2.3 2.4 ? 2.2.1 2.3 YARN HA (RM Fail Over via ZKFC) ? Application History Server Long-running applications HDFS Trace ? HDFS Symlink ? ? Hadoop 1 ? OSC .Enterprise 2013 25
  • 26. Hadoop Apache Sentry Apache Tez YARNPig / Hive Stinger Hadoop Hive100... Openstack Savanna OSC .Enterprise 2013 26
  • 27. Hadoop 2 HDFS : YARN : (Hive/Pig ...) (HDFS+MR) HDFS OSC .Enterprise 2013 27