hadoop ch1

  • Published on
    25-Jan-2015

  • View
    439

  • Download
    2

Embed Size (px)

DESCRIPTION

hadoop ch1

Transcript

  • 1.

2. 3. .. 1.1 1.2 1.3 1.4 1.5 4. 1.1 5. (Big data?!) - NYSE, 1 - facebook, 10 - , 15 >> 6. MyLifeBits - Microsoft research - - 1GB - 7. astrometry.net 8. Big Data vs - - : Big Data , ... 9. 1.2 10. / - > : , but, 1. 2. cost 11. / - > : , but, 1. 2. cost 1 -> HDFS 2 -> MapReduce 12. 1.3 MapReduce is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster Map(k1,v1) list(k2,v2) Reduce(k2, list (v2)) list(v3) 13. 1.3.1 - / - / X - / 14. 1.3.2 - , Hadoop, -> - Hadoop, 15. 1.3.3 - SETI@home, , Folding@home (http://cafe.naver.com/setikah) - CPU, CPU > 16. 1.4 - (made-up name) " . , , ." 17. 1.4 - - - - GFS -> NDFS - -> NDFS 18. 1.5 : , I/O (Avro) : RPC : / HDFS: 19. 1.5 : : HBASE: DB : :DB HDFS : (,,,) /