MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.


I'm not curious

Java Parallel Computation on Hadoop

Course Summary

Learn to write real, working data-driven Java programs that can run in parallel on multiple machines by using Hadoop.


  • +

    Course Syllabus

    • Overview
      • Welcome!
    • Background knowledge about Hadoop
      • Existing Technical Limitations
      • Requirements for the new approach
      • Hadoop solving the limitations
    • The Hadoop Ecosystem
      • Overview of HDFS
      • Overview of MapReduce
      • Overview of Hadoop clusters
    • Get Ready in pseudo-distributed mode
      • Cloudera VM
      • Demonstration: Using the VM
      • Shared Folders between your host OS and VM
      • Tips about Shared Folders
      • Accessing HDFS
      • Running MapReduce
      • Demonstration: Accessing HDFS
      • Demonstration: Running MapReduce
      • Demonstration: Web Console for HDFS
      • Demonstration: Web Console for MapReduce
    • Get Ready in distributed mode
      • About the Environment
      • Setup the Master node - Exercise Manual
      • Setup the Slave node - Exercise Manual
      • Start the Master node - Exercise Manual
      • Start the Slave node - Exercise Manual
    • Large-scale Word Counting
      • The Problem and Design
      • Demonstration: Develop and Run the program
      • Word Counting - Source Code
    • Large-scale Data Sorting
      • The Problem and Design
      • Demonstration: Develop and Run the program
      • Data Sorting - Source Code
    • Large-scale Pattern Searching
      • The Problem and Design
      • Demonstration: Develop and Run the program
      • Pattern Searching - Source Code
    • Large-scale Item Co-occurrence
      • The Problem and Design
      • Demonstration: Develop and Run the program
      • Item Co-occurrence - Source Code
    • Large-scale Inverted Index
      • The Problem and Design
      • Demonstration: Develop and Run the program
      • Inverted Index - Source Code
    • Large-scale Data Aggregation
      • The Problem and Design
      • Demonstration: Develop and Run the program
      • Data Aggregation - Source Code
    • Data Preparation
      • Dataset 0
      • Dataset 1
      • Dataset 2


Course Fee:
USD 19

Course Type:

Self-Study

Course Status:

Active

Workload:

1 - 4 hours / week

Attended this course?

Back to Top

Awards & Accolades for MyTechLogy
Winner of
REDHERRING
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top