The Building Blocks of Hadoop - HDFS, MapReduce, and YARN

Pluralsight

Course Summary

Processing billions of records requires a deep understanding of distributed computing. In this course, you'll get introduced to Hadoop, an open-source distributed computing framework that can help you do just that.

+
Course Description

You know how to write Java code and you know what processing you want to perform on your huge dataset. But, can you use the Hadoop distributed framework effectively to get your work done? This course, The Building Blocks of Hadoop Â HDFS, MapReduce, and YARN, gives you a fundamental understanding of the building blocks of Hadoop: HDFS for storage, MapReduce for processing, and YARN for cluster management, to help you bridge the gap between programming and big data analysis. First, you'll get a complete architecture overview for Hadoop. Next, you'll learn how to set up a pseudo-distributed Hadoop environment and submit and monitor tasks on that environment. And finally, you'll understand the configuration choices you can make for stability, reliability optimized task scheduling on your distributed system. By the end of this course you'll have gained a strong understanding of the building blocks needed in order for you to use Hadoop effectively.

Course Description

You know how to write Java code and you know what processing you want to perform on your huge dataset. But, can you use the Hadoop distributed framework effectively to get your work done? This course, The Building Blocks of Hadoop Â HDFS, MapReduce, and YARN, gives you a fundamental understanding of the building blocks of Hadoop: HDFS for storage, MapReduce for processing, and YARN for cluster management, to help you bridge the gap between programming and big data analysis. First, you'll get a complete architecture overview for Hadoop. Next, you'll learn how to set up a pseudo-distributed Hadoop environment and submit and monitor tasks on that environment. And finally, you'll understand the configuration choices you can make for stability, reliability optimized task scheduling on your distributed system. By the end of this course you'll have gained a strong understanding of the building blocks needed in order for you to use Hadoop effectively.

+
Course Syllabus

Course Overview
- 1m 32s

â€”Course Overview 1m 32s

Introducing Hadoop
- 20m 34s

â€”The Need for Distributed Computing 4m 58s
â€”Two Ways to Build a System 5m 0s
â€”Introducing Hadoop 5m 34s
â€”Other Technologies in the Hadoop Eco-system 5m 0s

Installing Hadoop
- 33m 24s

â€”Hadoop Install Modes 4m 37s
â€”Installing Hadoop in Standalone Mode 6m 59s
â€”Pseudo-distributed Mode: Setting up SSH 4m 40s
â€”Pseudo-distributed Mode: The JAVA_HOME Environment Variable 3m 11s
â€”Pseudo-distributed Mode: Configuration Settings 3m 40s
â€”Pseudo-distributed Mode: Starting HDFS and YARN 5m 3s
â€”Psuedo-distributed Mode: Monitoring the Cluster 5m 10s

Storing Data with HDFS
- 34m 8s

â€”The Name Node and Data Nodes 4m 52s
â€”Storing and Reading Files from HDFS 5m 21s
â€”Introduction to HDFS Commands 4m 40s
â€”Copying Files to and from Hadoop 4m 44s
â€”Fault Tolerance with Replication 6m 54s
â€”Name Node Failure Management 7m 34s

Processing Data with MapReduce
- 26m 19s

â€”The Map and Reduce Phases to Process Data 4m 52s
â€”Data Flow in a MapReduce 5m 1s
â€”Implement MapReduce in Java 3m 38s
â€”Set up the Map, Reduce, and Main Classes 5m 42s
â€”Submit a Jar to Hadoop 3m 53s
â€”Monitor the Mapreduce Job Using the Web Interface 3m 11s

Scheduling and Managing Tasks with YARN
- 22m 39s

â€”Anatomy of a Job Run in YARN 6m 14s
â€”The First in First out Scheduler 4m 18s
â€”The Capacity Scheduler 3m 34s
â€”The Fair Scheduler 2m 31s
â€”Running Jobs on a Specific Queue 6m 0s

Course Fee:

USD 29

Course Type:	Self-Study
Course Status:	Active
Workload:	1 - 4 hours / week

This course is listed under Development & Implementations and Industry Specific Applications Community

Java

Attended this course? Write a Review

Course Fee:

USD 29

Course Type:	Self-Study
Course Status:	Active
Workload:	1 - 4 hours / week

IT Career Development Platform

The Building Blocks of Hadoop - HDFS, MapReduce, and YARN

Pluralsight

Course Summary

Course Description

Course Description

Course Syllabus

Course Type:

Course Status:

Workload:

Hadoop

MapReduce

Secure Shell (SSH)

Java ARchive (JAR) file

Java

Attended this course? Write a Review

Course Type:

Course Status:

Workload: