Big Data Hadoop Developer Certification Training Course
Intellipaat
Course Summary
Big data Hadoop developer training by Intellipaat will master you in HDFS, MapReduce, Yarn, Hive, PIG, Oozie, Flume, etc. In this Big Data Hadoop developer online course you will work on 4 real life projects and prepare yourself for Cloudera Spark and Hadoop Developer Certification (CCA175) Exam. You will get 6 months of Intellipaat Hadoop cloudlab access with this course.
-
+
Course Description
About Hadoop Developer Training Course
What you will learn in this Hadoop Developer Online Training Course?
- Learn the Hadoop Architecture and Hadoop basics for beginners
- Learn what is Hadoop, HDFS and MapReduce framework
- Write MapReduce programs and deploy Hadoop clusters
- Develop applications for Big Data using Hadoop Technology
- Develop YARN programs on the Hadoop 2.X version
- Work on Big Data analytics using Hive, Pig and YARN
- Integrate MapReduce and HBase to do advanced usage and Indexing
- Learn fundamentals of Spark framework and its working
- Understand RDD in Apache Spark
- Learn Hadoop development best practices
- Job scheduling using Oozie
- Prepare for the Cloudera Spark and Hadoop Developer Certification
Who should take this Hadoop Developer Certification Training Course?
- Software Developers, analytics, BI, ETL, and data warehousing professionals
- Big Data Hadoop developers, architects and testing personnel
What are the prerequisites for Hadoop Developer Training?
You don’t need prior knowledge of Apache Hadoop.Why should you take Online Hadoop Developer Training?
- Global Hadoop Market to Reach $84.6 Billion by 2021 – Allied Market Research
- Shortage of 1.4 -1.9 million Big Data Hadoop Analysts in US alone by 2018– Mckinsey
- Hadoop Developer in the US can get a salary of $100,000 – indeed.com
-
+
Course Syllabus
Big Data Hadoop Developer Course Content
Introduction to Big Data & Hadoop and its Ecosystem, Map Reduce and HDFSWhat is Big Data, Where does Hadoop fit in, Hadoop Distributed File System – Replications, Block Size, Secondary Namenode, High Availability, Understanding YARN – ResourceManager, NodeManager, Difference between 1.x and 2.xHadoop Installation & setupHadoop 2.x Cluster Architecture , Federation and High Availability, A Typical Production Cluster setup , Hadoop Cluster Modes, Common Hadoop Shell Commands, Hadoop 2.x Configuration Files, Cloudera Single node clusterDeep Dive in MapreduceHow Mapreduce Works, How Reducer works, How Driver works, Combiners, Partitioners, Input Formats, Output Formats, Shuffle and Sort, Mapside Joins, Reduce Side Joins, MRUnit, Distributed CacheLab exercises :Working with HDFS, Writing WordCount Program, Writing custom partitioner, Mapreduce with Combiner , Map Side Join, Reduce Side Joins, Unit Testing Mapreduce, Running Mapreduce in Local Job Runner ModeGraph Problem SolvingWhat is Graph, Graph Representation, Breadth first Search Algorithm, Graph Representation of Map Reduce, How to do the Graph Algorithm, Example of Graph Map Reduce,Exercise 1: Exercise 2:Exercise 3:Detailed understanding of PigA. Introduction to PigUnderstanding Apache Pig, the features, various uses and learning to interact with PigB. Deploying Pig for data analysisThe syntax of Pig Latin, the various definitions, data sort and filter, data types, deploying Pig for ETL, data loading, schema viewing, field definitions, functions commonly used.C. Pig for complex data processingVarious data types including nested and complex, processing data with Pig, grouped data iteration, practical exerciseD. Performing multi-dataset operationsData set joining, data set splitting, various methods for data set combining, set operations, hands-on exerciseE. Extending PigUnderstanding user defined functions, performing data processing with other languages, imports and macros, using streaming and UDFs to extend Pig, practical exercisesF. Pig JobsWorking with real data sets involving Walmart and Electronic Arts as case studyDetailed understanding of HiveA. Hive IntroductionUnderstanding Hive, traditional database comparison with Hive, Pig and Hive comparison, storing data in Hive and Hive schema, Hive interaction and various use cases of HiveB. Hive for relational data analysisUnderstanding HiveQL, basic syntax, the various tables and databases, data types, data set joining, various built-in functions, deploying Hive queries on scripts, shell and Hue.C. Data management with HiveThe various databases, creation of databases, data formats in Hive, data modeling, Hive-managed Tables, self-managed Tables, data loading, changing databases and Tables, query simplification with Views, result storing of queries, data access control, managing data with Hive, Hive Metastore and Thrift server.D. Optimization of HiveLearning performance of query, data indexing, partitioning and bucketingE. Extending HiveDeploying user defined functions for extending HiveF. Hands on Exercises – working with large data sets and extensive queryingDeploying Hive for huge volumes of data sets and large amounts of queryingG. UDF, query optimizationWorking extensively with User Defined Queries, learning how to optimize queries, various methods to do performance tuning.(AVRO) Data FormatsSelecting a File Format, Tool Support for File Formats, Avro Schemas, Using Avro with Hive and Sqoop, Avro Schema Evolution, CompressionIntroduction to Hbase architectureWhat is Hbase, Where does it fits, What is NOSQLHadoop Cluster Setup and Running Map Reduce JobsMulti Node Cluster Setup using Amazon ec2 – Creating 4 node cluster setup, Running Map Reduce Jobs on ClusterAdvance MapreduceDelving Deeper Into The Hadoop API,More Advanced Map Reduce Programming, Joining Data Sets in Map Reduce,Graph Manipulation in HadoopBig Data Hadoop Developer ProjectProject Work<pâ€>1. Project – Working with Map Reduce, Hive, SqoopProblem Statement – It describes that how to import mysql data using sqoop and querying it using
hive and also describes that how to run the word count mapreduce job.2. Project – Hadoop Yarn Project – End to End PoCProblem Statement – It includes:Import Movie data,Append the data,How to use sqoop commands to bring the data into the hdfs,End to End flow of transaction data,How to process the real word data or huge amount of data using map reduce program in terms of movie etc.
This course is listed under
Open Source
, Development & Implementations
, Industry Specific Applications
, Data & Information Management
, Networks & IT Infrastructure
, Operating Systems
and Quality Assurance & Testing
Community
Related Posts: