Hadoop Testing Course Content
Introduction to Hadoop and its Ecosystem, MapReduce and HDFS
Introduction to Hadoop and its constituent ecosystem, understanding MapReduce and HDFS,Big Data, Factors constituting Big Data, Hadoop and Hadoop Ecosystem, Map Reduce -Concepts of Map, Reduce, Ordering, Concurrency, Shuffle, Reducing, Concurrency, Hadoop Distributed File System (HDFS) Concepts and its Importance, Deep Dive in Map Reduce – Execution Framework, Partitioner, Combiner, Data Types, Key pairs, HDFS Deep Dive – Architecture, Data Replication, Name Node, Data Node, Data Flow, Parallel Copying with DISTCP, Hadoop Archives
Hands on Exercises
Installing Hadoop in Pseudo Distributed Mode, Understanding Important configuration files, their Properties and Demon Threads, Accessing HDFS from Command Line, Map Reduce – Basic Exercises, Understanding Hadoop Eco-system, Introduction to Sqoop, use cases and Installation, Introduction to Hive, use cases and Installation, Introduction to Pig, use cases and Installation, Introduction to Oozie, use cases and Installation, Introduction to Flume, use cases and Installation, Introduction to Yarn, Mini Project – Importing Mysql Data using Sqoop and Querying it using Hive
How to develop Map Reduce Application, writing unit test, Best Practices for developing and writing, Debugging Map Reduce applications
Introduction to Pig & its features
What Is Pig?, Pig’s Features, Pig Use Cases, Interacting with Pig, Basic Data Analysis with Pig, Pig Latin Syntax, Loading Data, Simple Data Types, Field Definitions, Data Output, Viewing the Schema, Filtering and Sorting Data, Commonly-Used Functions, Hands-On Exercise: Using Pig for ETL Processing
Introduction to Hive
What Is Hive?, Hive Schema and Data Storage, Comparing Hive to Traditional Databases, Hive vs. Pig, Hive Use Cases, Interacting with Hive, Relational Data Analysis with Hive, Hive Databases and Tables, Basic HiveQL Syntax, Data Types, Joining Data Sets, Common Built-in Functions, Hands-On Exercise: Running Hive Queries on the Shell, Scripts, and Hue
Hadoop Stack Integration Testing
Why Hadoop testing is important, Unit testing, Integration testing, Performance testing, Diagnostics, Nightly QA test, Benchmark and end to end tests, Functional testing, Release certification testing, Security testing, Scalability Testing, Commissioning and Decommissioning of Data Nodes Testing, Reliability testing, Release testing
Roles and Responsibilities of Hadoop Testing
Understanding the Requirement, preparation of the Testing Estimation, Test Cases, Test Data, Test bed creation, Test Execution, Defect Reporting, Defect Retest, Daily Status report delivery, Test completion., ETL testing at every stage (HDFS, HIVE, HBASE) while loading the input (logs/files/records etc) using sqoop/flume which includes but not limited to data verification, Reconciliation., User Authorization and Authentication testing (Groups, Users, Privileges etc), Report defects to the development team or manager and driving them to closure., Consolidate all the defects and create defect reports., Validating new feature and issues in Core Hadoop.
Framework called MR Unit for Testing of MapReduce Programs
Report defects to the development team or manager and driving them to closure, Consolidate all the defects and create defect reports, Validating new feature and issues in Core Hadoop, Responsible for creating a testing Framework called MR Unit for testing of MapReduce programs.
Automation testing using the OOZIE, Data validation using the query surge tool.
Test Execution of Hadoop _customized
Test plan for HDFS upgrade, Test automation and result
Test Plan Strategy Test Cases of Hadoop Testing
How to test install and configure
Hadoop Testing Projects
Project WorkProject 1 – Working with MapReduce, Hive, SqoopProblem Statement– It describes that how to import MySQL data using Sqoop and querying it using hive and also describes how to run the word count MapReduce job.Project 2 – Hadoop Testing using MRProblem Statement – It describes how to test MapReduce codes with MR unit.