MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.


I'm not curious

SQL on Hadoop - Analyzing Big Data with Hive

Course Summary

From developer to analyst, this course tackles a few big questions about big data: Why does this technology exist and why do I need it? How can I get the best out of it utilizing something familiar like SQL and how does this all fit together in an ever-ev


  • +

    Course Syllabus

    ● Introduction to Hadoop
        ◦ Introduction
        ◦ Motivation for Hadoop
        ◦ Distributed Computing Challenges
        ◦ Hadoop File System (HDFS)
        ◦ MapReduce
        ◦ Word Count Example
        ◦ Summary
    ● Introduction to Hive
        ◦ Introduction
        ◦ Hive Motivation
        ◦ Hive Architecture
        ◦ Hive Principles - Schema on Read
        ◦ Hive Principles - The Hive Warehouse
        ◦ Hive Query Language Basics - SELECT and Sub Queries
        ◦ Creating Databases and Tables with HiveQL
        ◦ Loading Data - Hive Managed and External Tables
        ◦ Summary
    ● Hive Query Language
        ◦ Introduction
        ◦ Data Types
        ◦ Type Conversions
        ◦ Managed Partitioned Tables
        ◦ External Partitioned Tables
        ◦ Multi Inserts and Dynamic Partition Inserts
        ◦ Data Retrieval - Group By and Functions
        ◦ Sorting and Controlling Data Flow
        ◦ The CLI and Variable Substitution
        ◦ Summary
    ● Advanced HiveQL
        ◦ Introduction
        ◦ Bucketing
        ◦ Bucket and Block Sampling
        ◦ Joins
        ◦ Joins in Depth and Join Optimizations
        ◦ Map-side Joins for Bucketed Tables
        ◦ Distributed Cache
        ◦ UDTFs, Explode and Lateral View
        ◦ Extending Hive - Custom UDF Recap
        ◦ Accessing The Distributed Cache
        ◦ Hadoop Streaming and Transform()
        ◦ Windowing and Analytics Functions
        ◦ Summary
    ● Storage and The Eco-System
        ◦ Create Table Statement - File Formats and SerDes
        ◦ HCatalog
        ◦ Sqoop
        ◦ DistCP
        ◦ Hadoop Eco-System Projects
        ◦ References and Resources
        ◦ Summary
        ◦ 
        ◦ Course Content
        ◦ Table of Contents
        ◦ Description
        ◦ Exercise Files
        ◦ Assessment
        ◦ Discussion
        ◦ 
        ◦ More Info
        ◦ LevelIntermediate
        ◦ Rating
        ◦ 1 2 3 4 5
        ◦ Duration4h 16m
        ◦ Released09 Oct 2013
        ◦ Features
        ◦ 
        ◦ Tags
        ◦ developer (1076)
        ◦ data (161)
        ◦ sql (66)
        ◦ cloud (56)
        ◦ big-data (10)
        ◦ hadoop (4)
        ◦ hive (1)
        ◦ mapreduce (1)
        ◦ 
        ◦ Related Courses
        ◦ Big Data Analytics with Tableau
        ◦ SQL Big Data Convergence - The Big Picture
        ◦ SQL Azure

     


Course Fee:
USD 29

Course Type:

Self-Study

Course Status:

Active

Workload:

1 - 4 hours / week

Attended this course?

Back to Top

Awards & Accolades for MyTechLogy
Winner of
REDHERRING
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top