SQL on Hadoop - Analyzing Big Data with Hive
Pluralsight
Course Summary
This course will teach you the Hive query language and how to apply it to solve common Big Data problems. This includes an introduction to distributed computing, Hadoop, and MapReduce fundamentals and the latest features released with Hive 0.11
-
+
Course Description
From developer to analyst, this course tackles a few big questions about big data: Why does this technology exist and why do I need it? How can I get the best out of it utilizing something familiar like SQL and how does this all fit together in an ever-evolving eco-system? This course will introduce the concepts of distributed computing, Hadoop and MapReduce and then goes into great detail into Apache Hive which is an SQL-like query language that can be used with Hadoop and NoSQL databases like HBase and Cassandra. The course presents some challenges you might experience solving real production problems and how Hive makes that task easier to accomplish.
-
+
Course Syllabus
Introduction to Hadoop- 24m 5s
—Introduction 1m 48s
—Motivation for Hadoop 1m 12s
—Distributed Computing Challenges 3m 1s
—Hadoop File System (HDFS) 1m 30s
—MapReduce 4m 8s
—Word Count Example 1m 27s
—Demo: Basic Hadoop Commands and Environment Setup 9m 49s
—Summary 1m 10sIntroduction to Hive- 46m 17s
—Introduction 1m 17s
—Hive Motivation 1m 36s
—Hive Architecture 2m 24s
—Hive Principles - Schema on Read 1m 3s
—Hive Principles - The Hive Warehouse 2m 29s
—Hive Query Language Basics - SELECT and Sub Queries 4m 9s
—Creating Databases and Tables with HiveQL 7m 51s
—Demo: Working with Hive Tables and Loading Data into Warehouse 12m 9s
—Loading Data - Hive Managed and External Tables 2m 27s
—Demo: External Tables and Create Table Alternatives 9m 48s
—Summary 1m 4sHive Query Language- 1h 28mAdvanced HiveQL- 1h 18mStorage and The Eco-System- 19m 9s