Big Data Power Tools Bundle

Skillwise
Course Summary
The functional programming nature and the availability of a REPL environment make Scala particularly well suited for a distributed computing framework like Spark. Using these two technologies in tandem can allow you to effectively analyze and explore data in an interactive environment with extremely fast feedback. This course will teach you how to best combine Spark and Scala, making it perfect for aspiring data analysts and Big Data engineers. Access 51 lectures & 8.5 hours of content 24/7 Use Spark for a variety of analytics & machine learning tasks Understand functional programming constructs in Scala Implement complex algorithms like PageRank & Music Recommendations Work w/ a variety of datasets from airline delays to Twitter, web graphs, & Product Ratings Use the different features & libraries of Spark, like RDDs, Dataframes, Spark SQL, MLlib, Spark Streaming, & GraphX Write code in Scala REPL environments & build Scala applications w/ an IDE
-
+
Course Description
The functional programming nature and the availability of a REPL environment make Scala particularly well suited for a distributed computing framework like Spark. Using these two technologies in tandem can allow you to effectively analyze and explore data in an interactive environment with extremely fast feedback. This course will teach you how to best combine Spark and Scala, making it perfect for aspiring data analysts and Big Data engineers.
- Access 51 lectures & 8.5 hours of content 24/7
- Use Spark for a variety of analytics & machine learning tasks
- Understand functional programming constructs in Scala
- Implement complex algorithms like PageRank & Music Recommendations
- Work w/ a variety of datasets from airline delays to Twitter, web graphs, & Product Ratings
- Use the different features & libraries of Spark, like RDDs, Dataframes, Spark SQL, MLlib, Spark Streaming, & GraphX
- Write code in Scala REPL environments & build Scala applications w/ an IDE
- Length of time users can access this course: lifetime
- Access options: web streaming, mobile streaming
- Certification of completion not included
- Redemption deadline: redeem your code within 30 days of purchase
- Experience level required: all levels, but some knowledge of Java or C++ is assumed
- Internet required
-
+
Course Syllabus
- Introduction
- Connect the Dots with Linear Regression
- Basic Statistics Used for Regression
- Simple Regression
- Applying Simple Regression Using Excel
- Multiple Regression
- Applying Multiple Regression using Excel
- Logistic Regression for Categorical Dependent Variables
- Solving Logistic Regression
- Applying Logistic Regression
- Introduction
- Factor Analysis and PCA
- Basic Statistics Required for PCA
- Diving into Principal Components Analysis
- PCA in Excel
- PCA in R
- PCA in Python
- Introduction
- Diving into R
- Vectors
- Arrays
- Matrices
- Factors
- Lists and Data Frames
- Descriptive Statistics
- Data Visualization in R
- You, Us & This Course
- Introducing Hive
- Built-in Functions
- Sub-Queries
- Partitioning
- Bucketing
- Windowing
- Understanding MapReduce
- MapReduce logic for queries: Behind the scenes
- Join Optimizations in Hive
- Hadoop and Hive Install
- Appendix
- Introduction
- Getting Started
- Loading Data into a QV App
- Exploring Data using the UI
- Transforming Data in Load Scripts
- Effectively presenting data
- Advanced Load Transformations
- You, This Course and Us
- Stream Processing with Storm
- Implementing a Hello World Topology
- Processing Data using Files
- Running a Topology in the Remote Mode
- Adding Parallelism to a Storm Topology
- Section 7: Building a Word Count Topology
- Remote Procedure Calls Using Storm
- Managing Reliability of Topologies
- Integrating Storm with Different Sources/Sinks
- Using the Storm Multilang Protocol
- Complex Transformations using Trident
- Machine Learning using Storm
- You, This Course and Us
- Introducing Scala
- Expressions or Statements?
- First Class Functions
- Collections
- Classes and Objects
- You, This Course and Us
- Introduction to Spark
- Resilient Distributed Datasets
- Advanced RDDs: Pair Resilient Distributed Datasets
- Advanced Spark: Accumulators, Spark Submit, MapReduce , Behind The Scenes
- PageRank: Ranking Search Results
- Spark SQL
- MLlib in Spark: Build a recommendations engine
- Spark Streaming
- Graph Libraries
- Scala Language Primer
- Supplementary Installs