Supercharge R with SparkR - Apply your R chops to Big Data!

Udemy
Course Summary
Extend R with Spark and SparkR - Create clusters on AWS, perform distributed modeling, and access HDFS and S3
-
+
Course Description
In this class you will learn:
- how to use R in a distributed environment
- create Spark clusters on Amazon's AWS
- perform distributed modeling using GLM
- measure distributed regression and classification predictions
- access data from csv's, json, hdfs, and S3
All our examples will be performed on real clusters - no training wheels, single local clusters or third-party tools.
Note 1: you will need to know how to SSH to your Amazon AWS instance (I will show how I do it using the Mac but Windows or Linux isn't covered)
Note 2: There is a minimal cost involved when using Amazon's AWS instances. This biggest machine we will use is around 0.05 US cents/hour/machine.
This course is listed under
Open Source
, Cloud Computing
, Development & Implementations
, Data & Information Management
, Operating Systems
and Server & Storage Management
Community
Related Posts: