MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.

I'm not curious

Data Science Architect Masters Program

Course Summary

Our Data Science Architect masters program lets you gain proficiency in Data Science. You will work on real world projects in Data Science with R, Apache Spark, Scala, Deep learning, Tableau, Data Science with SAS, Hadoop developer, Mahout and more. In this program you will cover 10 courses and 20 industry based projects

  • +

    Course Syllabus

    Data Science Course Content

    Introduction to Data Science and Statistical Analytics
    Introduction to Data Science, Use cases, Need of Business Analytics, Data Science Life Cycle, Different tools available for Data Science
    Introduction to R
    Installing R and R-Studio, R packages, R Operators, if statements and loops (for, while, repeat, break, next), switch case
    Data Exploration, Data Wrangling and R Data Structure
    Importing and Exporting data from external source, Data exploratory analysis, R Data Structure (Vector, Scalar, Matrices, Array, Data frame, List), Functions, Apply Functions
    Data Visualization
    Bar Graph (Simple, Grouped, Stacked), Histogram, Pi Chart, Line Chart, Box (Whisker) Plot, Scatter Plot, Correlogram
    Introduction to Statistics
    Terminologies of Statistics ,Measures of Centers, Measures of Spread, Probability, Normal Distribution, Binary Distribution, Hypothesis Testing, Chi Square Test, ANOVA
    Predictive Modeling – 1 ( Linear Regression)
    Supervised Learning – Linear Regression ,Bivariate Regression, Multiple Regression Analysis, Correlation( Positive, negative and neutral), Industrial Case Study, Machine Learning Use-Cases, Machine Learning Process Flow, Machine Learning Categories
    Predictive Modeling – 2 ( Logistic Regression)
    Logistic Regression
    Decision Trees
    What is Classification and its use cases?, What is Decision Tree?, Algorithm for Decision Tree Induction, Creating a Perfect Decision Tree, Confusion Matrix
    Random Forest
    Random Forest, What is Naive Bayes?
    Unsupervised learning
    What is Clustering & its Use Cases?, What is K-means Clustering?, What is Canopy Clustering?, What is Hierarchical Clustering?
    Association Analysis and Recommendation engine
    Market Basket Analysis (MBA), Association Rules, Apriori Algorithm for MBA, Introduction of Recommendation Engine, Types of Recommendation – User-Based and Item-Based, Recommendation Use-case
    Sentiment Analysis
    Introduction to Text Mining, Introduction to Sentiment, Setting up API bridge, between R and Tweeter Account, Extracting Tweet from Tweeter Acc, Scoring the tweet
    Time Series
    What is Time Series data?, Time Series variables, Different components of Time Series data, Visualize the data to identify Time Series Components, Implement ARIMA model for forecasting, Exponential smoothing models, Identifying different time series scenario based on which different Exponential Smoothing model can be applied, Implement respective ETS model for forecasting

    R Programming Course Content

    Introduction to R

    R language for statistical programming, the various features of R, introduction to R Studio, the statistical packages, familiarity with different data types and functions, learning to deploy them in various scenarios, use SQL to apply ‘join’ function, components of R Studio like code editor, visualization and debugging tools, learn about R-bind.


    R Functions, code compilation and data in well-defined format called R-Packages, learn about R-Package structure, Package metadata and testing, CRAN (Comprehensive R Archive Network), Vector creation and variables values assignment.

    Sorting Dataframe

    R functionality, Rep Function, generating Repeats, Sorting and generating Factor Levels, Transpose and Stack Function.

    Matrices and Vectors

    Introduction to matrix and vector in R, understanding the various functions like Merge, Strsplit, Matrix manipulation, rowSums, rowMeans, colMeans, colSums, sequencing, repetition, indexing and other functions.

    Reading data from external files

    Understanding subscripts in plots in R, how to obtain parts of vectors, using subscripts with arrays, as logical variables, with lists, understanding how to read data from external files.

    Generating plots

    Generate plot in R, Graphs, Bar Plots, Line Plots, Histogram, components of Pie Chart.

    Analysis of Variance (ANOVA)

    Understanding Analysis of Variance (ANOVA) statistical technique, working with Pie Charts, Histograms, deploying ANOVA with R, one way ANOVA, two way ANOVA.

    K-means Clustering

    K-Means Clustering for Cluster & Affinity Analysis, Cluster Algorithm, cohesive subset of items, solving clustering issues, working with large datasets, association rule mining affinity analysis for data mining and analysis and learning co-occurrence relationships.

    Association Rule Mining

    Introduction to Association Rule Mining, the various concepts of Association Rule Mining, various methods to predict relations between variables in large datasets, the algorithm and rules of Association Rule Mining, understanding single cardinality.

    Regression in R

    Understanding what is Simple Linear Regression, the various equations of Line, Slope, Y-Intercept Regression Line, deploying analysis using Regression, the least square criterion, interpreting the results, standard error to estimate and measure of variation.

    Analyzing Relationship with Regression

    Scatter Plots, Two variable Relationship, Simple Linear Regression analysis, Line of best fit

    Advance Regression

    Deep understanding of the measure of variation, the concept of co-efficient of determination, F-Test, the test statistic with an F-distribution, advanced regression in R, prediction linear regression.

    Logistic Regression

    Logistic Regression Mean, Logistic Regression in R.

    Advance Logistic Regression

    Advanced logistic regression, understanding how to do prediction using logistic regression, ensuring the model is accurate, understanding sensitivity and specificity, confusion matrix, what is ROC, a graphical plot illustrating binary classifier system, ROC curve in R for determining sensitivity/specificity trade-offs for a binary classifier.

    Receiver Operating Characteristic (ROC)

    Detailed understanding of ROC, area under ROC Curve, converting the variable, data set partitioning, understanding how to check for multicollinearlity, how two or more variables are highly correlated, building of model, advanced data set partitioning, interpreting of the output, predicting the output, detailed confusion matrix, deploying the Hosmer-Lemeshow test for checking whether the observed event rates match the expected event rates.

    Kolmogorov Smirnov Chart

    Data analysis with R, understanding the WALD test, MC Fadden’s pseudo R-squared, the significance of the area under ROC Curve, Kolmogorov Smirnov Chart which is non-parametric test of one dimensional probability distribution.

    Database connectivity with R

    Connecting to various databases from the R environment, deploying the ODBC tables for reading the data, visualization of the performance of the algorithm using Confusion Matrix.

    Integrating R with Hadoop

    Creating an integrated environment for deploying R on Hadoop platform, working with R Hadoop, RMR package and R Hadoop Integrated Programming Environment, R programming for MapReduce jobs and Hadoop execution.

    R Case Studies

    Logistic Regression Case Study

    In this case study you will get a detailed understanding of the advertisement spends of a company that will help to drive more sales. You will deploy logistic regression to forecast the future trends, detect patterns, uncover insights and more all through the power of R programming. Due to this the future advertisement spends can be decided and optimized for higher revenues.

    Multiple Regression Case Study

    You will understand how to compare the miles per gallon (MPG) of a car based on the various parameters. You will deploy multiple regression and note down the MPG for car make, model, speed, load conditions, etc. It includes the model building, model diagnostic, checking the ROC curve, among other things.

    Receiver Operating Characteristic (ROC) case study

    You will work with various data sets in R, deploy data exploration methodologies, build scalable models, predict the outcome with highest precision, diagnose the model that you have created with various real world data, check the ROC curve and more.

    Scala Course Content

    Introduction of Scala

    Introducing Scala and deployment of Scala for Big Data applications and Apache Spark analytics.

    Pattern Matching

    The importance of Scala, the concept of REPL (Read Evaluate Print Loop), deep dive into Scala pattern matching, type interface, higher order function, currying, traits, application space and Scala for data analysis.

    Executing the Scala code

    Learning about the Scala Interpreter, static object timer in Scala, testing String equality in Scala, Implicit classes in Scala, the concept of currying in Scala, various classes in Scala.

    Classes concept in Scala

    Learning about the Classes concept, understanding the constructor overloading, the various abstract classes, the hierarchy types in Scala, the concept of object equality, the val and var methods in Scala.

    Case classes and pattern matching

    Understanding Sealed traits, wild, constructor, tuple, variable pattern, and constant pattern.

    Concepts of traits with example

    Understanding traits in Scala, the advantages of traits, linearization of traits, the Java equivalent and avoiding of boilerplate code.

    Scala java Interoperability

    Implementation of traits in Scala and Java, handling of multiple traits extending.

    Scala collections

    Introduction to Scala collections, classification of collections, the difference between Iterator, and Iterable in Scala, example of list sequence in Scala.

    Mutable collections vs. Immutable collections

    The two types of collections in Scala, Mutable and Immutable collections, understanding lists and arrays in Scala, the list buffer and array buffer, Queue in Scala, double-ended queue Deque, Stacks, Sets, Maps, Tuples in Scala.

    Use Case bobsrockets package

    Introduction to Scala packages and imports, the selective imports, the Scala test classes, introduction to JUnit test class, JUnit interface via JUnit 3 suite for Scala test, packaging of Scala applications in Directory Structure, example of Spark Split and Spark Scala.

    Spark Course Content

    Introduction to Spark

    Introduction to Spark, how Spark overcomes the drawbacks of working MapReduce, understanding in-memory MapReduce,interactive operations on MapReduce, Spark stack, fine vs. coarse grained update, Spark stack,Spark Hadoop YARN, HDFS Revision, YARN Revision, the overview of Spark and how it is better Hadoop, deploying Spark without Hadoop,Spark history server, Cloudera distribution.

    Spark Basics

    Spark installation guide,Spark configuration, memory management, executor memory vs. driver memory, working with Spark Shell, the concept of Resilient Distributed Datasets (RDD), learning to do functional programming in Spark, the architecture of Spark.

    Working with RDDs in Spark

    Spark RDD, creating RDDs, RDD partitioning, operations & transformation in RDD,Deep dive into Spark RDDs, the RDD general operations, a read-only partitioned collection of records, using the concept of RDD for faster and efficient data processing,RDD action for Collect, Count, Collectsmap, Saveastextfiles, pair RDD functions.

    Aggregating Data with Pair RDDs

    Understanding the concept of Key-Value pair in RDDs, learning how Spark makes MapReduce operations faster, various operations of RDD,MapReduce interactive operations, fine & coarse grained update, Spark stack.

    Writing and Deploying Spark Applications

    Comparing the Spark applications with Spark Shell, creating a Spark application using Scala or Java, deploying a Spark application,Scala built application,creation of mutable list, set & set operations, list, tuple, concatenating list, creating application using SBT,deploying application using Maven,the web user interface of Spark application, a real world example of Spark and configuring of Spark.

    Parallel Processing

    Learning about Spark parallel processing, deploying on a cluster, introduction to Spark partitions, file-based partitioning of RDDs, understanding of HDFS and data locality, mastering the technique of parallel operations,comparing repartition & coalesce, RDD actions.

    Spark RDD Persistence

    The execution flow in Spark, Understanding the RDD persistence overview,Spark execution flow & Spark terminology, distribution shared memory vs. RDD, RDD limitations, Spark shell arguments,distributed persistence, RDD lineage,Key/Value pair for sorting implicit conversion like CountByKey, ReduceByKey, SortByKey, AggregataeByKey

    Spark Streaming & Mlib

    Spark Streaming Architecture, Writing streaming programcoding, processing of spark stream,processing Spark Discretized Stream (DStream), the context of Spark Streaming, streaming transformation, Flume Spark streaming, request count and Dstream, multi batch operation, sliding window operations and advanced data sources. Different Algorithms, the concept of iterative algorithm in Spark, analyzing with Spark graph processing, introduction to K-Means and machine learning, various variables in Spark like shared variables, broadcast variables, learning about accumulators.

    Improving Spark Performance

    Introduction to various variables in Spark like shared variables, broadcast variables, learning about accumulators, the common performance issues and troubleshooting the performance problems.

    Spark SQL and Data Frames

    Learning about Spark SQL, the context of SQL in Spark for providing structured data processing, JSON support in Spark SQL, working with XML data, parquet files, creating HiveContext, writing Data Frame to Hive, reading JDBC files, understanding the Data Frames in Spark, creating Data Frames, manual inferring of schema, working with CSV files, reading JDBC tables, Data Frame to JDBC, user defined functions in Spark SQL, shared variable and accumulators, learning to query and transform data in Data Frames, how Data Frame provides the benefit of both Spark RDD and Spark SQL, deploying Hive on Spark as the execution engine.

    Scheduling/ Partitioning

    Learning about the scheduling and partitioning in Spark,hash partition, range partition, scheduling within and around applications, static partitioning, dynamic sharing, fair scheduling,Map partition with index, the Zip, GroupByKey, Spark master high availability, standby Masters with Zookeeper, Single Node Recovery With Local File System, High Order Functions.

    Python Course Content

    Introduction to Python
    What is Python Language and features, Why Python and why it is different from other languages, Installation of Python, Anaconda Python distribution for Windows, Mac, Linux. Run a sample python script, working with Pyhton IDE’s. Running basic python commands – Data types, Variables,Keywords,etcHands-on Exercise – Install Anaconda Python distribution for your OS (Windows/Linux/Mac)
    Basic constructs of Python language
    Indentation(Tabs and Spaces) and Code Comments (Pound # character); Variables and Names; Built-in Data Types in Python – Numeric: int, float, complex – Containers: list, tuple, set, dict – Text Sequence: Str (String) – Others: Modules, Classes, Instances, Exceptions, Null Object, Ellipsis Object – Constants: False, True, None, NotImplemented, Ellipsis, __debug__; Basic Operators: Arithmetic, Comparison, Assignment, Logical, Bitwise, Membership, Indentity; Slicing and The Slice Operator [n:m]; Control and Loop Statements: if, for, while, range(), break, continue, else;Hands-on Exercise – Write your first Python program Write a Python Function (with and without parameters) Use Lambda expression Write a class, create a member function and a variable, Create an object Write a for loop to print all odd numbers
    Wrting Object Oriented Program in Python and connecting with Database
    Classes – classes and objects, access modifiers, instance and class members OOPS paradigm – Inheritance, Polymorphism and Encapsulation in Python. Functions: Parameters and Return Types; Lambda Expressions, Making connection with Database for pulling data.
    File Handling, Exception Handling in Python
    Open a File, Read from a File, Write into a File; Resetting the current position in a File; The Pickle (Serialize and Deserialize Python Objects); The Shelve (Overcome the limitation of Pickle); What is an Exception; Raising an Exception; Catching an Exception;Hands-on Exercise – Open a text file and read the contents, Write a new line in the opened file, Use pickle to serialize a python object, deserialize the object, Raise an exception and catch it
    Mathematical Computing with Python (NumPy)
    Arrays and Matrices, ND-array object, Array indexing, Datatypes, Array math Broadcasting, Std Deviation, Conditional Prob, Covariance and Correlation.Hands-on Exercise – Import numpy module, Create an array using ND-array, Calculate std deviation on an array of numbers, Calculate correlation between two variables
    Scientific Computing with Python (SciPy)
    Builds on top of NumPy, SciPy and its characteristics, subpackages: cluster, fftpack, linalg, signal, integrate, optimize, stats; Bayes Theorem using SciPyHands-on Exercise – Import SciPy, Apply Bayes theorem using SciPy on the given dataset
    Data Visualization (Matplotlib)
    Plotting Grapsh and Charts (Line, Pie, Bar, Scatter, Histogram, 3-D); Subplots; The Matplotlib APIHands-on Exercise – Plot Line, Pie, Scatter, Histogram and other charts using Matplotlib
    Data Analysis and Machine Learning (Pandas) OR Data Manipulation with Python
    Dataframes, NumPy array to a dataframe; Import Data (csv, json, excel, sql database); Data operations: View, Select, Filter, Sort, Groupby, Cleaning, Join/Combine, Handling Missing Values; Introduction to Machine Learning(ML); Linear Regression; Time SeriesHands-on Exercise – Import Pandas, Use it to import data from a json file,,Select records by a group and apply filter on top of that, View the records, Perform Linear Regression analysis, Create a Time Series
    Natural Language Processing, Machine Learning (Scikit-Learn)
    Introduction to Natural Language Processing (NLP); NLP approach for Text Data; Environment Setup (Jupyter Notebook); Sentence Analysis; ML Algorithms in Scikit-Learn; What is Bag of Words Model; Feature Extraction from Text; Model Training; Search Grid; Multiple Parameters; Build a PipelineHands-on Exercise – Setup Jupyter Notebook environment, Load a dataset in Jupyter, Use algorithm in Scikit-Learn package to perform ML techniques, Train a model Create a search grid
    Web Scraping for Data Science
    What is Web Scraping; Web Scraping Libraries (Beautifulsoup, Scrapy); Installation of Beautifulsoup; Install lxml Python Parser; Making a Soup Object using an input html; Navigating Py Objects in the Soup Tree; Searching the Tree; Output Print; Parsing Full or PartialHands-on Exercise – Install Beautifulsoup and lxml Python parser, Make a Soup object using an input html file, Navigate Py objects in the soup tree, Search tree, Print output
    Python on Hadoop
    Understanding Hadoop and its various components; Hadoop ecosystem and Hadoop common; HDFS and MapReduce Architecture; Python scripting for MapReduce Jobs on Hadoop frameworkHands-on Exercise – Write a basic MapReduce Job in Python and connect with Hadoop Framework to perform the task
    Writing Spark code using Python
    What is Spark,understanding RDDs, Spark Libs, writing Spark code using python,Spark Machine Libraries Mlib, Regression, Classification and Clustering using Spark MLlibHands-on Exercise – Implement sandbox, Run a python code in sandbox, Work with HDFS file system from sandbox

    SAS Course Content

    Introduction to SAS

    Introduction to Base SAS, Installation of SAS tool, Getting started with SAS, various SAS Windows – Log, Explorer, Output, Search, Editor, etc. working with data sets, overview of SAS Functions, Library Types and programming files

    SAS Enterprise Guide

    Import/Export Raw Data files, reading and sub setting the data set, various statements like WHERE, SET, Merge

    Hands-on Exercise – Import Excel file in workspace, Read data, Export the workspace to save data

    SAS Operators & Functions

    Various SAS Operators – Arithmetic, Logical, Comparison, various SAS Functions – NUMERIC, CHARACTER, IS NULL, CONTAINS, LIKE, Input/Put, Date/Time, Conditional Statements (Do While, Do Until, If, Else)

    Hands-on Exercise – Apply logical, arithmetic operators and SAS functions to perform operations

    Compilation & Execution

    Understanding about Input Buffer, PDV (Backend), learning what is Missover

    Using Variables

    Defining and Using KEEP and DROP statements, apply these statements, Format and Labels in SAS.

    Hands-on Exercise – Use KEEP and DROP statements

    Creation and Compilation of SAS Data sets

    Understanding Delimiter, dataline rules, DLM, Delimiter DSD, raw data files and execution, list input for standard data.

    Hands-on Exercise – Use delimiter rules on raw data files

    SAS Procedures

    The various SAS standard Procedures built-in for popular programs – PROC SORT, PROC FREQ, PROC SUMMARY, PROC RANK, PROC EXPORT, PROC DATASET, PROC TRANSPOSE, , PROC CORR etc.

    Hands-on Exercise – Use SORT, FREQ, SUMMARY, EXPORT and other procedures

    Input statement and formatted input

    Reading standard and non-standard numeric inputs with Formatted inputs, Column Pointer Controls, Controlling while a record loads, Line pointer control / Absolute line pointer control, Single Trailing , Multiple IN and OUT statements, DATA LINES statement and rules, List Input Method, comparing Single Trailing and Double Trailing.

    Hands-on Exercise – Read standard and non-standard numeric inputs with Formatted inputs, Control while a record loads, Control a Line pointer, Write Multiple IN and OUT statements


    SAS FORMAT statements – standard and user-written, associating a format with a variable, working with SAS FORMAT, deploying it on PROC Data sets, comparing ATTRIB and FORMAT statements.

    Hands-on Exercise – Format a variable, deploy format rule on PROC DATA set, Use ATTRIB statement

    SAS Graphs

    Understanding PROC GCHART, various Graphs, Bar Charts – Pie, Bar, 3D, plotting variables with PROC GPLOT.

    Hands-on Exercise – Plot graphs using PROC GPLOT Display charts using PROC GCHART

    Interactive Data Processing

    SAS advanced data discovery and visualization, point-and-click analytics capabilities, powerful reporting tools.

    Data Transformation Function

    Character Functions, Numeric Functions, Converting Variable Type.

    Hands-on Exercise – Use Functions in data transformation

    Output Delivery System (ODS)

    Introduction to ODS, Data Optimization, How to generate files (rtf, pdf, html, doc) using SAS

    Hands-on Exercise – Optimize data, generate rtf, pdf, html and doc files


    Macro Syntax, Macro Variables, Positional Parameters in a Macro, Macro Step

    Hands-on Exercise – Write a macro, Use positional parameters


    SQL Statements in SAS, SELECT, CASE, JOIN, UNION, Sorting Data

    Hands-on Exercise – Create sql query to select and add a condition
    Use a CASE in select query

    Advanced Base SAS

    Base SAS web-based interface and ready-to-use programs, advanced data manipulation, storage and retrieval, descriptive statistics.

    Hands-on Exercise – Use web UI to do statistical operations

    Summarization Reports

    Report Enhancement, Global Statements, User-defined Formats, PROC SORT, ODS Destinations, ODS Listing, PROC FREQ, PROC Means, PROC UNIVARIATE, PROC REPORT, PROC PRINT

    Hands-on Exercise – Use PROC SORT to sort the results, List ODS, Find mean using PROC Means, print using PROC PRINT

    Tableau Course Content

    Introduction to Data Visualization and Power of Tableau
    What is data visualization, Comparision and benefits against reading raw numbers, Real usage examples from various business domains, Some quick powerful examples using Tableau without going into the technical details of Tableau
    Architecture of Tableau
    Installation of Tableau Desktop, Architecture of Tableau, Interface of Tableau (Layout, Toolbars, Data Pane, Analytics Pane etc), How to start with Tableau, Ways to share and exporting the work done in TableauHands-on Exercise – Play with the tableau desktop, interface to learn its user interface, Share an existing work, Export an existing work
    Working with Metadata & Data Blending
    Connection to Excels, PDFs and Cubes, Managing Metadata and Extracts, Data Preparation and dealing with NULL values, Data Joins (Inner, Left, Right, Outer) and Union, Cross Database joining, Data BlendingHands-on Exercise – Connect to an excel sheet and import data, Use metadata and extracts, Handle NULL values, Clean up the data before the actual use, Perform various join techniques, Perform data blending from more than one sources
    Creation of sets
    Marks, Highlighting, Sort and Group, Working with Sets (Creation of sets, Editing sets, IN/OUT, Sets in Hierarchies)Hands-on Exercise – Create and edit sets using Marks, Highlight desired items, Make groups, Applying sorting on result, Make hierachies in the created set
    Working with Filters
    Filters (Addition and Removal), Filtering continuous dates, dimensions, measures, Interactive FiltersHands-on Exercise – Add Filter on data set by date/dimensions/measures, Use interactive filter to views, Remove some filters to see the result
    Organizing Data and Visual Analytics
    Formatting Data (Labels, Annotations, Tooltips, Edit axes), Formatting Pane (Menu, Settings, Font, Alignment, Copy-Paste), Trend and Reference Lines, Forecasting, k-means Cluster Analysis in TableauHands-on Exercise – Apply labels, annotations, tooltips to graphs, Edit the attributes of axes, Set a reference line, Do k-means cluster analysis on a dataset
    Working with Mapping
    Coordinate points, Plotting Longitude and Latitude, Editing Unrecognized Locations, Custom Geocoding, Polygon Maps, WMS: Web Mapping Services, Background Image (Add Image, Plot Points on Image, Generate coordinates from Image)Hands-on Exercise – Plot latitude and longitude on geo map, Edit locations on the map, Create custom geocoding, Use images of a map and plot points on it, find coordinates in the image, Create a polygon map, Use WMS
    Working with Calculations & Expressions
    Calculation Syntax and Functions in Tableau, Types of Calculations (Table, String, Logic, Date, Number, Aggregate), LOD Expressions (concept and syntax), Aggregation and Replication with LOD Expressions, Nested LOD Expressions
    Working with Parameters
    Create Parameters, Parameters in Calculations, Using Parameters with Filters, Column Selection Parameters, Chart Selection ParametersHands-on Exercise – Create new parameters to apply on a filter, Pass parameters to filters to selet columns, Pass parameters to filters to select charts
    Charts and Graphs
    Dual Axes Graphs, Histogram (Single and Dual Axes), Box Plot, Pareto Chart, Motion Chart, Funnel Chart, Waterfall Chart, Tree Map, Heat Map, Market Basket analysisHands-on Exercise – Plot a histogram, heat map, tree map, funnel chart and others using the same data set, Do market basket analysis on a given dataset
    Dashboards and Stories
    Build and Format a Dashboard (Size, Views, Objects, Legends and Filters), Best Practices for Creative and Interactive Dashboards using Actions, Create Stories (Intro of Story Points, Creating and Updating Story Points, Adding Visuals in Stories, Annotations with Description)Hands-on Exercise – Create a dashboard view, Include objects, legends and filters, Make the dashboard interactive, Create and edit a story with visual effects, annotation, description
    Integration of Tableau with R and Hadoop
    Introduction to R Language, Applications and Use Cases of R, Deploying R on Tableau Platform, Learning R functions in Tableau, Integration with HadoopHands-on Exercise – Deploy R on tableau, Create a line graph using R interface, Connect tableau with Hadoop and extract data

    Deep Learning Course Content

    Introduction to Machine Learning

    The domain of machine learning and its implications to the artificial intelligence sector, the advantages of machine learning over other conventional methodologies.

    Deep Learning Techniques

    Introduction to Deep Learning within machine learning, how it differs from all others methods of machine learning, training the system with training data, supervised and unsupervised learning, classification and regression supervised learning, clustering and association unsupervised learning, the algorithms used in these types of learning.

    TensorFlow for Training Deep Learning Model

    Introduction to TensorFlowopen source software library for designing, building and training Deep Learning models, Python Library behind TensorFlow, Tensor Processing Unit (TPU) programmable AI accelerator by Google.

    Introduction to Neural Networks

    Mapping the human mind with Deep Neural Networks, the various building block of Artificial Neural Networks, the architecture of DNN, its building blocks, the concept of reinforcement learning in DNN, the various parameters, layers, activation functions and optimization algorithms in DNN.

    Using GPUs to train Deep Learning networks

    Introduction to GPUs and how they differ from CPUs, the importance of GPUs in training Deep Learning Networks, the forward pass and backward pass training technique, the GPU constituent with simpler core and concurrent hardware.

    Big Data Hadoop Developer Course Content

    Introduction to Big Data & Hadoop and its Ecosystem, Map Reduce and HDFS
    What is Big Data, Where does Hadoop fit in, Hadoop Distributed File System – Replications, Block Size, Secondary Namenode, High Availability, Understanding YARN – ResourceManager, NodeManager, Difference between 1.x and 2.x
    Hadoop Installation & setup
    Hadoop 2.x Cluster Architecture , Federation and High Availability, A Typical Production Cluster setup , Hadoop Cluster Modes, Common Hadoop Shell Commands, Hadoop 2.x Configuration Files, Cloudera Single node cluster
    Deep Dive in Mapreduce
    How Mapreduce Works, How Reducer works, How Driver works, Combiners, Partitioners, Input Formats, Output Formats, Shuffle and Sort, Mapside Joins, Reduce Side Joins, MRUnit, Distributed Cache
    Lab exercises :
    Working with HDFS, Writing WordCount Program, Writing custom partitioner, Mapreduce with Combiner , Map Side Join, Reduce Side Joins, Unit Testing Mapreduce, Running Mapreduce in Local Job Runner Mode
    Graph Problem Solving
    What is Graph, Graph Representation, Breadth first Search Algorithm, Graph Representation of Map Reduce, How to do the Graph Algorithm, Example of Graph Map Reduce,Exercise 1: Exercise 2:Exercise 3:
    Detailed understanding of Pig
    A. Introduction to PigUnderstanding Apache Pig, the features, various uses and learning to interact with PigB. Deploying Pig for data analysisThe syntax of Pig Latin, the various definitions, data sort and filter, data types, deploying Pig for ETL, data loading, schema viewing, field definitions, functions commonly used.C. Pig for complex data processingVarious data types including nested and complex, processing data with Pig, grouped data iteration, practical exerciseD. Performing multi-dataset operationsData set joining, data set splitting, various methods for data set combining, set operations, hands-on exerciseE. Extending PigUnderstanding user defined functions, performing data processing with other languages, imports and macros, using streaming and UDFs to extend Pig, practical exercisesF. Pig JobsWorking with real data sets involving Walmart and Electronic Arts as case study
    Detailed understanding of Hive
    A. Hive IntroductionUnderstanding Hive, traditional database comparison with Hive, Pig and Hive comparison, storing data in Hive and Hive schema, Hive interaction and various use cases of HiveB. Hive for relational data analysisUnderstanding HiveQL, basic syntax, the various tables and databases, data types, data set joining, various built-in functions, deploying Hive queries on scripts, shell and Hue.C. Data management with HiveThe various databases, creation of databases, data formats in Hive, data modeling, Hive-managed Tables, self-managed Tables, data loading, changing databases and Tables, query simplification with Views, result storing of queries, data access control, managing data with Hive, Hive Metastore and Thrift server.D. Optimization of HiveLearning performance of query, data indexing, partitioning and bucketingE. Extending HiveDeploying user defined functions for extending HiveF. Hands on Exercises – working with large data sets and extensive queryingDeploying Hive for huge volumes of data sets and large amounts of queryingG. UDF, query optimizationWorking extensively with User Defined Queries, learning how to optimize queries, various methods to do performance tuning.
    (AVRO) Data Formats
    Selecting a File Format, Tool Support for File Formats, Avro Schemas, Using Avro with Hive and Sqoop, Avro Schema Evolution, Compression
    Introduction to Hbase architecture
    What is Hbase, Where does it fits, What is NOSQL
    Hadoop Cluster Setup and Running Map Reduce Jobs
    Multi Node Cluster Setup using Amazon ec2 – Creating 4 node cluster setup, Running Map Reduce Jobs on Cluster
    Advance Mapreduce
    Delving Deeper Into The Hadoop API,More Advanced Map Reduce Programming, Joining Data Sets in Map Reduce,Graph Manipulation in Hadoop

    Excel Training Course Content

    Entering Data

    Introduction to Excel spreadsheet, learning to enter data, filling of series and custom fill list, editing and deleting fields.

    Referencing in Formulas

    Learning about relative and absolute referencing, the concept of relative formulae, the issues in relative formulae, creating of absolute and mixed references and various other formulae.

    Name Range

    Creating names range, using names in new formulae, working with the name box, selecting range, names from a selection, pasting names in formulae, selecting names and working with Name Manager.

    Understanding Logical Functions

    the various logical functions in Excel, the If function for calculating values and displaying text, nested If functions, VLookUp and IFError functions.

    Getting started with Conditional Formatting

    Learning about conditional formatting, the options for formatting cells, various operations with icon sets, data bars and color scales, creating and modifying sparklines.

    Advanced-level Validation

    multi-level drop down validation, restricting value from list only, learning about error messages and cell drop down.

    Important Formulas in Excel

    Introduction to the various formulae in Excel like Sum, SumIF & SumIFs, Count, CountA, CountIF and CountBlank, Networkdays, Networkdays International, Today & Now function, Trim (Eliminating undesirable spaces), Concatenate (Consolidating columns)

    Working with Dynamic table

    Introduction to dynamic table in Excel, data conversion, table conversion, tables for charts and VLOOKUP.

    Data Sorting

    Sorting in Excel, various types of sorting including, alphabetical, numerical, row, multiple column, working with paste special, hyperlinking and using subtotal.

    Data Filtering

    The concept of data filtering, understanding compound filter and its creation, removing of filter, using custom filter and multiple value filters, working with wildcards.

    Chart Creation

    Creation of Charts in Excel, performing operations in embedded chart, modifying, resizing, and dragging of chart.

    Various Techniques of Charting

    Introduction to the various types of charting techniques, creating titles for charts, axes, learning about data labels, displaying data tables, modifying axes, displaying gridlines and inserting trendlines, textbox insertion in a chart, creating a 2-axis chart, creating combination chart.

    Pivot Tables in Excel

    The concept of Pivot tables in Excel, report filtering, shell creation, working with Pivot for calculations, formatting of reports, dynamic range assigning, the slicers and creating of slicers.

    Ensuring Data and File Security

    Data and file security in Excel, protecting row, column, and cell, the different safeguarding techniques.

    Getting started with VBA Macros

    Learning about VBA macros in Excel, executing macros in Excel, the macro shortcuts, applications, the concept of relative reference in macros.

    Core concepts of VBA

    In-depth understanding of Visual Basic for Applications, the VBA Editor, module insertion and deletion, performing action with Sub and ending Sub if condition not met.

    Ranges and Worksheet in VBA

    Learning about the concepts of workbooks and worksheets in Excel, protection of macro codes, range coding, declaring a variable, the concept of Pivot Table in VBA, introduction to arrays, user forms, getting to know how to work with databases within Excel.

    IF condition

    Learning how the If condition works and knowing how to apply it in various scenarios, working with multiple Ifs in Macro.

    Loops in VBA

    Understanding the concept of looping, deploying looping in VBA Macros.

    Debugging in VBA

    Studying about debugging in VBA, the various steps of debugging like running, breaking, resetting, understanding breakpoints and way to mark it, the code for debugging and code commenting.

    Messaging in VBA

    The concept of message box in VBA, learning to create the message box, various types of message boxes, the IF condition as related to message boxes.

    Practical Projects in VBA

    Mastering the various tasks and functions using VBA, understanding data separation, auto filtering, formatting of report, combining multiple sheets into one, merging multiple files together.

    Best Practices of Dashboards Visualization

    Introduction to powerful data visualization with Excel Dashboard, important points to consider while designing the dashboards like loading the data, managing data and linking the data to tables and charts, creating Reports using dashboard features.

    Principles of Charting

    Learning to create charts in Excel, the various charts available, the steps to successfully build a chart, personalization of charts, formatting and updating features, various special charts for Excel dashboards, understanding how to choose the right chart for the right data.

    Getting started with Pivot Tables

    Creation of Pivot Tables in Excel, learning to change the Pivot Table layout, generating Reports, the methodology of grouping and ungrouping of data.

    Creating Dashboards

    Learning to create Dashboards, the various rules to follow while creating Dashboards, creation of dynamic dashboards, knowing what is data layout, introduction to thermometer chart and its creation, how to use alerts in the Dashboard setup.

    Creation of Interactive Components

    How to insert a Scroll bar to a data window?, Concept of Option buttons in a chart, Use of combo box drop-down, List box control Usage, How to use Checkbox Control?

    Data Analysis

    Understanding data quality issues in Excel, linking of data, consolidating and merging data, working with dashboards for Excel Pivot Tables.

    Mahout Course Content

    Mahout Overview

    Classification and Recommendation, Clustering in Mahout, Pattern Mining, Understanding machine Learning, Using Model diagram to decide the approach, Data flow, Supervised and Unsupervised learning

    Mahout Recommendations

    Concept of Recommendation, Recommendations by E-commerce site, Comparison between User Recommendations and Item recommendation, Define recommenders and Classifiers, Process of Collaborative Filtering, Explaining Pearson coefficient algorithm, Euclidean distance measure, Implementing a recommender using map reduce

    Clustering Session 1

    Defining Clustering, User-to-user similarity, Clustering Illustration, Euclidean distance measure, Distance measure vector, Understanding the process of Clustering, Vectorizing documents-Unstructured data

    Clustering Session 2

    Document clustering, Sequence-to-sparse Utility, K-Mean Clustering

    Classification Session 1

    Terminology, Predictor and Target variable, Classifiable DataKey Challenges in Classification algorithm, Vectorizing Continuous data, Classification Examples, Logic Regression and its examples

    Clustering and Classification Session 2

    Clustering, Clustering Process, Transaction Clustering, Different techniques of Vectorization, Distance measure, Clustering algorithm-K-MEAN, Clustering Application-1, Clustering Application-2, Sentiment Analyzer

    Pattern Mining

    Pearson Coefficient, Collaborative Filtering Process, Collaborative Filtering, Similarity Algorithms, Pearson Correlation, Euclidean Distance Measure -Frequent Pattern & Association rules, Frequent Pattern Growth

    Data Science Project

    Project 1 – Understanding Cold Start Problem in Data Science

    Topics: This project involves understanding of the cold start problem associated with the recommender systems. You will gain hands-on experience in information filtering, working on systems with zero historical data to refer to, as in the case of launching a new product. You will gain proficiency in working with personalized applications like movies, books, songs, news and such other recommendations. This project includes the following:

    • Algorithms for Recommender
    • Ways of Recommendation
    • Types of Recommendation -Collaborative Filtering Based Recommendation, Content-Based Recommendation
    • Complete mastery in working with the Cold Start Problem.

    Project 2 – Recommendation for Movie, Summary

    Topics: This is real world project that gives you hands-on experience in working with a movie recommender system. Depending on what movies are liked by a particular user, you will be in a position to provider data-driven recommendations. This project involves understanding recommender systems, information filtering, predicting ‘rating’, learning about user ‘preference’ and so on. You will exclusively work on data related to user details, movie details and others. The main components of the project include the following:

    • Recommendation for movie
    • Two Types of Predictions – Rating Prediction, Item Prediction
    • Important Approaches: Memory Based and Model-Based
    • Knowing User Based Methods in K-Nearest Neighbor
    • Understanding Item Based Method
    • Matrix Factorization
    • Decomposition of Singular Value
    • Data Science Project discussion
    • Collaboration Filtering
    • Business Variables Overview
    Case StudyThe Market Basket Analysis (MBA) case study

    This case study is associated with the modeling technique of Market Basket Analysis where you will learn about loading of data, various techniques for plotting the items and running the algorithms. It includes finding out what are the items that go hand in hand and hence can be clubbed together. This is used for various real world scenarios like a supermarket shopping cart and so on.

    R Programming Projects

    Project 1

    Domain – Restaurant Revenue Prediction

    Data set – Sales

    Project Description – This project involves predicting the sales of a restaurant on the basis of certain objective measurements. This project will give real time industry experience on handling multiple use cases and derive the solution. This project gives insights about feature engineering and selection.

    Project 2

    Domain – Data AnalyticsObjective – To predict about the class of a flower using its petal’s dimensions

    Project 3

    Domain – FinanceObjective – The project aims to find the most impacting factors in preferences of pre-paid model, also identifies which are all the variables highly correlated with impacting factors

    Project 4

    Domain – Stock MarketObjective – This project focuses on Machine Learning by creating predictive data model to predict future stock prices

    Apache Spark – Scala Project
    Project 1: Movie RecommendationTopics – This is a project wherein you will gain hands-on experience in deploying Apache Spark for movie recommendation. You will be introduced to the Spark Machine Learning Library, a guide to MLlib algorithms and coding which is a machine learning library. Understand how to deploy collaborative filtering, clustering, regression, and dimensionality reduction in MLlib. Upon completion of the project you will gain experience in working with streaming data, sampling, testing and statistics.Project 2: Twitter API Integration for tweet AnalysisTopics – With this project you will learn to integrate Twitter API for analyzing tweets. You will write codes on the server side using any of the scripting languages like PHP, Ruby or Python, for requesting the Twitter API and get the results in JSON format. You will then read the results and perform various operations like aggregation, filtering and parsing as per the need to come up with tweet analysis.Project 3: Data Exploration Using Spark SQL – Wikipedia data setTopics – This project lets you work with Spark SQL. You will gain experience in working with Spark SQL for combining it with ETL applications, real time analysis of data, performing batch analysis, deploying machine learning, creating visualizations and processing of graphs.
    Python Projects
    Project 1: – Python Web Scraping for Data ScienceIn this project you will be introduced to the process of web scraping using Python. It involves installation of Beautiful Soup, web scraping libraries, working on common data and page format on the web, learning the important kinds of objects, Navigable String, deploying the searching tree, navigation options, parser, search tree, searching by CSS class, list, function and keyword argument.

    Project 2

    Objective – To generate a password using Python code which would be tough to guess

    Project 3

    Domain – FinanceObjective – The project aims to find the most impacting factors in preferences of pre-paid model, also identifies which are all the variables highly correlated with impacting factors

    Project 4

    Domain – Stock MarketObjective – This project focuses on Machine Learning by creating predictive data model to predict future stock prices

    Project 5 : Server logs/Firewall logsObjective – This includes the process of loading the server logs into the cluster using Flume. It can then be refined using Pig Script, Ambari and HCatlog. You can then visualize it using elastic search and excel.This project task includes:
    • Server logs
    • Potential uses of server log data
    • Pig script
    • Firewall logs
    • Work flow editor
    SAS Projects

    Project 1 – Build analytical solution for patients taking medicines

    Domain: Health Care

    Objective – This project aims to find out descriptive statistics & subset for specific clinical data problems. It will give them brief insight about BASE SAS procedures and data steps.

    Project 2 – Build revenue projections reports

    Domain: Sales

    Objective – This project will give you hands-on experience in working with the SAS data analytics and business intelligence tool. You will be working on the data entered in a business enterprise setup, aggregate, retrieve and manage that data. You will learn to create insightful reports and graphs and come up with statistical and mathematical analysis to scientifically predict the revenue projection for a particular future time frame. Upon completion of the project you will be well-versed in the practical aspects of data analytics, predictive modeling, and data mining.

    Project 3

    Domain: Finance Market

    Objective – The project aims to find the most impacting factors in preferences of pre-paid model, also identifies which are all the variables highly correlated with impacting factors

    Project 4

    Domain: Analytics

    Objective – k-Means Cluster analysis on Iris dataset to predict about the class of a flower using its petal’s dimensions

    Tableau Projects
    Project 1 –

Course Fee:
USD 1053

Course Type:


Course Status:



1 - 4 hours / week

Related Posts:

Attended this course?

Back to Top

Awards & Accolades for MyTechLogy
Winner of
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top