Data Science Certification

This Data Science course will cover the whole data life cycle ranging from Data Acquisition and Data Storage using R-Hadoop concepts, Applying modelling through R programming using Machine learning algorithms and illustrate impeccable Data Visualization by leveraging on ‘R’ capabilities.With companies across industries striving to bring their research and analysis (R&A) departments up to speed, the demand for qualified data scientists is rising.


Learn More

Leading Tools

Master Data Science using leading tools such as SAS, R, Python and Tableau

Flexible Schedule

Set and maintain flexible deadlines.


Experiential Learning

Hands-on learning through 6 industry projects, across multiple tools and industries

100% online courses

Start instantly and learn at your own schedule.


Get career assistance​ & certificate

Extensive support via resume building, interview prep, mentorship and interview opportunities

World class faculty

Learn From Industry Experts With Years Of Data Science Experience & Excellent Teaching Skills

Advantages of This Course

Learn how to analyze large amounts of data to bring out insights
Gain hands-on knowledge through the problem solving based approach of the course along with working on a project at the end of the course
Relevant examples and cases make the learning more effective and easier

This course is for students pursuing their graduation/post-graduation and for working professionals

Learn to analyze data using SQL, R, SAS, Python, Predictive Analytics & Machine Learning
Become one of the most in-demand Data Analytics in the world today

Course Curriculum

Getting Started With Data Science And Recommender Systems

  • Data Science Overview
  • Reasons to use Data Science
  • Project Lifecycle
  • Data Acquirement
  • Evaluation of Input Data
  • Transforming Data
  • Statistical and analytical methods to work with data
  • Machine Learning basics
  • Introduction to Recommender systems
  • Apache Mahout Overview

Reasons To Use, Project Lifecycle

  • What is Data Science?
  • What Kind of Problems can you solve?
  • Data Science Project Life Cycle
  • Data Science-Basic Principles
  • Data Acquisition
  • Data Collection
  • Understanding Data- Attributes in a Data, Different types of Variables
  • Build the Variable type Hierarchy
  • Two Dimensional Problem
  • Co-relation b/w the Variables- explain using Paint Tool
  • Outliers, Outlier Treatment
  • Boxplot, How to Draw a Boxplot

Acquiring Data

  • Discussion on Boxplot- also Explain
  • Example to understand variable Distributions
  • What is Percentile? – Example using Rstudio tool
  • How do we identify outliers?
  • How do we handle outliers?
  • Outlier Treatment: Using Capping/Flooring General Method
  • Distribution- What is Normal Distribution
  • Why Normal Distribution is so popular
  • Uniform Distribution
  • Skewed Distribution
  • Transformation

Machine Learning In Data Science

  • Discussion about Box plot and Outlier
  • Goal: Increase Profits of a Store
  • Areas of increasing the efficiency
  • Data Request
  • Business Problem: To maximize shop Profits
  • What are Interlinked variables
  • What is Strategy
  • Interaction b/w the Variables
  • Univariate analysis
  • Multivariate analysis
  • Bivariate analysis
  • Relation b/w Variables
  • Standardize Variables
  • What is Hypothesis?
  • Interpret the Correlation
  • Negative Correlation
  • Machine Learning

Statistical And Analytical Methods Dealing With Data, Implementation Of Recommenders Using Apache Mahout And Transforming Data

  • Correlation b/w Nominal Variables
  • Contingency Table
  • What is Expected Value?
  • What is Mean?
  • How Expected Value is differ from Mean
  • Experiment – Controlled Experiment, Uncontrolled Experiment
  • Degree of Freedom
  • Dependency b/w Nominal Variable & Continuous Variable
  • Linear Regression
  • Extrapolation and Interpolation
  • Univariate Analysis for Linear Regression
  • Building Model for Linear Regression
  • Pattern of Data means?
  • Data Processing Operation
  • What is sampling?
  • Sampling Distribution
  • Stratified Sampling Technique
  • Disproportionate Sampling Technique
  • Balanced Allocation-part of Disproportionate Sampling
  • Systematic Sampling
  • Cluster Sampling
  • 2 angels of Data Science-Statistical Learning, Machine Learning

Testing And Assessment, Production Deployment And More

  • Multi variable analysis
  • linear regration
  • Simple linear regration
  • Hypothesis testing
  • Speculation vs. claim(Query)
  • Sample
  • Step to test your hypothesis
  • performance measure
  • Generate null hypothesis
  • alternative hypothesis
  • Testing the hypothesis
  • Threshold value
  • Hypothesis testing explanation by example
  • Null Hypothesis
  • Alternative Hypothesis
  • Probability
  • Histogram of mean value
  • Revisit CHI-SQUARE independence test
  • Correlation between Nominal Variable

Business Algorithms, Simple Approaches To Prediction, Building Model, Model Deployment

  • Machine Learning
  • Importance of Algorithms
  • Supervised and Unsupervised Learning
  • Various Algorithms on Business
  • Simple approaches to Prediction
  • Predict Algorithms
  • Population data
  • sampling
  • Disproportionate Sampling
  • Steps in Model Building
  • Sample the data
  • What is K?
  • Training Data
  • Test Data
  • Validation data
  • Model Building
  • Find the accuracy
  • Rules
  • Iteration
  • Deploy the model
  • Linear regression

Getting Started With Segmentation Of Prediction And Analysis

  • Clustering
  • Cluster and Clustering with Example
  • Data Points, Grouping Data Points
  • Manual Profiling
  • Horizontal & Vertical Slicing
  • Clustering Algorithm
  • Criteria for take into Consideration before doing Clustering
  • Graphical Example
  • Clustering & Classification: Exclusive Clustering, Overlapping Clustering, Hierarchy Clustering
  • Simple Approaches to Prediction
  • Different types of Distances: 1.Manhattan, 2.Euclidean, 3.Consine Similarity
  • Clustering Algorithm in Mahout
  • Probabilistic Clustering
  • Pattern Learning
  • Nearest Neighbor Prediction
  • Nearest Neighbor Analysis

Integration Of R And Hadoop

  • R introduction
  • How R is typically used
  • Features of R
  • Introduction to Big data
  • R+Hadoop
  • Ways to connect with R and Hadoop
  • Products
  • Case Study
  • Architecture
  • Steps for Installing RIMPALA
  • How to create IMPALA packages