Big Data Data Analytics
Big Data Data Analytics
2. Introduction to R
Data types, Sub setting, Writing Data, Reading Tabular Data files, Creating a
Vector & Vector operations, Initializing Data frame, Control Structures &
Functions, Loop functions & Debugging
Statistics in R Computing basic Statistics, Comparing means of two samples,
Testing a correlation for significance, Classical Tests (t, z, F), ANOVA
Data Visualization in R Creating bar chart, dot plot, Creating a scatter plot, pie
chart, creating a histogram and box plot
Statistical Modelling in R & Data mining in R
4. Machine Learning
Introduction to machine learning
Regression Least Squares, Ridge Regression, Lasso Regression, k-nearest
neighbors Regression & Classification
Supervised Learning Discriminative Algorithms (Linear & Quadratic), Generative
Algorithms, Support Vector Machines, Learning Theory, Regularization & Model
Selection, Perceptron Algorithm
Centre for Development of Advanced Computing
8. Introduction to Spark
Introduction to Spark, Components of Spark Unified Stack
Resilient Distributed Dataset (RDD)
Creation of Parallelized Collection & External Datasets, RDD operations
Usage of SparkContext, submission of application to the cluster
Analyzing Big Data using Self-service BI Tools, e.g. Impala, Hive, Stinger etc.
Big Data Analytics query performance enablers
Managing stream computing in a Big Data environment
Various techniques for streaming analytics