Cs3353 Foundations of Data Science L T P C 3 0 0 3

This document outlines the course objectives and content for a Foundations of Data Science course. The course contains 5 units that cover topics such as data science processes, describing and analyzing relationships in data, Python libraries for data wrangling, and data visualization. Students will learn basic statistical and probability concepts, perform descriptive analytics on benchmark datasets, and apply correlation and regression analyses. The accompanying laboratory course involves hands-on experiments with Python packages like NumPy, Pandas, and Matplotlib to work with datasets and visualize data.

Uploaded by

arunasekaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

412 views2 pages

Cs3353 Foundations of Data Science L T P C 3 0 0 3

Uploaded by

arunasekaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

CS3353 FOUNDATIONS OF DATA SCIENCE LTPC 3003

UNIT I INTRODUCTION 9
Data Science: Benefits and uses – facets of data - Data Science Process: Overview – Defining research goals –
Retrieving data – Data preparation - Exploratory Data analysis – build the model– presenting findings and
building applications - Data Mining - Data Warehousing – Basic Statistical descriptions of Data
UNIT II DESCRIBING DATA 9
Types of Data - Types of Variables -Describing Data with Tables and Graphs –Describing Data with Averages
- Describing Variability - Normal Distributions and Standard (z) Scores
UNIT III DESCRIBING RELATIONSHIPS 9
Correlation –Scatter plots –correlation coefficient for quantitative data –computational formula for correlation
coefficient – Regression –regression line –least squares regression line – Standard error of estimate –
interpretation of r2 –multiple regression equations –regression towards the mean
UNIT IV PYTHON LIBRARIES FOR DATA WRANGLING 9
Basics of Numpy arrays –aggregations –computations on arrays –comparisons, masks, boolean logic – fancy
indexing – structured arrays – Data manipulation with Pandas – data indexing and selection – operating on data
– missing data – Hierarchical indexing – combining datasets – aggregation and grouping – pivot tables
UNIT V DATA VISUALIZATION 9
Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and contour plots – Histograms
– legends – colors – subplots – text and annotation – customization – three dimensional plotting - Geographic
Data with Basemap - Visualization with Seaborn.
TOTAL:45 PERIODS
TEXT BOOKS
1. David Cielen, Arno D. B. Meysman, and Mohamed Ali, “Introducing Data Science”, Manning Publications,
2016. (Unit I)
2. Robert S. Witte and John S. Witte, “Statistics”, Eleventh Edition, Wiley Publications, 2017. (Units II and III)
3. Jake VanderPlas, “Python Data Science Handbook”, O’Reilly, 2016. (Units IV and V)

REFERENCES:
1. Allen B. Downey, “Think Stats: Exploratory Data Analysis in Python”, Green Tea Press,2014.
CS3362 DATA SCIENCE LABORATORY L T P C 0 0 4 2
COURSE OBJECTIVES:
 To understand the python libraries for data science

 To understand the basic Statistical and Probability measures for data science.

 To learn descriptive analytics on the benchmark data sets.

 To apply correlation and regression analytics on standard data sets.

 To present and interpret data using visualization packages in Python.

LIST OF EXPERIMENTS:
1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels and Pandas packages.
2. Working with Numpy arrays
3. Working with Pandas data frames
4. Reading data from text files, Excel and the web and exploring various commands for doing descriptive
analytics on the Iris data set.
5. Use the diabetes data set from UCI and Pima Indians Diabetes data set for performing the following:
a. Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation, Skewness and
Kurtosis.
b. Bivariate analysis: Linear and logistic regression modeling
c. Multiple Regression analysis
d. Also compare the results of the above analysis for the two data sets.
6. Apply and explore various plotting functions on UCI data sets.
a. Normal curves.
b. Density and contour plots.
c. Correlation and scatter plots.
d. Histograms.
e. Three dimensional plotting.
7. Visualizing Geographic Data with Basemap

LIST OF EQUIPMENTS :(30 Students per Batch)

Tools: Python, Numpy, Scipy, Matplotlib, Pandas, statmodels, seaborn, plotly, bokeh
Note: Example data sets like: UCI, Iris, Pima Indians Diabetes etc.
TOTAL: 60 PERIODS

Cloud Unit3
No ratings yet
Cloud Unit3
26 pages
FDP On AI
100% (1)
FDP On AI
10 pages
Visit Our Infosys Interview Preparation Dashboard
No ratings yet
Visit Our Infosys Interview Preparation Dashboard
7 pages
CS3362 - Data Science Laboratory - Manual - Final-1
No ratings yet
CS3362 - Data Science Laboratory - Manual - Final-1
76 pages
FDS Lesson Plan
No ratings yet
FDS Lesson Plan
8 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Os Lab Manual AI&DS
No ratings yet
Os Lab Manual AI&DS
64 pages
Cs3461 Operating Systems Laboratory L T P C
No ratings yet
Cs3461 Operating Systems Laboratory L T P C
1 page
Numpy - Tutorial - Ipynb - Colaboratory
No ratings yet
Numpy - Tutorial - Ipynb - Colaboratory
9 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
Unit 5 Fod (1) (Repaired)
No ratings yet
Unit 5 Fod (1) (Repaired)
28 pages
Algorithms Lab Manual
100% (1)
Algorithms Lab Manual
37 pages
CS3361 Set1
No ratings yet
CS3361 Set1
5 pages
Ds Unit 1 Data Structures
No ratings yet
Ds Unit 1 Data Structures
27 pages
MC4102 OOSE Question bank
No ratings yet
MC4102 OOSE Question bank
4 pages
CS3461 OS Manual
No ratings yet
CS3461 OS Manual
119 pages
Jerusalem College of Engineering: ACADEMIC YEAR 2021 - 2022
No ratings yet
Jerusalem College of Engineering: ACADEMIC YEAR 2021 - 2022
40 pages
3-1 Bigdata (Spark)
No ratings yet
3-1 Bigdata (Spark)
3 pages
Artificial Intelligence and Machine Learning Fundamentals
No ratings yet
Artificial Intelligence and Machine Learning Fundamentals
23 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
14 pages
Oop Lesson Plan Cs3391 Jec
No ratings yet
Oop Lesson Plan Cs3391 Jec
4 pages
Question Bank - OS
No ratings yet
Question Bank - OS
6 pages
Lab Manual
No ratings yet
Lab Manual
28 pages
AL3452 OS NOTES (1)
No ratings yet
AL3452 OS NOTES (1)
280 pages
Syllabus GE3151 PROBLEM SOLVING AND PYTHON PROGRAMMING 3 0 0 3
No ratings yet
Syllabus GE3151 PROBLEM SOLVING AND PYTHON PROGRAMMING 3 0 0 3
2 pages
Ge8151 Phython Prog Unit 4 New
No ratings yet
Ge8151 Phython Prog Unit 4 New
33 pages
Data Science Laboratory Lab Manual: Prepared by Dr. R Obulakonda Reddy, Associate Professor
No ratings yet
Data Science Laboratory Lab Manual: Prepared by Dr. R Obulakonda Reddy, Associate Professor
35 pages
CS3401 Algorithms Syllabus
No ratings yet
CS3401 Algorithms Syllabus
3 pages
AL3502DEEP LEARNING FOR VISIONL T P C
No ratings yet
AL3502DEEP LEARNING FOR VISIONL T P C
3 pages
ad3461-ml-lab-manual
No ratings yet
ad3461-ml-lab-manual
48 pages
ML LAB(R22) MANUAL (4)
No ratings yet
ML LAB(R22) MANUAL (4)
25 pages
CS3352 - Foundation of Data Science
No ratings yet
CS3352 - Foundation of Data Science
2 pages
Ad3311 Set4
No ratings yet
Ad3311 Set4
2 pages
Aim L Record
No ratings yet
Aim L Record
26 pages
CS3401 - Algorithm
No ratings yet
CS3401 - Algorithm
37 pages
It6006 Data Analytics Syllabus
No ratings yet
It6006 Data Analytics Syllabus
1 page
Aiml Lab Manual 2023
No ratings yet
Aiml Lab Manual 2023
17 pages
AD3461 ML lab manual
No ratings yet
AD3461 ML lab manual
32 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
21 pages
FDS Unit 1
No ratings yet
FDS Unit 1
21 pages
GE3151 Problem Solving and Python Programming Lecture Notes 2
No ratings yet
GE3151 Problem Solving and Python Programming Lecture Notes 2
158 pages
Data Science Fundamentals Syllabus
No ratings yet
Data Science Fundamentals Syllabus
3 pages
Dev PDF
100% (1)
Dev PDF
35 pages
Question Paper - AI (Feb 1)
No ratings yet
Question Paper - AI (Feb 1)
2 pages
Unit 2 Fod
No ratings yet
Unit 2 Fod
27 pages
Lesson Plan For GE3151
No ratings yet
Lesson Plan For GE3151
5 pages
Object Oriented Programming - CS8391
No ratings yet
Object Oriented Programming - CS8391
9 pages
AoA Important Question
100% (1)
AoA Important Question
3 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
38 pages
Ad3491 Fdsa Unit 4 Notes Eduengg-2
No ratings yet
Ad3491 Fdsa Unit 4 Notes Eduengg-2
16 pages
006 Practical List of DM-2023
No ratings yet
006 Practical List of DM-2023
1 page
ccs346 Eda
No ratings yet
ccs346 Eda
2 pages
DSA Lab Syllabus
No ratings yet
DSA Lab Syllabus
1 page
Unit 3
No ratings yet
Unit 3
24 pages
Java Week 7 Solutions (Nptel)
No ratings yet
Java Week 7 Solutions (Nptel)
2 pages
LAB MANUAL - OS - 2021 Regulation Final-1
No ratings yet
LAB MANUAL - OS - 2021 Regulation Final-1
68 pages
GE3151 PYTHON Syllabus
No ratings yet
GE3151 PYTHON Syllabus
2 pages
FDSA Unit-2
No ratings yet
FDSA Unit-2
41 pages
Data Analytics Lab File Rohit
No ratings yet
Data Analytics Lab File Rohit
23 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Touchpad Prime Ver. 1.2 Class 6
From Everand
Touchpad Prime Ver. 1.2 Class 6
Nisha Batra
No ratings yet
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet
django forms
No ratings yet
django forms
8 pages
BRAIN STROKE BTH09 F01
No ratings yet
BRAIN STROKE BTH09 F01
20 pages
MA 4151 Applied Probability and Statistics For Computer Science Engineers Old Question Paper
33% (3)
MA 4151 Applied Probability and Statistics For Computer Science Engineers Old Question Paper
6 pages
Data Link layer
No ratings yet
Data Link layer
88 pages
MULTIMEDIA QUESTION ANSWERING SYSTEM USING DIVERSE RELEVANCE RANKING
No ratings yet
MULTIMEDIA QUESTION ANSWERING SYSTEM USING DIVERSE RELEVANCE RANKING
11 pages
CS2028 UNIX INTERNALS Question Bank New
No ratings yet
CS2028 UNIX INTERNALS Question Bank New
10 pages
OOPS LAB RECORD PRINT
No ratings yet
OOPS LAB RECORD PRINT
110 pages
IP Addressing
No ratings yet
IP Addressing
52 pages
CS3581 NETWORKS LAB MANUAL FOR 2021
No ratings yet
CS3581 NETWORKS LAB MANUAL FOR 2021
70 pages
SRM Valliammai Engineering College: Department of Computer Science and Engineering Question Bank
No ratings yet
SRM Valliammai Engineering College: Department of Computer Science and Engineering Question Bank
13 pages
Ec8551 - Communication Networks MCQ
90% (10)
Ec8551 - Communication Networks MCQ
35 pages
Multiple Access
No ratings yet
Multiple Access
47 pages
Unit-4
No ratings yet
Unit-4
75 pages
Is Question Bank
No ratings yet
Is Question Bank
10 pages
Go Tutorial PDF
No ratings yet
Go Tutorial PDF
45 pages
Unit-4 Logical Design
100% (1)
Unit-4 Logical Design
25 pages
Placement Details: 2017-18 IDBI Federal Life Insurance Co LTD)
No ratings yet
Placement Details: 2017-18 IDBI Federal Life Insurance Co LTD)
7 pages
Idbi - Chennai Offer Letter-Podhigai - 17-18
No ratings yet
Idbi - Chennai Offer Letter-Podhigai - 17-18
2 pages
CN Lab Manual
No ratings yet
CN Lab Manual
76 pages
MG6088-Software Project Management
100% (1)
MG6088-Software Project Management
9 pages
CS6311 - PDS - 2 Lab Manual by Rajasekaran
No ratings yet
CS6311 - PDS - 2 Lab Manual by Rajasekaran
95 pages
Cs8392 Object Oriented Programming
No ratings yet
Cs8392 Object Oriented Programming
2 pages
CS6801
No ratings yet
CS6801
7 pages
Address
No ratings yet
Address
5 pages
CS6311 - PDS - 2 Lab Manual by Rajasekaran
No ratings yet
CS6311 - PDS - 2 Lab Manual by Rajasekaran
95 pages
CHAPTER I
No ratings yet
CHAPTER I
19 pages
Bcm4 c04 - Quantitative_techniques_for_business (1)
No ratings yet
Bcm4 c04 - Quantitative_techniques_for_business (1)
2 pages
Unit 4
No ratings yet
Unit 4
30 pages
Research Capabilityof Teachers Its Correlates Determinantsand Implicationsfor Continuing Professional Development
No ratings yet
Research Capabilityof Teachers Its Correlates Determinantsand Implicationsfor Continuing Professional Development
12 pages
Shape of Data Skewness and Kurtosis
No ratings yet
Shape of Data Skewness and Kurtosis
6 pages
Impact of Macroeconomic Indicators on Financing in Islamic Banking Industry
No ratings yet
Impact of Macroeconomic Indicators on Financing in Islamic Banking Industry
22 pages
IFM PROJECT REPORT GROUP2
No ratings yet
IFM PROJECT REPORT GROUP2
47 pages
Completion of Wind Turbine Data Sets For Wind Integration Studies PDF
No ratings yet
Completion of Wind Turbine Data Sets For Wind Integration Studies PDF
11 pages
MANOVA
No ratings yet
MANOVA
33 pages
Joint Inversion of Airborne TEM Data and Surface Geoelectrical Data
No ratings yet
Joint Inversion of Airborne TEM Data and Surface Geoelectrical Data
10 pages
haladyna mkale
No ratings yet
haladyna mkale
9 pages
Fieldwork Is Good The Student Perception
No ratings yet
Fieldwork Is Good The Student Perception
19 pages
2018 The Production or Intermediation Approach
No ratings yet
2018 The Production or Intermediation Approach
10 pages
Bivariate Analysis Formulas Sheet 241109 223345
No ratings yet
Bivariate Analysis Formulas Sheet 241109 223345
4 pages
The Geography IA
No ratings yet
The Geography IA
26 pages
Chapter 3
No ratings yet
Chapter 3
6 pages
Further Statistical Analysis
No ratings yet
Further Statistical Analysis
10 pages
MEASURING TOURIST MOTIVATION
No ratings yet
MEASURING TOURIST MOTIVATION
27 pages
Application of Machine Learning Methods To Spatial Interpolation of Environmental Variables
No ratings yet
Application of Machine Learning Methods To Spatial Interpolation of Environmental Variables
13 pages
Salcala, 23-2, 131-154, Mayfield
No ratings yet
Salcala, 23-2, 131-154, Mayfield
24 pages
Grade: Midterm II (Quantitative Methods I)
No ratings yet
Grade: Midterm II (Quantitative Methods I)
3 pages
Algone12024 Exam
No ratings yet
Algone12024 Exam
28 pages
Farah Amir 358 (P. 41-58)
No ratings yet
Farah Amir 358 (P. 41-58)
18 pages
SPE 68361 Considerations On The Selection of An Optimum Vertical Multiphase Pressure Drop Prediction Model For Oil Wells
No ratings yet
SPE 68361 Considerations On The Selection of An Optimum Vertical Multiphase Pressure Drop Prediction Model For Oil Wells
10 pages
Download full Solution Manual for Macroeconomics Fourteenth Canadian Edition Canadian 14th Edition Ragan 0321794885 9780321794888 all chapters
100% (22)
Download full Solution Manual for Macroeconomics Fourteenth Canadian Edition Canadian 14th Edition Ragan 0321794885 9780321794888 all chapters
45 pages
Lesson 2 Characteristics, Processes, and Ethics of Research
No ratings yet
Lesson 2 Characteristics, Processes, and Ethics of Research
61 pages
Time Spent on Digital Devices and Its Effect on Academic Performance of Grade 12 Students of San Roque National High School, SY. 2024-2025
No ratings yet
Time Spent on Digital Devices and Its Effect on Academic Performance of Grade 12 Students of San Roque National High School, SY. 2024-2025
29 pages
Financial Prudence As Determinant of Employees' Interest in Graduate Study
No ratings yet
Financial Prudence As Determinant of Employees' Interest in Graduate Study
13 pages
II BSC CS II BCA Statistical Methods MODEL (2022)
No ratings yet
II BSC CS II BCA Statistical Methods MODEL (2022)
2 pages
Thesis-Proposal-Danivie Jaranta Final 1
No ratings yet
Thesis-Proposal-Danivie Jaranta Final 1
33 pages

Cs3353 Foundations of Data Science L T P C 3 0 0 3

Uploaded by

Cs3353 Foundations of Data Science L T P C 3 0 0 3

Uploaded by

CS3353 FOUNDATIONS OF DATA SCIENCE LTPC 3003

 To learn descriptive analytics on the benchmark data sets.

 To apply correlation and regression analytics on standard data sets.

 To present and interpret data using visualization packages in Python.

LIST OF EQUIPMENTS :(30 Students per Batch)

You might also like