0% found this document useful (0 votes)
34 views2 pages

Data science

The document outlines the syllabus for a Data Science course offered by Jawaharlal Nehru Technological University, Kakinada, for the academic year 2019-20. It includes course objectives, outcomes, and detailed content across five units covering topics such as statistics, machine learning, data visualization, and ethics in data science. Additionally, it lists textbooks and e-resources for further study.

Uploaded by

jokike8919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views2 pages

Data science

The document outlines the syllabus for a Data Science course offered by Jawaharlal Nehru Technological University, Kakinada, for the academic year 2019-20. It includes course objectives, outcomes, and detailed content across five units covering topics such as statistics, machine learning, data visualization, and ethics in data science. Additionally, it lists textbooks and e-resources for further study.

Uploaded by

jokike8919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

R-19 Syllabus for CSE. JNTUK w. e. f.

2019-20

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY: KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


L T P C
IV Year –I Semester
3 0 0 3
DATA SCIENCE
Course Objectives:
From the course the student will learn
 Provide you with the knowledge and expertise to become a proficient data scientist
 Demonstrate an understanding of statistics and machine learning concepts that are vital
for data science
 Learn to statistically analyze a dataset
 Explain the significance of exploratory data analysis (EDA) in data science
 Critically evaluate data visualizations based on their design and use for communicating
stories from data
Course Outcomes:
At the end of the course, student will be able to
 Describe what Data Science is and the skill sets needed to be a data scientist
 Illustrate in basic terms what Statistical Inference means. Identify probability
distributions
commonly used as foundations for statistical modelling, Fit a model to data
 Use R to carry out basic statistical modeling and analysis
 Apply basic tools (plots, graphs, summary statistics) to carry out EDA
 Describe the Data Science Process and how its components interact
 Use APIs and other tools to scrap the Web and collect data
 Apply EDA and the Data Science process in a case study
UNIT I
Introduction, The Ascendance of Data, Motivating Hypothetical: Data Sciencester, Finding Key
Connectors, The Zen of Python, Getting Python, Virtual Environments, Whitespace Formatting,
Modules, Functions, Strings, Exceptions, Lists, Tuples, Dictionaries defaultdict, Counters, Sets,
Control Flow, Truthiness, Sorting, List Comprehensions, Automated Testing and assert, Object-
Oriented Programming, Iterables and Generators, Randomness, Regular Expressions, Functional
Programming, zip and Argument Unpacking, args and kwargs, Type Annotations, How to Write
Type Annotations.
UNIT II
Visualizing Data: matplotlib, Bar Charts, Line Charts, Scatterplots. Linear Algebra: Vectors,
Matrices, Statistics: Describing a Single Set of Data, Correlation, Simpson’s Paradox, Some
Other Correlational Caveats, Correlation and Causation.
Gradient Descent: The Idea Behind Gradient Descent, Estimating the Gradient, Using the
Gradient, Choosing the Right Step Size, Using Gradient Descent to Fit Models, Minibatch and
Stochastic Gradient Descent.
UNIT III
Getting Data: stdin and stdout, Reading Files, Scraping the Web, Using APIs,
Working with Data: Exploring Your DataUsing NamedTuples, Dataclasses, Cleaning and
Munging, Manipulating Data, Rescaling, Dimensionality Reduction.
Probability: Dependence and Independence, Conditional Probability, Bayes’s Theorem, Random
Variables, Continuous Distributions, The Normal Distribution, The Central Limit Theorem
R-19 Syllabus for CSE. JNTUK w. e. f. 2019-20

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY: KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

UNIT IV
Machine Learning: Modeling, Overfitting and Underfitting, Correctness, The Bias-Variance
Tradeoff, Feature Extraction and Selection, k-Nearest Neighbors, Naive Bayes, Simple Linear
Regression, Multiple Regression, Digression, Logistic Regression
UNIT V
Clustering: The Idea, The Model, Choosing k, Bottom-Up Hierarchical Clustering.
Recommender Systems: Manual Curation, Recommending What’s Popular, User-Based
Collaborative Filtering, Item-Based Collaborative Filtering, Matrix Factorization
Data Ethics, Building Bad Data Products, Trading Off Accuracy and Fairness, Collaboration,
Interpretability, Recommendations, Biased Data, Data Protection
IPython, Mathematics, NumPy, pandas, scikit-learn, Visualization, R
Textbooks:
1) Joel Grus, “Data Science From Scratch”, OReilly.
2) Allen B.Downey, “Think Stats”, OReilly.
Reference Books:
1) Doing Data Science: Straight Talk From The Frontline, 1st Edition, Cathy O’Neil and
Rachel Schutt, O’Reilly, 2013
2) Mining of Massive Datasets, 2nd Edition, Jure Leskovek, Anand Rajaraman and Jeffrey
Ullman, v2.1, Cambridge University Press, 2014
3) “The Art of Data Science”, 1st Edition, Roger D. Peng and Elizabeth matsui, Lean
Publications, 2015
4) “Algorithms for Data Science”, 1st Edition, Steele, Brian, Chandler, John, Reddy,
Swarna, springers Publications, 2016
e-Resources:
1) https://github.com/joelgrus/data-science-from-scratch
2) https://github.com/donnemartin/data-science-ipython-notebooks
3) https://github.com/academic/awesome-datascience

You might also like