Data Analyst Nanodegree Program - Syllabus
Data Analyst Nanodegree Program - Syllabus
Syllabus
u dacity.com
u dacity.com
Data Analyst
Overview:
Learn how to analyze data using in-demand Python libraries
like NumPy and pandas. Students will start by going over the
basics of the data analysis process, then dive into advanced
data wrangling skills to work with messy, complex real-world
datasets. Finally, you will create highly customized
visualizations using the Matplotlib Python library.
Soft are ar
w /H dw are an ver ion re d s qu irement :s
For this anode ree ro ram ou will need access to the Internet
N g p g , y .
Additional software such as P thon and its common data anal sis libraries e. . andas and at lotlib will be
y y ( g, p M p )
required but the ro ram includes Udacit or s aces with all of the relevant ac a es installed so students
, p g y W k p p k g ,
*The length of this program is an estimation of total hours the average student may take to complete all required coursework,
including lecture and project time. If you spend about 5-10 hours per week working through the program, you should finish within
the time provided. Actual hours may vary.
u dacity.com
Data A n a lyst
Course #1:
Introduction to Data Analysis
with Pandas and NumPy
PROJ ECT # 1
Investigate a Dataset
Gather data
NumPy, pandas, and Matplotlib Use the pandas query function to filter data
Jupyter Notebooks
Use pandas explode to expand data
Explain that Jupyter Notebooks can combine explanatory
text, math equations, code, and visualizations
Communicating Results
Create a new Jupyter Notebook
udacity.com
Data A n a lyst
Course #2:
Advanced Data Wrangling
PROJ ECT # 2
Assessing Data
Supporting Lesson Content
Describe the assessing phase
Identify each step of the data wrangling process (gathering, Identify data quality issues and categorize them
Identify additional file formats that data analysts might Test cleaning programmatically using Python
udacity.com
Data A n a lyst
In Part II, Explanatory data visualization, you will produce a short presentation that illustrates interesting properties, trends,
and relationships that you discovered in your selected dataset. The primary method of conveying your findings will be
through transforming your exploratory visualizations from the first part into polished, explanatory visualizations.
Understand why visualization is important in the practice Use axis limits and different scales to change how your
of data analysis.
data is interpreted.
Know what distinguishes exploratory analysis from
Explanatory analysis, and the role of data visualization in Multivariate Exploration of Data
each. Use encodings like size, shape, and color to encode values of
the third variable in a visualization.
Know different encodings that can be used to depict data in Use feature engineering to capture relationships between
visualizations.
variables.
Understand various pitfalls that can affect the effectiveness
and truthfulness of visualizations. Explanatory Visualizations
Understand what it means to tell a compelling story with
Bivariate Exploration of Data
data.
Use scatterplots to depict relationships between numeric Choose the best plot type, encodings, and annotations to
variables.
polish your plots.
Use violin and box charts to depict relationships between Create high-quality image files using a Jupyter Notebook to
categorical and numeric variables.
convey your findings.
Use clustered bar charts to depict relationships between
categorical variables
Visualization C ase S tu d y
Use faceting to create plots across different subsets of
Apply your knowledge of data visualization to a dataset
the data involving the characteristics of diamonds and their prices.
udacity.com
Data A n a lyst
Course #1 Instructor
Matt Maybeno
Principal Software Engineer
Course #2 Instructor
Ria Cheruvu
Intel NEX AI Ethics Lead Architect
Ria is Intel NEX AI Ethics Lead Architect, leading trustworthy AI. She is an emerging
industry speaker and has a master’s in data science from Harvard University. Ria
previously served as a Teaching Fellow for Harvard's 2021 Data Science graduate
curriculum and Lead Instructor for Eduonix's ML Deployment course.
Course #3 Instructor
Josh Magee
Senior Data Scientist
Josh is a Senior Data Scientist at Local Logic, where he models commercial real
estate trends, acquisitions, and sustainable cities. He was formerly Assistant
Professor of Data Analytics at Stonehill College, and was a postdoctoral researcher
in nuclear physics at Lawrence Livermore National Laboratory.
udacity.com
Learn More at
w w w.u dacity.com
u dacity.com