0% found this document useful (0 votes)
12 views

Data+Architect+Nanodegree+Program+Syllabus

The Data Architect Nanodegree Program equips learners with skills to design and implement enterprise data infrastructure solutions, including relational databases, data warehouses, and data governance practices. The program spans four months and covers topics such as data modeling, big data systems, and data governance, requiring prior knowledge in databases and programming. Hands-on projects and mentorship support are integral to the learning experience, preparing graduates for advanced roles in data architecture.

Uploaded by

Dod MR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Data+Architect+Nanodegree+Program+Syllabus

The Data Architect Nanodegree Program equips learners with skills to design and implement enterprise data infrastructure solutions, including relational databases, data warehouses, and data governance practices. The program spans four months and covers topics such as data modeling, big data systems, and data governance, requiring prior knowledge in databases and programming. Hands-on projects and mentorship support are integral to the learning experience, preparing graduates for advanced roles in data architecture.

Uploaded by

Dod MR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

INDIVIDUAL LEARNERS

S C H O O L O F D ATA S C I E N C E

Data Architect
Nanodegree Program Syllabus
Overview
In this program, learners will plan, design, and implement enterprise data infrastructure solutions and create the blueprints
for an organization’s data management system. Learners will create a relational database with PostGreSQL, design an Online
Analytical Processing (OLAP) data model to build a cloud based data warehouse, and design scalable data lake architecture
that meets the needs of big data. Finally, learn how to apply the principles of data governance to an organization’s data
management system.

Learning Objectives

A graduate of this program will be able to:

• Build conceptual, logical, and physical entity relationship diagrams (ERDs).

• Architect a physical database in PostGreSQL.

• Transform data from transactional systems into an operational data store.

• Create a data warehouse system using dimensional data models.

• Use appropriate storage and processing frameworks to manage big data.

• Design end-to-end batch and stream processing architecture.

• Establish data governance best practices including metadata management, master data management,
and data quality management.

Data Architect 2
Program information

Estimated Time Skill Level

4 months at 10hrs/week* Advanced

Prerequisites

A well-prepared learner should have knowledge of:

• Relational database management systems or foundational database skills

• Intermediate Python

• Intermediate SQL

• Batch processing and stream processing frameworks

• Operating systems, including UNIX, Linux, and MS Windows

• Basics of ETL/data pipelines

Required Hardware/Software

Learners need access to the internet and a 64-bit computer.

*The length of this program is an estimation of total hours the average student may take to complete all required
coursework, including lecture and project time. If you spend about 5-10 hours per week working through the program, you
should finish within the time provided. Actual hours may vary.

Data Architect 3
Course 1

Data Architecture Foundations


Learn about the principles of data architecture. Begin by learning the characteristics of good data architecture and how
to apply them. Next, move on to the topic of data modeling. Learn to design a data model, normalize data, and create
professional ERD. Finally, take everything you learned and create a physical database using PostGreSQL.

Course Project

Designing an HR Database
In this project, learners will design, build, and populate a database for the human resources (HR)
department at the imaginary Tech ABC Corp, a video game company. This project will start with a request
from the HR manager. From there, learners will need to design a database using the foundational principles
of data architecture that is best suited to the department’s needs. They will go through the steps of
database architecture, creating database proposals, database entity relationship diagrams, and finally
creating the database itself. This project is a scaled-down simulation of the kind of real-world assignments
data architects work on every day.

• Define data architecture characteristics.


Lesson 1
• Define data governance and its role.
What is Data Architecture?
• Define scalability and flexibility in database design.

• Introduction to ERDs.

Lesson 2 • Develop a database schema.

Database Framework • Understand normalization and its use cases.

• Learn to normalize data to the 3rd Normal Form.

Data Architect 4
• Introduction to ERDs.

• Build a conceptual ERD.


Lesson 3
• Build a logical ERD.
Relational Data Design
• Learn about cardinality and Crow’s Foot notation.

• Build a physical ERD.

• Learn about factors that affect database performance.

• Learn about file and data storage solutions.

• Use DDL SQL to create database objects in PostGreSQL.


Lesson 4
• Learn about data ingestions methods, including: ETL, Pipelines, APIs, and direct
Creating a Physical Database feeds.

• Use DML SQL to populate a database with data in PostGreSQL.

• Use CRUD SQL commands to demonstrate proper operation of a database.

Course 2

Designing Data Systems


Learn to design enterprise data architecture and build a cloud-based data warehouse with Snowflake. Learners will evaluate
various data assets of an organization and characteristics of these data sources, design a staging area for ingesting varieties
of data coming from source systems, and design an operational data source (ODS). Finally, learn to design OLAP dimensional
data models, design ELT data processing that is capable of moving data from an ODS to a data warehouse, and write SQL
queries for the purpose of building reports.

Data Architect 5
Course Project

Design a Data Warehouse for Reporting & OLAP


In this project, learners will design end to end data architecture, build ingestion of data from Yelp and
Climatic source systems, design operational data store and data warehouse systems, transform data from
staging to ODS, and finally from ODS to a data warehouse system. Yelp source carries a list of businesses,
restaurants, reviews, and ratings. Climatic data source keeps track of temperature and precipitation data.
Both of these websites are independent sources and not related to each other. The final objective of this
project is to write appropriate SQL to find the impact of weather on restaurant ratings.

• Understand importance of data architecture in any organization.

• Learn the benefits of executing a data architecture.


Lesson 1
• Learn the business and technical artifacts required.
Enterprise Data Architecture
• Understand business and functional requirements.

• Learn how OLTP, ODS, and OLAP models are being designed.

• Build staging area for data ingestion.

Lesson 2 • Learn to organize data assets based on schemas.

Staging Data • Design schedules for data processing based on the requirements.

• Learn to manage staging area through metadata.

• Build an integrated ER model connecting distributed data assets.

• Learn to design data dictionary and master data.


Lesson 3
• Apply normalization rules to eliminate redundancies.
Operational Data Store
• Learn when to use ETL vs. ELT techniques.

• Learn to cleanse data anomalies.

Data Architect 6
• Learn two OLAP modeling designs—Star and Snowflake schemas.

Lesson 4 • Learn various dimensional and fact table types.

Data Warehouse • Build ELT data processing from ODS to data warehouse.

• Write SQL queries for the purpose of reporting.

Course 3

Big Data Systems


Learn to help organizations with massive amounts of data, including identification of big data problems and how to design
big data solutions. Learn about internal architecture of many of the big data tools such as HDFS, MapReduce, Hive, and Spark,
and how these tools work internally to provide distributed storage, distributed processing capabilities, fault tolerance, and
scalability. Next, learn about evaluating NoSQL databases, their use cases, and creating a NoSQL database with Amazon
DynamoDB. Finally, learn to implement data lake design patterns and enable transactional capabilities in a data lake.

Course Project

Design an Enterprise Data Lake System


Act as a big data architect and work on a real use case faced by a medical processing company. Start by
analyzing the current architecture of the company. Then understand technical and business requirements
and propose a new data lake based solution to both technical and executive audiences. For technical
audiences, develop a design document outlining a solution and rationale, and for the executive audience,
record a short presentation pitching a solution. This is a real world scenario where learners will act as an
expert data infrastructure consultant to the company and solve challenges the company is facing today.
Learners will also hone their presentation skills and learn to articulate complex technical terminologies.

Data Architect 7
• Explain what “big data” is.

• Articulate the business value of big data.


Lesson 1
• Describe the characteristics of big data.
Characteristics of Big Data
• Distinguish between horizontal scaling vs. vertical scaling.

• Describe the components of a big data ecosystem.

• Explain how distributed storage works in HDFS.

Lesson 2 • Explain how distributed processing works.

• Explain how resources are managed in a Hadoop cluster.


Ingestion, Storage &
Processing Frameworks • Distinguish between different distributed processing frameworks.

• Apply frameworks to appropriate use cases.

• Explain difference between SQL and NoSQL databases.

• Differentiate between ACID and CAP properties of SQL and NoSQL databases.
Lesson 3
• Implement, create, read, write, and update NoSQL DB operations with
NoSQL Databases DynamoDB.

• Create simple NoSQL data model.

• Explain data lakes and their business value.


Lesson 4
• Distinguish between different data formats and their applications.
Scalable Data Lake • Articulate data lake design patterns and challenges.
Architecture
• Explain how to enable transactional capabilities in data lakes.

Data Architect 8
Course 4

Data Governance
Learn how to design a data governance solution that meets a company’s needs. First, learn about the different types of
metadata, and how to build a metadata management system, enterprise data model, and enterprise data catalog. Next,
learn how to perform data profiling using various techniques including data quality dimensions, identify remediation options
for data quality issues, and measure and monitor data quality using data quality scores, thresholds, dashboards, exception,
and trend reports. Finally, learn the concept of master data and golden record, different types of master data management
architectures, as well as the golden record creation and master data governance processes.

Course Project

Data Governance at SneakerPark


In this project, learners will be implementing data governance solutions for an online shoe reseller
SneakerPark to better manage their data now and in the future. First, create an enterprise data model that
provides a holistic view of all the data in their systems. Next, document the metadata in an enterprise data
catalog and profile the data in their systems to identify data quality issues, suggest remediation strategies
for each of these issues, and design a data quality dashboard. Finally, sketch out a proposed MDM
implementation architecture, define a set of matching rules for the creation of customer and item master
data, and define the data governance roles and responsibilities that are necessary to oversee this data
governance initiative.

Lesson 1 • Understand data governance and its importance.

• Learn about the different disciplines of data governance.


Introduction to
Data Governance • Understand the different stakeholders involved in data governance projects.

Data Architect 9
• Understand the different types of metadata.

• Understand the components and capabilities of metadata management


Lesson 2
system.
Metadata Management • Create conceptual and logical enterprise data models.

• Create an enterprise data catalog.

• Perform data profiling using various techniques using data quality dimensions.

Lesson 3 • Identify remediation options for data quality issues.

Data Quality Management • Measure data quality using data quality scores and thresholds.

• Monitor data quality using dashboards, exception, and trend reports.

• Understand the concepts of master data and golden record.

• Understand different types of master data management architectures.


Lesson 4
• Create a golden record using various match and merge techniques.
Master Data Management
• Understand data governance processes for authoring, monitoring, and
approval of master data.

Data Architect 10
Meet your instructors.

Ben Larson
Data Architect & Analytics Consultant

Benjamin has over 15 years of experience working as a data professional in fields including
medicine, telecomm, and finance, in roles ranging from data architect to data scientist and analytics
consultant. He holds a PhD in decision sciences, where his research was focused on rare event
detection.

Shankar Korrapolu
CEO at OK2

Shankar Korrapolu is the cofounder and CEO of startup OK2, a cross-platform mobile gaming
engine that builds games cheaper and faster without compromising quality. For over 30 years
he has offered his data processing services to organizations in investment banking, pharma,
government, and education sectors.

Shrinath Parikh
Senior Data Architect

Shrinath is an entrepreneur and data architect passionate about helping enterprise companies
transform and engineer their big data analytics applications on Cloud. He has worked with AWS,
Google, and Microsoft cloud platforms, has over 15 certifications and an MS in computer science
from the University of Texas at Dallas.

Vijaya Nelavelli
Founder & Principal Data Architect

Vijaya is the founder and principal data architect for Great View Data Corp., which provides
data architecture consulting and implementation services. Vijaya has extensive experience with
creating architecture strategy and roadmaps, establishing frameworks and best practices, and
data management.

Data Architect 11
Rostislav Rabotnik
Principal Data Architect

Rostislav is an enterprise data architect and data management leader whose expertise covers
data governance, architecture, and integration practices across a diverse range of technologies.
He has worked at companies of all sizes and in various industries. His musings can be found at
learndataarchitecture.com.

Nicholas DeGiacomo
Data Scientist

Nick has built and managed teams of scientists and engineers for political campaigns, social media,
and supply chain companies. With experiences ranging from startups to Amazon, he balances
speed and scale. In his free time, Nick enjoys teaching graduate statistics courses at both Columbia
and Yeshiva Universities.

Data Architect 12
Udacity’s learning
experience

Hands-on Projects Quizzes


Open-ended, experiential projects are designed Auto-graded quizzes strengthen comprehension.
to reflect actual workplace challenges. They aren’t Learners can return to lessons at any time during
just multiple choice questions or step-by-step the course to refresh concepts.
guides, but instead require critical thinking.

Knowledge Custom Study Plans


Find answers to your questions with Knowledge, Create a personalized study plan that fits your
our proprietary wiki. Search questions asked by individual needs. Utilize this plan to keep track of
other students, connect with technical mentors, movement toward your overall goal.
and discover how to solve the challenges that
you encounter.

Workspaces Progress Tracker


See your code in action. Check the output and Take advantage of milestone reminders to stay
quality of your code by running it on interactive on schedule and complete your program.
workspaces that are integrated into the platform.

Data Architect 13
Our proven approach for building
job-ready digital skills.
Experienced Project Reviewers

Verify skills mastery.


• Personalized project feedback and critique includes line-by-line code review from
skilled practitioners with an average turnaround time of 1.1 hours.

• Project review cycle creates a feedback loop with multiple opportunities for
improvement—until the concept is mastered.

• Project reviewers leverage industry best practices and provide pro tips.

Technical Mentor Support

24/7 support unblocks learning.


• Learning accelerates as skilled mentors identify areas of achievement and potential
for growth.

• Unlimited access to mentors means help arrives when it’s needed most.

• 2 hr or less average question response time assures that skills development stays on track.

Personal Career Services

Empower job-readiness.
• Access to a Github portfolio review that can give you an edge by highlighting your
strengths, and demonstrating your value to employers.*

• Get help optimizing your LinkedIn and establishing your personal brand so your profile
ranks higher in searches by recruiters and hiring managers.

Mentor Network

Highly vetted for effectiveness.


• Mentors must complete a 5-step hiring process to join Udacity’s selective network.

• After passing an objective and situational assessment, mentors must demonstrate


communication and behavioral fit for a mentorship role.

• Mentors work across more than 30 different industries and often complete a Nanodegree
program themselves.

*Applies to select Nanodegree programs only.

Data Architect 14
Learn more at
www.udacity.com/online-learning-for-individuals →

12.29.22 | V1.0

You might also like