Snowpro Advanced: Data Engineer: Exam Study Guide
Snowpro Advanced: Data Engineer: Exam Study Guide
DATA ENGINEER
EXAM STUDY GUIDE
Last Updated: August 1, 2023
SNOWPRO ADVANCED: DATA ENGINEER STUDY GUIDE OVERVIEW
This study guide highlights concepts that may be covered on Snowflake’s SnowPro Advanced:
Data Engineer Certification exam.
Holding the SnowPro Core certification in good standing is a prerequisite for taking the
Advanced: Data Scientist certification.
For an overview and more information on the SnowPro Core Certification exam or SnowPro
Advanced Certification series, please navigate here.
This guide will show the Snowflake topics and subtopics covered on the exam. Following the
topics will be additional resources consisting of videos, documentation, blogs, and/or exercises
to help you understand data engineering with Snowflake.
Some links may have more value than others, depending on your experience. The same
amount of time should not be spent on each link. Some links may appear in more than one
domain.
Page 1
TABLE OF CONTENTS
SNOWPRO ADVANCED: DATA ENGINEER STUDY GUIDE OVERVIEW 1
RECOMMENDATIONS FOR USING THE GUIDE 1
SNOWPRO ADVANCED: DATA ENGINEER CERTIFICATION OVERVIEW 2
SNOWPRO ADVANCED: DATA ENGINEER PREREQUISITE 3
SNOWPRO ADVANCED: DATA ENGINEER SUBJECT AREA BREAKDOWN 4
SNOWPRO ADVANCED: DATA ENGINEER DOMAINS & OBJECTIVES 4
Domain 1.0: Data Movement 4
Domain 1.0: Data Movement Study Resources 5
Domain 2.0: Performance Optimization 6
Domain 2.0: Performance Optimization Study Resources 6
Domain 3.0: Storage & Data Protection 7
Domain 3.0: Storage & Data Protection Study Resources 7
Domain 4.0: Security 7
Domain 4.0: Security Study Resources 8
Domain 5.0: Data Transformation 8
Domain 5.0: Data Transformation Study Resources 9
SNOWPRO ADVANCED: DATA ENGINEER SAMPLE QUESTIONS 10
The SnowPro Advanced: Data Engineer tests advanced knowledge and skills used to apply
comprehensive data engineering principles using Snowflake. The exam will assess skills through
scenario-based questions and real-world examples.
Target Audience:
2 + years of data engineering experience, including practical experience using Snowflake for
Data Engineer tasks. In addition, successful candidates may have:
● A working knowledge of Restful APIs, SQL, semi-structured datasets, and cloud native
concepts.
Page 2
This exam is designed for:
● Data Engineers
● Software Engineers
Eligible individuals must hold an active SnowPro Core Certified credential. If you feel you need
more guidance on the fundamentals, please see the SnowPro Core Exam Study Guide.
STEPS TO SUCCESS
1. Review the Data Engineer Exam Guide
2. Attend Snowflake’s Instructor Led Data Engineering Course
3. Review and study applicable white papers and documentation
4. Get hands-on practical experience with relevant business requirements using Snowflake
5. Attend Snowflake Webinars
6. Attend Snowflake Virtual Hands-on Labs for more hands-on practical experience
7. Schedule your exam
8. Take your exam!
Page 3
SNOWPRO ADVANCED: DATA ENGINEER SUBJECT AREA BREAKDOWN
This exam guide includes test domains, weightings, and objectives. It is not a comprehensive
listing of all the content that will be presented on this examination. The table below lists the main
content domains and their weightings.
Page 4
1.7 Design and build data sharing solutions.
● Implement a data share
● Create a secure view
● Implement row level filtering
1.8 Outline when to use external tables and define how they work.
● Partitioning external tables
● Materialized views
● Partitioned data unloading
Lab Guides
Accelerating Data Engineering with Snowflake & dbt
Auto-Ingest Twitter Data into Snowflake
Automating Data Pipelines to Drive Marketing Analytics with Snowflake & Fivetran
Additional Assets
Support for Calling External functions via Google Cloud API Gateway Now in Public
Preview (blog)
Snowflake and Spark, Part 2: Pushing Spark Query (blog)
Fetching Query Results From Snowflake (blog)
Moving from On-Premises ETL to Cloud-Driven ELT (white paper)
Page 5
Domain 2.0: Performance Optimization
Lab Guides
Resource Optimization: Performance
Resource Optimization: Usage Monitoring
Building a Data Application
Additional Assets
Performance Impact from Local and Remote Disk Spilling (blog)
Snowflake: Visualizing Warehouse Performance (blog)
Caching in Snowflake Data Warehouse (blog)
Page 6
SHOW STREAMS
System Functions
TASK_HISTORY
Virtual Warehouses
3.4 Use Time Travel and cloning to create new development environments.
● Clone objects
● Validate changes before promoting
● Rollback changes
Lab Guides
Getting Started with Time Travel
Page 7
4.2 Outline the system defined roles and when they should be applied.
● The purpose of each of the system defined roles including best practices usage
in each case
● The primary differences between SECURITYADMIN and USERADMIN
roles
● The difference between the purpose and usage of the USERADMIN/
SECURITYADMIN roles and SYSADMIN
Additional Assets
Snowflake RBAC Security Prefers Role Inheritance to Role Composition (blog)
5.1 Define User-Defined Functions (UDFs) and outline how to use them.
● Snowpark UDFs (for example, Java, Python, Scala)
● Secure UDFs
● SQL UDFs
● JavaScript UDFs
● User-Defined Table Functions (UDTFs)
Page 8
5.3 Design, build, and leverage stored procedures.
● Snowpark stored procedures (for example, Java, Python, Scala)
● SQL Scripting stored procedures
● JavaScript stored procedures
● Transaction management
Additional Assets
Snowflake For Data Engineering – Easily Ingest, Transform and Deliver Data for
Up-To-The Moment Insight (white paper)
Bringing Extensibility to Data Pipelines: What’s New with Snowflake External Functions
(blog)
Generating a JSON Dataset Using Relational Data in Snowflake (blog)
Best Practices for Managing Unstructured Data (white paper)
Page 9
SNOWPRO ADVANCED: DATA ENGINEER SAMPLE QUESTIONS
2. A Data Engineer has inherited a database and is monitoring a table with the below query
every 30 days:
The Engineer gets the first two results (e.g., Day 0 and Day 30).
-- DAY 0 -------
{
"cluster_by_keys" : "LINEAR(o_orderdate)",
"total_partition_count" : 3218,
"total_constant_partition_count" : 0,
"average_overlaps" : 20.4133,
"average_depth" : 11.4326,
"partition_depth_histogram" : {
"00000" : 0,
"00001" : 0,
"00002" : 0,
"00003" : 0,
"00004" : 0,
"00005" : 0,
"00006" : 0,
"00007" : 0,
"00008" : 0,
"00009" : 0,
"00010" : 993,
"00011" : 841,
"00012" : 748,
"00013" : 413,
"00014" : 121,
"00015" : 74,
"00016" : 16,
"00032" : 12
Page 10
}
}
-- DAY 30 -------
{
"cluster_by_keys" : "LINEAR(o_orderdate)",
"total_partition_count" : 3240,
"total_constant_partition_count" : 0,
"average_overlaps" : 64.1185,
"average_depth" : 33.4704,
"partition_depth_histogram" : {
"00000" : 0,
"00001" : 0,
"00002" : 0,
"00003" : 0,
"00004" : 0,
"00005" : 0,
"00006" : 0,
"00007" : 0,
"00008" : 0,
"00009" : 0,
"00010" : 0,
"00011" : 0,
"00012" : 0,
"00013" : 0,
"00014" : 0,
"00015" : 0,
"00016" : 0,
"00032" : 993,
"00064" : 2247
}
}
a. The table is well organized for queries that range over column o_orderdate.
Over time, this organization is degrading.
b. The table was initially well organized for queries that range over column
o_orderdate. Over time this organization has improved further.
c. The table was initially not organized for queries that range over column
o_orderdate. Over time, this organization has changed.
d. The table was initially poorly organized for queries that range over column
o_orderdate. Over time, this organization has improved.
Page 11
3. A Data Engineer is preparing to load staged data from an external stage using a
task object.
Which of the following practices will provide the MOST efficient load performance?
4. A Data Engineer is working on a project that requires data to be moved directly from an
internal stage to an external stage.
5. The S1 schema contains two permanent tables that were created as shown below:
a. The retention time on table_a does not change; table_b is set to 20 days.
b. An error will be generated; a data retention time on a schema cannot be set.
c. The retention time on both tables will be set to 20 days.
d. The retention time will not change on either table.
Page 12
Correct responses for sample questions:
1: b, 2: a, 3: d, 4: a , 5: a
The information provided in this study guide is provided for your purposes only and may not be provided
to third parties.
IN ADDITION, THIS STUDY GUIDE IS PROVIDED “AS IS”. NEITHER SNOWFLAKE NOR ITS
SUPPLIERS MAKES ANY OTHER WARRANTIES, EXPRESS OR IMPLIED, STATUTORY OR
OTHERWISE, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY,
TITLE, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT.
Page 13