Snowflake Scenario Based Interview Questions
Snowflake Scenario Based Interview Questions
Interview
Questions
Scenario Based
Q1: how to give sequence number to newly
added column in a table in snowflake?
we can do this by using sequences in snowflake. execute below code to
understand.
Example:
create table abc(col1 varchar);
insert into abc values ('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h’);
So if the date is ‘2022-11-30', it will be shown as '2022-01-01' because of the masking policy.
Example:
Output:
Input: +-------+-------------+-------+-------++-------
| EMPID | DEPT | MONTH | SALES |
(empid, dept, jan, feb, mar);
|-------+-------------+-------+-------+-------
(1, 'electronics', 100, 200, 300), | 1 | electronics | JAN | 100 |
(2, 'clothes', 100, 300, 150), | 1 | electronics | FEB | 200 |
| 1 | electronics | MAR | 300 |
(3, 'cars', 200, 400, 100); | 2 | clothes | JAN | 100 |
| 2 | clothes | FEB | 300 |
| 2 | clothes | MAR | 150 |
select * from monthly_sales | 3 | cars | JAN | 200 |
| 3 | cars | FEB | 400 |
unpivot (sales for month in (jan, feb, mar))
| 3 | cars | MAR | 100 |
order by empid; +-------+-------------+-------+-------++-----
Examples to practice Pivot and Unpivot
Pivot:
https://docs.snowflake.com/en/sql-reference/constructs/pivot.html
Unpivot:
https://docs.snowflake.com/en/sql-reference/constructs/unpivot.html
Q6: How to get dropped table data, if we recreate
a table with same name as dropped table?
There is table with name EMP with 1000 records
• Day1: dropped EMP table
• Day2: created a table with name EMP and inserted 5000 records
• Day3: I need data from EMP table that I had dropped on Day1
How to get that Day1 data?
Ans:
ALTER TABLE RENAME EMP to EMP_Day2;
UNDROP TABLE EMP;
ALTER TABLE RENAME EMP to EMP_Day1;
ALTER TABLE RENAME EMP_ Day2 to EMP;
Question:
Let A is original table and B is cloned table, if we insert data into Table B, will it reflect to
table A and what happens to the storage?
Answer:
After cloning there will be no impact of rows insertion or deletion on either of the tables.
If you insert or delete records in Table A, it will not reflect on Table B.
If you insert or delete records in Table B, it will not reflect on Table A.
Question:
When we can schedule queries by using Tasks in snowflake, why we go for third party scheduling
tools?
Answer:
By using Tasks we can schedule the tasks and monitor them from TASK_HISTORY table which is
difficult. But third party scheduling tools offer UI based monitoring and it is very easy to control the
job flow like holding jobs, releasing dependencies, cancelling or killing jobs etc.
Control-M
Airflow
Ansible
TWS
Active Batch
JAMS Scheduler
Q9: How can I convert my Teradata DDL to Snowflake
DDL?
We can’t convert 100s of tables DDL from a traditional database to Snowflake
manually, but we can use below ways.
1. Use roboquery tool which can convert DDL from any database to any other
database including Snowflake, in free version we can convert only 5 per day.
https://roboquery.com/app/
2. You can develop a python script to convert all DDL at a time.
3. You can write a Procedure in snowflake to convert the DDLs to Snowflake.
Q10: What is the difference btn Full load and
Incremental Delta Load? And how to choose?
In full load, the complete data set will be loaded every time where we delete/truncate the
target table data and load the new dataset.
In Incremental loads we will fetch only the data that was inserted/updated after previous load
and load that to target table by using UPSERT operations(update existing record and insert new
record). We can pull incremental data from source with help of LAST_UPDATE_TIMESTAMP
and perform UPSERT by using key fields.
As Snowflake doesn’t have their own cloud they are dependent(can host) on
other cloud storage providers, it doesn’t mean that if you host your account on
Azure you can only extract data from Azure cloud, you can extract the data from
any of these 3 clouds.
Q12: Write a Query to get below output from given
input.
Input: Output:
EMP_ID FULL_NAME LANG FULL_NAME LANG_SPEAK
101 Virat Kohli English Virat Kohli English,Hindi,Marathi
101 Virat Kohli Hindi Mahesh Babu English,Telugu,Tamil,Hindi
101 Virat Kohli Marathi Janardhan Telugu,English
102 Mahesh Babu English
102 Mahesh Babu Telugu Query:
102 Mahesh Babu Tamil
102 Mahesh Babu Hindi select FULL_NAME, LISTAGG(LANG, ',') as LANG_SPEAK
103 Janardhan Telugu from EMP_LANG
103 Janardhan English group by FULL_NAME;
Q13: Write a Query to get below output from given
input.
Input: Output:
id val id values
1 A 1 A|B|C
1 B 2 X|Y
1 C 3 A|B
2 X
2 Y
3 A Query:
3 B
select id, LISTAGG(val, '|') as "values"
from tab group by id;
Queries to practice above 2 examples
create or replace table EMP_LANG(EMP_ID int, FULL_NAME varchar(50), LANG varchar(20));
insert into EMP_LANG values
(101, 'Virat Kohli', 'English'), (101, 'Virat Kohli', 'Hindi'), (101, 'Virat Kohli', 'Marathi'),
(102, 'Mahesh Babu', 'English'), (102, 'Mahesh Babu', 'Telugu'), (102, 'Mahesh Babu', 'Tamil'),
(102, 'Mahesh Babu', 'Hindi'), (103, 'Janardhan', 'Telugu'), (103, 'Janardhan', 'English’);
select * from EMP_LANG;
select FULL_NAME, LISTAGG(LANG, ',') as LANG_SPEAK from EMP_LANG group by FULL_NAME;
-------------------------
create or replace table tab(id int, val varchar(10));
insert into tab values (1, 'A'), (1, 'B'), (1, 'C'), (2, 'X'), (2, 'Y'), (3, 'A'), (3, 'B’);
select id, LISTAGG(val, '|') as "values" from tab group by id;
Q14: What are Slowly Changing Dimensions(SCD)
and how do you implement them in Snowflake?
SCDs are the dimensions that change slowly over the time. There are many types of SCDs
from SCD type 1 to type 6, but we majorly use below 3 SCDs.
SCD Type 1: Contains only current data(Insert new records, update old existing records).
SCD Type 2: Contains only current + history data(Insert new records, Logically expire the
existing record, load newer version of the existing record).
SCD Type 3: Contains only current + recent or limited history data (Just maintain current and
previous version of records if available).
We can implement these SCDs in Snowflake by using Streams and Tasks. I have already
explained about Streams and Tasks in my videos.
And SCD implementation was explained clearly in below links with examples.
https://www.phdata.io/blog/implementing-slowly-changing-dimensions-in-snowflake/
https://community.snowflake.com/s/article/Building-a-Type-2-Slowly-Changing-Dimension-in-Snowflake-Using-Str
eams-and-Tasks-Part-1
Q15: In which scenarios you have written stored
procedures and UDFs(User Defined Functions)?
We can write stored procedures
1. When there is a need to execute multiple SQL statements in some order.
2. When there is a need to apply same logic multiple times or at multiple places with
different parameters.
3. To automate certain things.
Note: In snowflake we can write procedures and UDFs in multiple languages like SQL,
JavaScript, Python, Java and Scala.
Thank You
You can get all videos, PPTs, queries and files in my Udemy course for
a very less price.. I will be updating this content and will be uploading
all new videos in this course.
Link for this course and discount coupon details given in the
description of this video.