59% found this document useful (27 votes)
16K views

Assignment: Case Study - 1: Operation Analytics

This document contains two case studies related to operation analytics and investigating metric spikes. For the first case study, it provides SQL queries to calculate metrics like jobs reviewed per hour, 7-day rolling average of throughput, percentage of jobs by language, and identifying duplicate rows. For the second case study, it provides SQL queries to calculate user engagement, user growth, weekly retention, engagement by device, and email engagement metrics.

Uploaded by

Prerna Bhandari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
59% found this document useful (27 votes)
16K views

Assignment: Case Study - 1: Operation Analytics

This document contains two case studies related to operation analytics and investigating metric spikes. For the first case study, it provides SQL queries to calculate metrics like jobs reviewed per hour, 7-day rolling average of throughput, percentage of jobs by language, and identifying duplicate rows. For the second case study, it provides SQL queries to calculate user engagement, user growth, weekly retention, engagement by device, and email engagement metrics.

Uploaded by

Prerna Bhandari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ASSIGNMENT

Case Study -1: Operation Analytics

• Points to be considered:
o What does the event mean? What to consider for reviewing?
o Candidate should spend some time understanding the table

QA: Calculate the number of jobs reviewed per hour per day for November 2020?
QB: Let’s say the above metric is called throughput. Calculate 7 day rolling average
of throughput? For throughput, do you prefer daily metric or 7-day rolling and
why?
QC: Calculate the percentage share of each language in the last 30 days?
QD: Let’s say you see some duplicate rows in the data. How will you display
duplicates from the table?

QA.
SELECT
ds,
ROUND(1.0*COUNT(job_id)*3600/SUM(time_spent),2) AS throughput
FROM
job_data
WHERE
event IN (‘transfer’,’decision’)
AND ds BETWEEN ‘2020-11-01- AND -2020-11-30’
GROUP BY
ds

QB.
WITH CTE AS (
SELECT
ds,
COUNT(job_id) AS num_jobs,
SUM(time_spent) AS total_time
FROM
job_data
WHERE
event IN(‘transfer;,’decision’)
AND ds BETWEEN ‘2020-11-01’ AND ‘2020-11-30’
GROUP BY
ds
)
SELECT
ds,
ROUND(1.0*
SUM(num_jobs) OVER (ORDER BY ds ROWS BETWEEN 6 PRECEDING AND CURRENT
ROW) / SUM(total_time) OVER (ORDER BY ds ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW),2) AS throughput_7d
FROM
CTE

QC.
WITH CTE AS (
SELECT
Language,
COUNT(job_id) AS num_jobs
FROM
job_data
WHERE
event IN(‘transfer’,’decision’)
AND ds BETWEEN ‘2020-11-01’ AND ‘2020-11-30’
GROUP BY
language
),
total AS (
SELECT
COUNT(job_id) AS total_jobs
FROM
job_data
WHERE
event IN(‘transfer’,’decision’)
AND ds BETWEEN ‘2020-11-01’ AND ‘2020-11-30’
GROUP BY
language
)
SELECT
language,
ROUND(100.0*num_jobs/total_jobs,2) AS perc_jobs
FROM
CTE
CROSS OIN
total
ORDER BY
perc_job DESC

QD.
WITH CTE AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ds, job_id, actor_id) AS rownum
FROM
job_data
)

DELETE
FROM
CTE
WHERE
rownum > 1

Case Study – 2: Investigating metric Spike


QA : Calculate the weekly user engagement?
QB : Calculate the user growth for product?
QC : Calculate the weekly retention of users-sign up cohort?
QD : Calculate the weekly engagement per device?
QE : Calculate the email engagement metrics?

QA.
SELECT DATE_TRUNC(‘week’, e.occurred_at),
COUNT(DISTINCT e.user_id) AS weekly_active_users
FROM events e
WHERE e.event_type = ‘engagement’
AND e.event_name = ‘login’
GROUP BY 1
ORDER BY 1

QB.
SELECT DATE_TRUNC(‘day’, created_at) AS day,
COUNT(*) AS all_users,
COUNT(CASE WHEN activated_at IS NOT NULL THEN u.user_id ELSE
NULL END) AS activated_users
FROM users u
WHERE created_at >= ‘2021-04-01’
AND created_at < ‘2021-04-30’
GROUP BY 1
ORDER BY 1

QC.
SELECT DATE_TRUNC(‘week’, z.occurred_at) AS “week”,
AVG(z.age_at_event) AS “Average age durig week”,
COUNT(DISTINCT CASE WHEN z.user_age > 70 THEN z.user_id ELSE
NULL END) AS “10+ weeks”,
COUNT(DISTINCT CASE WHEN z.user_age < 70 AND z.user_age >=63
THEN z.user_id ELSE NULL END) AS ‘9 weeks”,
COUNT(DISTINCT CASE WHEN z.user_age < 63 AND z.user_age >=56
THEN z.user_id ELSE NULL END) AS ‘8 weeks”,
COUNT(DISTINCT CASE WHEN z.user_age < 56 AND z.user_age >=49
THEN z.user_id ELSE NULL END) AS ‘7 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 49 AND z.user_age >=42


THEN z.user_id ELSE NULL END) AS ‘6 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 42 AND z.user_age >=35


THEN z.user_id ELSE NULL END) AS ‘5 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 35 AND z.user_age >=28


THEN z.user_id ELSE NULL END) AS ‘4 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 28 AND z.user_age >=21


THEN z.user_id ELSE NULL END) AS ‘3 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 21 AND z.user_age >=14


THEN z.user_id ELSE NULL END) AS ‘2 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 14 AND z.user_age >=7


THEN z.user_id ELSE NULL END) AS ‘1 weeks”,

COUNT(DISTINCT CASE WHEN z.user_age < 7 AND z.user_age >=63


THEN z.user_id ELSE NULL END) AS ‘Less than a week”,
FROM(
SELECT e.occurred_at, u.user_id, DATE_TRUNC(“week”,u.activated_at) AS
activation_week, EXTRACT(‘day’ FROM e.occurred_at – u.activated_at) AS
age_at_event, EXTRACT(‘day’ FROM ‘201-09-01’::TIMESTAMP –
u.activated_at) AS user_age
FROM tutorial.yammer_users u
JOIN tutorial.yammer_events e
ON e.user_id = u.user_id
AND e.event_type = ‘engagement’
AND e.evnetn_name= ‘login’
AND e.occurred_at >= ‘2014-05-01’
AND e.occurred_at < ‘2014-09-01’
WHERE u.activated_at IS NOT NULL
) z
GROUP BY 1
ORDER BY 1
LIMIT 100

QD.
SELECT DATE_TRUNC(‘week’, occurred_at) AS week,
COUNT(DISTINCT e.user id) AS weekly active users,
COUNT(DISTINCT CASE WHEN e.device IN(‘macbook pro’,’lenovo
thinkpad’,’macbook air’,’’dell inspiron notebook’,’asus
chromebook’,’dell inspiron desktop’,’acer aspire notebook’,’hp
pavilion desktop’,’acer aspire desktop’,’mac mini’)
THEN e.user id ELSE NULL END) AS computer,
COUNT(DISTINCT CASE WHEN e.device IN(‘iphone 5’,’samsung galaxy
s4’,’nexus 5’,’iphone 5s’,’iphone 4s’,’nokia lumia 635’,’htc
one’,’samsung galaxy note’,’amazon fire phone’) THEN e.user id ELSE
NULL END) AS phone,
COUNT(DISTINCT CASE WHEN e.device IN(‘ipad air’,’nexus 7’,’ipad
mini’,’nexus 10’,’kindle fire’,’windows surface’,’samsung galaxy
tablet’) THEN e.user id ELSE NULL END) AS
tablet
FROM events e
WHERE e.event type = ‘engagement’
AND e.event name = ‘login’
GROUP BY 1
ORDER BY 1
LIMIT 100

QE.
SELECT DATE_TRUNC(‘week’, occurred_at) AS week,
COUNT(CASE WHEN e.action = ‘sent weekly digest’ THEN e.user id
ELSE NULL END) AS weekly emails,
COUNT(CASE WHEN e.action = ‘sent reengagement email’ THEN
e.user id ELSE NULL END) AS reengagement emails,
COUNT(CASE WHEN e.action = ‘email open’ THEN e.user id ELSE
NULL END) AS email opens,
COUNT(CASE WHEN e.action = ‘email clickthrough’ THEN e.user id
ELSE NULL END) AS email clickthroughs
FROM email events e
GROUP BY 1

You might also like