0% found this document useful (0 votes)
975 views

Metric Spike.

Operation Analytics provides insights into a company's entire operations process from start to finish. As a data analyst, the role involves working with operations, support, and marketing teams to analyze collected data and gain insights. This type of analysis can predict a company's overall growth or decline. As a result, workflows become more effective, cross-functional teams are more aware, and automation is more efficient.

Uploaded by

param
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
975 views

Metric Spike.

Operation Analytics provides insights into a company's entire operations process from start to finish. As a data analyst, the role involves working with operations, support, and marketing teams to analyze collected data and gain insights. This type of analysis can predict a company's overall growth or decline. As a result, workflows become more effective, cross-functional teams are more aware, and automation is more efficient.

Uploaded by

param
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Project Description:

Operation Analytics provides an overview of a company's operations


from start to finish. Based on this, the company determines what needs to be
improved. Data Analyst role involves working closely with the operations,
support, and marketing teams to gain insights from their data collection. Having
such an important role in a company, this type of analysis can also be used to
predict the overall growth or decline of a company. As a result, workflows will
be more effective, cross-functional teams will be more aware, and automation
will be more efficient.

Tech Stack: MySQL workbench

Approach:
I got clear understanding of data and prepared data for analysis,
Created Schema and tables in MySQL from the given datasets,
Defined the Problems statements, Setted business environment to answer
As a Data Analyst, undertaken Operational analytics to discover the insights to
help teams make better decisions and investigating metric spike to answer
business questions
Case study 1 (Job Data)
Creating Schema:
Create database METRIC_SPIKE;

Creating job_data table:


CREATE TABLE job_data (
ds varchar(100),
job_id INT NOT NULL,
Actor_id VARCHAR(200),
event VARCHAR(200),
language VARCHAR(200),
time_spent INT,
org VARCHAR(200)
);

Performing Analysis:
Question-1

Number of jobs reviewed: Amount of jobs reviewed over time.

task: Calculate the number of jobs reviewed per hour per day for November 2020?

Answer:

SELECT ds,

ROUND ( 1.0*COUNT(Job_id) * 3600 /SUM(time_spent) ,2) As throughput

FROM job_data

WHERE event IN ("transfer", "decision")

AND ds BETWEEN "01-11-2020" AND "30-11-2020"

GROUP BY ds;
Question-2

Throughput: It is the no. of events happening per second.

task: Let’s say the above metric is called throughput. Calculate 7 day rolling average

Answer:

WITH job_count AS (
SELECT ds,
COUNT(job_id) AS num_jobs,
SUM(time_spent) AS total_time
FROM job_data
WHERE
event IN ( "transfer", "decision")
AND ds BETWEEN "01-11-2020" AND "30-11-2020"
GROUP BY ds
)
SELECT
ds,
ROUND (1.0* SUM(num_jobs) OVER (ORDER BY ds ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW) / SUM(total_time) OVER (ORDER BY ds ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW) , 2) As 7d_throughput
FROM
job_count ;
Question-3

Percentage share of each language: Share of each language for different contents.

task: Calculate the percentage share of each language in the last 30 days?

Answer:
WITH per_language AS (

SELECT Language,

COUNT(job_id) AS num_jobs

FROM job_data

WHERE event IN ("transfer", "decision")

AND ds BETWEEN "01-11-2020" AND "30-11-2020"

GROUP BY language

),

job_total AS (

SELECT COUNT(job_id) AS total_jobs

FROM job_data

WHERE event IN ("transfer", "decision")

AND ds BETWEEN "01-11-2020" AND "30-11-2020"

GROUP BY Language

SELECT

Language,

ROUND(100.0*num_jobs/total_jobs, 2) As perc_jobs

FROM per_language CROSS JOIN job_total

ORDER BY perc_jobs DESC ;


Question - 4

Duplicate rows: Rows that have the same value present in them.

task: Let’s say you see some duplicate rows in the data. How will you display duplicates
from the table?

Answer:

select distinct *

FROM job_data;

WITH duplicates AS

SELECT * ,

ROW_NUMBER() OVER (PARTITION BY ds, job_id, Actor_id, event, language, time_spent,


org) AS row_num

FROM job_data

SELECT *

FROM duplicates

WHERE row_num > 1 ;


Case Study 2 (Investigating metric spike)
Creating/Using Schema:

Use Metri_spike:

Creating tables:

Events table

CREATE TABLE events(

user_id INT not null,

occurred_at date,

event_type varchar(200),

event_name VARCHAR(200),

location VARCHAR(200),

device varchar(200),

user_type int,

FOREIGN KEY (user_id) REFERENCES users(user_id) );

Users table

CREATE TABLE users

user_id INT not null,

created_at date,

company_id INT,

language VARCHAR(200),

activated_at date,

state VARCHAR(200),

PRIMARY KEY (user_id));


Email events table

CREATE TABLE email_events

user_id INT not null,

occurred_at VARCHAR(200),

Action VARCHAR (200),

user_type int

);

Question-1

User Engagement: To measure the activeness of a user. Measuring if the user finds quality
in a product/service.

task: Calculate the weekly user engagement?

Answer:

SELECT extract(week FROM e.occurred_at) as week,

COUNT(DISTINCT e.user_id) AS active_users

FROM events e

WHERE e.event_type = "engagement"

AND e.event_name = "login"

GROUP BY 1

ORDER BY 1 ;
Question-2

User Growth: Amount of users growing over time for a product.

task: Calculate the user growth for product?

Answer:

SELECT extract(day FROM u.created_at) as day,

COUNT(*) as all_users,

COUNT(CASE WHEN activated_at IS NOT NULL THEN u.user_id ELSE

NULL END) AS activated_users

FROM users u

WHERE created_at BETWEEN "2021-04-01" AND created_at < "2021-04-30"

GROUP BY day

ORDER BY day ;

Question-3

Weekly Retention: Users getting retained weekly after signing-up for a product.

task: Calculate the weekly retention of users-sign up cohort?

Answer:

SELECT EXTRACT( week from z.occurred_at) AS "week",

AVG (z.age_at_event) AS "Average_age_during_week",

COUNT(DISTINCT CASE WHEN z.user_age > 70 THEN z.user_id ELSE

NULL END) AS "10+ weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 70 AND z.user_age >=63

THEN z.user_id ELSE NULL END) AS "9 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 63 AND z.user_age >=56

THEN z.user_id ELSE NULL END) AS "8 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 56 AND z.user_age >=49


THEN z.user_id ELSE NULL END) As "7 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 49 AND z.user_age >=42

THEN z.user_id ELSE NULL END) AS "6 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 42 AND z.user_age >=35

THEN z.user_id ELSE NULL END) AS "5 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 35 AND z.user_age >=28

THEN z.user_id ELSE NULL END) AS "4 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 28 AND z.user_age >=21

THEN z.user_id ELSE NULL END) AS "13 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 21 AND z.user_age >=14

THEN z.user_id ELSE NULL END) AS "2 weeks",

COUNT(DISTINCT CASE WHEN Z.user_age < 14 AND z.user_age >=7

THEN z.user_id ELSE NULL END) AS "1 weeks",

COUNT(DISTINCT CASE WHEN z.user_age < 7 AND z.user_age >=63

THEN z.user_id ELSE NULL END) AS "Less than a week"

FROM

SELECT e.occurred_at, u.user_id, EXTRACT( week FROM u.activated_at) AS


activation_week, EXTRACT( day FROM e.occurred_at

- u.activated_at) AS age_at_event, EXTRACT(day FROM "201-09-01" - u.activated_at) AS


user_age

FROM users u JOIN events e

ON e.user_id = u.user_id

AND e.event_type = "engagement"

AND e.event_name= "login"

AND e.occurred_at >= "2014-05-01"

AND e.occurred_at < "2014-09-01"


WHERE u.activated_at IS NOT NULL

as z

GROUP BY week

ORDER BY week

LIMIT 100;

Question-4

Weekly Engagement: To measure the activeness of a user. Measuring if the user finds
quality in a product/service weekly.

Your task: Calculate the weekly engagement per device?

Answer:

SELECT extract(week FROM occurred_at) AS week,

COUNT(DISTINCT e.user_id) AS weekly_active_users, COUNT(DISTINCT CASE WHEN


e.device IN ('macbook pro', ' lenovo thinkpad' , 'macbook air' ,'dell inspiron notebook'

'asus', 'chromebook','dell inspiron desktop','acer aspire notebook', 'hp pavilion


desktop','acer aspire desktop','mac mini')

THEN e.user_id ELSE NULL END) AS computer,

COUNT(DISTINCT CASE WHEN e.device IN ('iphone 5','samsung galaxy','nexus 5' ,'iphone


5s', 'iphone 4s' , 'nokia lumia 635' ,'htc','htc1','samsung galaxy note','amazon fire phone')

THEN e.user_id ELSE NULL END) AS phone,

COUNT(DISTINCT CASE WHEN e.device IN ('ipad air' , 'nexus 7' , ' ipad mini', 'nexus 10',
'kindle fire','windows surface' ,'samsung galaxy

tablet')

THEN e.user_id ELSE NULL END) AS tablet

FROM events e

WHERE e.event_type ='engagement'

AND e.event_name = 'login'


GROUP BY week

ORDER BY week

LIMIT 100;

QUESTION - 5

Email Engagement: Users engaging with the email service.

Your task: Calculate the email engagement metrics? */

Answer:

SELECT EXTRACT( week FROM occurred_at) As week,

COUNT(CASE WHEN e.Action = 'sent weekly digest'

THEN e.user_id

ELSE NULL END) AS weekly_emails,

COUNT(CASE WHEN e.Action = 'sent reengagement email' THEN

e.user_id ELSE NULL END) As reengagement_emails,

COUNT(CASE WHEN e.action = 'email open' THEN e.user_id ELSE

NULL END) AS email_opens,

COUNT(CASE WHEN e.action = 'email clickthrough'

THEN e.user_id

ELSE NULL END) AS email_clickthroughs

FROM email_events e

GROUP BY 1 ;

You might also like