Metric Spike.
Metric Spike.
Approach:
I got clear understanding of data and prepared data for analysis,
Created Schema and tables in MySQL from the given datasets,
Defined the Problems statements, Setted business environment to answer
As a Data Analyst, undertaken Operational analytics to discover the insights to
help teams make better decisions and investigating metric spike to answer
business questions
Case study 1 (Job Data)
Creating Schema:
Create database METRIC_SPIKE;
Performing Analysis:
Question-1
task: Calculate the number of jobs reviewed per hour per day for November 2020?
Answer:
SELECT ds,
FROM job_data
GROUP BY ds;
Question-2
task: Let’s say the above metric is called throughput. Calculate 7 day rolling average
Answer:
WITH job_count AS (
SELECT ds,
COUNT(job_id) AS num_jobs,
SUM(time_spent) AS total_time
FROM job_data
WHERE
event IN ( "transfer", "decision")
AND ds BETWEEN "01-11-2020" AND "30-11-2020"
GROUP BY ds
)
SELECT
ds,
ROUND (1.0* SUM(num_jobs) OVER (ORDER BY ds ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW) / SUM(total_time) OVER (ORDER BY ds ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW) , 2) As 7d_throughput
FROM
job_count ;
Question-3
Percentage share of each language: Share of each language for different contents.
task: Calculate the percentage share of each language in the last 30 days?
Answer:
WITH per_language AS (
SELECT Language,
COUNT(job_id) AS num_jobs
FROM job_data
GROUP BY language
),
job_total AS (
FROM job_data
GROUP BY Language
SELECT
Language,
ROUND(100.0*num_jobs/total_jobs, 2) As perc_jobs
Duplicate rows: Rows that have the same value present in them.
task: Let’s say you see some duplicate rows in the data. How will you display duplicates
from the table?
Answer:
select distinct *
FROM job_data;
WITH duplicates AS
SELECT * ,
FROM job_data
SELECT *
FROM duplicates
Use Metri_spike:
Creating tables:
Events table
occurred_at date,
event_type varchar(200),
event_name VARCHAR(200),
location VARCHAR(200),
device varchar(200),
user_type int,
Users table
created_at date,
company_id INT,
language VARCHAR(200),
activated_at date,
state VARCHAR(200),
occurred_at VARCHAR(200),
user_type int
);
Question-1
User Engagement: To measure the activeness of a user. Measuring if the user finds quality
in a product/service.
Answer:
FROM events e
GROUP BY 1
ORDER BY 1 ;
Question-2
Answer:
COUNT(*) as all_users,
FROM users u
GROUP BY day
ORDER BY day ;
Question-3
Weekly Retention: Users getting retained weekly after signing-up for a product.
Answer:
FROM
ON e.user_id = u.user_id
as z
GROUP BY week
ORDER BY week
LIMIT 100;
Question-4
Weekly Engagement: To measure the activeness of a user. Measuring if the user finds
quality in a product/service weekly.
Answer:
COUNT(DISTINCT CASE WHEN e.device IN ('ipad air' , 'nexus 7' , ' ipad mini', 'nexus 10',
'kindle fire','windows surface' ,'samsung galaxy
tablet')
FROM events e
ORDER BY week
LIMIT 100;
QUESTION - 5
Answer:
THEN e.user_id
THEN e.user_id
FROM email_events e
GROUP BY 1 ;