Google(DA) 面试准备
Google(DA) 面试准备
*Za/2是Z1-a/2的notation,Zb是Z1-b的notation;两者的绝对值一样
*sigma=p*(1-p),在p=0.5时p最大;因此在p增大且小于0.5时,sigma增大
● Test: z-test
● Sigma:
1) If OEC is a rate (binary choice for individual), then I need to calculate two different
sigmas under H1 and H0
H1: p1(1-p1)+p2(1-p2)
H0: 2*(p1+p2)/2 * [1-(p1+p2)/2]
2) If OEC is a continuous number, then I will estimate the value of sigma based on
experience (e.g. the population variance among users and assume it is the same for
the two groups)
● Z: The quantile of a/2 in the normal distribution (a=0.05, b=0.2)
*Za/2是Z1-a/2的notation,Zb是Z1-b的notation;两者的绝对值一样
● Effect size
1) If OEC is a rate: baseline conversion rate-minimum detectable effect
2) If OEC is a continuous number: difference between the mean of the two groups
2. If after running an A/B testing you find the fact that the desired metric(i.e, Click Through
Rate) is going up while another metric is decreasing(i.e., Clicks). How would you make a
decision?
1) it can be the case that when one metric increases and the other one goes down -
it depends on what the other metric is and what is the goal of our experiment. For
example, CTR goes up because users are more willing to click but this new
change can also reduce how many times user need to click for fulfill a function,
then the total number of pageviews also go down, so that CTR decreases.
2) need to understand the goal of the company or the experiment, refer to diverse
decision makers. Thus, we can re-evaluate how important are different metrics
3) may also think about looking for some alternative OEC. The metrics go in two
directions may indicate that there is a conflict between long term investment
(return to the site) and short time goal (ctr). A good OEC can balance the two
4) by intuition & experience: maybe it’s fine to have one metric significant and the
other one do no change significantly (could be wrong if the two metrics have big
changes together)
3. Now assuming you have an A/B testing result reflecting your test result is kind of
negative (i.e, p-value ~= 20%). How will you communicate with the product manager?
1) the test result is not statistically significant, indicating that our experiment group
and the control group are not different on the metric we test on. This means that
the change may not affect the users experience we choose to measure, based on
all the assumptions we made
2) the AB test itself can also be improved. May need an AA test to make sure the
users in the two groups are randomly selected. That is, they are similar on the
factor we test, and can represent the population. May also check if there are any
other significant changes may make a difference in the product because random
selection is hard in a real-world setting. Depending on the answer to the second
question, we can decide whether to rerun the test
3) We may need to re-think about our metrics and how to measure them correctly.
We may need to have some further discussion on what is our goal in the analysis,
and do we choose the right sample in the study. The effect could be different in
various subgroups, and some hidden pattern could also change our statistical
results. For example, new users and experienced users may have different
experiences with the change; there could be regional or cultural difference based
on what products we are talking about; there can be seasonal pattern that
influence how users use our product (holiday/summer vacation/weekend)
4) The statistical significance is not everything. depending on our practical goal
(practical significance) we have, even if the test is not statistically significant, it
can still have some real increase for at least some users (especially based on the
comparison between the confidence interval between the minimum detectable
effect we set); it may need further experiment and test with more users
5) data sometimes does not mean everything behind the product, and sometimes
AB test can have uncertain results. It is always a good idea to use other sources
to make the decision, such as qualitative research methods, strategic business
issues, to complement the findings from AB test and quantitative research
4. If given the above 20% p-value, the product manager still decides to launch this new
feature, how would you claim your suggestions and alerts?
1) may want to find a proper slice of population from the users and see how it works.
From the initial experiment, we may find some subgroups received larger
influence from the change, and may want to change the targeted users to analyze
(could also try different days in a week - this can help use to debug & find new
hypothesis)
2) maybe the new function can be changed - for example, can only launch some
part of and be more conservative on the effect on user’s experience
3) may want to change: OEC, minimum detectable effect, metrics, and re-run the
test again, and cross-checking results with other methods
5. If you are running an A/B testing and find that the result is very positive, thus you decide
to launch it. In the first 2 weeks, the performance of our website is very positive.
However, with time flying by, all metrics seem to go back to normal. How will you explain
this result?
It is true that the effect of the change may gradually be smaller, and the influence
can disappear as you wrap up the change. This is because:
1) the influence may not be repeatable when there are some seasonal pattern, so
that users would have seasonal / holiday behaviors
2) if there is a change of version, then when user adopt changes, their behaviors will
also change. We can use cohort groups to track how users change their
behaviors and look for patterns for the future
3) It’s a good idea to continue to use a group of users who are not influenced by any
seasonal pattern as the control group, and keep tracking. This can show how
large the influence of seasonal pattern is, and may tell us whether the influence is
repeatable
7. Facebook is testing different designs of the user homepage. They have come up with 50
variations. They would like to test all of them and choose the best. How would you set up
this experiment? What metrics would you calculate and how would you report the
results?
Step 1: check the data accuracy & whether the change is significant
Step 2: check if thre are any outlier (filter spam)
Step 3: check the result of AA test and loof for any difference before experiment
Step 4: external (PR) reason - discuss with product/technique/operation teams to
discuss the potential reason and know any potential change they made
1) do we launch any new functions, features or internal products at the same time?
2) do we have any recent marketing or PR strategy?
3) is there any big news about the product on social media?
Step 5: is there any seasonal/time pattern? will the pattern have different effect to
different groups?
Step 6: Try to split the data into different categories and calculate the influence
factor for each category: factor of category 1=(today’s metric of category 1 -
yesterday’s metric of category 2) / (today’s total - yesterday’s total)
Step 7: similar change in other metric or also happens in in competitor products
1) for each user: average purchase amount per user per day ($), average purchase
time per user per day, time for the first purchase of an user, what time users
usually make a purchase, what kind of purchase the user makes each time
2) for each APP: how many purchase received per APP, purchase amount per APP,
profit rate per APP (cost & profit), most popular item/product that users purchase
3) for platform: average revenue per purchase/user, customer acquisition cost, cost
for services (maintainess, cloud, marketing), promotion ROI for each APP
3. Assume we are Facebook and we would like to add a new 'love' button. should we do
this? / 产品新增加一个function,怎么test效果。
4. We are running 30 tests at the same time, trying different versions of our home page. In
only one case test wins against the old home page. P-value is 0.04. Would you make the
change?
Assuming that we set alpha, which is the significance level of probability of type I
error, as 0.05, it indicates that if we run 30 tests and get significant result for 30*0.05=1.5
times by chance. The 0.04 p-value is not small enough to reverse this conclusion,
because we can still have 30*0.04=1.2 times significant result out of 30 tests by chance.
They all show that the one-time significant result can only happen by chance, not
because of our treatment. Thus, we cannot make the change based on this.
5. Assume that you are assigned to estimate the LTV (lifetime value) of our game app
player. what kind of metrics would you like to calculate so as to make a good prediction?
Assume that you already collect all that you want. How would you make this
prediction/estimation? / How would you estimate the life-time value of a click on ad?
Focus on how you would communicate this in simple terms to an advertiser.
The user lifetime value refers to the amount of net profit you’re generating from a
user before they churn from the app or stop converting on in-app purchases. There are
different ways of calculation. Some metrics I would need are:
1) how much a user spend per visit/click on average?
2) how many visits a user have per week/or any other purchase cycle?
3) customer’s retention rate
4) the average user lifespan (churn rate)
To figure out your churn rate, you need to track how many customers purchased in
two sequential periods and divide that by the total number of customers who bought
in the first period.
To determine your annual customer churn rate, you can use this formula:
6. If you got a chance to add on new features for our app to increase our profit within a very
short term. What will you do?
7. We have developed a new ad model. How would you test if this is better than the current
model? Not just regarding revenue, but also thinking about user experience
1) quantitative metrics: ad CTR, profit rate (ROI), ad requests, ad watching/skip
time, total time user spend on pages with ad
2) qualitative UX research: survey with how do they like the ad
8. What is your favorite google product? How can you improve the product? How do you
measure the success of your improvement?
YouTube (900M DAU)
Goal: video-sharing platform where users can upload, watch, share, rate, and comment
on videos; they can also subscribe and communicate with other users. Business users
can also find potential advertisement opportunities.
Dislike (problems):
1) User engagement: Social network experience
- Diverse content category & community culture
- Social functions (comments area is low in APP)
- When traffic is large, it’s hard to interact and be noticed
2) User experience: Advertising during the video
3) Increase revenue: membership services
- How to attract users to pay for the membership
- How to balance the conflict between membership & Ads revenue
How to solve:
1) UX design improvement: Ads algorithms, social functions (comments, messages)
2) Diverse membership services
Pro:
- shortvideo by ordinary users
- advertisement advantage
- sers can watch and upload
u
Cons:
- howto survive as a smaller youtuber
- large
work of content management
- embership revenue
m
How to implement:
Difficulty:
- cost of new algorithm
- lot of research on user experience on social network
Clarify: is there a control group or we launched this campaign for the whole country?
If yes: how do we assign companies/users into the two group?
If no: observational study (average treatment on treated/LATE)
Matching + Difference-in-Difference
1) pick up a proper control and treatment group: who is the target of the campaign?
is the whole country available for the campaign? who is eligible for the campaign
2) collect data before and after the campaign
3) propensity score matching
4) DiD + parallel check
10. 怎么吸引日活
A cohort is a group of users who share a common characteristic that is identified in this
report by an Analytics dimension. For example, all users with the same Acquisition Date belong
to the same cohort. The Cohort Analysis report lets you isolate and analyze cohort behavior.
Cohort analysis helps you understand the behavior of component groups of users apart
from your user population as a whole. Examples of how you can use cohort analysis include:
1. Examine individual cohorts to gauge response to short-term marketing efforts like
single-day email campaigns.
2. See how the behavior and performance of individual groups of users changes day to day,
week to week, and month to month, relative to when you acquired those users.
3. Organize users into groups based on shared characteristics like Acquisition Date, and
then examine the behavior of those groups according to metrics like User Retention or
Revenue.
e.g. launch a new feature/event and acquire some new users. The new users in this month is a
cohort, and we can compare with the cohort acquired earlier this year on their revenue/retention
rate, so that we can check the effect of the new feature/event.
● 每天的工作时间更长
Group 1: require operator to work 10 hours a day with 30$/hour
Group 2: require operator to work 6 hours a day with 20$/hour
Group 3: let the operators to choose to work 6 or 10 hours a day with 20/30$
check id there are significantly difference on customer’s satisfaction level in each
group, so that we can test what factor actually influence it.
● 工作年限更长: match the operators with the features that may influence the length
of working experience, such as education, communication ability, proficiency of
problem solving etc. after finding the matched operators, regress the satisfaction
level on the years of experience.
Simpson’s Paradox: there are different subgroups in the population, and within each
subgroups, the results are stable. But when aggregated, the mix of subgroups will drive
the results.
* reason: wrong setup (sanity checks), change affects different groups differently
(when by user id, then the change may increase events of new users more than
experienced users)
* frequent subgroup difference:
1) weekends vs week day
2) new users vs experienced users
3) men/women or other demographic groups
4) region, language
* how to make a decision: dig deeper to check why there is a different effect to
different users
16. 工作的时候你的customer来找你request一个project,你应该问customer什么问题
17. 一个新游戏出来了,怎么去估计下载量
18. 一个新软件更新了,怎么衡量这个更新对软件用户使用的影响
1) AB test:
- based on what kind of influence we care about (for example, if we care
about consumption, then how long people use the APP; if social network,
then how many comments or messages people send; if producing or user
experience etc.), pick up the metric and randomly select users to do the
AB test.
- may need to differentiate users using Android or iOS (and need to think
about how to pick up the treatment group because some users upgrade
by themselves and other might be forcefully upgraded)
- use a Z test to test the difference between the two groups and compare
with the minimum detectable effect we expect to see
- sanity check (AA test), sign check
2) observational studies
- obtain data before and after the upgrade; again, pick up the proper metric
based on research goal and what exact changes have been made
- matching and difference in difference approach
- parallel check and calculate the result
3) qualitative research: UX research such as survey/interview, or usability test.
19. google self-driving car technique, what business model to monetize it? what challenges
do you foresee?
Business model:
1) personal viechle
2) ride services (urber, taxi, bus, commute, parking)
3) transportation and logistics
Challenge:
1) how to decrease the error rates in indentifying (safety issues)
2) the cost of such technology
3) the accuracy of navigation and maps
4) the cooperation with urban infrastructure system and the cost of it
5) how to protect information and data privacy
How to check: Z test on average profit rate between google express and the population
average (one sample z test or t test depending on sample size, which is how many shampoo
brands we have).
Why this happens: Simpson’s Paradox (a trend in subgroups can disappear or reverse
when aggregate together). There are some brands with high profit rate, but customers in google
express buy less; they may purchase more on brands with lower profit rate.
21. Gmail想launch一个新功能,自动推荐邮件收件人,问怎么衡量我们是否要launch,怎么设
计实验
1) 先clarify目标,比如是为了吸引新用户?还是增加现有用户的engagement?
2) 通过这个确定相关metrics,再确定怎么衡量
3) randomly select users into the control and treatment group if possible. If not, may
need to choose the users based on the goal
4) run the AB test (Z test) to find the general results and compare it to the minimum
detectable effect size we expect
5) 再把对象分组看是否有不同的结论
6) 最后总结根据不同的outcome,有什么action plan
It depends on the goal of the research and the goal of the product design.
1) user engagement: total use time, freuquency of open the APP
2) social network: use of like, comments, share, msg features
3) increase retention: 7/14/28/180/365 rentention rates
4) attract new users: # sign up and referral
5) solve a specific probelm in the old product: complaint rates
6) monetize: revenue
1) AB test
2) observational study: matching and difference in difference
- obtain data before and after the upgrade; again, pick up the proper metric
based on research goal and what exact changes have been made
- matching and difference in difference approach
- parallel check and calculate the result
3) cohort analysis for promotion (compare the new users from the promotion with
regular new users before on their retention and revenue)
28. Google has released a new version of their search algorithm, for which they used A/B
testing. During the testing process, engineers realized that the new algorithm was not
implemented correctly and returned less relevant results. 2 things happened during
testing:
a. People in the treatment group performed more queries than the control group.
b. Advertising revenue was higher in the treatment group as well.
Q1: What may be the cause of people in the treatment group performing more searches
than the control group?
We know the new algorithm produced less relevant search results. This means that users
may have to make additional searches in order to clarify what they are searching for
using the new algorithm. In order to test this hypothesis, we could study how close
searches are to each other. If we notice additional searches are done very soon after, we
could classify as them clarifying searches.
Q2: What do you think caused the new algorithm to generate more advertising revenue,
even though the results were less relevant?
1) We know that more searches are being conducted, since advertisements are
served along with every new search, there are more opportunities for users to
click on the advertisement.
2) Another possibility is that the search algorithm is different than the algorithm used
to display ads. In this case, the ads themselves may be more relevant than the
search results, causing more ad clicks.
Q3: Since the less relevant algorithm resulted in higher advertising revenue, should it be
implemented anyways?
1) The effects described are probably only short-term effects due to the problems
with the algorithm. We shouldn’t sacrifice the long term potential of the site for a
temporary increase in revenue and searches. Google is probably best positioned
to win in the long term when it has the most relevant search algorithm.
2) Also depend on what is the major goal of the research, which is to increase the
revenue or user’s experience on searching
3) Refer to diverse decision makers (re-evaluate how important is the metric), and
use their intuition & experience: maybe it’s fine to have one metric significant and
the other one do no change significantly (could be wrong if the two metrics have
big changes together)
29. A car company produces all the cars for a country, we’ll call them Car X and Car Y. 50%
of the population drives Car X, the other 50% drives Car Y. Two potential technologies
have just been discovered that help reduce gasoline usage!
Technology A: Increases the MPG of Car X from 50 MPG to 75 MPG
Technology B: Increases the MPG of Car Y from 10 MPG to 11 MPG
Which technology should be implemented to save the most gasoline for the country?
Assume that the average commute distance for a car is D. Then we define total gas used
(G) as G=D/MPG. So the Total Gas Used Change with Policy A: (D/50)-(D/75) = D / 150
Total Gas Used Change with Policy B: (D/10)-(D/11) = D / 110
30. In 2011 Facebook launched Messenger as a stand alone app for mobile devices (it used
to only be part of the Facebook App). How would you track the performance of this new
application? What metrics would you use?
31. Facebook is testing different designs of the user homepage. They have come up with 50
variations. They would like to test all of them and choose the best. How would you setup
this experiment? What metrics would you calculate and how would you report the
results?
SQL
1. date | user_num, 问如何找到top 100 week over week increase/drop 其实后来只要我列出
wk-1 和wk之间的increase和dropper,用dateadd 或者datediff就行了
(
SELECT tb1.weeknum,
SUM(tb1.user_num) over(partition by tb1.week_num) as cnt
FROM
(
SELECT *,
WEEK(date) as week_num
FROM table
) tb1
) tb2
LEFT JOIN
(
SELECT tb1.weeknum,
SUM(tb1.user_num) over(partition by tb1.week_num) as cnt
FROM
(
SELECT *,
WEEK(date) as week_num
FROM table
) tb1
) tb3 on tb1.weeknum=tb2.weeknum-1
2. view|video_id|date,要求找出最早的达到100个view的每个视频的日期。
# 如果view是累计观看数:
SELECT video_id, min(d_date) as first_day_100
FROM table
WHERE view>=100
GROUP BY video_id
# 如果view是每天新产生的观看数:
SELECT tb.video_id, min(tb.d_date) as first_day_100
FROM
(
SELECT video_id,
sum(view) over(partition by video_id order by d_date) as cnt,
d_date
FROM table
) tb
WHERE tb.cnt>=100
GROUP BY tb.video_id
5. 有三个variables: user id, date, transaction id, 需要些sql to show user id that has
transactions in Nov 2017 but not in Sep 2017
SELECT t1.user_id
FROM
(
SELECT *
FROM table
WHERE day like "2019-11"
) t1
LEFT JOIN
(
SELECT *
FROM table
WHERE day like "2019-09"
) t2 on t1.user_id=t2.user_id
WHERE t2.user_id is NULL
6. Table【in_app_purchase】:
uid: unique user id.
timestamp: specific timestamp detailed to seconds.
purchase amount: the amount of a one-time purchase.
This is a table containing in-app purchase data. A certain user could have multiple
purchases on the same day
Question 1: List out the top 3 names of the users who have the most purchase amount
on '2018-01-01'
(
SELECT uid, sum(amount) as total_amount,
DATE(from_unixtime(DATE, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
) tb
WHERE tb.d_date="2018-01-01"
ORDER BY tb.total_amount
LIMIT 3
Question 2: Sort the table by timestamp for each user. Create a new column named "cum
amount" which calculates the cumulative amount of a certain user of purchase on the
same day.
(
SELECT uid, DATE(from_unixtime(DATE, '%Y-%m-%d %H:%i:%S')) as d_date,
purchase_amount
FROM in_app_purchase
)A
Question 3: For each day, calculate the growth rate of purchase amount compared to the
previous day. if no result for a previous day, show 'Null'.
SELECT A.d_date,
CASE WHEN B.d_date IS NULL THEN NULL
ELSE (A.total_amount - B.total_amount) / B.total_amount
END AS growth_rate
FROM
(
SELECT sum(amount) as total_amount,
DATE(from_unixtime(1515980716, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
)A
LEFT JOIN
(
SELECT sum(amount) as total_amount,
DATE(from_unixtime(1515980716, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
) B on A.d_date=B.d_date+1
Question 4: For each day, calculate a 30day rolling average purchase amount.
SELECT tb.d_date,
AVG(tb.total_amount) OVER(ORDER BY tb.d_date ASC
rows between 29 preceding and current row) as AVG_30
FROM
(
SELECT sum(amount) as total_amount,
DATE(from_unixtime(DATE, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
) tb
ORDER BY tb.d_date
(
SELECT t1.user_id, t1.ad_id, t2.timestamp as last
LAG(t2.timestamp) OVER(PARTITION BY t1.user_id, t1.ad_id ORDER BY t2.timestamp) as
previous_t
FROM
(
SELECT user_id, ad_id
FROM table
GROUP BY user_id, ad_id
HAVING COUNT(ad_id)>2
) t1
JOIN
(
SELECT *
FROM table
) t2 ON t1.user_id=t2.user_id AND t1.ad_id=t2.ad_id
)A
GROUP BY 1, 2
11. 用SQL计算留存率、点赞率、好友请求接受率
1) 留存率
SELECT A.first_day AS install_dt,
COUNT(DISTINCT A.player_id) AS installs,
ROUND(SUM(IF(B.player_id is null,0,1))/COUNT(DISTINCT A.player_id),2) as Day1_retention
FROM
(
SELECT player_id, MIN(event_date) as first_day
FROM Activity
GROUP BY player_id
)A
LEFT JOIN
(
SELECT player_id, event_date
FROM Activity
) B on A.first_day=B.event_date-1 AND A.player_id=B.player_id
GROUP BY A.first_day
15. 给了一个product table,里面大概有order date, user id, item name, price, quantity,
country,问求Germany卖得最多的item,记得不是很清楚,大概是这么个题
16. 有个youtube用户播放量的表,date和user_id是全的,如果当天用户没有浏览,view就
是0,column有date, user_id, view
SQL1:找出某个指定月的active user(播放量大于0)
SQL2:找出每一天,当天最近7天内有浏览过视频的用户
(
SELECT user_id, date,
SUM(view) OVER(partition by user_id order by date ASC rows between 6 preceding and current
row) as view_in_7days
) tb
WHERE view_in_7days>0
https://www.1point3acres.com/bbs/page-322459.html