0% found this document useful (0 votes)
19 views20 pages

Google(DA) 面试准备

The document discusses A/B testing methodologies, including sample size calculation, metric evaluation, and decision-making based on test results. It emphasizes the importance of understanding the context of metrics, the significance of statistical results, and the need for continuous evaluation and adjustment of testing strategies. Additionally, it outlines approaches for communicating findings and suggestions to product managers and stakeholders.

Uploaded by

xindi.zhao.ut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views20 pages

Google(DA) 面试准备

The document discusses A/B testing methodologies, including sample size calculation, metric evaluation, and decision-making based on test results. It emphasizes the importance of understanding the context of metrics, the significance of statistical results, and the need for continuous evaluation and adjustment of testing strategies. Additionally, it outlines approaches for communicating findings and suggestions to product managers and stakeholders.

Uploaded by

xindi.zhao.ut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

AB Test

1. How do you calculate the sample size for an A/B testing?

*Za/2是Z1-a/2的notation,Zb是Z1-b的notation;两者的绝对值一样
*sigma=p*(1-p),在p=0.5时p最大;因此在p增大且小于0.5时,sigma增大

● Test: z-test
● Sigma:
1) If OEC is a rate (binary choice for individual), then I need to calculate two different
sigmas under H1 and H0
H1: p1(1-p1)+p2(1-p2)
H0: 2*(p1+p2)/2 * [1-(p1+p2)/2]
2) If OEC is a continuous number, then I will estimate the value of sigma based on
experience (e.g. the population variance among users and assume it is the same for
the two groups)
● Z: The quantile of a/2 in the normal distribution (a=0.05, b=0.2)
*Za/2是Z1-a/2的notation,Zb是Z1-b的notation;两者的绝对值一样
● Effect size
1) If OEC is a rate: baseline conversion rate-minimum detectable effect
2) If OEC is a continuous number: difference between the mean of the two groups

2. If after running an A/B testing you find the fact that the desired metric(i.e, Click Through
Rate) is going up while another metric is decreasing(i.e., Clicks). How would you make a
decision?

1) it can be the case that when one metric increases and the other one goes down -
it depends on what the other metric is and what is the goal of our experiment. For
example, CTR goes up because users are more willing to click but this new
change can also reduce how many times user need to click for fulfill a function,
then the total number of pageviews also go down, so that CTR decreases.
2) need to understand the goal of the company or the experiment, refer to diverse
decision makers. Thus, we can re-evaluate how important are different metrics
3) may also think about looking for some alternative OEC. The metrics go in two
directions may indicate that there is a conflict between long term investment
(return to the site) and short time goal (ctr). A good OEC can balance the two
4) by intuition & experience: maybe it’s fine to have one metric significant and the
other one do no change significantly (could be wrong if the two metrics have big
changes together)

3. Now assuming you have an A/B testing result reflecting your test result is kind of
negative (i.e, p-value ~= 20%). How will you communicate with the product manager?

1) the test result is not statistically significant, indicating that our experiment group
and the control group are not different on the metric we test on. This means that
the change may not affect the users experience we choose to measure, based on
all the assumptions we made
2) the AB test itself can also be improved. May need an AA test to make sure the
users in the two groups are randomly selected. That is, they are similar on the
factor we test, and can represent the population. May also check if there are any
other significant changes may make a difference in the product because random
selection is hard in a real-world setting. Depending on the answer to the second
question, we can decide whether to rerun the test
3) We may need to re-think about our metrics and how to measure them correctly.
We may need to have some further discussion on what is our goal in the analysis,
and do we choose the right sample in the study. The effect could be different in
various subgroups, and some hidden pattern could also change our statistical
results. For example, new users and experienced users may have different
experiences with the change; there could be regional or cultural difference based
on what products we are talking about; there can be seasonal pattern that
influence how users use our product (holiday/summer vacation/weekend)
4) The statistical significance is not everything. depending on our practical goal
(practical significance) we have, even if the test is not statistically significant, it
can still have some real increase for at least some users (especially based on the
comparison between the confidence interval between the minimum detectable
effect we set); it may need further experiment and test with more users
5) data sometimes does not mean everything behind the product, and sometimes
AB test can have uncertain results. It is always a good idea to use other sources
to make the decision, such as qualitative research methods, strategic business
issues, to complement the findings from AB test and quantitative research

4. If given the above 20% p-value, the product manager still decides to launch this new
feature, how would you claim your suggestions and alerts?

1) may want to find a proper slice of population from the users and see how it works.
From the initial experiment, we may find some subgroups received larger
influence from the change, and may want to change the targeted users to analyze
(could also try different days in a week - this can help use to debug & find new
hypothesis)
2) maybe the new function can be changed - for example, can only launch some
part of and be more conservative on the effect on user’s experience
3) may want to change: OEC, minimum detectable effect, metrics, and re-run the
test again, and cross-checking results with other methods

5. If you are running an A/B testing and find that the result is very positive, thus you decide
to launch it. In the first 2 weeks, the performance of our website is very positive.
However, with time flying by, all metrics seem to go back to normal. How will you explain
this result?

It is true that the effect of the change may gradually be smaller, and the influence
can disappear as you wrap up the change. This is because:
1) the influence may not be repeatable when there are some seasonal pattern, so
that users would have seasonal / holiday behaviors
2) if there is a change of version, then when user adopt changes, their behaviors will
also change. We can use cohort groups to track how users change their
behaviors and look for patterns for the future
3) It’s a good idea to continue to use a group of users who are not influenced by any
seasonal pattern as the control group, and keep tracking. This can show how
large the influence of seasonal pattern is, and may tell us whether the influence is
repeatable

6. Close friend notification: 打算给用户的发text notification 告诉他们close friends 的Update.


1. How to evaluate if we want to add this feature?
2. How to evaluate negative impact of this feature?
3. If the engineer team roll out the feature on 1000 users, the time spend on
facebook was 23 mins the week before roll out and 25 mins the week after. What
would you say about this result?
4. Experiment 1000 user, 24 mins before, 26 mins after. Control arm 1000 user, 24
mins before, 24 mins after. What would you say about this result.
5. The conversion rate for the experiment is 10%. And the engineer team rollout this
feature globally, Estimate the total number of users use this feature.

7. Facebook is testing different designs of the user homepage. They have come up with 50
variations. They would like to test all of them and choose the best. How would you set up
this experiment? What metrics would you calculate and how would you report the
results?

T-test among pairs of treatment + Bonferroni correction method to correct alpha


value to 0.05/50=0.001(multiple hypothesis tests would increase the likelihood of a rare
event happening) - note: needs large population for testing.

Product Sense (Experiment Design &


Metrics Selection)
1. Today you immediately notice that our app's new users are doubled. What could be the
reason? Do you think it's good or not? / ​if signups of gmail has increased, why is it? Does
it meet expectation? / 如果用户subscription 忽然下降了好多,怎么看是什么原因。

Step 1: check the data accuracy & whether the change is significant
Step 2: check if thre are any outlier (filter spam)
Step 3: check the result of AA test and loof for any difference before experiment
Step 4: external (PR) reason - discuss with product/technique/operation teams to
discuss the potential reason and know any potential change they made
1) do we launch any new functions, features or internal products at the same time?
2) do we have any recent marketing or PR strategy?
3) is there any big news about the product on social media?
Step 5: is there any seasonal/time pattern? will the pattern have different effect to
different groups?
Step 6: Try to split the data into different categories and calculate the influence
factor for each category: factor of category 1=(today’s metric of category 1 -
yesterday’s metric of category 2) / (today’s total - yesterday’s total)
Step 7: similar change in other metric or also happens in ​in competitor products

it can be good to have new users; but


1) can our server afford the large traffic so that we can provide good user
experience?
2) retention rate is even more important than the absolute number of new users; are
we losing those new users? are we losing old users? how can we keep them
active?
2. If we have an app with in-app purchase, name at least 4 metrics you would like to
monitor in your dashboard.

1) for each user: average purchase amount per user per day ($), average purchase
time per user per day, time for the first purchase of an user, what time users
usually make a purchase, what kind of purchase the user makes each time
2) for each APP: how many purchase received per APP, purchase amount per APP,
profit rate per APP (cost & profit), most popular item/product that users purchase
3) for platform: average revenue per purchase/user, customer acquisition cost, cost
for services (maintainess, cloud, marketing), promotion ROI for each APP

3. Assume we are Facebook and we would like to add a new 'love' button. should we do
this? / ​产品新增加一个function,怎么test效果。

1) what is the goal of this feature? Increase user’s engagment/social network? Is


there any complaints on this issue? Or just a change of the like button design?
Do we want to use this button to take place of the like button
2) based on the research goal, choose the metrics. Such as the time spend on
social networking (comments, reply, like, etc)
3) randomly pick up two group of users. The size of the sample depends on the
alpha, beta, effect size, and population sigma. Also think about using some filters
(such as only on a certain platform/country/language etc). We should also run the
AA test which make sure that they are similar to each other before any
experiments on the metric we pick
4) Run the Z test in the AB test setting and calculate the 95% confidence interval for
the metric. Compare it with the the minimum detectable effect we expect.
5) Do sanity check and sign test to make sure the result is on the direction we want.
We also want to conduct some qualitative study (survey/interview) to understand
user experience
6) also need to think about the cost of the new feature and compare it with the effect
size

4. We are running 30 tests at the same time, trying different versions of our home page. In
only one case test wins against the old home page. P-value is 0.04. Would you make the
change?

Assuming that we set alpha, which is the significance level of probability of type I
error, as 0.05, it indicates that if we run 30 tests and get significant result for 30*0.05=1.5
times by chance. The 0.04 p-value is not small enough to reverse this conclusion,
because we can still have 30*0.04=1.2 times significant result out of 30 tests by chance.
They all show that the one-time significant result can only happen by chance, not
because of our treatment. Thus, we cannot make the change based on this.

5. Assume that you are assigned to estimate the LTV (lifetime value) of our game app
player. what kind of metrics would you like to calculate so as to make a good prediction?
Assume that you already collect all that you want. How would you make this
prediction/estimation? / How would you estimate the life-time value of a click on ad?
Focus on how you would communicate this in simple terms to an advertiser.

The user lifetime value refers to the amount of net profit you’re generating from a
user before they churn from the app or stop converting on in-app purchases. There are
different ways of calculation. Some metrics I would need are:
1) how much a user spend per visit/click on average?
2) how many visits a user have per week/or any other purchase cycle?
3) customer’s retention rate
4) the average user lifespan (churn rate)

1) average purchase value (total revenue / # of orders)


2) average purchase frequency rate (# of purchase / # of users)
3) average customer's lifetime span (total span / # of users)
To do the calculate, I can user different way to calculate the LTV and use the mean of
those approaches. For example, there are three major ways to calculate:
1) $ spent each visit * visit time per cycle * # of cycle per year * # of years in
total lifespan * retention rate
2) purchase value / average purchase frequency rate * average customer's
lifetime span

To figure out your churn rate, you need to track how many customers purchased in
two sequential periods and divide that by the total number of customers who bought
in the first period.

To determine your annual customer churn rate, you can use this formula:

6. If you got a chance to add on new features for our app to increase our profit within a very
short term. What will you do?

1) Email list marketing


2) Advertisement
3) Sponsors and Partnerships
4) Creating Strong Code
If you develop your own code from the ground up and it proves to be
successful, other brands may approach you and offer to re-skin your app (either
for their purposes or yours). By licensing your code to other developers, you can
make money without disrupting your users’ experiences.
5) In-App Purchases
6) SMS Marketing
7) Free/Premium Versions + Multiple Payment Options for Subscription Services
8) Strong Content Strategies
9) Data-Driven Strategies
Using this method, you can figure out who is spending the most time and money
on your app and place a primary focus on those users instead of spending all of your
development time on new user acquisition.

7. We have developed a new ad model. How would you test if this is better than the current
model? Not just regarding revenue, but also thinking about user experience
1) quantitative metrics: ad CTR, profit rate (ROI), ad requests, ad watching/skip
time, total time user spend on pages with ad
2) qualitative UX research: survey with how do they like the ad

8. What is your favorite google product? How can you improve the product? How do you
measure the success of your improvement?
YouTube (900M DAU)
Goal: video-sharing platform where users can upload, watch, share, rate, and comment
on videos; they can also subscribe and communicate with other users. Business users
can also find potential advertisement opportunities.

Like (how it serves the goal):


1)​ ​Encourage content producing:
-​ ​Use pieces of time to watch and increase activity
- ​Encourage UGC (User Generated Content) and original content by advertising
and copyright revenue to youtuber
-​ ​YouTube analytics & other user-friendly instructions
2)​ ​Find different strategy for various users:
-​ ​Children
-​ ​Disability & Languages
-​ ​Music
3)​ ​Advertising:
-​ ​Incentives to create high-quality ads
-​ ​Low cost due to the large size of user & video

Dislike (problems):
1)​ ​User engagement: Social network experience
-​ ​Diverse content category & community culture
-​ ​Social functions (comments area is low in APP)
-​ ​When traffic is large, it’s hard to interact and be noticed
2)​ ​User experience: Advertising during the video
3)​ ​Increase revenue: membership services
-​ ​How to attract users to pay for the membership
-​ ​How to balance the conflict between membership & Ads revenue

How to solve:
1)​ ​UX design improvement: Ads algorithms, social functions (comments, messages)
2)​ ​Diverse membership services

Competitor: Netflix/Hulu/HBO with membership/subscribe fee

Pro:
-​ ​shortvideo by ordinary users
-​ ​advertisement advantage
-​ ​ sers can watch and upload
u
Cons:
-​ ​howto survive as a smaller youtuber
-​ ​large
work of content management
-​ ​ embership revenue
m

How to implement:
Difficulty:
-​ ​cost of new algorithm
-​ ​lot of research on user experience on social network

How to validate the solution (metrics):


-​ ​Traffic : Daily active users
-​ ​Engagement: Total APP time
-​ ​Retention & Referral
-​ ​Cost & revenue: # of video production, Youtuber income, advertisement ROI
-​ ​Survey on user experience

9. 产品组在**国家launch了一个campaign, 目标比如说是增加 xx metric。 你怎么判断xx 的


增加是由于这compaign导致的?

Clarify: is there a control group or we launched this campaign for the whole country?
If yes: how do we assign companies/users into the two group?
If no: observational study (average treatment on treated/LATE)
Matching + Difference-in-Difference
1) pick up a proper control and treatment group: who is the target of the campaign?
is the whole country available for the campaign? who is eligible for the campaign
2) collect data before and after the campaign
3) propensity score matching
4) DiD + parallel check

10. 怎么吸引日活

1) Easy Onboarding: Providing a seamless onboarding experience can help


significantly reduce abandonment rates. The more difficult it is to begin using an
app – too many steps to sign up, too many information fields, complex
features/functions, etc. – the more likely users are to abandon it.
2) Use Push Notifications (The Right Way)
3) Include Elements of Mobile Personalization: mobile personalization is arguably
one of the most important aspects of a compelling application. Personalization
helps provide a more unique, relevant experience to the user. The more aligned
the experience is with a user’s needs and preferences, the more likely they are to
continue to use the application.
4) Offer and Incentivization Program: drive engagement and retention you need to
give users incentive to use your app. Mobile-specific rewards, specialized content
access, coupons, special promotions, and other offers ​will help drive conversions
and encourage engagement.
5) Encourage Two-Way Communication: asking your users for feedback will show
that their input is being considered to drive the app in the direction they want it to
go. The added benefit of opening these lines of communication with your users is
that they won’t be as likely to post a negative review on the app stores if they can
tell you first. Showing responsiveness and addressing any questions or concerns
will boost your engagement and retention rates, encourage positive reviews, and
build long-term brand loyalty.

11. 如果一个product,你有以前几次feature launch时候的ticket数据(类似于bug啊系统报错


啊什么的),还有这些ticket用了多少人什么人多长时间解决的,问你会和tech support
team提什么建议。

people allocation strategy,把人放到更多tickets出现的device上去

12. Cohort analysis

A cohort is a group of users who share a common characteristic that is identified in this
report by an Analytics dimension. For example, all users with the same Acquisition Date belong
to the same cohort. The Cohort Analysis report lets you isolate and analyze cohort behavior.
Cohort analysis helps you understand the behavior of component groups of users apart
from your user population as a whole. Examples of how you can use cohort analysis include:
1. Examine individual cohorts to gauge response to short-term marketing efforts like
single-day email campaigns.
2. See how the behavior and performance of individual groups of users changes day to day,
week to week, and month to month, relative to when you acquired those users.
3. Organize users into groups based on shared characteristics like Acquisition Date, and
then examine the behavior of those groups according to metrics like User Retention or
Revenue.

e.g. launch a new feature/event and acquire some new users. The new users in this month is a
cohort, and we can compare with the cohort acquired earlier this year on their revenue/retention
rate, so that we can check the effect of the new feature/event.

13. 一个新来的manager觉得 工作时间更长的接线员, 顾客的满意度更高。问你怎么决定这个


推测是否正确,用什么样的数据, 怎么分析。

● 每天的工作时间更长
Group 1: require operator to work 10 hours a day with 30$/hour
Group 2: require operator to work 6 hours a day with 20$/hour
Group 3: let the operators to choose to work 6 or 10 hours a day with 20/30$
check id there are significantly difference on customer’s satisfaction level in each
group, so that we can test what factor actually influence it.
● 工作年限更长: match the operators with the features that may influence the length
of working experience, such as education, communication ability, proficiency of
problem solving etc. after finding the matched operators, regress the satisfaction
level on the years of experience.

14. 某个全球的sales 比例升了 有没有可能按照地区来分结果是这个sales 比例下降的

Simpson’s Paradox: there are different subgroups in the population, and within each
subgroups, the results are stable. But when aggregated, the mix of subgroups will drive
the results.
* reason: wrong setup (sanity checks), change affects different groups differently
(when by user id, then the change may increase events of new users more than
experienced users)
* frequent subgroup difference:
1) weekends vs week day
2) new users vs experienced users
3) men/women or other demographic groups
4) region, language
* how to make a decision: dig deeper to check why there is a different effect to
different users

Total Subregion 1 Subregion 2 Subregion 3

Before 35% 90% 10% 5%

After 50% 50% 50% 50%

15. 想在google map 上面给coffe shop开发一个新的button, 这个button 表示schedule 的功能


(比如通过点击这个button可以带你到新的页面,然后你可以在新的页面安排自己的schedule
,或者是把这个schedule直接连接到你的calendar上面之类的)。问: 这个新的feature 怎么
知道好不好(或者是说问我值不值得设计这个功能​)
Research questions: why we want to design such a new feature? Do we receive
any complaints, or we want to attract new users, or we want to increase retention rate, or
we want to increase revenue (partnership with business)
1) qualitative research
● survey: how do they notice it, how do they learn to use it, what functions
do they use, how do they like it
● usability test: give instructions and test how can they fulfill each task +
questions on user experience afterward
2) quantitative research
● AB test: retention rate, ROI, complaint rate
● matching and difference in difference: who would like to use it, or what
function have higher CTR
● logistic regression

16. 工作的时候你的customer来找你request一个project,你应该问customer什么问题

1) reason to conduct a research - goal of the project


2) what are the major outcomes they care about
3) the details in the change and what does each change means
4) what are some other change than may influence the outcome at the sametime
5) deadline and who to contact with if have questions

17. 一个新游戏出来了,怎么去估计下载量

1) user size and general downloads scale from past experience


2) how many download channels/platfroms we have and what is the
general/average downloads in the past

18. 一个新软件更新了,怎么衡量这个更新对软件用户使用的影响

1) AB test:
- based on what kind of influence we care about (for example, if we care
about consumption, then how long people use the APP; if social network,
then how many comments or messages people send; if producing or user
experience etc.), pick up the metric and randomly select users to do the
AB test.
- may need to differentiate users using Android or iOS (and need to think
about how to pick up the treatment group because some users upgrade
by themselves and other might be forcefully upgraded)
- use a Z test to test the difference between the two groups and compare
with the minimum detectable effect we expect to see
- sanity check (AA test), sign check
2) observational studies
- obtain data before and after the upgrade; again, pick up the proper metric
based on research goal and what exact changes have been made
- matching and difference in difference approach
- parallel check and calculate the result
3) qualitative research: UX research such as survey/interview, or usability test.

19. google self-driving car technique, what business model to monetize it? what challenges
do you foresee?

Business model:
1) personal viechle
2) ride services (urber, taxi, bus, commute, parking)
3) transportation and logistics
Challenge:
1) how to decrease the error rates in indentifying (safety issues)
2) the cost of such technology
3) the accuracy of navigation and maps
4) the cooperation with ​urban infrastructure system​ and the cost of it
5) how to protect information and data privacy

20. 假设google express新来了个VP, 这个VP认为google express卖洗发水的利润率低于市场


平均,问如何衡量这个是否正确。有两个图,一个图是express和别家卖的洗发水牌子的分
布,另一个图是每个洗发水牌子的利润率。

How to check: Z test on average profit rate between google express and the population
average (one sample z test or t test depending on sample size, which is how many shampoo
brands we have).
Why this happens: Simpson’s Paradox (a trend in subgroups can disappear or reverse
when aggregate together). There are some brands with high profit rate, but customers in google
express buy less; they may purchase more on brands with lower profit rate.

21. Gmail想launch一个新功能,自动推荐邮件收件人,问怎么衡量我们是否要launch,怎么设
计实验

1) 先clarify目标,比如是为了吸引新用户?还是增加现有用户的engagement?
2) 通过这个确定相关metrics,再确定怎么衡量
3) randomly select users into the control and treatment group if possible. If not, may
need to choose the users based on the goal
4) run the AB test (Z test) to find the general results and compare it to the minimum
detectable effect size we expect
5) 再把对象分组看是否有不同的结论
6) 最后总结根据不同的outcome,有什么action plan

22. Google scholar,有什么metric来衡量它的service,比如一篇文章1000次下载有一次fail了


用户就会不开心

1) successful download rate


2) how many search for a user before finding and downloading the correct paper
3) time of loading all search results
4) how many complaints received
5) UX research data from opinion survey

23. 我们现在想要update google maps上的信息,比如google maps里有些地址,电话号码都


是旧的,我们现在想要更新,你怎么看?

1) First I would like to understand what is the value of this product.


- Why do users use such information? What about the usage for
merchants? What about for the company, ad revenue?
- where is this research question come from? User’s complaints or the
decreasing revenue?
2) How to measure such value?
a) What metrics?
- For users, we can track DAU, Events per user.
- For merchants, we can check complaints rates
- For Google, we can track ad revenue

b) Does updating info improve these metrics?


- any history data? compare places with contact info and without?
- design AB test or observational study (matching and DiD)
3) Decision?
- How much incremental value does it bring?
- How much investment in engineering and operation is required?
- Does the benefit justify the investment?
- Make a call with business partners.

24. 如果只用一个metric来评价一个新产品是否成功 应该选哪个

It depends on the goal of the research and the goal of the product design.
1) user engagement: total use time, freuquency of open the APP
2) social network: use of like, comments, share, msg features
3) increase retention: 7/14/28/180/365 rentention rates
4) attract new users: # sign up and referral
5) solve a specific probelm in the old product: complaint rates
6) monetize: revenue

25. 如果launch一个新的UI,在6/1 release的,现在又release前半年的数据和后半年的数据 问


你如何评价这个新的UI怎么样

1) Have T and C group: AB test/difference-in-difference


2) No T and C group: fixed effects model: outcome (engagement metrics) ~ binary
treatment variable + date (in month) + control variable
3) qualitative research
* matching and difference in difference
- obtain data before and after the upgrade; again, pick up the proper metric based
on research goal and what exact changes have been made
- matching and difference in difference approach
- parallel check and calculate the result

26. 怎么预测有多少人会sign up youtube tv,需要什么样的数据,用什么方法预测。

1) try to estimate the size of market/subscribers from competing products


2) design experiment and do a smaller scale experiment and collect data about who
sign up for the service
3) use the data to match some users who are similar but not sign up so that we
have the binary label for prediction
4) user machine learning approach to forecast based on some important features of
training set and then make the prediction

27. 如果想对网页做个大改动, 怎么样决定要不要release 这个change / 现在有个面向企业的新


产品做promotion吸引用户 ,怎么测那个promotion 对产品的影响,怎么测有没有效果,你
用什么方法分析。

1) AB test
2) observational study: matching and difference in difference
- obtain data before and after the upgrade; again, pick up the proper metric
based on research goal and what exact changes have been made
- matching and difference in difference approach
- parallel check and calculate the result
3) cohort analysis for promotion (compare the new users from the promotion with
regular new users before on their retention and revenue)

28. Google has released a new version of their search algorithm, for which they used A/B
testing. During the testing process, engineers realized that the new algorithm was not
implemented correctly and returned less relevant results. 2 things happened during
testing:
a. People in the treatment group performed more queries than the control group.
b. Advertising revenue was higher in the treatment group as well.
Q1: What may be the cause of people in the treatment group performing more searches
than the control group?

We know the new algorithm produced less relevant search results. This means that users
may have to make additional searches in order to clarify what they are searching for
using the new algorithm. In order to test this hypothesis, we could study how close
searches are to each other. If we notice additional searches are done very soon after, we
could classify as them clarifying searches.

Q2: What do you think caused the new algorithm to generate more advertising revenue,
even though the results were less relevant?

1) We know that more searches are being conducted, since advertisements are
served along with every new search, there are more opportunities for users to
click on the advertisement.
2) Another possibility is that the search algorithm is different than the algorithm used
to display ads. In this case, the ads themselves may be more relevant than the
search results, causing more ad clicks.

Q3: Since the less relevant algorithm resulted in higher advertising revenue, should it be
implemented anyways?

1) The effects described are probably only short-term effects due to the problems
with the algorithm. We shouldn’t sacrifice the long term potential of the site for a
temporary increase in revenue and searches. Google is probably best positioned
to win in the long term when it has the most relevant search algorithm.
2) Also depend on what is the major goal of the research, which is to increase the
revenue or user’s experience on searching
3) Refer to diverse decision makers (re-evaluate how important is the metric), and
use their intuition & experience: maybe it’s fine to have one metric significant and
the other one do no change significantly (could be wrong if the two metrics have
big changes together)

29. A car company produces all the cars for a country, we’ll call them Car X and Car Y. 50%
of the population drives Car X, the other 50% drives Car Y. Two potential technologies
have just been discovered that help reduce gasoline usage!
Technology A: Increases the MPG of Car X from 50 MPG to 75 MPG
Technology B: Increases the MPG of Car Y from 10 MPG to 11 MPG
Which technology should be implemented to save the most gasoline for the country?

Assume that the average commute distance for a car is D. Then we define total gas used
(G) as G=D/MPG. So the Total Gas Used Change with Policy A: (D/50)-(D/75) = D / 150
Total Gas Used Change with Policy B: (D/10)-(D/11) = D / 110

30. In 2011 Facebook launched Messenger as a stand alone app for mobile devices (it used
to only be part of the Facebook App). How would you track the performance of this new
application? What metrics would you use?

1) DAU, MAU, and new user sign-ups


2) Total time spent in APP (more time is being spent on the app because it is harder
to use)
3) Number of messages being sent before and after change.
4) Response time to messages received
5) Retension

31. Facebook is testing different designs of the user homepage. They have come up with 50
variations. They would like to test all of them and choose the best. How would you setup
this experiment? What metrics would you calculate and how would you report the
results?

T-test Among Pairs of Treatment


1) This is similar to an AB test except we randomly assign all 50 versions to users.
With so many versions you will need to make sure you have enough users to get
a statistically significant result (this usually is not an issue for Facebook)
2) With multiple hypotheses test you increase the likelihood of a rare event
occurring, meaning you need to adjust your alpha value you use to compare your
p-value against for significance accordingly.
- for X metrics, what is the prob of at least 1 false positive? That is,
assuming independence: p(FP=0)=0.95^X, p(FP>=1) = 1-p(FP=0). If
correlated, then p(FP>=1) over estimated
- Bonferroni correction: simple to calculate & no assumption, but can be too
conservative because multiple metrics can move together. To calculate:
individual alpha = overall alpha / n

SQL
1. date | user_num, 问如何找到top 100 week over week increase/drop 其实后来只要我列出
wk-1 和wk之间的increase和dropper,用dateadd 或者datediff就行了

SELECT tb2.weeknum, ABS(tb2.cnt-tb1.cnt) as change_by_last_week


FROM

(
SELECT tb1.weeknum,
SUM(tb1.user_num) over(partition by tb1.week_num) as cnt
FROM

(
SELECT *,
WEEK(date) as week_num
FROM table
) tb1

) tb2

LEFT JOIN

(
SELECT tb1.weeknum,
SUM(tb1.user_num) over(partition by tb1.week_num) as cnt
FROM

(
SELECT *,
WEEK(date) as week_num
FROM table
) tb1

) tb3 on tb1.weeknum=tb2.weeknum-1

ORDER BY change_by_last_week DESC


LIMIT 100

2. view|video_id|date,要求找出最早的达到100个view的每个视频的日期。

# 如果view是累计观看数:
SELECT video_id, min(d_date) as first_day_100
FROM table
WHERE view>=100
GROUP BY video_id

# 如果view是每天新产生的观看数:
SELECT tb.video_id, min(tb.d_date) as first_day_100
FROM

(
SELECT video_id,
sum(view) over(partition by video_id order by d_date) as cnt,
d_date
FROM table
) tb

WHERE tb.cnt>=100
GROUP BY tb.video_id

3. when ur given a database, what do u look at at the very beginning​?

quality of data, accuracy, descriptive data, missing data, outlier

4. When joining multiple tables is slow, what do you do?

spark sql, more filter, optimize query, 读取table的时候只选取必要的column; join table 的


时候注意大表小表哪个join哪个,然后设置index

5. 有三个variables: user id, date, transaction id, 需要些sql to show user id that has
transactions in Nov 2017 but not in Sep 2017

SELECT t1.user_id
FROM

(
SELECT *
FROM table
WHERE day like "2019-11"
) t1

LEFT JOIN

(
SELECT *
FROM table
WHERE day like "2019-09"
) t2 on t1.user_id=t2.user_id
WHERE t2.user_id is NULL

6. Table【in_app_purchase】:
uid: unique user id.
timestamp: specific timestamp detailed to seconds.
purchase amount: the amount of a one-time purchase.
This is a table containing in-app purchase data. A certain user could have multiple
purchases on the same day
Question 1: List out the top 3 names of the users who have the most purchase amount
on '2018-01-01'

SELECT tb.uid, tb.total_amount


FROM

(
SELECT uid, sum(amount) as total_amount,
DATE(from_unixtime(DATE, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
) tb

WHERE tb.d_date="2018-01-01"
ORDER BY tb.total_amount
LIMIT 3

Question 2: Sort the table by timestamp for each user. Create a new column named "cum
amount" which calculates the cumulative amount of a certain user of purchase on the
same day.

SELECT A.uid, A.d_date,


SUM(A.purchase_amount) OVER(partition by A.uid ORDER BY A.d_date) rows between
unbounded preceding and current row as cum_amt
FROM

(
SELECT uid, DATE(from_unixtime(DATE, '%Y-%m-%d %H:%i:%S')) as d_date,
purchase_amount
FROM in_app_purchase
)A
Question 3: For each day, calculate the growth rate of purchase amount compared to the
previous day. if no result for a previous day, show 'Null'.

SELECT A.d_date,
CASE WHEN B.d_date IS NULL THEN NULL
ELSE (A.total_amount - B.total_amount) / B.total_amount
END AS growth_rate

FROM

(
SELECT sum(amount) as total_amount,
DATE(from_unixtime(1515980716, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
)A

LEFT JOIN

(
SELECT sum(amount) as total_amount,
DATE(from_unixtime(1515980716, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
) B on A.d_date=B.d_date+1

Question 4: For each day, calculate a 30day rolling average purchase amount.

SELECT tb.d_date,
AVG(tb.total_amount) OVER(ORDER BY tb.d_date ASC
rows between 29 preceding and current row) as AVG_30

FROM
(
SELECT sum(amount) as total_amount,
DATE(from_unixtime(DATE, '%Y-%m-%d %H:%i:%S')) as d_date
FROM in_app_purchase
GROUP BY d_date
) tb

ORDER BY tb.d_date

7. 一个table有三列,user_id, ad_id, time_stamp。问对于每个user-ad pair,找出如果一个用


户看了一个广告超过两次,那么最后一次以及倒数第二次看广告之间的时间差。

SELECT A.user_id, A.ad_id,


MAX(A.last)-MAX(A.previous_t) as diff
FROM

(
SELECT t1.user_id, t1.ad_id, t2.timestamp as last
LAG(t2.timestamp) OVER(PARTITION BY t1.user_id, t1.ad_id ORDER BY t2.timestamp) as
previous_t
FROM
(
SELECT user_id, ad_id
FROM table
GROUP BY user_id, ad_id
HAVING COUNT(ad_id)>2
) t1

JOIN

(
SELECT *
FROM table
) t2 ON t1.user_id=t2.user_id AND t1.ad_id=t2.ad_id
)A
GROUP BY 1, 2

11. 用SQL计算留存率、点赞率、好友请求接受率

1) 留存率
SELECT A.first_day AS install_dt,
COUNT(DISTINCT A.player_id) AS installs,
ROUND(SUM(IF(B.player_id is null,0,1))/COUNT(DISTINCT A.player_id),2) as Day1_retention
FROM

(
SELECT player_id, MIN(event_date) as first_day
FROM Activity
GROUP BY player_id
)A

LEFT JOIN

(
SELECT player_id, event_date
FROM Activity
) B on A.first_day=B.event_date-1 AND A.player_id=B.player_id

GROUP BY A.first_day

12. 给一个table,有column: productid, platform, revenue , 找top 10 revenue by


productid, platform

SELECT revenue, productid, platform


FROM table
GROUP BY productid, platform
ORDER BY revenue DESC
LIMIT 10

SELECT A.revenue, A.productid, A.platform


FROM
(
SELECT revenue, productid, platform,
RANK() OVER(PARTITION BY productid, platform ORDER BY revenue DESC) as rank
FROM table
)A
WHERE A.rank<=10

13. 给一个table,有column: productid, developer name,找top 10 revenue by developer


,这里需要考虑要不要用left join,因为有可能其中一个table的primary key 不在另一个
table

SELECT IFNULL(t1.revenue,0) as revenue


FROM t2
LEFT JOIN t1 ON t2.productid=t1.productid
GROUP BY developer
ORDER BY 1 DESC
LIMIT 10

14. 求所有developer的revenue,陷阱是有的revenue 是0 所以没有record在第一个table,


但是有developer name在另一个table,就考你有没有考虑

SELECT t2.developerid, IFNULL(t1.revenue,0)) as revenue


FROM t2
LEFT JOIN t1 ON t2.productid=t1.productid
GROUP BY developer

15. 给了一个product table,里面大概有order date, user id, item name, price, quantity,
country,问求Germany卖得最多的item,记得不是很清楚,大概是这么个题

SELECT item_name, SUM(quantity) as total_q


FROM table
WHERE country=’Germany’
GROUP BY item_name
ORDER BY 2 DESC
LIMIT 1

16. 有个youtube用户播放量的表,date和user_id是全的,如果当天用户没有浏览,view就
是0,column有date, user_id, view
SQL1:找出某个指定月的active user(播放量大于0)

SELECT DISTINCT user_id


FROM table
WHERE view>0
AND date=’2019-12-02’

SQL2:找出每一天,当天最近7天内有浏览过视频的用户

SELECT date, user_id


FROM

(
SELECT user_id, date,
SUM(view) OVER(partition by user_id order by date ASC rows between 6 preceding and current
row) as view_in_7days
) tb
WHERE view_in_7days>0

https://www.1point3acres.com/bbs/page-322459.html

You might also like