The Netflix Recommender System:

2019-04-30  本文已影响0人  xiiatuuo

用户研究发现netflix的用户在一到两屏看过10-20个title之后,在60s-90s过后就会失去兴趣。推荐系统的目的就是在两屏之内让用户找到感兴趣的东西。
how each member watches (e.g., the device, time of day, day of week, intensity of watching)
有这么几种推荐策略:
1)Personalized Video Ranker
orders the entire catalog of videos (or subsets selected by genre or other filtering) for each member profile in a personalized way。
Because we use PVR so widely, it must be good at general- purpose relative rankings throughout the entire catalog; this limits how personalized it can actually be
PVR需要对一个分类下所有的视频进行rank,需要对所有分类都进行排序,这实际上限制了个性化
2) Top-N Video Ranker
find the best few personalized recommendations in the entire catalog for each member, that is, focusing only on the head of the ranking, a freedom that PVR does not have because it gets used to rank arbitrary subsets of the catalog
TVR其实是用对头部的视频进行rank,挑出topn出来,所以方法上比PVR更自由。但是这俩其实共享了很多相同的属性,比如
3)Treding Now
used to drive the Trending Now row,有两部分情况表现很好:

  1. 商业价值
    The effective catalog size (ECS) is a metric that describes how spread viewing is across the items in our catalog.tells us how many videos are required to account for a typical hour streamed.
    ECS的计算方法如下:


    图片.png

    Notethat pi ≥ pi+1 for i=1,...,N−1and 综合为1.

  2. 衡量标准
    直觉跟线上效果不一定相关,比如“house of cards”看起来更相似的相关推荐结果效果并不如更宽泛的结果.
    we have observed that improving engagement—the time that our members spend viewing Netflix content—is strongly correlated with improving retention.
    显著性和测试的cell数量关系很大,For example, if we find that 50% of the members in the test have retained when we compute our retention metric, then we need roughly 2 million members per cell to measure a retention delta of 50.05% to 49.95%=0.1% with statistical confidence. this type of plot can be used as a guide to choose the sample size for the cells in a test, for example, detecting a retention delta of 0.2% requires the sample size traced by the black line labeled 0.2%, which changes as a function of the average retention rate when the experiment stops, being maximum (south of 500k members per cell) when the retention rate is 50%.


    图片.png

    离线测试加速迭代,Offline experiments allow us to iterate quickly on algorithm prototypes, and to prune the candidate variants that we use in actual A/B experiments.

  1. 关键问题
    1)Better Experimentation Protocols
    还是需要更好地离线和在线评测指标来综合整体的收益,比如在长期收益和短期收益的衡量上
    2)Global Algorithms
    3)Controlling for Presentation Bias
    introduce randomness into the recommendations
    4)Page Construction
    It took us a couple of years to find a fully personalized algorithm to construct a page of recommendations that A/B tested better than a page based on a template (itself optimized through years of A/B testing)
    5)Member Coldstarting
    Today, our member coldstart approach has evolved into a survey given during the sign-up process, during which we ask new members to select videos from an algorithmically populated set that we use as input into all of our algorithms.
    6)Choosing the Best Evidence to Support Each Recommendation
    highlight different aspects of a video, such as an actor or director involved in it

  2. 延伸阅读
    Learning a Personalized Homepage


    图片.png

    We want our recommendations to be accurate in that they are relevant to the tastes of our members, but they also need to be diverse so that we can address the spectrum of a member’s interests versus only focusing on one. We want to be able to highlight the depth in the catalog we have in those interests and also the breadth we have across other areas to help our members explore and even find new interests. We want our recommendations to be fresh and responsive to the actions a member takes, such as watching a show, adding to their list, or rating; but we also want some stability so that people are familiar with their homepage and can easily find videos they’ve been recommended in the recent past
    二维的多行,横着天然满足相关性,竖着天然满足多样性。
    we consider important

A simple way to add in diversity is to switch from a row-ranking approach to a stage-wise approach using a scoring function that considers both a row as well as its relationship to both the previous rows and the previous videos already chosen for the page.Other approaches to greedily add diversity based on submodular function maximization can also be used.
Diversity can also be additionally incorporated into the scoring model when considering the features of a row compared to the rest of the page by looking at how similar the row is to the rest of the rows or the videos in the row to the videos on the rest of the page.

上一篇 下一篇

猜你喜欢

热点阅读