Feedback Loop与Causal Inference

2021-02-22 本文已影响0人 shudaxu

系统越来越复杂
其中很多的模型，策略，决定了先验（输入数据的分布），导致后续模型的输入，其实本身是有偏的。（系统越复杂，偏越大）。Feedback loop其中一大问题即是：如何解决这些bias，近年Debiasing[3]也成了业界研究的热点：

Debiasing

研究这类问题，我们的思路是什么。其中一个比较好的落脚点就是：causal inference。譬如在一个特定的时刻，可以用看待Confounder 的视角（思路），来研究我们当前系统策略（相当于confounder）对模型的影响。

其实很多问题，都可以用Causal Inference中Confounder bias的视角来解释，因为他们都旨在消除这些bias。比如OLS与WLS的问题：为什么residual与变量相关的情况下，OLS就是biased[4]？如何使用WLS来解决[5]？如何验证ESMM有什么根本性问题，为什么是biased[3]，如何解决[8]？为什么我们需要对高频user/item打压，popularity bias的问题[6]？在uplift modeling的过程中，我们为何需要无偏样本，confounder会造成怎样的影响[9]？广告ctr估计中，对样本校准的几种方式，weighting 策略如何保证无偏[14]？Selection bias分类有哪些，哪些属于confounder，哪些问题可以Recover[15]?什么条件下Y on X is unconfounded given(conditioning on) S[17]，这个条件与back-door path有什么关系[16]，如何对estimator做adjustment[18]？

Evaluation

譬如我们在AB实验中遇到无法完全随机的情况下，Confounder如何解决[7]？

Dilemma

很多时候，还是需要抓住主要矛盾才行，不能盲目追求debiasing。这里还是老问题，unbiased estimator一定比 biased estimator好吗[11]？所以这里其实是一个trade-off，因为cofounder本身会导致特征空间更多的共线性，导致模型纬度的提升，系数的不稳定性，以及潜在的variance增大[12]。很常见的就比如PCA，其本身便会直接导致confounder 的问题，但是仍然是有效的，因为很多时候当你需要追求更稳定有效的estimator的时候，bias不一定会是首要问题[13]。
另外一个核心的问题就是，causality本身，来自于我们的domain knowledge，并不来自于数据本身[19]。即causal graph本身的建模至关重要，因为它不是从数据中得出的，所以一定是融合了我们的先验知识，这些先验如果有误，最后的结论一定也是受影响的。

Refer:
[1]How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility

[2]:关于Propensity Score，https://en.wikipedia.org/wiki/Propensity_score_matching
以及https://www.methodology.psu.edu/resources/propensity-scores/（关于为什么我们要控制PS，而不是直接控制covariates？是因为直接控制covariates是一个过强的手段【强制假设covariates一致情况下PS才一致】）
Propensity Score的根本思路就是：完全randomized experiment是可信的，但是很多场景下（譬如observational experiment）assignment of treatments 并非随机的。所以我们将影响treatment的covariates考虑进来，通过其来预估“被treatment”的概率，来模拟一个“完全随机”的过程，以此来reduce [bias] due to [confounding]（其实就是selection bias）
PS相关的方法分为多种：

Propensity Score Matching：利用PS来进行matching，（一般实验组较少）即从对照组抽取与实验组Propensity Score相似的subset进行对比。
Propensity Score Subclassification：根据Propensity Score对样本进行分类（分层）分成多个subset，对ps相似的组（层）分别进行对比。
Propensity Score Weighting： weighting the treated ones by $\frac 1 {P(T=1)}$ and the untreated ones by $\frac 1 {P(T=0)}$ .

[3]Bias消解：https://www.jianshu.com/p/35ad4e5b5b21，最后一节，关于confounder。以及Refer[2]中，后面相关的文章。
以及更早的一篇bias相关的文章：https://www.jianshu.com/p/be1383b7b7bc

[4]:confounding bias：https://statisticsbyjim.com/regression/confounding-variables-bias/#:~:text=Omitted%20variable%20bias%20occurs%20when,which%20biases%20the%20coefficient%20estimates.
OLS例子很简单，如果confounder从模型中被排除，由于其与dependent variable相关，所以新的模型的residual会与confounder相关，而confounder与independent variable相关，所以residual与independent variable就相关了，不满足OLS的前提假设：homoscedasticity。见：https://www.jianshu.com/p/c553bbffc99f
PS：所谓confounding (omitted) variables，必须是同时与dependent variable与independent variable都有关系。如果omitted variables与independent variable是无关的（no correlation），那么排除（excluding）它们不会造成estimator的bias。（非常好理解，排除仅与dependent变量相关的variables可能只是variance增大）
exclude confounder造成bias的原因：the omitted variable forces the model to attribute the effects of the excluded variable to the one in the model.（当我们进行完全randomized的实验时，confounder会均匀地分布在实验组中）

[5]:Weighting Regressions by Propensity Scores.

[6]:一个比较直觉的认知：https://www.jianshu.com/p/dc0f33079391，更theoretical的认知，关于popularity bias，也可以理解为propensity score(weighting)视角：Item Popularity and Recommendation Accuracy

[7]小样本实验中，本质问题也是一样的，存在一些变量与结果相关（影响gmv，pv），并且与分组也是相关的（在treatment变量上未均匀分布）其他视角：https://www.jianshu.com/p/6e391a9ea367，cofounder视角：https://www.cnblogs.com/gogoSandy/p/11796536.html，https://cloud.tencent.com/developer/inventory/9162

[8]:An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies
treatment assignment to be strongly ignorable的条件：(能用IPW解决的条件）
Y为response，Z为treatment，X为covariates
$Y \perp Z|X$
$0<P(Z=1|X)<1$
第一个条件被称作：no unmeasured confounders：所有同时影响outcome与treatment assignment的变量都被考虑了。
IPW具体使用方式可见：Recommendations as Treatments: Debiasing Learning and Evaluation，IPS is unbiased for any prob assignment mechanism。
在cvr估计的问题中，treatment assignment（click/rating）是user-self select的， $P(Z=1|X)$ 是propensity score，即观测到样本的概率（即pCtr）。

[9]这种场景中，有偏样本很常见。如果我们构造样本时不注意，bias的问题会极大地影响结果。

[10] abtesting 中的方差分析，回归分析。协方差分析。abtesting中非随机情况下（或者observational experiment），如何做推断与分析，此处propensity score的运用。

[11] https://www.jianshu.com/p/35ad4e5b5b21中最后一个陷阱。

[12]这里也类似bias and variance的trade-off。即Multicollinearity causes its own problems including unstable coefficient estimates, lower statistical [power], and less precise estimates

[13]很多情况下，bias不是很大的问题。譬如在样本采样训练中，非完全随机的采样（比如信息流转化的估计，样本太少需要对负样本单独负采样）也会导致bias，但是在某些排序任务中，这些bias并不是首要问题。

[14]:https://www.jianshu.com/p/6b26798f4903中weighting的策略，其实也可以理解是inverse propensity weighting，因为数学上他们是等价的。论文见： Logistic Regression in Rare Events Data. Political Methodology. 2002。IPW见Recommendations as Treatments: Debiasing Learning and Evaluation中。当然，也是有差异的，因为采样的mechanism是已知的，而很多需要IPW的情景中，propensity可能是未知，需要通过covariate建模来估计的。

[15]:Recovering from Selection Bias in Causal and Statistical Inference。注：其中graph1中的(d)类型无法recover，cvr的bias不属于此，因为click与x相关（有correlation），注意correlation与causality不同。

[16]:causal inference，可以通过对条件概率的marginalization完成：Marginal integration for nonparametric causal inference。但是需要满足一定的条件，backdoor criterion：
https://medium.com/data-for-science/causal-inference-part-xi-backdoor-criterion-e29627a1da0e

[17]:CAUSAL DIAGRAMS 【Sander Greenland】中的Selection bias and Confounder
[18]:Observational Studies and Confounding [Matthew Blackwell] 中：Estimating causal effects under no unmeasured confounders。
Controlling Confounding Bias，3.3.2 Back Door Adjustment。
Marginal integration for nonparametric causal inference。
[19] https://towardsdatascience.com/use-causal-graphs-4e3af630cf64.data doesn't speak itself

Feedback Loop与Causal Inference

Debiasing

Evaluation

Dilemma

猜你喜欢

热点阅读