数据处理
1、缺失值填充
代码调包示例.png
ps,链接:https://blog.csdn.net/shunqixing/article/details/80045189
2、sklearn预处理包:https://blog.csdn.net/weixin_40807247/article/details/82793220
3、数据归一化:https://blog.csdn.net/bbbeoy/article/details/70185798
4、异常值检测:https://blog.csdn.net/panda_zjd/article/details/71810859
4.1、异常值替换法:统计学Bootstrap方法:https://www.applysquare.com/topic-cn/QlJiMm6X5/ ; 连续变量用均值替代,离散变量用中位数/众数替代。
异常值检测及处理:https://wenku.baidu.com/view/4c9a5a13910ef12d2bf9e703.html
https://wenku.baidu.com/view/d75046d176eeaeaad1f330aa.html
差分:https://segmentfault.com/q/1010000005888855
OneClassSVM【异常值检测、解决极度不平衡数据】:https://www.cnblogs.com/coshaho/p/9925862.html
https://www.cnblogs.com/damumu/p/7320334.html
https://blog.csdn.net/YE1215172385/article/details/79750703
https://blog.csdn.net/tandelin/article/details/88784501
箱线图(tukey's method)检验异常值:https://blog.csdn.net/zhuiqiuuuu/article/details/82721935
SKLEARN异常检测:https://blog.csdn.net/hustqb/article/details/75216241
5、数据仓库:星型VS雪花型:https://blog.csdn.net/ecjtuxuan/article/details/6273983
6、数据分布假设性检验:https://segmentfault.com/a/1190000007626742
正态分布检验:https://blog.csdn.net/cyan_soul/article/details/81236124