机器学习基石笔记:13 Hazard of Overfittin
2019-05-01 本文已影响8人
cherryleechen
![](https://img.haomeiwen.com/i8016875/44b7bf250ceed95b.png)
引起过拟合的原因有:
- 过度VC维(模型复杂度高) ------ 确定性噪声;
- 随机噪声;
- 有限的样本数量
。
![](https://img.haomeiwen.com/i8016875/486f60fb702e8c7f.png)
![](https://img.haomeiwen.com/i8016875/7a4616d8bf2344d8.png)
通过具体实验来看模型复杂度/确定性噪声、随机噪声
、样本数量
对过拟合的影响:
![](https://img.haomeiwen.com/i8016875/dca32d8eee690f35.png)
![](https://img.haomeiwen.com/i8016875/60fe5039c2dde0be.png)
![](https://img.haomeiwen.com/i8016875/84f3671c74825fb3.png)
![](https://img.haomeiwen.com/i8016875/bcccb21bb629303e.png)
避免过拟合的常用方法:
- 从简单模型开始:降低模型复杂度;
- data cleaning/data pruning:去noise;
- data hinting(线索):增加样本数量;
- regularization:正则化;
- validation:验证。
![](https://img.haomeiwen.com/i8016875/c174ae5e91e42580.png)
![](https://img.haomeiwen.com/i8016875/a690fadc7e0c5f38.png)