统计绘图 | 归一化 vs 标准化

2022-08-19  本文已影响0人  shwzhao

具体请参考:
https://en.wikipedia.org/wiki/Feature_scaling
https://en.wikipedia.org/wiki/Normalization_(statistics)
CSDN | 为什么要做特征归一化/标准化?
CSDN | 标准化和归一化,请勿混为一谈,透彻理解数据变换)
公众号 | 数据处理中的标准化、归一化,究竟是什么?
公众号 | 数据标准化_z-score

先看一下3组数据处理前的分布

mtcars %>%
  select(mpg, disp, hp) %>%
  rownames_to_column("car") %>%
  pivot_longer(-car, names_to = "terms", values_to = "values") %>%
  ggplot() +
  geom_violin(aes(terms, values)) +
  theme_bw()
image.png

再看一下数据处理后的分布,进入了同一量纲,趋势看起来都比较一致,归一化有确定区间,标准化没有。

mtcars %>%
  select(mpg, disp, hp) %>%
  rownames_to_column("car") %>%
  pivot_longer(-car, names_to = "terms", values_to = "values") %>%
  group_by(terms) %>%
  mutate(normal_values = (values-min(values))/(max(values)-min(values)),
         standard_values = (values-mean(values))/sd(values)) %>%
  pivot_longer(-c(car, terms), names_to = "termss", values_to = "valuess") %>%
  filter(termss != "values") %>%
  ggplot() +
  geom_violin(aes(termss, valuess)) +
  # geom_boxplot(aes(termss, valuess)) +
  facet_wrap(~terms) +
  theme_bw()
image.png
mtcars %>%
  select(mpg, disp, hp) %>%
  scale() %>%
  head()
#>                          mpg        disp         hp
#> Mazda RX4          0.1508848 -0.57061982 -0.5350928
#> Mazda RX4 Wag      0.1508848 -0.57061982 -0.5350928
#> Datsun 710         0.4495434 -0.99018209 -0.7830405
#> Hornet 4 Drive     0.2172534  0.22009369 -0.5350928
#> Hornet Sportabout -0.2307345  1.04308123  0.4129422
#> Valiant           -0.3302874 -0.04616698 -0.6080186
mtcars %>%
  select(mpg, disp, hp) %>%
  rownames_to_column("car") %>%
  pivot_longer(-car, names_to = "terms", values_to = "values") %>%
  group_by(terms) %>%
  mutate(normal_values = (values-min(values))/(max(values)-min(values)),
         standard_values = (values-mean(values))/sd(values)) %>%
  pivot_wider(id_cols = car, names_from = terms, values_from = standard_values) %>%
  head()
#> # A tibble: 6 x 4
#>   car                  mpg    disp     hp
#>   <chr>              <dbl>   <dbl>  <dbl>
#> 1 Mazda RX4          0.151 -0.571  -0.535
#> 2 Mazda RX4 Wag      0.151 -0.571  -0.535
#> 3 Datsun 710         0.450 -0.990  -0.783
#> 4 Hornet 4 Drive     0.217  0.220  -0.535
#> 5 Hornet Sportabout -0.231  1.04    0.413
#> 6 Valiant           -0.330 -0.0462 -0.608
上一篇下一篇

猜你喜欢

热点阅读