小洁详解《R数据科学》--第21章 图形化沟通
1.准备工作
library(ggplot2)
library(tidyverse)
##另外两个扩展包用显式引用,不加载了
2. 标签
也就是label 可以实行的修改有:
(1)标题类
标题title、副标题subtitle、右下角caption
#书上的代码有误,paste没加逗号而且少一个括号
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = class)) +
geom_smooth(se = FALSE) +
labs(title ="Fuel efficiency generally decreases withengine size")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
加上副标题和右下角:
#书上的代码又是多处错误
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = class)) +
geom_smooth(se = FALSE) +
labs(title = "Fuel efficiency generally decreases with engine size",
subtitle =
"Two seaters (sports cars) are an exception because of their light weight",
caption = "Data from fueleconomy.gov"
)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
(2) 坐标轴和图例标题
默认的坐标轴、图例标题都是变量名
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = class)) +
geom_smooth(se = FALSE) +
labs(
x = "Engine displacement (L)",
y = "Highway fuel economy (mpg)",
colour = "Car type"
)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
坐标轴标题用数学公式替代
df <- tibble(
x = runif(10),
y = runif(10)
)
ggplot(df, aes(x, y)) +
geom_point() +
labs(
x = quote(sum(x[i] ^ 2, i == 1, n)),
y = quote(alpha + beta + frac(delta, theta))
)
3. 注释
文字注解,geom_text() 结合label
跳过有问题的 直接看比较好的啦
亮点:圈出了想要标记的点,自动调整标签位置以免重叠
best_in_class <- mpg %>%
group_by(class) %>%
filter(row_number(desc(hwy)) == 1)
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = class)) +
geom_point(size = 3, shape = 1, data = best_in_class) +
ggrepel::geom_label_repel(
aes(label = model),
data = best_in_class
)
不显示图例,直接把颜色分类依据覆盖在原图上
class_avg <- mpg %>%
group_by(class) %>%
summarize(
displ = median(displ),
hwy = median(hwy)
)
ggplot(mpg, aes(displ, hwy, colour = class)) +
ggrepel::geom_text_repel(aes(label = class),
data = class_avg,
size = 6,
label.size = 0,
segment.color = "blue"
) +
geom_point() +
theme(legend.position = "none")
## Warning: Ignoring unknown parameters: label.size
我发现我运行出的结果和中文书英文书结果都不一样,这就比较怀疑人生。然后简单研究了一下ggrepel这个包,发现中文书的代码应该是把geom_label_repel改成了geom_text_repel,而segment.color指的是引导线的颜色,根本没必要设置,因为设置的点比较少,并不会出现引导线。(后来又感觉毫无意义)
关于引导线的理解运行下面代码即可,改动下面的颜色就可以知道设置的是哪里啦。
ggplot(mtcars,
aes(wt, mpg, label = rownames(mtcars), colour = factor(cyl))) +
geom_point()+
geom_label_repel(aes(fill=factor(cyl)), colour="white", segment.colour="blue")
(label <- mpg %>%
summarize(
displ = max(displ),
hwy = max(hwy), #这个是最右上角那个点的坐标
label =
"Increasing engine size is \nrelated to decreasing fuel economy."
))
## # A tibble: 1 x 3
## displ hwy label
## <dbl> <dbl> <chr>
## 1 7 44 "Increasing engine size is \nrelated to decreasing fuel eco…
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_text(
aes(label = label), #label那一列赋值给label参数
data = label, #label数据框
vjust = "top",
hjust = "right" #这是让点位于label的右上角
)
让label紧贴右上角
#又是一段paste错代码
label <- tibble(
displ = Inf,
hwy = Inf,
label =
"Increasing engine size is \nrelated todecreasing fuel economy."
)
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_text(
aes(label = label),
data = label,
vjust = "top",
hjust = "right"
)
4.标度
指的是坐标轴上的刻度、名称和图例单行的名称
图例布局:整体位置通过theme(legend.position)来设置,left、right、top、botttom和none可选。
颜色映射可用scale_color_brewer(palette = "")调整, http://colorbrew2.org 给出了很多分类标度使用的配色, 使红绿对比更强烈,即使患有红绿色盲症也可区分。 scale_color_viridis()则给出了连续模拟(成为渐变色)
形状映射则可以在黑白图中发挥作用。
5.缩放
coord_cartesian()可设置 xlim 和 ylim 参数值,以确定缩放范围。这是局部放大,模拟曲线还是按照所有数据来做的,不同于取子集。
同一张图拆分(缩放)为系列图形,使用同样的标度:
先将标度设置好并赋值,后添加。
suv <- mpg %>% filter(class == "suv")
compact <- mpg %>% filter(class == "compact")
x_scale <- scale_x_continuous(limits = range(mpg$displ))
y_scale <- scale_y_continuous(limits = range(mpg$hwy))
col_scale <- scale_color_discrete(limits = unique(mpg$drv))
ggplot(suv, aes(displ, hwy, color = drv)) +
geom_point() +
x_scale +
y_scale +
col_scale
ggplot(compact, aes(displ, hwy, color = drv)) +
geom_point() +
x_scale +
y_scale +
col_scale
6.主题theme
默认的主题是灰色背景白线格子,可通过theme修改
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = class)) +
geom_smooth(se = FALSE) +
theme_bw()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
7.保存图形-ggsave()
ggplot我写了详细的一个系列,应该也很好理解_