ggplot2 绘图总结

2019-07-16 本文已影响37人 Liam_ml

图形属性

xmin

xmax

ymin

ymax

xend

yend

weight

color：轮廓色

fill：填充色

shape：点形状

linetype: dotted dashed

size：点大小，线条大小（粗细）

alpha：透明度，0-1:完全透明-完全不透明

width：宽度（条形图等）

binwidth：组距（直方图等）

label：名称（如x，y,legend等）

angle：角度

hjust：水平平移

vjust：垂直平移

lower

middle

upper

map_id

group：分组

position：位置调整

点图

ggplot(data=mpg,aes(x=displ,y=hwy))+

geom_point(color='grey')#轮廓色为灰色（实质效果：填充部分也为灰色）

ggplot(data=mpg,aes(x=displ,y=hwy))+

geom_point(fill='blue')#点图无填充色选项，所以等于空设置；而color默认黑色

ggplot(data=mpg,aes(x=displ,y=hwy))+

geom_point(aes(color=cyl),alpha=I(0.6)) #设置透明度,alpha范围是0到1，全透明到不透明；I()表示设定，跟映射函数aes()相对

ggplot(data=mpg,aes(x=displ,y=hwy))+

geom_point(aes(color=factor(cyl)),alpha=0.6) #cyl为整数，系统识别#默认是连续变量，所以生成的图例是连续型颜色图例；factor使之因子化

ggplot(data=mpg,aes(x=displ,y=hwy))+

geom_point(aes(color=factor(cyl),shape=factor(cyl)),alpha=0.6)

将cyl映射至形状属性

image

折线图

ggplot(data=mpg,aes(x=displ,y=hwy))+

geom_line(color='grey',size=2)

条形图

ggplot(data=mpg,aes(x=factor(displ),y=hwy))+

geom_bar(stat='identity',width=0.8,color='green',fill='grey')

stat是指统计变换；#stat='identity'是指不进行统计变换即hwy就是纵

坐标值因为displ出现众多的重复值，所以分组更多，纵坐标值不断累加

width为组距，color为轮廓色（可以看到y值累加上去），fill为填充色

ggplot(data=mpg,aes(x=displ))+

geom_bar(stat='density')

统计变换为密度即y为displ的密度分布；注意，这里不需要给y映射变量

统计变换为bin,中文翻译是bin封箱，其过程是生成变量count(对x计数)，density（一维密度估计），x(组的中心估计)——默认利用count和x；如若#要引用这几个变量，则在变量左右加双圆点，譬如 ..density..

image

直方图：

仅限于x为连续型变量，如果x为离散型则该函数报错——此时可以利用条线图来绘制直方图

ggplot(data=mpg,aes(x=displ,fill=fl))+

geom_histogram(binwidth=0.2，position=”stack”)

binwidth 为The width of the bins；不同于条形图的width

position是指位置调整，stack是堆叠即同组几何对象堆叠

ggplot(data=mpg,aes(x=displ,fill=fl))+

geom_histogram(binwidth=0.4,position='dodge')

position是指位置调整，dodge是同组几何对象并列

位置调整的参数还有：fill jitter identity

image

箱线图：

ggplot(data=mpg,aes(x=factor(fl),y=hwy))+

geom_boxplot(color='grey')

ggplot(data=mpg,aes(x=1,y=hwy))+

geom_boxplot(fill='grey',color='blue') #hwy不分组

ggplot(data=mpg,aes(x=1,y=hwy))+

geom_boxplot(fill='grey',color='blue'，outlier.colour= "red", outlier.shape = 1) #高亮异常值并赋予特定的几何对象

image

曲线密度图

使用geom_density

ggplot(data=mpg,aes(x=displ,fill=fl))+geom_density(color='white',size=0.1,alpha=I(0.3))

ggplot(data=mpg,aes(x=displ,y=..density..))+geom_histogram(fill='grey',binwidth=0.18,alpha=I(0.3))+geom_density(color='white',size=0.8)

几乎看不到密度曲线。原因：直方图和密度图结合在一起。直方图中bin变换生成y变量有count和density，默认使用前者，这样由于count很大，density很小（总是小于1）,就会值得密度线处于低位，难以看到，所以需要y=..density..(引用bin变换的数据必须前后加双圆点)

ggplot(data=mpg,aes(x=displ))+geom_histogram(fill='grey',binwidth=0.18,alpha=I(0.3))+geom_density(color='white',size=0.8)

image

饼图

ggplot(data=mpg,aes(x=1,fill=fl))+geom_bar()+

coord_polar(theta='y')

image

coord_polar是极坐标的意思，区别以往的笛卡尔坐标。coord_polar()作用是把把笛卡尔坐标变换为极坐标。该函数有theta,start,direction三个参数，后者者顶多是图的微调，需要了解可以查看帮助文件，theta才是关键。极坐标参数theta有两个指标半径和角度，就饼图而言，各部分内容的角度不同，半径相同；而默认theta=”x”,即将x映射为角度，剩下的y映射为半径。这些为前期准备，下面一步步分解这个过程。

First step: 生成条形图，其中各部分的比例关系用y反映。

ggplot(data=mpg,aes(x=1,fill=fl))+geom_bar()