R语言学习

R

2020-04-24  本文已影响0人  TX_ab85

0228

1.some refernce material
R Cookbook
R in Action
ggplot2
Advanced R
2.Installing and Loading Package
installing:install.packages('ggplot2')
loading:library(ggplot2)
updating:update.packages()
3.R language basics
create a vector: v=(1,4,4,3,2,2,3) or w=c("apple","banana","orange")
return certain elements: v【c(2,3,4)】 or v【2:4】 v【c(2,4,3)】
Delete certain element:v=【-2】 删除第二个元素 or v=【-2:-4】 删除2到4个元素
Extract element: v【v<3】 提取所有小于3的元素
Find elements: which(v==3) NOTE:the returns are the indices of elements,“=”是赋值,“==”是相等。
which.max(v)最大值
which.min(v)最小值

0229

4.Numbers随机数
Repeatable Random Numbers:set.seed(250)【12:00】使电脑重复产生的随机数相同
Random Number:a=runif(3,min=0,max=100)
Rounding of Numbers:floor(2)向下取整 ceiling(a)向上取整 round(a,4)括弧中的数字为保留的小数位数
Random Numbers from Other Distributions:rnorm()正态分布,rexp()指数分布,rbinom(),rgeom()几何分布,rnbinom()负二项分布
?round 查询命令
??round 产生与查询关键词相关内容
5.Data Input
loading local data:?read.csv;read.csv(file="……"); or read.table(file"……")
loading online data: read.csv("http://……")
attach:attach()
6.Graphs
plot:plot()
histograms:hist()直方图
density plot:plot(density())密度图
scatter plot:plot()散点图
box plot:boxplot(time~)箱线图
Q-Q plot:qqnorm(),qqline()and qqplot() quantiles & quanbles
par设定绘图环境
hist(x,breaks=20,col="blue")绘制直方图,竖条20根,填充为蓝色
plot(density(x))绘制曲线图
plot(x,type=“
”)绘制散点图,形式为
*
boxplot(x,y)箱线图 boxplot(time~***)区分性别

0330

a=c(1,2,3,4,5,6)
b=c("one","two","three")
c=c(TRUE,TRUE,FALSE)
x=matrix(1:20,nrow=5,ncol=4,byrow=TRUE)
y=matrix(1:20,nrow=5,ncol=4,byrow=FALSE)
x[2,]
x[,2]
x[2,c(2,4)]
x=[3:5,2]
rnames=c("apple","banana","orange","melon","corn")
cnanes=v("cat","dog","bird","pig")
rownames(x)=rnames
colnames(x)=cnames

0302

简书调成markdown编辑模式

一级标题

二级标题

三级标题

*斜体
我来自湖南
**加粗
黄鹤一去不复返

代码引用

大家好

图片引用

[图片上传失败...(image-cdfa6e-1587743770549)]
引用下文章叭

一片孤城万仞山

0303

1.putty安装

从官网下载putty,在putty中输入ip地址,再点open,再输入用户名和密码

2.在putty上操作Linux系统

思维导图: TIM图片20200303211602.png

0304

下载miniconda

1.搜索conda官网,找到下载链接
2.ctrl+c复制链接
3.打开putty,cd biosoft
4.wget+右键粘贴链接
5.bash+刚下的文件
6.enter跳过加yes即可安装
7.最后激活:source ~/.bashrc
8.添加清华镜像:conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda/
conda config --set show_channel_urls yes

运行miniconda

1.用conda list来查看
2.用conda search fastqc来搜索fastqc软件
3.conda install fastqc -y来安装
4.conda remove fastqc -y来卸载

备注:

我在添加清华镜像的时候出现了错误,这时可尝试rm ~/.condarc来删除之前的配置,这样就可以继续添加了!!

0305

R语言的基础

1.安装R和Rstudio

1.百度搜索R和Rstudio,在相应的网站中下好安装
要确保自己的电脑用户名是英文的哦

2.用R语言进行一些画图

-50个正态分布随机数的圆点图:plot(rnorm(50))

TIM图片20200305193010.png
箱型图:boxplot()
TIM截图20200305193635.png

3.R语言的基本操作

1.dir(目录)到达路径
2.<-赋值
3.删除变量:rm(x)
4.列出历史命令:history()
5.清空控制台:ctrl+l

0306

R语言数据结构

赋值

1.一个值:v<-8
2.多个不连续:v<-c(3,5,7)
3.多个连续值(1到10):v<-[1:10]
4.数列:v<-seq(1:10,by=0.5)
5.重复:v<-rep(1:3,time=3)

取值

1.v[2]
2.v[1:3] 选第1到第3
3.v[c(1,3,4)] 选第1第3第4、
4.v[-4] 除了第四个数
5.v[v==10] v中等于10的数
6.v[v<4] v中小于4的数
7.v[v %in% c(1,3,4,5,5)] v中存在于向量中的数

读取本地数据和对数据的操作

1.读取数据:read.csv()文件应放置工作目录
2.导出数据:write.table()
3.设置行列名称
列名:colnames(v)[1]<-"xxxxx"改第一列的名称为xxxxx
行名:rownames(v)[1]<-"xxxxx"
4.数据保存:save.image(file="xxxx.Rdata")
单个变量保存:save.image(x,file="xxxx.Rdata")
5.提取数据框中元素:
x[x,y] 第x行第y列
x[x,] 第x行
x[,y] 第y列
6.将数据框名添加到搜索环境:attach(x)

问题

save(X,file="test.RData")这句代码如果报错X not found,是为什么,应该怎么解决?

答:是环境变量中没有X这个变量,解决的话应该找找是不是变量名弄错了,比如把小写x弄成大写

0307

image.png

0309

array

dim1=c("A1","A2")
dim2=c("B1","B2","B3")
dim3=c("C1","C2","C3","C4")
dim4=c("D1","D2","D3")
z=array(1:72, c(2,3,4,3),dimnames=list(dim1,dim2,dim3,dim4))
z[1,2,3,]

dataframe

attach(mtcars)
par(mfrow=c(1,4))#设置图的个数
plot(rnorm(50),pch=17)#pch点的形状
plot(rnorm(20),type = "l",lty=5)#lty线是形状
plot(rnorm(100),cex=3)#cex点的大小
plot(rnorm(200),lwd=2)#lwd线的大小
?pch
title(main = "normal list")
axis()
legend()
attach(mtcars)#加入R搜索途径
layout(matrix(c(1,1,2,3),2,2,byrow=TRUE))#矩阵

0311

对图形的设置

par(mfrow=c(1,4))#设置图的个数,partion
plot(rnorm(50),pch=17)#pch点的形状
plot(rnorm(20),type = "l",lty=5)#lty线是形状
plot(rnorm(100),cex=3)#cex点的大小
plot(rnorm(200),lwd=2)#lwd线的大小
?pch
title(main = "normal list")
axis()
legend()
attach(mtcars)#加入R搜索途径
layout(matrix(c(1,1,2,3),2,2,byrow=TRUE))#矩阵
hist(wt)#直方图
hist(mpg)
hist(disp)
hist(mtcars)
?pch

for和while的循环语句

for (i in 1:10) {#for loop循环,遍历
print(i)
i=i+1
}
i=1
while(i <= 10){#while loop循环
print(i)
i=i+1
}

0312

if条件和switch条件

i=1
if(i==1){
print("hello world")
}else{
print("goodbye eorld")
}
i=3
if(i==1){#if的条件语句
print("hello")
}else if (i==3) {#多个条件可一直else if
print("goodbye")
}else{
print("good game")
}
feelings=c("sad","afraid")
for(i in feelings){
print(
switch(i,#swich的作用相当于else if,转换
happy="i am glad",
afraid="something to fear",
sad="cheer up",
angry="calm dowm"
)
)
}

0313

R语言中的user-defined function

myfunction=function(x){#R语言中的user defined function
for(i in feelings){
print(
switch(i,
happy="i am glad",
afraid="something to fear",
sad="cheer up",
angry="calm dowm"
)
)
}
}
myfunction=function(x,a,b,c){
return(asin(x)^2-bx+c)
}
curve(myfunction(x,20,3,4),xlim=c(1,20))#画出刚刚定义的函数图像
myfeeling=function(x){
for(i in feelings){
print(
switch(i,
happy="i am glad",
afraid="something to fear",
sad="cheer up",
angry="calm dowm"
)
)
}
}
feelings=c("sad","afraid")
myfeeling(feelings)

0314

bar plot

library(vcd)
counts <- table(Arthritis$Improved)

counts image.png

barplot(counts,
main="Simple Bar Plot",
xlab="Improvement", ylab="Frequency")
barplot(counts,
main="Horizontal Bar Plot",
xlab="Frequency", ylab="Improvement",
horiz=TRUE)

image.png
counts <- table(ArthritisImproved, ArthritisTreatment)
barplot(counts,
main="Stacked Bar Plot",
xlab="Treatment", ylab="Frequency",
col=c("red", "yellow","green"),
legend=rownames(counts),
beside = TRUE)
image.png

pie plot

install.packages("plotrix")
library(plotrix)
slices <- c(10,12,4, 16, 8)
lbls <- c("US", "UK", "Australia", "Germany", "France")
pie(slices, labels = lbls,main="Simple Pie Chart",edges=300,radius=1)


image.png

0315

fan plot

slices <- c(10,12,4, 16, 8)
lbls <- c("US", "UK", "Australia", "Germany", "France")
fan.plot(slices,labels=lbls,main = "fan plot")


image.png

dot chart

dotchart(mtcars$mpg,
labels=row.names(mtcars),cex=0.7,
main="Gas Mileage for Car Models",

xlab="Miles Per G image.png
allon")

对数据的基本操作

head(mtcars)#展示前面六行
summary(mtcars)
attach(mtcars)
table(cyl)#统计该列数据的频数
table(cut(mpg,seq(10,34,by=2)))#统计该列数据特定区间的频数

0317

x = rnorm(100, mean = 10, sd = 1)
y = rnorm(100, mean = 30, sd = 10)
t.test(x, y, alt = "two.sided",paired=TRUE)#双边检验
set.seed(123)
A = matrix(sample(100,15), nrow=5, ncol=3)


image.png

t(A)#置换行列


image.png

A+2
A-2
A2
A/2
set.seed(234)
B = matrix(sample(100,15), nrow=5, ncol=3)
t(A)
t(A) %
% B#共有区域矩阵相乘
colMeans(A)#列的平均数
colSums(A)#列的和
crossprod(A,B)#A的置换,乘以B

0318

FACTOR

factor=factor(rep(c(1:3),times=5))#对变量做标记
x=sample(100,15)
tapply(x,factor,mean)#用factor来对x标记
rbind(x,factor)
boo=rbind(x,factor)[2,]==2
which(boo)
rbind(x,factor)[1,which(boo)]
mean(rbind(x,factor)[1,which(boo)])

bilibili:AV5625356

柱状图:0319

单样品柱状图

file1="Anr.lib.stat.txt"
dat2=read.table(file=file1,check.names=F,header=T,sep="\t",comment.char = "")

对数据进行排序处理

dat2=dat2[order(dat2[,2],decreasing=T),]
head(dat2)

画图

bar1=ggplot(dat2,aes(x=Species_Name,y=Homologous_Number))+
geom_bar(stat = "identity",position = "dodge",width = 0.8)
bar1
ggsave(bar1,filename = "hello.png",width = 12,height = 9)
dat2
?read.table

修改排序

dat2[,1]=factor(dat2[,1],levels = dat2[,1],order=T)

Other最后

ending=c("Other")
level=as.character(dat2[!dat2[,1]==ending,1])
level=unique(c(level,ending))
dat2[,1]=factor(dat2[,1],levels=level,order=T)
bar1=ggplot(dat2,aes(x=Species_Name,y=Homologous_Number))+
geom_bar(stat = "identity",position = "dodge",width = 0.8)

0320

设定和映射的差别

p_bar=ggplot(dat2,aes(x=dat2[,1],y=dat2[,2],fill=dat2[,1]))+
geom_bar(stat="identity",position ="dodge",width = 0.8)+
scale_fill_brewer(palette="Paired",direction=-1)
p_bar

调用R中的颜色包

RColorBrewer::display.brewer.all()

文字标记

p_bar=p_bar+
geom_text(aes(label=paste(as.character(dat2[,3]*100),"%",sep="")),vjust = 1,size=3)
p_bar

标签文字的调整

p_bar=p_bar+labs(x="species",y="helo",title = "nothing")
p_bar
p_bar=p_bar+ggtitle(label="nothing",subtitle = "someshing")
p_bar

theme修改title,legend及背景

p_bar=p_bar+theme(
plot.title = element_text(size = 25,face = "bold", vjust = 0.5, hjust = 0.5),##title位置
axis.text.x=element_text(size = 10,face = "bold", vjust = 1, hjust = 1,angle = 45),##x轴文本位置
panel.background = element_rect(fill = "transparent",color = "black"),##表格内背景
plot.background = element_rect(fill = "lightblue",colour = "red"),##图样背景
axis.ticks.x=element_blank(),
panel.grid.minor = element_blank(), ##表格内格子
panel.grid.major = element_blank())
p_bar

321

多样品柱状图

file2="nr.lib.stat.txt"

读取数据

dat2=read.table(file2,sep="\t",check.names = F,header = T,comment.char = "")
head(dat2)
dat2=dat2[order(dat2[,3],decreasing = T),]
head(dat2)

相关标签设置

ending=c("other")
xlab="Species_Name"
ylab="Unigenes_num"
title="Nr"
subtitle="Homologous_Number"

固定顺序

level=as.character(dat2[!dat2[,1]==ending,1])
level=unique(c(level,ending))

dat2[,1]=factor(dat2[,1],levels=level,order=T)

dat2[,2]=factor(dat2[,2],order=T)

基础做图

library(ggplot2)
p_bar2=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.7,position ="dodge",color="darkgrey")
p_bar2

p_bar2=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.7,position =position_dodge(width=0.9),color="darkgrey")
p_bar2

标签修改,颜色修改#设置有aes(fill)生成的图例

p_bar2=p_bar2+labs(x=xlab,y=ylab)+
ggtitle(label=title,subtitle = subtitle)
p_bar2

fill="Cultivar"
p_bar2=p_bar2+labs(x=xlab,y=ylab,fill=fill)+
ggtitle(label=title,subtitle = subtitle)
p_bar2

322

多样品柱状图

file2="nr.lib.stat.txt"

读取数据

dat2=read.table(file2,sep="\t",check.names = F,header = T,comment.char = "")
head(dat2)
dat2=dat2[order(dat2[,3],decreasing = T),]
head(dat2)

相关标签设置

ending=c("other")
xlab="Species_Name"
ylab="Unigenes_num"
title="Nr"
subtitle="Homologous_Number"

固定顺序

level=as.character(dat2[!dat2[,1]==ending,1])
level=unique(c(level,ending))

dat2[,1]=factor(dat2[,1],levels=level,order=T)

dat2[,2]=factor(dat2[,2],order=T)

基础做图

library(ggplot2)
p_bar2=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.7,position ="dodge",color="darkgrey")
p_bar2

p_bar2=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.7,position =position_dodge(width=0.9),color="darkgrey")
p_bar2

标签修改,颜色修改#设置有aes(fill)生成的图例

p_bar2=p_bar2+labs(x=xlab,y=ylab)+
ggtitle(label=title,subtitle = subtitle)
p_bar2

fill="Cultivar"
p_bar2=p_bar2+labs(x=xlab,y=ylab,fill=fill)+
ggtitle(label=title,subtitle = subtitle)
p_bar2

控制顺序,基于建立的有序因子

p_bar2=p_bar2+scale_fill_brewer(palette="Set3",direction=1)
p_bar2
p_bar2=p_bar2+scale_fill_manual(values=c("red","turquoise"))
p_bar2

标记文字

p_bar2=p_bar2+
geom_text(aes(label=paste(as.character(dat2[,4]*100),"%",sep="")),position=position_dodge(width=0.9),vjust = -0.5,size=2)
p_bar2

细节调整

p_bar2=p_bar2+
theme(
plot.title = element_text(size = 25,face = "bold", vjust = 0.5, hjust = 0.5),
legend.title = element_text(size = 15,face = "bold", vjust = 0.5, hjust = 0.5),
legend.text = element_text(size = 10, face = "bold"),
legend.position = 'right',
legend.key.size=unit(0.5,'cm'),
axis.text.x=element_text(size = 10,face = "bold", vjust = 1, hjust = 1,angle = 45),
axis.text.y=element_text(size = 10,face = "bold", vjust = 0.5, hjust = 0.5),

axis.title.x = element_text(size = 15,face = "bold", vjust = 0.5, hjust = 0.5),
axis.title.y = element_text(size = 15,face = "bold", vjust = 0.5, hjust = 0.5),

panel.background = element_rect(fill = "transparent",colour = "black"), 
panel.grid.minor = element_blank(), 
panel.grid.major = element_blank(),
plot.background = element_rect(fill = "transparent",colour = "black"))

p_bar2

堆叠柱状图

p_barS=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.8,position ="stack")
p_barS

反向填充

p_barS=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.8,position =position_stack(reverse = T))
p_barS

文字标签,图例反向等

p_barS=ggplot(dat2,aes(x=dat2[,1],y=dat2[,3],fill=dat2[,2]))+
geom_bar(stat="identity",width=0.8,position =position_stack(reverse = T))+
labs(x=xlab,y=ylab,fill=fill)+
ggtitle(label=title,subtitle = subtitle)+
scale_fill_manual(values=c("red","turquoise"))+
guides(fill=guide_legend(reverse=F))
p_barS
p_barS=p_barS+
geom_text(aes(label=paste(as.character(dat2[,4]*100),"%",sep="")),position=position_stack(vjust = 0.5,reverse = T),size=3)
p_barS

323

基础饼状图

读取数据

file1="Anr.lib.stat.txt"
dat1=read.table(file=file1,check.names=F,header=T,sep="\t",comment.char = "")
head(dat1,3)
dat1=dat1[order(dat1[,2],decreasing = T),]
head(dat1)

排序处理

ending="Other"
fill="Species"
title="Nr"
subtitle="Homologous_Number"
level=as.character(dat1[!dat1[,1]==ending,1])
level=unique(c(level,ending))
dat1[,1]=factor(dat1[,1],levels=level,order=T)

柱状垛叠

p_pie=ggplot(dat1,aes(x="",y=dat1[,2],fill=dat1[,1]))+
geom_bar(stat="identity",width=1,position = position_stack(reverse = T))
p_pie

设置y轴极坐标,方向

p_pie=p_pie+
coord_polar(theta="y",direction=-1)
p_pie

颜色填充设置

p_pie=p_pie+
scale_fill_brewer(palette ="Set3",direction = 1)
p_pie

图例标题修改

p_pie=p_pie+
labs(x="",y="",fill=fill)+
ggtitle(label =title,subtitle=subtitle)
p_pie

文字标签

p_pie=p_pie+geom_text(aes(label=paste(as.character(dat1[,3]*100),"%",sep="")),position =position_stack(vjust=0.5,reverse = T),size=3)
p_pie

刻度背景调整

p_pie=p_pie+
theme(
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
legend.title = element_text(hjust = 0.5),
axis.ticks = element_blank(),
axis.text.x = element_blank(),
panel.background = element_rect(fill = "transparent",colour = NA),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
plot.background = element_rect(fill = "transparent",colour = NA)
)
p_pie

数据分布——密度图,箱线图,直方图

325

读取数据

file="gene_fpkm.xls"
demo_fpkm=read.table(file,header = F,sep = "\t",check.names = F)

对数据进行处理

fpkm=melt(demo_fpkm,variable.name = "Sample",value.name = "FPKM")
head(fpkm,10)

画密度图

p_density=ggplot(fpkm,aes(x=log(fpkm[,3]),color=fpkm[,2],fill=fpkm[,2]))+
geom_density(alpha=0.25,size=0.5)
p_density

修改标度

p_density=p_density+xlim(-3,5)
p_density

标题主题修改

p_density=p_density+
ggtitle("Gene expression density")+
labs(x="log10FPKM",color="Samples",fill="Samples")+
theme_bw()
p_density

箱线图

p_box=ggplot(fpkm,aes(x=fpkm[,2],y=log10(fpkm[,3]),fill=fpkm[,2]))+
geom_boxplot(size=0.5,width=0.8,notch=T,outlier.shape = NA)
p_box

y轴范围限制

p_box=p_box+ylim(-3,5)
p_box

标题主题修改

p_box=p_box+
ggtitle("Gene expression distribution")+
labs(y="log10FPKM",x="",fill="Samples")+
theme_bw()
p_box

直方图

326

画点样品直方图

p_histogram=ggplot(fpkm,aes(x=log10(fpkm[,3]),fill=fpkm[,2]))+
geom_histogram(binwidth = 1,alpha=0.5,size=1,stat="bin")+
xlim(-3,5)
p_histogram

多样品直方图

默认stack模式

p_histogram=ggplot(fpkm,aes(x=log10(fpkm[,3]),fill=fpkm[,2]))+
geom_histogram(binwidth = 1,alpha=0.5,size=1,stat="bin")+
xlim(-3,5)
p_histogram

分区

p_histogram=p_histogram+
facet_grid(.~fpkm[,2],scales = "free")
p_histogram

频率型直方图

p_histogram=ggplot(fpkm,aes(x=log10(fpkm[,3]),y=..density..))+
geom_histogram(aes(fill=fpkm[,2]),binwidth = 1,alpha=0.5,size=1,stat="bin")+
xlim(-3,5)
p_histogram=p_histogram+
facet_grid(.~fpkm[,2],scales = "free")
p_histogram

加上密度曲线

p_freqpoly=ggplot(fpkm,aes(x=log10(fpkm[,3]),color=fpkm[,2]))+
geom_freqpoly(binwidth = 1,alpha=0.5,size=1,stat="bin")+
xlim(-3,5)
p_freqpoly

频数折线图

p_h_f=p_histogram+
geom_freqpoly(aes(color=fpkm[,2]),binwidth = 1,alpha=0.5,size=1,stat="bin")
p_h_f

327

读取数据

file="CK-WT_vs_T-WT.xls"
demo_DEG=read.table(file,check.names = F,header = T,sep = "\t")
head(demo_DEG)

设置阈值信息

line_FC=2
line_FDR=0.01
col=c("red","blue","grey")
AFPKM=c(2:4)
BFPKM=c(5:7)

通过阈值上下调信息

demo_DEG[demo_DEG[,"FDR"] <line_FDR & demo_DEG[,"log2FC"] >= log2(line_FC),ncol(demo_DEG)+1]="Up"
demo_DEG[demo_DEG[,"FDR"] <line_FDR & -log2(line_FC) >= demo_DEG[,"log2FC"],ncol(demo_DEG)]="Down"
demo_DEG[demo_DEG[,"FDR"] >=line_FDR | log2(line_FC) > abs(demo_DEG[,"log2FC"]),ncol(demo_DEG)]="Normal"
colnames(demo_DEG)[ncol(demo_DEG)]="Regulate"
head(demo_DEG)

火山图

volcano=demo_DEG

有序因子的创建以便于颜色修改

volcanoRegulate=factor(volcanoRegulate,levels = c("Up","Down","Normal"),order=T)

初步画火山图

p_volcano=ggplot(volcano,aes(x=log2FC,y=-log10(FDR)))+
geom_point(aes(color=Regulate),alpha=0.5)
p_volcano

颜色的改变

p_volcano=p_volcano + scale_color_manual(values =col)
p_volcano

在基础图上加上阈值线

p_volcano=p_volcano +
geom_hline(yintercept=c(-log10(line_FDR)),linetype=4)+
geom_vline(xintercept=c(-log2(line_FC),log2(line_FC)),linetype=4)
p_volcano

主题修改

p_volcano=p_volcano+theme_bw()
p_volcano

保存

ggsave(p_volcano,filename = "six.png")


six.png

MA图

328

读取数据

head(demo_DEG,10)

设置xy轴名称

MA=demo_DEG[,c("ID","log2FC","Regulate")]
head(MA,2)
MA[,4]=1/2log2(rowMeans(demo_DEG[,AFPKM])rowMeans(demo_DEG[,BFPKM]))
colnames(MA)[4]="1/2log2FPKM"
head(MA,3)

有序因子设置

MARegulate=factor(MARegulate,levels = c("Up","Down","Normal"),order=T)

MA图绘制

p_MA=ggplot(MA,aes(x=1/2log2FPKM,y=log2FC))+
geom_point(aes(color=Regulate),alpha=0.5)
p_MA

设置x轴范围

p_MA=p_MA+scale_x_continuous(limits = c(-5,15))
p_MA

颜色,阈值

加主题
p_MA=p_MA+
scale_color_manual(values =c("red","darkgreen", "darkgrey"))+
geom_hline(yintercept=c(-log2(line_FC),log2(line_FC)),linetype=4)+
theme_bw()
p_MA

保存

ggsave(p_MA,filename = "six(2).png")


six(2).png

329

柱状图(富集分析结果可视化)

library(ggplot2)

读取数据

enrich="GO.enrich.txt"
demo_go=read.table(enrich,check.names = F,sep = "\t",header = T,comment.char = "")
head(demo_go)

对p值进行排序,并取P值前20小的数据

go=demo_go[order(demo_go[demo_go$Pvalue<enrich,"Pvalue"],decreasing = F),]
head(go)
if(nrow(go)>=20){
go=go[1:20,]
}
head(go)

对二级范围进行分别排序

go=go[order(go$Term_type,decreasing = F),]
head(go)

显著性高的排在前面

goDescription=factor(goDescription,levels = rev(go$Description),ordered = T)

筛选出P值为0的数,并转换为对数

go[go$Pvalue==0,"Pvalue"]=1e-15

画图,对x轴和y轴进行反转

go_bar=ggplot(go,aes(x=Description,y=-log10(Pvalue),fill=Term_type))+
geom_bar(stat="identity",width = 0.8)
go_bar
go_bar=go_bar+
coord_flip()
go_bar

对文字标签的修改,主题修改

go_bar=go_bar+
geom_text(aes(label=as.character(DEGs)),position = "stack",vjust=0,hjust=0,size=3)+
theme_bw()
go_bar

保存

ggsave(go_bar,filename = "seven(1).png")


seven(1).png

330

气泡图

设置阈值

enrich=0.01
minPvalue=1e-15

读取数据

demo_go=read.table(file,header = T,check.names = F,comment.char = "",sep = "\t")
head(demo_go)

对p值进行排序,并取P值前20小的数据

go=demo_go[order(demo_go[demo_go$Pvalue<enrich,"Pvalue"],decreasing = F),]
head(go)
if(nrow(go)>=20){
go=go[1:20,]
}
head(go)

筛选出P值为0的数,并转换为对数

go[go$Pvalue==0,"Pvalue"]=1e-15

画气泡图

go_point=ggplot(go,aes(x=Description,y=Rich_factor))+
geom_point(aes(color=-log10(Pvalue),size=DEGs),alpha=0.8)+coord_flip()
go_point

对颜色进行修改(渐变)

go_point=go_point+
scale_color_gradient(low = "green",high = "red")
go_point

保存

ggsave(go_point,filename = "seven(2).png")


seven(2).png

331

用pheatmap来绘制

install.packages("pheatmap")
library(pheatmap)

读取数据

file="All.DEG_final_3000.xls"
mat=read.table(file,check.names = F,header = T,sep = "\t",comment.char = "")
head(mat,3)
dim(mat)

pheatmap简单画图

pheatmap(mat)
options(stringsAsFactors = TRUE)

对基因结果标准化(行)

pheatmap(mat,scale = "row")

隐藏行名

pheatmap(mat,scale="row",show_rownames=F)

改变颜色

library(RColorBrewer)
RColorBrewer::display.brewer.all()
col=c("blue","white","red")
color=colorRampPalette(col)(100)
pheatmap(mat2,scale="row",show_rownames=F,color=color)

单元格大小cell

cellheight=300/nrow(mat)
cellwidth=300/ncol(mat)
cellwidth=10
cellheight=10
pheatmap(mat,scale="row",show_rownames=F,color=color,cellwidth = cellwidth,cellheight = cellheight,border_color="black")


image.png

424

给热图添加注释

col_file="annotation_col1.xls"
annotation_col=read.table(col_file,header = T,row.names=1,sep="\t",check.names = F,comment.char = "")
annotation_col
pheatmap(mat,scale="row",show_rownames=F,annotation_col = annotation_col)
pheatmap(mat,scale="row",show_rownames=F,annotation_col = annotation_col,annotation_colors = ann_colors)

离散分类取色

brewer.pal.info
qual=rownames(brewer.pal.info[brewer.pal.info[,"category"]=="qual",])
qualColor=c()
for(i in qual){
qualColor=c(qualColor,brewer.pal(brewer.pal.info[i,"maxcolors"], i))
}
length(qualColor)
length(unique(qualColor))
qualColor=unique(qualColor)
qualColor

seq为数值分类

seqColor=list(Blues=c("#F7FBFF","#08306B"),Reds=c("#FFF5F0","#67000D"),
Greys=c("#FFFFFF","#000000"))
seqColor

设置注释颜色

annotation_color=list()

类型

char=1
num=1
for(i in colnames(annotation_col)){

if(is.numeric(annotation_col[,i])){
annotation_color[[i]]=seqColor[[num]]
num=num+1

}else{
n=length(table(annotation_col[,i]))
annotation_color[[i]]=qualColor[char:(char+n-1)]
names(annotation_color[[i]])=names(table(annotation_col[,i]))
char=char+n
}

}

查看

annotation_color

画图

pdf(file=paste(workdir,"/gene_heatmap.pdf",sep=""),width = 9,height = 9)
heatmap=pheatmap(mat,color=color,cellwidth = cellwidth,cellheight = cellheight,scale="row",
annotation_col = annotation_col,annotation_colors = annotation_color,
show_rownames=F,fontsize_col=8,fontsize=7)
dev.off()


TIM图片20200424173038.jpg

行重排顺序

newOrder=mat[heatmaptree_roworder,]

添加cluster

cluster=10
row_cluster=cutree(heatmap$tree_row,k=cluster)
newOrder[,ncol(newOrder)+1]=row_cluster[match(rownames(newOrder),names(row_cluster))]
colnames(newOrder)[ncol(newOrder)]="Cluster"
head(newOrder,2)
write.table(newOrder,file = paste(workdir,"/gene_newOrder_withCluster.xls",sep = ""),sep="\t",row.names = T,col.names = T,quote = F)

上一篇下一篇

猜你喜欢

热点阅读