小白学习RNA-seq之旅

RNA-seq学习:No.8IsoformSwitchAnaly

2020-03-19  本文已影响0人  小贝学生信

在之前的步骤中,已经筛选出感兴趣的转录本信息,并添加了多种注释了。接下来就可以做最后的分析了,比如哪个基因的转录本信息变化最大,如何变化。
由于之前的示例文件包含基因数比较少,且涉及了三种情况,较复杂。因此这里换一个理想的示例文件进行演练。

data("exampleSwitchListAnalyzed")
sar <- subsetSwitchAnalyzeRlist(
  exampleSwitchListAnalyzed, 
  exampleSwitchListAnalyzed$isoformFeatures$condition_1 == 'COAD_ctrl'
)
sar
image.png

步骤三:结果分析 Post Analysis

1、查看Switch排名

extractTopSwitches(
  sar, 
  filterForConsequences = TRUE, 
  n = 2,   #如果设置n=NA,则会返回所有排名的一个表格
  sortByQvals = TRUE    #若为FALSE则按dIF值排名,默认为TRUE
)
sar.ge_df=extractTopSwitches( 
  sar, 
  filterForConsequences = TRUE, 
  n = NA,                
  extractGenes = TRUE,    # when FALSE isoforms are returned
  sortByQvals = TRUE
)
sar.ge_df
sar.is_df=extractTopSwitches( 
  sar, 
  filterForConsequences = TRUE, 
  n = NA,                
  extractGenes = FALSE,    # when FALSE isoforms are returned
  sortByQvals = TRUE
)
head(sar.is_df,2)
head(sar.is_df,2)

2、单个基因switch可视化★

(1)以比较显著的ZAK基因为例
subset(sar.is_df, gene_name == 'ZAK')
ZAK基因的两个转录本情况
switchPlot(sar, gene = 'ZAK')

如下图结果:

柱状图上方的横线标注应该表示p值,即有无显著差异。ns表示 not significant;*~***表示有显著差异。

switchPlot
pdf(file = '~/li/test/ZAK.pdf', onefile = FALSE, height=6, width = 9)
switchPlot(sar, gene='ZAK')
dev.off()
(2)批量绘制
switchPlotTopSwitches(
    switchAnalyzeRlist = sar, 
    n = 10,   #绘制排名前10的
    filterForConsequences = FALSE, 
    splitFunctionalConsequences = TRUE
)
switchPlotTopSwitches

3、Genome-wide Consequences Summaries

extractConsequenceSummary(
  sar,
  consequencesToAnalyze='all',  #也可以指定感兴趣的consequences
  plotGenes = FALSE,           # enables analysis of genes (instead of isoforms)
  asFractionTotal = FALSE      # enables analysis of fraction of significant features
)
extractConsequenceEnrichment(
    exampleSwitchListAnalyzed,
    consequencesToAnalyze='all',
    analysisOppositeConsequence = TRUE,
    returnResult = FALSE # if TRUE returns a data.frame with the results
)

From the analysis above, it is therefore quite clear, that many of the opposing consequences are significantly unevenly distributed. In other words, many types of consequences seem to be used in a condition-specific manner.

4、Genome-wide Splicing Summaries

extractSplicingSummary(sar)
extractSplicingSummary
extractSplicingEnrichment(
  sar,
  splicingToAnalyze = c('A3','MES','ATSS','ATTS'), 
  returnResult = TRUE # if TRUE returns a data.frame with the results
)
image.png

5、最后画一个火山图吧~

ggplot(data=sar$isoformFeatures, aes(x=dIF, y=-log10(isoform_switch_q_value))) +
  geom_point(
    aes( color=abs(dIF) > 0.1 & isoform_switch_q_value < 0.05 ), # default cutoff
    size=1
  ) +
  geom_hline(yintercept = -log10(0.05), linetype='dashed') + # default cutoff
  geom_vline(xintercept = c(-0.1, 0.1), linetype='dashed') + # default cutoff
  scale_color_manual('Signficant\nIsoform Switch', values = c('black','red')) +
  labs(x='dIF', y='-Log10 ( Isoform Switch Q Value )') +
  theme_bw()
volcano figure

As there are many dIF values (effect size) very close to zero, which have a significant isoform switch (black dots above dashed horizontal line) this nicely illustrates why a cutoff on both the dIF and the q-value are necessary.


上一篇 下一篇

猜你喜欢

热点阅读