ballgown下载及使用

2020-09-05 本文已影响0人就是大饼

下载

我用的是R3.6.2的版本，对应BiocManager应该是3.10的版本

R版本对应BiocManager版本

if (!requireNamespace("BiocManager", quietly = TRUE))
+ install.packages("BiocManager")
BiocManager::install(version = "3.10")
Update all/some/none? [a/s/n]:
a
BiocManager::install("ballgown")
library("ballgown")

报错

错误: package or namespace load failed for ‘ballgown’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
 不存在叫‘XML’这个名字的程辑包

解决

install.packages("XML", type = "binary")

即可

 library("ballgown")
载入程辑包：‘ballgown’

The following object is masked from ‘package:base’:

    structure

使用

将xshell的文件下载到D:/R-3.6.2/library/ballgown/extdata中

data_directory = system.file('extdata', package='ballgown')
data_directory
bg = ballgown(dataDir=data_directory, samplePattern='Rice_D908', meas='all')
bg
##ballgown instance with 46933 transcripts and 14 samples

Accessing assembly data

structure(bg)$exon
GRanges object with 204872 ranges and 2 metadata columns:
           seqnames            ranges strand |        id transcripts
              <Rle>         <IRanges>  <Rle> | <integer> <character>
       [1]    chr01       30715-30813      - |         1           1
       [2]    chr01       31493-31600      - |         2           1
       [3]    chr01     165271-165369      - |         3           2
       [4]    chr01     166049-166156      - |         4           2
       [5]    chr01     265805-265938      + |         5           3
       ...      ...               ...    ... .       ...         ...
  [204868]    chr12 26990249-26990482      - |    204868       46932
  [204869]    chr12 26990579-26990815      - |    204869       46932
  [204870]    chr12 26990912-26991370      - |    204870       46932
  [204871]    chr12 27014897-27015357      - |    204871       46933
  [204872]    chr12 27018053-27018149      - |    204872       46933
  -------
  seqinfo: 12 sequences from an unspecified genome; no seqlengths

structure(bg)$intron
GRanges object with 157939 ranges and 2 metadata columns:
           seqnames            ranges strand |        id transcripts
              <Rle>         <IRanges>  <Rle> | <integer> <character>
       [1]    chr01       30814-31492      - |         1           1
       [2]    chr01     165370-166048      - |         2           2
       [3]    chr01     265939-269542      + |         3           3
       [4]    chr01     270455-274027      + |         4           3
       [5]    chr01     274746-275038      + |         5           3
       ...      ...               ...    ... .       ...         ...
  [157935]    chr12 26987793-26989097      - |    157935       46932
  [157936]    chr12 26989337-26990248      - |    157936       46932
  [157937]    chr12 26990483-26990578      - |    157937       46932
  [157938]    chr12 26990816-26990911      - |    157938       46932
  [157939]    chr12 27015358-27018052      - |    157939       46933
  -------
  seqinfo: 12 sequences from an unspecified genome; no seqlengths

structure(bg)$trans
GRangesList object of length 46933:
$`1`
GRanges object with 2 ranges and 2 metadata columns:
      seqnames      ranges strand |        id transcripts
         <Rle>   <IRanges>  <Rle> | <integer> <character>
  [1]    chr01 30715-30813      - |         1           1
  [2]    chr01 31493-31600      - |         2           1
  -------
  seqinfo: 12 sequences from an unspecified genome; no seqlengths

$`2`
GRanges object with 2 ranges and 2 metadata columns:
      seqnames        ranges strand |        id transcripts
         <Rle>     <IRanges>  <Rle> | <integer> <character>
  [1]    chr01 165271-165369      - |         3           2
  [2]    chr01 166049-166156      - |         4           2
  -------
  seqinfo: 12 sequences from an unspecified genome; no seqlengths

...
<46931 more elements>

transcript_fpkm = texpr(bg, 'FPKM')
transcript_cov = texpr(bg, 'cov')
whole_tx_table = texpr(bg, 'all')
exon_mcov = eexpr(bg, 'mcov')
junction_rcount = iexpr(bg)
whole_intron_table = iexpr(bg, 'all')
gene_expression = gexpr(bg)

结果

image.png

指定分组，及重复样本数目

pData(bg) =data.frame(id=sampleNames(bg), group=rep(c(1,0), each=14))
phenotype_table=pData(bg)

差异表达分析

stat_results = stattest(bg, feature='transcript', meas='FPKM', covariate='group')
报错  Error in model.frame.default(object, data, xlev = xlev) : 
  变数的长度不一样('libadjust')

pData(bg)=data.frame(pData(bg),time=rep(1:14,2))  ##用stattest 检验每个转录本在时间刻度上的差异表达
timecourse_results=stattest(bg,feature='transcript',meas='FPKM',covariate='time',timecourse=TRUE)
Error in model.frame.default(object, data, xlev = xlev) : 
  变数的长度不一样('libadjust')

待解决！！

注：

参考说明书：https://github.com/alyssafrazee/ballgown*
如果不下载到ballgown包的exdata文件夹里的话，可以If your data is stored in a directory structure other than the one specified above, you can use the samples
argument in the ballgown function: samples should be a vector (1-d array) with one entry per sample,where the entry gives the path to the folder containing that sample's .ctab files.
但是我没看懂，就下exdata了
在前面的准备过程中还下了好多好多包，反正各种报错又各种debug就是了T T

ballgown下载及使用

下载

使用

Accessing assembly data

指定分组，及重复样本数目

待解决！！

猜你喜欢

热点阅读