RNA-seq 数据处理(三)准备差异表达分析文件
2023-09-17 本文已影响0人
风知秋
DESeq2 and edgeR are two popular Bioconductor packages for analyzing differential expression, which take as input a matrix of read counts mapped to particular genomic features (e.g., genes).
准备差异表达分析的输入文件;
prepDE.py [options]
主要参数如下:
-i # 输入,一个包含所有样本的 folder,或者一个 text file 包含样品 ID 和目录;
-g # 输出,the gene count matrix;
-t # 输出,the transcript count matrix;
运行示例如下:
输入的文件为上一步生成的 gtf 文件;
python prepDE.py -i sample_lst.txt -g gene_count_matrix.csv -t transcript_count_matrix.csv
These count matrices (CSV files) can then be imported into R for use by DESeq2 and edgeR (using the DESeqDataSetFromMatrix and DGEList functions, respectively).