Bioinformatics course_F6

2017-08-30  本文已影响21人  Ternq8

F6 wiki

F6_Bioinformatics.png

BOOK: shell scripting


cd
..
options
ls -l -t order as time edited.
ls -lt the same


https://fantom6-collaboration.gsc.riken.jp/files/mongoDB/

ls -lt | head first 10 line
bunzip2 RunAll_sample_summary.20170830.tsv.bz2 解压bz2

less -S RunAll_sample_summary.20170830.tsv
less RunAll_sample_summary.20170830.tsv
bash-3.2$ wc RunAll_sample_summary.20170830.tsv 7191 212813 1857590 RunAll_sample_summary.20170830.tsv
lines words letters


are sample ID unique or not?
cut -f1 RunAll_sample_summary.20170830.tsv | head
>out put
unique

关于cut的参数: http://man.linuxde.net/cut


SAM/BAM介绍
mapping
formate for reads

Fast: Fasta + quality

bam is the same as SAM,but zip

Each line of SAM start with@


Bedtools

bed formate

chr start end | option (name score strand (+/-/.))|

start <end

mysql

set theory for genomics

bedtools intersect --help
bedtools intersect -u -a XX.bed -b XXX.bed | head  #to see the unique
#-wa
#-v

is this overlap significant?

bedrolls fisher -m -a XXX.bed -b XXX.bed -g hg38.genome   
#m: merge everything
## two tail p-value

bedtools shuffle

bedtools shuffle -i hg38_gwas.bed
# only one file as input
bedtools shuffle -i hg38_gwas.bed -g hg38.genome  | bedtools sort> hg38_gwas_shuffle.bed

Formate conversion
Coverage Plots


R shiny
R markdown
knitR
R a platform for releasing result
R call variable “objects”

df=data.frame(
a=c()
b=c()
)
summary(df)
#列结合

help(“[“)

sum(1,2,NA, na.rm=T)
3
sum(1, NULL)
1
R
barplot(c(1,2,3))
q() # get out of R

first step in R with R studio

on-line http://try.jupyter.org/ (chose Welcome R-demo)

如何在R里搞一些随机数:
http://blog.csdn.net/lilanfeng1991/article/details/18505723

hist(c(1,2,4))
hist(runif(100))
hist(rnorm(100))

GitHub:
https://www.r-bloggers.com/rstudio-and-github/

上一篇下一篇

猜你喜欢

热点阅读