Bioinformatics course_F6

2017-08-30 本文已影响21人 Ternq8

F6 wiki

F6_Bioinformatics.png

Intro:
ls
grep
uniq

BOOK: shell scripting

cd
..
options
ls -l -t order as time edited.
ls -lt the same

https://fantom6-collaboration.gsc.riken.jp/files/mongoDB/

ls -lt | head first 10 line
bunzip2 RunAll_sample_summary.20170830.tsv.bz2 解压bz2

less -S RunAll_sample_summary.20170830.tsv
less RunAll_sample_summary.20170830.tsv
bash-3.2$ wc RunAll_sample_summary.20170830.tsv 7191 212813 1857590 RunAll_sample_summary.20170830.tsv
lines words letters

are sample ID unique or not?
cut -f1 RunAll_sample_summary.20170830.tsv | head
>out put
unique

关于cut的参数: http://man.linuxde.net/cut

SAM/BAM介绍
mapping
formate for reads

Fast: Fasta + quality

bam is the same as SAM,but zip

Each line of SAM start with@

QNAME: the name of the read
FLAG: what happened with the reads
RNAME
POS
CIGAR: geometry
drunken sailor？

Bedtools

bed formate

chr start end | option (name score strand (+/-/.))|

start <end

mysql

set theory for genomics

bedtools intersect --help
bedtools intersect -u -a XX.bed -b XXX.bed | head  #to see the unique
#-wa
#-v

is this overlap significant?

bedrolls fisher -m -a XXX.bed -b XXX.bed -g hg38.genome   
#m: merge everything
## two tail p-value

bedtools shuffle

bedtools shuffle -i hg38_gwas.bed
# only one file as input

bedtools shuffle -i hg38_gwas.bed -g hg38.genome  | bedtools sort> hg38_gwas_shuffle.bed

Formate conversion
Coverage Plots

R shiny
R markdown
knitR
R a platform for releasing result
R call variable “objects”

vectors 向量
c()
paste0()
Lists: a second cornerstone class of R
可以把任何变量加在一个list里面。
dataframe

df=data.frame(
a=c()
b=c()
)
summary(df)
#列结合

subsetting elements of objects
by coordinate
or by name

help(“[“)

“nothing”
NA: missing values, not available
NULL: nothing
NaN: 0/0 # result is not a number

sum(1,2,NA, na.rm=T)
3
sum(1, NULL)
1

in terminal-

R
barplot(c(1,2,3))
q() # get out of R

first step in R with R studio

on-line http://try.jupyter.org/ (chose Welcome R-demo)

如何在R里搞一些随机数：
http://blog.csdn.net/lilanfeng1991/article/details/18505723

hist(c(1,2,4))
hist(runif(100))
hist(rnorm(100))

GitHub:
https://www.r-bloggers.com/rstudio-and-github/

Bioinformatics course_F6

Bedtools

猜你喜欢

热点阅读