Bioinformatics course_F6
- Intro:
ls
grep
uniq
BOOK: shell scripting
cd
..
options
ls -l -t
order as time edited.
ls -lt
the same
https://fantom6-collaboration.gsc.riken.jp/files/mongoDB/
ls -lt | head
first 10 line
bunzip2 RunAll_sample_summary.20170830.tsv.bz2
解压bz2
less -S RunAll_sample_summary.20170830.tsv
less RunAll_sample_summary.20170830.tsv
bash-3.2$ wc RunAll_sample_summary.20170830.tsv 7191 212813 1857590 RunAll_sample_summary.20170830.tsv
lines words letters
are sample ID unique or not?
cut -f1 RunAll_sample_summary.20170830.tsv | head
>
out put
unique
关于cut的参数: http://man.linuxde.net/cut
SAM/BAM介绍
mapping
formate for reads
Fast: Fasta + quality
bam is the same as SAM,but zip
Each line of SAM start with@
- QNAME: the name of the read
- FLAG: what happened with the reads
- RNAME
- POS
- CIGAR: geometry
drunken sailor?
Bedtools
bed formate
chr start end | option (name score strand (+/-/.))|
start <end
mysql
set theory for genomics
bedtools intersect --help
bedtools intersect -u -a XX.bed -b XXX.bed | head #to see the unique
#-wa
#-v
is this overlap significant?
bedrolls fisher -m -a XXX.bed -b XXX.bed -g hg38.genome
#m: merge everything
## two tail p-value
bedtools shuffle
bedtools shuffle -i hg38_gwas.bed
# only one file as input
bedtools shuffle -i hg38_gwas.bed -g hg38.genome | bedtools sort> hg38_gwas_shuffle.bed
Formate conversion
Coverage Plots
R shiny
R markdown
knitR
R a platform for releasing result
R call variable “objects”
-
vectors 向量
c()
paste0()
-
Lists: a second cornerstone class of R
可以把任何变量加在一个list里面。 -
dataframe
df=data.frame(
a=c()
b=c()
)
summary(df)
#列结合
- subsetting elements of objects
by coordinate
or by name
help(“[“)
- “nothing”
NA: missing values, not available
NULL: nothing
NaN: 0/0 # result is not a number
sum(1,2,NA, na.rm=T)
3
sum(1, NULL)
1
- in terminal-
R
barplot(c(1,2,3))
q() # get out of R
first step in R with R studio
on-line http://try.jupyter.org/ (chose Welcome R-demo)
如何在R里搞一些随机数:
http://blog.csdn.net/lilanfeng1991/article/details/18505723
hist(c(1,2,4))
hist(runif(100))
hist(rnorm(100))