根据SNP构建fasta
# SNP下载:
ftp://ftp-mouse.sanger.ac.uk/current_snps/strain_specific_vcfs/

# vcf格式:
ref:http://www.internationalgenome.org/wiki/Analysis/vcf4.0

# 过滤:





ref:https://biopet.github.io/vcffilter/0.2/index.html 【含下载】
java -jar /home/pc/biosoft/vcffilter-assembly-0.1.jar --help
也可过滤,但是没用。
# 过滤命令:
ref:https://github.com/vcflib/vcflib
/home/pc/biosoft/Vcflib/vcflib/bin/vcffilter -f "FILTER = PASS" BALB_cJ.mgp.v5.snps.dbSNP142.vcf > BALB.vcf
占比:


约passed:4576884/5203549=0.87956969

# 软件vcftools:
ref:https://sourceforge.net/projects/vcftools/
ref2:http://vcftools.sourceforge.net/index.html
# 命令:

cat ref.fa | vcf-consensus file.vcf.gz > out.fa
# 参考基因组:
ftp-mouse.sanger.ac.uk/ref/GRCm38_68.fa
下载后查看:








结论:使用UCSC的mm10参考基因组进行构建OK!
步骤:
1.下载vcf的tbi文件:
axel -n 10 ftp://ftp-mouse.sanger.ac.uk/current_snps/strain_specific_vcfs/BALB_cJ.mgp.v5.snps.dbSNP142.vcf.gz.tbi
或者自己构建:
gunzip BALB.vcf.gz
bgzip -c BALB.vcf > BALB.vcf.gz
tabix -p vcf BALB.vcf.gz
2.vcftools:
cat ../mm10.chr.fa | vcf-consensus BALB.vcf.gz > BALB.fa
发现chr有问题:
sed 's/>chr/>/g' ../mm10.chr.fa > mm10.fa
cat mm10.fa | vcf-consensus BALB.vcf.gz > BALB.fa
sed -i 's/>/>chr/g' BALB.fa
samtools faidx查看下是否相同:


