2021-04-05 近交系数的摸索
计算我第二次过滤后的数据中的近交系数
[lyc@200server ~]$ ./plink --bfile clean2 --allow-extra-chr --het
Calculating allele frequencies... done.
Total genotyping rate is 0.977479.
3010393 variants and 141 people pass filters and QC.
Note: No phenotypes present.
--het: 3009972 variants scanned, report written to plink.het .
[lyc@200server ~]$ head -n 5 plink.het
FID IID O(HOM) E(HOM) N(NM) F
PB-362 PB-362 2513372 2.026e+06 2973035 0.5144
PB-363 PB-363 2508234 2.012e+06 2952631 0.5274
PB-364 PB-364 2493856 2.019e+06 2962142 0.5036
PB-365 PB-365 2477780 2.005e+06 2942252 0.5044
我有种不详的预感,0.8以下去掉后会没剩多少
又到了官网的解释
Inbreeding
--het ['small-sample'] ['gz']
--ibc
--het computes observed and expected autosomal homozygous genotype counts for each sample, and reports method-of-moments F coefficient estimates (i.e. (<observed hom. count> - <expected count>) / (<total observations> - <expected count>)) to plink.het. (The 'gz' modifier has the usual effect.) Expected counts are based on loaded (via --read-freq) or imputed MAFs; if there are very few samples in your immediate fileset, --read-freq is practically mandatory since imputed MAFs are wildly inaccurate in that case.
By default, the n/(n-1) multiplier in Nei's expected homozygosity formula is now omitted, since n may be unknown when using --read-freq. The 'small-sample' modifier causes the multiplier to be included, while forcing --het to use imputed MAFs (and known ns) from founders in the immediate dataset. (--maf-succ is not applied here.)
--ibc (ported from GCTA) calculates three inbreeding coefficients for each sample, and writes a report to plink.ibc. Briefly, Fhat1 is the usual variance-standardized relationship minus 1, Fhat2 is approximately equal to the --het estimate, and Fhat3 is based on the correlation between uniting gametes.
These calculations do not take LD into account. It is usually a good idea to perform some form of LD-based pruning before invoking them.
讲道理还是没告诉我怎么删除那些小于阈值的数据
看样子必须去写R的脚本了
然后我就在linux中的R安装叫dplyr的包,试图设定一个阈值去删除那些行数,然后只是安装这个包就陷入了深渊
> install.packages("dplyr")
Warning in install.packages("dplyr") :
'lib="/usr/local/lib64/R/library"'不可写
Would you like to use a personal library instead? (yes/No/cancel) yes
Would you like to create a personal library
‘~/R/x86_64-pc-linux-gnu-library/3.5’
to install packages into? (yes/No/cancel) yes
--- 在此連線階段时请选用CRAN的鏡子 ---
Warning: failed to download mirrors file (无法打开URL'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/usr/local/lib64/R/doc/CRAN_mirrors.csv'
Secure CRAN mirrors
1: 0-Cloud [https] 2: Algeria [https]
3: Australia (Canberra) [https] 4: Australia (Melbourne 1) [https]
5: Australia (Melbourne 2) [https] 6: Australia (Perth) [https]
7: Austria [https] 8: Belgium (Ghent) [https]
9: Brazil (PR) [https] 10: Brazil (RJ) [https]
11: Brazil (SP 1) [https] 12: Brazil (SP 2) [https]
13: Bulgaria [https] 14: Chile 1 [https]
15: Chile 2 [https] 16: China (Guangzhou) [https]
17: China (Lanzhou) [https] 18: China (Shanghai) [https]
19: Colombia (Cali) [https] 20: Czech Republic [https]
21: Denmark [https] 22: East Asia [https]
23: Ecuador (Cuenca) [https] 24: Ecuador (Quito) [https]
25: Estonia [https] 26: France (Lyon 1) [https]
27: France (Lyon 2) [https] 28: France (Marseille) [https]
29: France (Montpellier) [https] 30: France (Paris 2) [https]
31: Germany (Erlangen) [https] 32: Germany (Göttingen) [https]
33: Germany (Münster) [https] 34: Greece [https]
35: Iceland [https] 36: Indonesia (Jakarta) [https]
37: Ireland [https] 38: Italy (Padua) [https]
39: Japan (Tokyo) [https] 40: Japan (Yonezawa) [https]
41: Malaysia [https] 42: Mexico (Mexico City) [https]
43: Norway [https] 44: Philippines [https]
45: Serbia [https] 46: Spain (A Coruña) [https]
47: Spain (Madrid) [https] 48: Sweden [https]
49: Switzerland [https] 50: Turkey (Denizli) [https]
51: Turkey (Mersin) [https] 52: UK (Bristol) [https]
53: UK (Cambridge) [https] 54: UK (London 1) [https]
55: USA (CA 1) [https] 56: USA (IA) [https]
57: USA (KS) [https] 58: USA (MI 1) [https]
59: USA (NY) [https] 60: USA (OR) [https]
61: USA (TN) [https] 62: USA (TX 1) [https]
63: Vietnam [https] 64: (other mirrors)
Selection: 17
Warning: 无法在貯藏處https://mirror.lzu.edu.cn/CRAN/src/contrib中读写索引:
无法打开URL'https://mirror.lzu.edu.cn/CRAN/src/contrib/PACKAGES'
Warning messages:
1: In download.file(url, destfile = f, quiet = TRUE) :
URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'
2: package ‘dplyr’ is not available (for R version 3.5.1)
我看到好几个回答都是选了兰州的镜像才这样的,看到有回答说换上海的可以
> chooseCRANmirror()
Warning: failed to download mirrors file (无法打开URL'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/usr/local/lib64/R/doc/CRAN_mirrors.csv'
Secure CRAN mirrors
1: 0-Cloud [https] 2: Algeria [https]
3: Australia (Canberra) [https] 4: Australia (Melbourne 1) [https]
5: Australia (Melbourne 2) [https] 6: Australia (Perth) [https]
7: Austria [https] 8: Belgium (Ghent) [https]
9: Brazil (PR) [https] 10: Brazil (RJ) [https]
11: Brazil (SP 1) [https] 12: Brazil (SP 2) [https]
13: Bulgaria [https] 14: Chile 1 [https]
15: Chile 2 [https] 16: China (Guangzhou) [https]
17: China (Lanzhou) [https] 18: China (Shanghai) [https]
19: Colombia (Cali) [https] 20: Czech Republic [https]
21: Denmark [https] 22: East Asia [https]
23: Ecuador (Cuenca) [https] 24: Ecuador (Quito) [https]
25: Estonia [https] 26: France (Lyon 1) [https]
27: France (Lyon 2) [https] 28: France (Marseille) [https]
29: France (Montpellier) [https] 30: France (Paris 2) [https]
31: Germany (Erlangen) [https] 32: Germany (Göttingen) [https]
33: Germany (Münster) [https] 34: Greece [https]
35: Iceland [https] 36: Indonesia (Jakarta) [https]
37: Ireland [https] 38: Italy (Padua) [https]
39: Japan (Tokyo) [https] 40: Japan (Yonezawa) [https]
41: Malaysia [https] 42: Mexico (Mexico City) [https]
43: Norway [https] 44: Philippines [https]
45: Serbia [https] 46: Spain (A Coruña) [https]
47: Spain (Madrid) [https] 48: Sweden [https]
49: Switzerland [https] 50: Turkey (Denizli) [https]
51: Turkey (Mersin) [https] 52: UK (Bristol) [https]
53: UK (Cambridge) [https] 54: UK (London 1) [https]
55: USA (CA 1) [https] 56: USA (IA) [https]
57: USA (KS) [https] 58: USA (MI 1) [https]
59: USA (NY) [https] 60: USA (OR) [https]
61: USA (TN) [https] 62: USA (TX 1) [https]
63: Vietnam [https] 64: (other mirrors)
Selection: 18
Warning message:
In download.file(url, destfile = f, quiet = TRUE) :
URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'
换了上海的也没用
image.png
换美国的也没用
image.png
> install.packages('dplyr',dependencies=TRUE,repos='http://cran.rstudio.com/')
错误: 句法分析器1行里不能有多字节字符
image.png
处处是坑,我决定换一个思路,不调包能不能直接用,或者直接用linux去做,忽然发现全部搞错了,那个不是真正的近交系数,还需要自己去算一算,利用maf和het,还要解决镜像的问题