2021-04-05 近交系数的摸索

2021-04-06  本文已影响0人  L6511

计算我第二次过滤后的数据中的近交系数

[lyc@200server ~]$ ./plink --bfile clean2 --allow-extra-chr --het
Calculating allele frequencies... done.
Total genotyping rate is 0.977479.
3010393 variants and 141 people pass filters and QC.
Note: No phenotypes present.
--het: 3009972 variants scanned, report written to plink.het .
[lyc@200server ~]$ head -n 5 plink.het
     FID      IID       O(HOM)       E(HOM)        N(NM)            F
  PB-362   PB-362      2513372    2.026e+06      2973035       0.5144
  PB-363   PB-363      2508234    2.012e+06      2952631       0.5274
  PB-364   PB-364      2493856    2.019e+06      2962142       0.5036
  PB-365   PB-365      2477780    2.005e+06      2942252       0.5044

我有种不详的预感,0.8以下去掉后会没剩多少
又到了官网的解释

Inbreeding

--het ['small-sample'] ['gz']
--ibc
--het computes observed and expected autosomal homozygous genotype counts for each sample, and reports method-of-moments F coefficient estimates (i.e. (<observed hom. count> - <expected count>) / (<total observations> - <expected count>)) to plink.het. (The 'gz' modifier has the usual effect.) Expected counts are based on loaded (via --read-freq) or imputed MAFs; if there are very few samples in your immediate fileset, --read-freq is practically mandatory since imputed MAFs are wildly inaccurate in that case.

By default, the n/(n-1) multiplier in Nei's expected homozygosity formula is now omitted, since n may be unknown when using --read-freq. The 'small-sample' modifier causes the multiplier to be included, while forcing --het to use imputed MAFs (and known ns) from founders in the immediate dataset. (--maf-succ is not applied here.)

--ibc (ported from GCTA) calculates three inbreeding coefficients for each sample, and writes a report to plink.ibc. Briefly, Fhat1 is the usual variance-standardized relationship minus 1, Fhat2 is approximately equal to the --het estimate, and Fhat3 is based on the correlation between uniting gametes.

These calculations do not take LD into account. It is usually a good idea to perform some form of LD-based pruning before invoking them.

讲道理还是没告诉我怎么删除那些小于阈值的数据
看样子必须去写R的脚本了
然后我就在linux中的R安装叫dplyr的包,试图设定一个阈值去删除那些行数,然后只是安装这个包就陷入了深渊

> install.packages("dplyr")
Warning in install.packages("dplyr") :
  'lib="/usr/local/lib64/R/library"'不可写
Would you like to use a personal library instead? (yes/No/cancel) yes
Would you like to create a personal library
‘~/R/x86_64-pc-linux-gnu-library/3.5’
to install packages into? (yes/No/cancel) yes
--- 在此連線階段时请选用CRAN的鏡子 ---
Warning: failed to download mirrors file (无法打开URL'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/usr/local/lib64/R/doc/CRAN_mirrors.csv'
Secure CRAN mirrors 

 1: 0-Cloud [https]                   2: Algeria [https]                
 3: Australia (Canberra) [https]      4: Australia (Melbourne 1) [https]
 5: Australia (Melbourne 2) [https]   6: Australia (Perth) [https]      
 7: Austria [https]                   8: Belgium (Ghent) [https]        
 9: Brazil (PR) [https]              10: Brazil (RJ) [https]            
11: Brazil (SP 1) [https]            12: Brazil (SP 2) [https]          
13: Bulgaria [https]                 14: Chile 1 [https]                
15: Chile 2 [https]                  16: China (Guangzhou) [https]      
17: China (Lanzhou) [https]          18: China (Shanghai) [https]       
19: Colombia (Cali) [https]          20: Czech Republic [https]         
21: Denmark [https]                  22: East Asia [https]              
23: Ecuador (Cuenca) [https]         24: Ecuador (Quito) [https]        
25: Estonia [https]                  26: France (Lyon 1) [https]        
27: France (Lyon 2) [https]          28: France (Marseille) [https]     
29: France (Montpellier) [https]     30: France (Paris 2) [https]       
31: Germany (Erlangen) [https]       32: Germany (Göttingen) [https]    
33: Germany (Münster) [https]       34: Greece [https]                 
35: Iceland [https]                  36: Indonesia (Jakarta) [https]    
37: Ireland [https]                  38: Italy (Padua) [https]          
39: Japan (Tokyo) [https]            40: Japan (Yonezawa) [https]       
41: Malaysia [https]                 42: Mexico (Mexico City) [https]   
43: Norway [https]                   44: Philippines [https]            
45: Serbia [https]                   46: Spain (A Coruña) [https]       
47: Spain (Madrid) [https]           48: Sweden [https]                 
49: Switzerland [https]              50: Turkey (Denizli) [https]       
51: Turkey (Mersin) [https]          52: UK (Bristol) [https]           
53: UK (Cambridge) [https]           54: UK (London 1) [https]          
55: USA (CA 1) [https]               56: USA (IA) [https]               
57: USA (KS) [https]                 58: USA (MI 1) [https]             
59: USA (NY) [https]                 60: USA (OR) [https]               
61: USA (TN) [https]                 62: USA (TX 1) [https]             
63: Vietnam [https]                  64: (other mirrors)                


Selection: 17
Warning: 无法在貯藏處https://mirror.lzu.edu.cn/CRAN/src/contrib中读写索引:
  无法打开URL'https://mirror.lzu.edu.cn/CRAN/src/contrib/PACKAGES'
Warning messages:
1: In download.file(url, destfile = f, quiet = TRUE) :
  URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'
2: package ‘dplyr’ is not available (for R version 3.5.1) 

我看到好几个回答都是选了兰州的镜像才这样的,看到有回答说换上海的可以

> chooseCRANmirror()
Warning: failed to download mirrors file (无法打开URL'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/usr/local/lib64/R/doc/CRAN_mirrors.csv'
Secure CRAN mirrors 

 1: 0-Cloud [https]                   2: Algeria [https]                
 3: Australia (Canberra) [https]      4: Australia (Melbourne 1) [https]
 5: Australia (Melbourne 2) [https]   6: Australia (Perth) [https]      
 7: Austria [https]                   8: Belgium (Ghent) [https]        
 9: Brazil (PR) [https]              10: Brazil (RJ) [https]            
11: Brazil (SP 1) [https]            12: Brazil (SP 2) [https]          
13: Bulgaria [https]                 14: Chile 1 [https]                
15: Chile 2 [https]                  16: China (Guangzhou) [https]      
17: China (Lanzhou) [https]          18: China (Shanghai) [https]       
19: Colombia (Cali) [https]          20: Czech Republic [https]         
21: Denmark [https]                  22: East Asia [https]              
23: Ecuador (Cuenca) [https]         24: Ecuador (Quito) [https]        
25: Estonia [https]                  26: France (Lyon 1) [https]        
27: France (Lyon 2) [https]          28: France (Marseille) [https]     
29: France (Montpellier) [https]     30: France (Paris 2) [https]       
31: Germany (Erlangen) [https]       32: Germany (Göttingen) [https]    
33: Germany (Münster) [https]       34: Greece [https]                 
35: Iceland [https]                  36: Indonesia (Jakarta) [https]    
37: Ireland [https]                  38: Italy (Padua) [https]          
39: Japan (Tokyo) [https]            40: Japan (Yonezawa) [https]       
41: Malaysia [https]                 42: Mexico (Mexico City) [https]   
43: Norway [https]                   44: Philippines [https]            
45: Serbia [https]                   46: Spain (A Coruña) [https]       
47: Spain (Madrid) [https]           48: Sweden [https]                 
49: Switzerland [https]              50: Turkey (Denizli) [https]       
51: Turkey (Mersin) [https]          52: UK (Bristol) [https]           
53: UK (Cambridge) [https]           54: UK (London 1) [https]          
55: USA (CA 1) [https]               56: USA (IA) [https]               
57: USA (KS) [https]                 58: USA (MI 1) [https]             
59: USA (NY) [https]                 60: USA (OR) [https]               
61: USA (TN) [https]                 62: USA (TX 1) [https]             
63: Vietnam [https]                  64: (other mirrors)                


Selection: 18
Warning message:
In download.file(url, destfile = f, quiet = TRUE) :
  URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'

换了上海的也没用


image.png

换美国的也没用


image.png
> install.packages('dplyr',dependencies=TRUE,repos='http://cran.rstudio.com/')             
错误: 句法分析器1行里不能有多字节字符
image.png

处处是坑,我决定换一个思路,不调包能不能直接用,或者直接用linux去做,忽然发现全部搞错了,那个不是真正的近交系数,还需要自己去算一算,利用maf和het,还要解决镜像的问题

上一篇 下一篇

猜你喜欢

热点阅读