实操|rrBLUP包RRBLUP

时间:2021-6-19 作者:qvyue

数据处理

VCF转为 rrBLUP {-1,0,1} 格式

rrBLUP可识别的基因型格式为 {-1,0,1} (行头为marker,列为sample),因此需要对基本数据处理转换;

编码G矩阵计算时, 有不同的编码形式,如下:

  • 0,1,2; 即AA是0, 表示major基因, 1 表示杂合, 2表示aa(minor).
  • -1, 0, 1; 即-1是AA, 表示major基因型, 0表示杂合, 1表示aa(minor).
## vcftools 生成{ 0,1,2} 矩阵    
vcftools --vcf test.genotypes_no_missing_IDs.vcf --012 --out snp_matrix 
  • –012
    This option outputs the genotypes as a large matrix. Three files are produced. The first, with suffix “.012”, contains the genotypes of each individual on a separate line. Genotypes are represented as 0, 1 and 2, where the number represent that number of non-reference alleles. Missing genotypes are represented by -1. The second file, with suffix “.012.indv” details the individuals included in the main file. The third file, with suffix “.012.pos” details the site locations included in the main file.
##R    
data snp.txt

文件输入

示例文件:
traits.txt: https://pbgworks.org/sites/pbgworks.org/files/traits.txt
snp.txt: https://pbgworks.org/sites/pbgworks.org/files/snp.txt

Pheno 

数据过滤和填充

impute = A.mat(Markers,max.missing=0.5,impute.method="mean",return.imputed=T)#按50%缺失值过滤,并按均值填充 
Markers_impute2 = impute$imputed

简单交叉验证

traits=1 
cycles=300 
accuracy = matrix(nrow=cycles, ncol=traits)
for(r in 1:cycles){
  train= as.matrix(sample(1:207, 180)) 
  test
实操|rrBLUP包RRBLUP
多性状自动化计算

资料:

Introduction to Genomic Selection in R using the rrBLUP Package
【GS专栏】8-全基因组选择实战之RRBLUP

声明:本文内容由互联网用户自发贡献自行上传,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任。如果您发现有涉嫌版权的内容,欢迎发送邮件至:qvyue@qq.com 进行举报,并提供相关证据,工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。