科迪华数据科学家对基因组信息应用于植物育种的观点与建议

2023-10-12  本文已影响0人  生物信息与育种

本文内容整理自科迪华农业科学公司(Corteva Agriscience)的数量遗传学家Alencar Xavier博士几年前做的报告。Alencar Xavier在统计遗传学方面的工作是基因组辅助育种,重点是数据驱动的植物育种的理论和计算方面,例如使用各种信息来源进行建模、预测和选择。其研究涉及使用混合模型,贝叶斯方法和机器学习以及高性能计算的新数量遗传方法的开发和实施。其更多的介绍和工作可参考:

而Corteva于几年前由杜邦先锋、陶氏益农并购重组后,已然成为全球top2的巨型种企,做育种的不用做过多介绍。看看一线大厂的科学家怎么做育种的吧。

引言:在植物育种中利用基因组信息的机遇与挑战

数据的激增与测序成本的下降。

image.png

育种流程

image.png image.png image.png image.png image.png

遗传优势建模

动物GS中的单步法建模与植物GS中的单阶段建模
image.png

应用

为避免翻译错误,这里放原文为好。

image.png

Germplasm classification (PCA, Clustering, Unsupervised ML, FST)

Characterization

Characterize diversity using unsupervised learning methods.

Heterotic group

Classify (if known) or infer (if unknown) heterotic groups on individuals and populations.

Signatures of selection

Use FST (or related methods) to identify signatures of selection, adaptation and domestication.

Incorporation (GWAS, haplotype analysis)

Trait discovery

Finding new QTLs via association analysis on breeding data and designed populations.

Introduction of diversity

Screening non-elite (or elite from elsewhere) germplasm for pre-breeding.

Haplotype enrichment

Assess genome of non-elite material to add diversity to regions where elite germplasm is fixed.

Genomic selection (BayesABC, Supervised ML, etc.)

F2 enrichment (WF)

Entire population is genotyped with few markers and selected for specific QTL (e.g. disease resistance)

Pre-selection (WF/AF)

Entire population is genotyped and 0% is phenotyped. Selection is based on the genomic merit
estimated a predefined estimation set that is either made by design or using breeding data.

Test-and-shelf (WF/AF)

Entire population is genotyped and X% is phenotyped. Within-season selection is based on the
genomic merit estimated with a genomic model from phenotyped individuals.

Advancement (WF/AF)

Entire population is genotyped and phenotyped. Selection is based on the genetic merit of the
individuals using one or more seasons of data from those individuals.

Product placement (AF)

Similar to advancement but GxE takes the spotlight from G.

Recycling (Simulation and optimization)

Selection of parents

Selection of high BV individuals with complementary polygene or traits.

Select combinations

Providing a set of candidate parents (100% genotyped), combinations are based on clustering,
simulate crosses or predefined criterium (OHV or OPV).

Quantitative assessment (Variance component analysis)

Heritability

Narrow-sense and GxE (e.g. compound symmetry)

Genetic variance decomposition

Classic (Vg = Va + Vd + Vi) and hybrid (Vg = VGCA1 + VGCA2 + VSCA)

Genetic correlations

Across traits or within-trait across environments

Effective population size

Eigen analysis of the G matrix

Genetic progress and rate of genetic gains

Assess multiple years

Evaluate breeding strategies

Simulations and retrospective studies to ask what if questions

挑战

关键挑战

反复调整育种设计

对于育种家

总结

优化育种程序的参考资料:

Rincent et al. (2012) Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals. Genetics, 192(2), 715-728.

Isidro et al. (2015). Training set optimization under population structure in genomic selection. TAG 128(1), 145-158.

Habier (2016). Improved molecular breeding methods. US20160321396A1.

Ou and Liao (2019). Training set determination for genomic selection. TAG 132(10), 2781-2792.

Brauner et al. (2019). Genomic prediction with multiple biparental families. TAG

上一篇 下一篇

猜你喜欢

热点阅读