文献阅读 2.5 自私的分离扭曲超基因的上位选择——驱动、重组和

2022-07-30  本文已影响0人  龙star180

期刊

eLife 8.713/Q1

Epistatic selection on a selfish Segregation Distorter supergene – drive, recombination, and genetic load

自私的分离扭曲超基因的上位选择——驱动、重组和遗传负荷

Abstract

Meiotic drive supergenes are complexes of alleles at linked loci that together subvert Mendelian segregation resulting in preferential transmission. In males, the most common mechanism of drive involves the disruption of sperm bearing one of a pair of alternative alleles. While at least two loci are important for male drive—the driver and the target—linked modifiers can enhance drive, creating selection pressure to suppress recombination. In this work, we investigate the evolution and genomic consequences of an autosomal, multilocus, male meiotic drive system, Segregation Distorter (SD) in the fruit fly, Drosophila melanogaster. In African populations, the predominant SD chromosome variant, SD-Mal, is characterized by two overlapping, paracentric inversions on chromosome arm 2R and nearly perfect (~100%) transmission. We study the SD-Mal system in detail, exploring its components, chromosomal structure, and evolutionary history. Our findings reveal a recent chromosome-scale selective sweep mediated by strong epistatic selection for haplotypes carrying Sd, the main driving allele, and one or more factors within the double inversion. While most SD-Mal chromosomes are homozygous lethal, SD-Mal haplotypes can recombine with other, complementing haplotypes via crossing over, and with wildtype chromosomes via gene conversion. SD-Mal chromosomes have nevertheless accumulated lethal mutations, excess non-synonymous mutations, and excess transposable element insertions. Therefore, SD-Mal haplotypes evolve as a small, semi-isolated subpopulation with a history of strong selection. These results may explain the evolutionary turnover of SD haplotypes in different populations around the world and have implications for supergene evolution broadly.

减数分裂驱动超基因是连锁基因座上等位基因的复合体,它们共同破坏孟德尔分离,导致优先传播。在男性中,最常见的驱动机制涉及破坏带有一对替代等位基因之一的精子。虽然至少有两个基因座对雄性驱动力很重要——驱动器和目标——连接的修饰物可以增强驱动力,产生抑制重组的选择压力。在这项工作中,我们研究了果蝇果蝇中的常染色体、多位点、雄性减数分裂驱动系统、分离扭曲器 (SD) 的进化和基因组后果。在非洲人群中,主要的 SD 染色体变体 SD-Mal 的特征在于染色体臂 2R 上的两个重叠、旁中心倒位和几乎完美 (~100%) 的传播。我们详细研究了 SD-Mal 系统,探索其组成、染色体结构和进化史。我们的研究结果揭示了最近的染色体尺度选择性扫描,由对携带 Sd、主要驱动等位基因和双重倒位中的一个或多个因子的单倍型的强上位选择介导。虽然大多数 SD-Mal 染色体是纯合致死的,但 SD-Mal 单倍型可以通过交换与其他互补单倍型重组,并通过基因转换与野生型染色体重组。然而,SD-Mal 染色体已经积累了致命的突变、过多的非同义突变和过多的转座因子插入。因此,SD-Mal 单倍型进化为具有强选择历史的小型、半分离的亚群。这些结果可以解释 SD 单倍型在世界各地不同人群中的进化更替,并对超基因进化产生广泛影响。

Editor's evaluation

The work advances our understanding of the Segregation Distorter (SD) complex in Drosophila melanogaster. SD, the classic example of a selfish chromosome, consists of two tightly linked genetic elements and thus qualifies as a supergene. The work also excels through particularly careful analyses.

这项工作促进了我们对黑腹果蝇中分离扭曲器 (SD) 复合体的理解。 SD 是自私染色体的典型例子,由两个紧密相连的遗传元素组成,因此有资格作为超基因。 通过特别仔细的分析,这项工作也很出色。

Introduction

Supergenes are clusters of linked loci that control variation in complex phenotypes. Some supergenes mediate adaptive polymorphisms that are maintained by some form of frequency- or density-dependent natural selection, as in, for example, mimicry in butterflies, self-incompatibility in plants, plumage polymorphisms in birds, and heteromorphic sex chromosomes. Other supergenes are maintained by selfish social behaviors that enhance the fitness of carriers at the expense of non-carriers, as in some ant species. Still other supergenes are maintained by their ability to achieve selfish, better-than-Mendelian transmission during gametogenesis, as in the so-called meiotic drive complexes found in fungi, insects, and mammals.

超基因是控制复杂表型变异的连锁基因座簇。一些超基因介导适应性多态性,这些多态性由某种形式的频率或密度依赖性自然选择维持,例如蝴蝶的拟态、植物的自交不亲和、鸟类的羽毛多态性和异型性染色体。其他超基因是由自私的社会行为维持的,这些行为以牺牲非携带者为代价来增强携带者的适应性,例如在某些蚂蚁物种中。还有一些超基因通过它们在配子发生过程中实现自私、优于孟德尔传播的能力得以维持,如在真菌、昆虫和哺乳动物中发现的所谓减数分裂驱动复合物。

Meiotic drive complexes gain transmission advantages at the expense of other loci and their hosts. In heterozygous carriers of male drive complexes in animals, the driver disables spermatids that bear drive-sensitive target alleles. To spread in the population, the driver must be linked in a cis-arrangement to a drive-resistant (insensitive) target allele. Recombination between the driver and target can result in a ‘suicide’ haplotype that distorts against itself. These epistatic interactions between driver and target lead to selection for modifiers of recombination that tighten linkage, such as chromosomal inversions. Like most supergenes, meiotic drive complexes originate from two or more loci with some degree of initial linkage. Successful drivers thus tend to be located in regions of low recombination, such as non-recombining sex chromosomes, centromeric regions, or in chromosomal inversions of autosomes.

减数分裂驱动复合物以牺牲其他基因座及其宿主为代价获得传播优势。 在动物中雄性驱动复合物的杂合子携带者中,驱动器会禁用带有驱动敏感目标等位基因的精子细胞。 为了在人群中传播,驱动因子必须以顺式排列与驱动抗性(不敏感)目标等位基因相关联。 驱动程序和目标之间的重组可能导致“自杀”单倍型扭曲自身。 驱动因素和靶标之间的这些上位相互作用导致选择重组修饰剂以加强连接,例如染色体倒位。 像大多数超基因一样,减数分裂驱动复合物起源于两个或多个具有某种程度初始连锁的基因座。 因此,成功的驱动因素往往位于重组率低的区域,例如非重组性染色体、着丝粒区域或常染色体的染色体倒位

The short-term benefits of reduced recombination can entail long-term costs. Chromosomal inversions that lock supergene loci together can also incidentally capture linked loci, which causes large chromosomal regions to segregate as blocks. Due to reduced recombination, the efficacy of natural selection in these regions is compromised: deleterious mutations can accumulate, and beneficial ones are more readily lost. Many meiotic drive complexes are thus homozygous lethal or sterile. The degeneration of drive haplotypes is not inevitable, however. Different drive haplotypes that complement one another may be able to recombine, if only among themselves. Gene conversion from wildtype chromosomes may also ameliorate the genetic load of supergenes. Male meiotic drive complexes thus represent a class of selfish supergenes that evolve and persist via the interaction of drive, recombination, and natural selection.

减少重组的短期利益可能会带来长期成本。 将超基因位点锁定在一起的染色体倒位也可以偶然捕获连接的位点,这导致大的染色体区域分离为块。 由于重组减少,自然选择在这些区域的功效受到损害:有害的突变可以积累,而有益的突变更容易丢失。 许多减数分裂驱动复合物因此是纯合致死的或不育的。 然而,驱动单倍型的退化并非不可避免。 相互补充的不同驱动单倍型可能能够重组,即使只是在它们之间。 野生型染色体的基因转换也可以改善超基因的遗传负荷。 因此,雄性减数分裂驱动复合体代表了一类自私的超基因,它们通过驱动、重组和自然选择的相互作用而进化和持续存在。

Here, we focus on the evolutionary genetics of Segregation Distorter (SD), a well-known autosomal meiotic drive complex in Drosophila melanogaster. In heterozygous males, SD disables sperm bearing drive-sensitive wildtype chromosomes via a chromatin condensation defect. SD has two main components: the driver, Segregation Distorter (Sd), is a truncated duplication of the gene RanGAP located in chromosome arm 2L; and the target of drive, Responder (Rsp), is a block of satellite DNA in the pericentromeric heterochromatin of 2R. Previous studies of SD chromosomes have detected linked upward modifiers of drive, including Enhancer of SD (E[SD]) on 2L and several others on 2R, but their molecular identities are unknown. Sd-RanGAP and Rsp straddle the centromere, a region of reduced recombination, and some SD chromosomes bear pericentric inversions that presumably further tighten linkage among these loci. In heterozygotes with a pericentric inversion, recombination in the inverted region generates aneuploids and therefore reduced fertility, although this effect might be mitigated by strong suppression of recombination. Many SD chromosomes also bear paracentric inversions on 2R. Although recombination between paracentric inversions and the main components of SD is possible, their strong association implies a role for epistatic selection in the evolution of these supergenes.

在这里,我们专注于分离扭曲器 (SD) 的进化遗传学,SD 是黑腹果蝇中著名的常染色体减数分裂驱动复合物。在杂合子男性中,SD 通过染色质凝聚缺陷使携带驱动敏感的野生型染色体的精子失效。 SD 有两个主要组成部分:驱动程序,分离失真 (Sd),是位于染色体臂 2L 中的基因 RanGAP 的截断重复;驱动的目标,响应者 (Rsp),是 2R 的着丝粒周围异染色质中的一个卫星 DNA 块。先前对 SD 染色体的研究已经检测到驱动的相关向上修饰,包括 2L 上的 SD 增强子 (E[SD]) 和 2R 上的其他几个,但它们的分子身份未知。 Sd-RanGAP 和 Rsp 跨越着丝粒,这是一个重组减少的区域,一些 SD 染色体具有中心倒位,这可能会进一步加强这些基因座之间的联系。在具有中心倒位的杂合子中,倒位区域的重组会产生非整倍体,因此会降低生育能力,尽管这种影响可能会通过强烈抑制重组来减轻。许多 SD 染色体在 2R 上也具有旁着丝粒倒位。尽管旁中心倒位和 SD 的主要成分之间的重组是可能的,但它们的强关联意味着上位选择在这些超基因的进化中的作用。

While SD is present at low population frequencies (<5%) around the world, Sd-RanGAP appears to have originated in sub-Saharan Africa, the ancestral geographic range of D. melanogaster, survived the out-of-Africa bottleneck, and spread to the rest of the world. Multiple factors likely contribute to the low frequency of SD in populations: negative selection, insensitive Rsp alleles, and unlinked suppressors. Two independent longitudinal studies suggest that SD haplotypes can replace each other in populations over short time scales (<30 years) without major changes in the overall population frequency of SD. The predominant SD variant in Africa is SD-Mal, which recently swept across the entire continent. SD-Mal has a pair of rare, African-endemic, overlapping paracentric inversions spanning ~40% of 2R:In(2R)51B6–11;55E3–12 and In(2R)44F3–12;54E3–10, hereafter collectively referred to as In(2R)Mal. SD-Mal chromosomes are particularly strong drivers, with ~100% transmission. Notably, recombinant chromosomes bearing the Sd-RanGAP duplication from this haplotype but lacking the inversions do not drive, suggesting that In(2R)Mal is essential for SD-Mal drive. We therefore expect strong epistatic selection to enforce the association of Sd-RanGAP and In(2R)Mal. The functional role of In(2R)Mal for drive is still unclear: do these inversions function to suppress recombination between Sd-RanGAP and a major distal enhancer on 2R, or do they contain a major enhancer?

虽然 SD 存在于世界各地的低人口频率(<5%),但 Sd-RanGAP 似乎起源于撒哈拉以南非洲,即黑腹果蝇的祖先地理范围,在非洲以外的瓶颈中幸存下来,并传播到世界其他地方。多种因素可能导致人群中 SD 的低频率:负选择、不敏感的 Rsp 等位基因和未链接的抑制因子。两项独立的纵向研究表明,SD 单倍型可以在短时间内(<30 年)在人群中相互替代,而 SD 的总体人口频率不会发生重大变化。非洲主要的 SD 变体是 SD-Mal,它最近席卷了整个大陆。 SD-Mal 有一对罕见的、非洲特有的、重叠的旁中心倒位,跨越约 40% 的 2R:In(2R)51B6-11;55E3-12 和 In(2R)44F3-12;54E3-10,以下统称为 In(2R)Mal。 SD-Mal 染色体是特别强的驱动因素,传播率约为 100%。值得注意的是,带有来自该单倍型的 Sd-RanGAP 重复但缺乏倒位的重组染色体不会驱动,这表明 In(2R)Mal 对 SD-Mal 驱动至关重要。因此,我们期望强大的上位选择来强制 Sd-RanGAP 和 In(2R)Mal 的关联。 In(2R)Mal 对驱动的功能作用尚不清楚:这些倒位是否能抑制 Sd-RanGAP 与 2R 上的主要远端增强剂之间的重组,或者它们是否含有主要增强剂?

Here, we combine genetic and population genomic approaches to study SD-Mal haplotypes sampled from a single population in Zambia, the putative ancestral range of D. melanogaster. We address four issues. First, we reveal the structural features of the SD-Mal haplotype, including the organization of the insensitive Rsp allele and the In(2R)Mal rearrangements. Second, we characterize the genetic function of In(2R)Mal and its role in drive. Third, we infer the population genetic history of the rapid rise in frequency of SD-Mal in Zambia. And fourth, we explore the evolutionary consequences of reduced recombination on SD-Mal haplotypes. Our results show that SD-Mal experienced a recent chromosome-scale selective sweep mediated by epistatic selection and has, as a consequence of its reduced population recombination rate, accumulated excess non-synonymous mutations and transposable element (TE) insertions. The SD-Mal haplotype is a supergene that evolves as a small, semi-isolated subpopulation in which complementing SD-Mal chromosomes can recombine inter se via crossing over and with wildtype chromosomes via gene conversion. These results have implications for supergene evolution and may explain the enigmatic evolutionary turnover of SD haplotypes in different populations around the world.

在这里,我们结合遗传和群体基因组方法来研究从赞比亚单一群体中采样的 SD-Mal 单倍型,即黑腹果蝇的假定祖先范围。我们解决四个问题。首先,我们揭示了 SD-Mal 单倍型的结构特征,包括不敏感的 Rsp 等位基因的组织和 In(2R)Mal 重排。其次,我们描述了 In(2R)Mal 的遗传功能及其在驱动中的作用。第三,我们推断赞比亚 SD-Mal 频率迅速上升的群体遗传史。第四,我们探讨了减少重组对 SD-Mal 单倍型的进化后果。我们的研究结果表明,SD-Mal 最近经历了由上位选择介导的染色体规模选择性扫描,并且由于其降低的群体重组率,积累了过多的非同义突变和转座因子 (TE) 插入。 SD-Mal 单倍型是一种超基因,进化为一个小的、半分离的亚群,其中互补的 SD-Mal 染色体可以通过交叉重组相互重组,并通过基因转换与野生型染色体重组。这些结果对超基因进化有影响,并且可以解释世界各地不同人群中 SD 单倍型的神秘进化更替。

Results and discussion

To investigate the evolutionary genomics of SD-Mal, we sequenced haploid embryos from nine driving SD-Mal haplotypes sampled from a single population in Zambia, the putative ancestral range of D. melanogaster. Illumina read depth among samples ranged between ~46 and 67× (Supplementary file 1; BioProject PRJNA649752 in NCBI). Additionally, we obtained ~12× coverage with long-read Nanopore sequencing of one homozygous viable line, SD-ZI125, to create a de novo assembly of a representative SD-Mal haplotype (BioProject PRJNA649752 in NCBI; assembly in Navarro-Dominguez et al., 2022a). We use these data to study the evolution of SD-Mal structure, diversity, and recombination.

为了研究 SD-Mal 的进化基因组学,我们对来自赞比亚单一种群(假定的黑腹果蝇祖先范围)的九个驱动 SD-Mal 单倍型的单倍体胚胎进行测序。 样本中的Illumina读取深度范围在~46到67×之间(补充文件1;NCBI中的BioProject PRJNA649752)。 此外,我们通过长读长纳米孔测序获得了约 12 倍的覆盖率,对一个纯合可行系 SD-ZI125 进行了从头组装,以创建代表性 SD-Mal 单倍型的从头组装(NCBI 中的 BioProject PRJNA649752;Navarro-Dominguez 等人中的组装)。 我们使用这些数据来研究 SD-Mal 结构、多样性和重组的演变。

Chromosomal features of the SD-Mal supergene

The SD-Mal haplotype has at least three key features: the main drive locus, the Sd-RanGAP duplication on 2L; an insensitive Responder (Rspi) in 2R heterochromatin; and the paracentric In(2R)Mal arrangement on chromosome 2R (Figure 1). We used our long-read and short-read sequence data for SD-ZI125 to confirm the structure of the duplication (Figure 1A) and then validated features in the other SD-Mal haplotypes. All SD-Mal chromosomes have the Sd-RanGAP duplication at the same location as the parent gene on chromosome 2L. The Rsp locus, the target of SD, corresponds to a block of ~120 bp satellite repeats in 2R heterochromatin. The reference genome, Iso-1, has a Rsps allele corresponding to a primary Rsp locus containing two blocks of tandem Rsp repeats—Rsp-proximal and Rsp-major—with ~1000 copies of the Rsp satellite repeat interrupted by TEs. A small number of Rsp repeats exist outside of the primary Rsp locus, although they are not known to be targeted by SD. There are three of these additional Rsp loci in Iso-1: ~ 10 copies in 2R, distal to the major Rsp locus (Rsp-minor); a single copy at the distal end of 2R (60A); and ~12 copies in 3L. The genomes of SD flies carry ~20 copies of Rsp, but the organization of the primary Rsp locus on SD chromosomes is unknown. To characterize the Rsp locus of the SD-Mal haplotype, we mapped SD-Mal reads to an Iso-1 reference genome. As expected, reads from Iso-1 reference are distributed across the whole Rsp-major region. For SD-Mal chromosomes, however, very few reads map to the Rsp repeats at the Rsp-major (Figure 1B). This suggests that all SD-Mal have a complete deletion of the primary Rsp locus containing Rsp-proximal and Rsp-major and that the only Rsp copies in the SD-Mal genomes are the minor Rsp loci in chromosomes 2R and 3L.

SD-Mal超基因的染色体特征

SD-Mal 单倍型至少具有三个关键特征:主驱动基因座、2L 上的 Sd-RanGAP 重复; 2R 异染色质中的不敏感响应者 (Rspi);和 2R 染色体上的旁中心 In(2R)Mal 排列(图 1)。我们使用 SD-ZI125 的长读和短读序列数据来确认重复的结构(图 1A),然后验证其他 SD-Mal 单倍型的特征。所有 SD-Mal 染色体在与 2L 染色体上的亲本基因相同的位置具有 Sd-RanGAP 重复。 Rsp 基因座,SD 的目标,对应于 2R 异染色质中约 120 bp 的卫星重复块。参考基因组 Iso-1 有一个 Rsps 等位基因,对应于一个包含两个串联 Rsp 重复序列(Rsp-近端和 Rsp-主要)的主要 Rsp 基因座,其中约 1000 个 Rsp 卫星重复拷贝被 TE 中断少数 Rsp 重复存在于主要 Rsp 基因座之外,尽管不知道它们是 SD 的目标。 Iso-1 中有 3 个额外的 Rsp 基因座: 2R 中约 10 个拷贝,位于主要 Rsp 基因座(Rsp-minor)的远端; 2R(60A)远端单份; 3L 中约 12 份。 SD 果蝇的基因组携带约 20 个 Rsp 拷贝,但 SD 染色体上主要 Rsp 基因座的组织是未知的。为了表征 SD-Mal 单倍型的 Rsp 基因座,我们将 SD-Mal 读数映射到 Iso-1 参考基因组。正如预期的那样,来自 Iso-1 参考的读取分布在整个 Rsp 主要区域中。然而,对于 SD-Mal 染色体,很少有读取映射到 Rsp 主要的 Rsp 重复(图 1B)。这表明所有 SD-Mal 都完全缺失了包含 Rsp-proximal 和 Rsp-major 的初级 Rsp 基因座,并且 SD-Mal 基因组中唯一的 Rsp 拷贝是染色体 2R 和 3L 中的次要 Rsp 基因座。

图1

Figure 1. Map depicting the chromosomal features of the SD-Mal chromosome. The schematic shows the cytogenetic map of chromosomes 2L and 2R (redrawn based on images in Lefevre, 1976) and the major features of the chromosome. (A) Dotplot showing that the Sd locus is a partial duplication of the gene RanGAP (in black), located at band 37D2-6. The gene Hs2st occurs in the first intron of RanGAP, and it is also duplicated in the Sd locus (Hs2st-2). (B) The Rsp-major locus is an array of tandem repeats located in the pericentric heterochromatin (band h39). Read mapping to a reference genome containing 2R pericentric heterochromatin (Iso1 strain, see Chang and Larracuente, 2019) shows that SD-Mal chromosomes do not have any Rsp repeats in the Rsp-major locus, consistent with being insensitive to distortion by Sd (Rspi) (orange, high relative coverage regions correspond to transposable element interspersed), in contrast with Iso-1, which is sensitive (Rsps). The tracks below indicate the presence of types of repetitive elements found at this locus. Black lines indicate the presence of a repeat type in the reference genome. Gray shading shows where Rsp repeats are in the reference genome. (C) Two paracentric, overlapping inversions constitute the In(2R)Mal arrangement shown on the schematic of polytene chromosomes: In(2R)51BC;55E (In(2R)Mal-p) in orange brackets and In(2R)44F;54E (In(2R)Mal-d) in red parentheses. Pericentromeric heterochromatin and the centromere are represented by a gray rectangle and black circle, respectively. (D) Our assembly based on long-read sequencing data provide the exact breakpoints of In(2R)Mal and confirms that the distal inversion (Dmel.r6, 2R:14,591,034–18,774,475) occurred first, and the proximal inversion (Dmel.r6, 2R:8,855,601–15,616,195) followed, overlapping ~1 Mb with the distal inversion. The colored rectangles correspond to locally collinear blocks of sequence with the height of lines within the block corresponding to average sequence conservation in the aligned region (Darling et al., 2010). Blocks below the center black line indicate regions that align in the reverse complement orientation. Vertical red lines indicate the end of the assembled chromosomes. Visible marker locations used for generating recombinants (b (34D1), c (52D1), and px (58E4-58E8)) are indicated on the cytogenetic map (Lefevre, 1976).

图 1. 描绘 SD-Mal 染色体的染色体特征的图。该示意图显示了染色体 2L 和 2R 的细胞遗传学图谱(根据 Lefevre,1976 年的图像重新绘制)和染色体的主要特征。 (A)点图显示 Sd 基因座是基因 RanGAP(黑色)的部分重复,位于带 37D2-6。基因 Hs2st 出现在 RanGAP 的第一个内含子中,它也在 Sd 基因座(Hs2st-2)中重复。 (B) Rsp 主要基因座是一系列串联重复序列,位于中心异染色质(带 h39)中。读取映射到包含 2R 中心异染色质的参考基因组(Iso1 菌株,参见 Chang 和 Larracuente,2019)表明 SD-Mal 染色体在 Rsp 主要基因座中没有任何 Rsp 重复,这与对 Sd 失真不敏感(Rspi )(橙色,高相对覆盖区域对应于散布的转座因子),与敏感的 Iso-1 (Rsps) 形成对比。下面的轨迹表明存在在该基因座上发现的重复元素类型。黑线表示参考基因组中存在重复类型。灰色阴影显示 Rsp 重复在参考基因组中的位置。 (C) 两个旁中心重叠倒位构成多线染色体示意图所示的 In(2R)Mal 排列:橙色括号中的 In(2R)51BC;55E (In(2R)Mal-p) 和 In(2R)44F;红色括号中的 54E (In(2R)Mal-d)。着丝粒周围异染色质和着丝粒分别由灰色矩形和黑色圆圈表示。 (D) 我们基于长读长测序数据的组装提供了 In(2R)Mal 的确切断点,并确认远端倒置 (Dmel.r6, 2R:14,591,034–18,774,475) 首先发生,近端倒置 (Dmel.r6 , 2R:8,855,601–15,616,195) 紧随其后,与远端倒置重叠约 1 Mb。彩色矩形对应于局部共线的序列块,块内的线高对应于对齐区域中的平均序列保守性 (Darling et al., 2010)。中心黑线下方的块表示以反向补码方向对齐的区域。垂直的红线表示组装染色体的末端。用于生成重组体的可见标记位置 (b (34D1)、c (52D1) 和 px (58E4-58E8)) 显示在细胞遗传学图谱上 (Lefevre, 1976)。

The complex In(2R)Mal inversion is distal to the Rsp locus on chromosome 2R (Figure 1C). We used our SD-ZI125 assembly to determine the precise breakpoints of these inversions. Relative to the standard D. melanogaster 2R scaffold (BDGP6), SD-ZI125 has three large, rearranged blocks of sequence corresponding to In(2R)Mal (Figure 1C): a 1.03 Mb block collinear with the reference but shifted proximally; a second inverted 5.74 Mb block; and a third inverted 3.16 Mb block. From this organization, we infer that the distal inversion, which we refer to as In(2R)Mal-d, occurred first and spanned 4.18  Mb (approx. 2R:14,591,003–18,774,475). The proximal inversion, which we refer to as In(2R)Mal-p, occurred second and spanned 6.76  Mb, with 1.02  Mb overlapping with the proximal region of In(2R)Mal-d (approx. 2R:8,855,602–17,749,310). Note that any rearrangement different than distal first, proximal second, leads to a different outcome (Figure 1—figure supplement 2). All four breakpoints of the In(2R)Mal rearrangement involve simple joins of unique sequence. Three of these four breakpoints span genes (Figure 1—figure supplement 3): sns (2R:8,798,489–8,856,091), CG10931 (2R:17,748,935–17,750,136), and Mctp (2R:18,761,758–18,774,824). The CDSs of both sns and Mctp remain intact in the In(2R)Mal arrangement, with the inversion disrupting their 3’ UTRs. Neither of these two genes is expressed in testes (https://flybase.org/reports/FBgn0024189; https://flybase.org/reports/FBgn0034389; Chintapalli et al., 2007; FB2021_06; Larkin et al., 2021), making it unlikely that they affect drive. In(2R)Mal-p disrupts the CDS of CG10931, which is a histone methyltransferase with high expression levels in testis (https://flybase.org/reports/FBgn0034274; Chintapalli et al., 2007; FB2021_06; Larkin et al., 2021). Even for genes that are not directly interrupted by the inversion breakpoints, the chromosomal rearrangements may disrupt the regulation of nearby genes if, for example, they affect the organization of topologically associating domains (TADs; reviewed in Spielmann et al., 2018). The In(2R)Mal inversion breakpoints disrupt physical domains reported in Hou et al., 2012, however inversion-mediated disruptions of TAD boundaries do not necessarily affect gene expression (Ghavi-Helm et al., 2019). Future work is required to determine if the inversions affect gene expression near the breakpoints and if CG10931 has a role in the SD-Mal drive phenotype.

复杂的 In(2R)Mal 倒位位于染色体 2R 上 Rsp 基因座的远端(图 1C)。我们使用我们的 SD-ZI125 组件来确定这些反转的精确断点。相对于标准黑腹果蝇 2R 支架 (BDGP6),SD-ZI125 具有三个大的、重新排列的序列块,对应于 In(2R)Mal(图 1C):一个 1.03 Mb 的块与参考共线但向近端移动;第二个倒置的 5.74 Mb 块;第三个倒置的 3.16 Mb 块。从这个组织中,我们推断远端反转,我们称为 In(2R)Mal-d,首先发生并跨越 4.18 Mb(约 2R:14,591,003–18,774,475)。我们称之为 In(2R)Mal-p 的近端反转发生在第二位,跨越 6.76 Mb,其中 1.02 Mb 与 In(2R)Mal-d 的近端区域重叠(约 2R:8,855,602–17,749,310)。请注意,任何不同于远端的重排,近端第二,导致不同的结果(图1-图补充2)。 In(2R)Mal 重排的所有四个断点都涉及唯一序列的简单连接。这四个断点中的三个跨越基因(图 1-图补充 3):sns (2R:8,798,489–8,856,091)、CG10931 (2R:17,748,935–17,750,136) 和 Mctp (2R:18,761,758–18,774,824)。 sns 和 Mctp 的 CDS 在 In(2R)Mal 排列中保持完整,倒置破坏了它们的 3' UTR。这两个基因都没有在睾丸中表达(https://flybase.org/reports/FBgn0024189;https://flybase.org/reports/FBgn0034389;Chintapalli 等人,2007;FB2021_06;Larkin 等人,2021) ,使其不太可能影响驱动器。 In(2R)Mal-p 破坏了 CG10931 的 CDS,CG10931 是一种在睾丸中具有高表达水平的组蛋白甲基转移酶(https://flybase.org/reports/FBgn0034274;Chintapalli 等人,2007;FB2021_06;Larkin 等人。 , 2021)。即使对于不直接被倒位断点中断的基因,染色体重排也可能会破坏附近基因的调节,例如,如果它们影响拓扑关联域的组织(TAD;在 Spielmann 等人中进行了评论,2018 年)。 In(2R)Mal 反转断点破坏了 Hou 等人,2012 年报道的物理域,但是反转介导的 TAD 边界破坏并不一定会影响基因表达(Ghavi-Helm 等人,2019)。未来的工作需要确定倒位是否影响断点附近的基因表达,以及 CG10931 是否在 SD-Mal 驱动表型中起作用。

In African populations, chromosomes bearing Sd but lacking In(2R)Mal do not drive. The functional role of In(2R)Mal in drive is, however, unclear. As expected, In(2R)Mal suppresses recombination: in crosses between a multiply marked chromosome 2, b c px, and SD-Mal, we find that In(2R)Mal reduces the b–c genetic distance by 54.6% and the c–px genetic distance by 92.4%, compared with control crosses between b c px and Oregon-R (Table 1). Our crosses confirm that In(2R)Mal is indeed required for drive: if we generate recombinants along an SD-Mal chromosome, all recombinants with both Sd and In(2R)Mal show strong drive (Table 2, rows 1 and 2), whereas none of the recombinants that separate Sd and In(2R)Mal drive (Table 2, rows 3 and 4). We conclude that SD-Mal drive requires both Sd and In(2R) Mal, which implies that one or more essential enhancers, or co-drivers, is located within or distal to In(2R)Mal.

在非洲人群中,带有 Sd 但缺乏 In(2R)Mal 的染色体不驱动。 然而,In(2R)Mal 在驱动中的功能作用尚不清楚。 正如预期的那样,In(2R)Mal 抑制重组:在多重标记的 2 号染色体、b c px 和 SD-Mal 之间的杂交中,我们发现 In(2R)Mal 将 b-c 遗传距离减少了 54.6%,c- 与 b c px 和 Oregon-R 之间的对照杂交相比,px 遗传距离减少了 92.4%(表 1)。 我们的杂交证实 In(2R)Mal 确实是驱动所必需的:如果我们沿着 SD-Mal 染色体产生重组体,所有具有 Sd 和 In(2R)Mal 的重组体都显示出强大的驱动力(表 2,第 1 行和第 2 行), 而没有一个将 Sd 和 In(2R)Mal 分开的重组体驱动(表 2,第 3 行和第 4 行)。 我们得出结论,SD-Mal 驱动需要 Sd 和 In(2R) Mal,这意味着一个或多个基本增强子或共同驱动器位于 In(2R)Mal 内部或远端。

Table 1 Table 2

The temporal order of inversions (first In(2R)Mal-d, then In(2R)Mal-p) suggests two possible scenarios. In(2R)Mal-d, occurring first, may have captured the essential enhancer, with the subsequent In(2R)Mal-p serving to further reduce recombination between Sd and the enhancer. Alternatively, an essential enhancer may be located distal to In(2R)Mal-d, and the role of both In(2R)Mal inversions is to reduce recombination with Sd. To distinguish these possibilities, we measured drive in b+ Sd c+ In(2R) Mal px recombinants, which bear Sd and In(2R)Mal but have recombined between the distal breakpoint of In(2R)Mal (2R:18,774,475) and px (2R:22,494,297). All of these recombinants show strong drive (n = 71; Table 2, row 2). Assuming that recombination is uniformly distributed throughout the 3.72 Mb interval between the In(2R)Mal-d distal breakpoint and px, the probability of failing to separate an essential co-driver or distal enhancer among any of our 71 recombinants is <0.014. Furthermore, using molecular markers (see Materials and methods), we detected two recombinants within 100 kb of the distal breakpoint of In(2R)Mal, both with strong drive (k > 0.99; Supplementary file 2). We therefore infer that the co-driver resides inside or within 100 kb of the In(2R)Mal arrangement. More specifically, we speculate that the In(2R)Mal-d inversion both captured the co-driver and reduced recombination with Sd, whereas In(2R)Mal-p tightened linkage between centromere-proximal components of SD-Mal and In(2R)Mal-d.

反转的时间顺序(首先是 In(2R)Mal-d,然后是 In(2R)Mal-p)表明了两种可能的情况。首先出现的 In(2R)Mal-d 可能已经捕获了必要的增强子,随后的 In(2R)Mal-p 用于进一步减少 Sd 和增强子之间的重组。或者,一个重要的增强子可能位于 In(2R)Mal-d 的远端,两种 In(2R)Mal 倒位的作用是减少与 Sd 的重组。为了区分这些可能性,我们测量了 b+ Sd c+ In(2R) Mal px 重组体的驱动力,这些重组体带有 Sd 和 In(2R)Mal,但在 In(2R)Mal 的远端断点 (2R:18,774,475) 和 px ( 2R:22,494,297)。所有这些重组体都显示出强大的驱动力(n = 71;表 2,第 2 行)。假设重组在 In(2R)Mal-d 远端断点和 px 之间的 3.72 Mb 间隔内均匀分布,在我们的 71 个重组体中未能分离必要的共同驱动因子或远端增强子的概率为 <0.014。此外,使用分子标记(见材料和方法),我们在 In(2R)Mal 的远端断点 100 kb 内检测到两个重组体,两者都具有很强的驱动力(k > 0.99;补充文件 2)。因此,我们推断副驾驶员位于 In(2R)Mal 排列的内部或 100 kb 范围内。更具体地说,我们推测 In(2R)Mal-d 反转既捕获了共同驱动因素又减少了与 Sd 的重组,而 In(2R)Mal-p 加强了 SD-Mal 和 In(2R) 的着丝粒近端成分之间的联系)Mal-d。

Despite the recruitment of these inversions, recombination occurs readily between Sd and the proximal break of In(2R)Mal (Table 2). Nevertheless, we observe long-range linkage disequilibrium between Sd and In(2R)Mal. Among 204 haploid genomes from Zambia (see Materials and methods), we identified 198 wildtype haplotypes (Sd+ In(2R)Mal+), 3 SD-Mal haplotypes (Sd In(2R)Mal), and 3 recombinant haplotypes (three Sd In(2R) Mal+, zero Sd+ In(2R)Mal). While Sd and In(2R)Mal each have individually low sample frequencies (0.0294 and 0.0147, respectively), they tend to co-occur on the same chromosome (r2 = 0.493; Fisher’s exact p = 1.4 × 10–5). We calculated the expected decay of linkage disequilibrium between Sd and In(2R)Mal in the absence of any natural selection, assuming a conservative sex-averaged recombination frequency corresponding to a map distance between Sd and In(2R)Mal of ~2.5 cM (FlyBase; FB2021_06) and an effective population size of 106. Under these assumptions, the observed estimated coefficient of linkage disequilibrium, D = 0.0143, has an expected half-life of just ~28 generations (2.8 years) and, decays to negligible levels (i.e., expected D and r2 both ~10–3) in <100 generations (<10 years). We therefore conclude that the SD-Mal supergene haplotype is maintained by strong epistatic selection.

尽管招募了这些倒位,但在 Sd 和 In(2R)Mal 的近端断裂之间很容易发生重组(表 2)。然而,我们观察到 Sd 和 In(2R)Mal 之间的远程连锁不平衡。在来自赞比亚的 204 个单倍体基因组中(参见材料和方法),我们鉴定了 198 个野生型单倍型(Sd+ In(2R)Mal+)、3 个 SD-Mal 单倍型(Sd In(2R)Mal)和 3 个重组单倍型(三个 Sd In( 2R) Mal+,零 Sd+ In(2R)Mal)。虽然 Sd 和 In(2R)Mal 各自具有较低的样本频率(分别为 0.0294 和 0.0147),但它们往往同时出现在同一条染色体上(r2 = 0.493;Fisher 精确 p = 1.4 × 10-5)。我们计算了在没有任何自然选择的情况下 Sd 和 In(2R)Mal 之间连锁不平衡的预期衰减,假设保守的性别平均重组频率对应于 Sd 和 In(2R)Mal 之间的图距离约为 2.5 cM ( FlyBase;FB2021_06)和有效种群规模为 106。在这些假设下,观察到的连锁不平衡估计系数 D = 0.0143,预期半衰期仅为约 28 代(2.8 年),并且衰减到可忽略不计的水平(即,在 <100 代(<10 年)中,预期的 D 和 r2 都约为 10-3)。因此,我们得出结论,SD-Mal 超基因单倍型是通过强上位选择来维持的。

Rapid increase in frequency of the SD-Mal supergene

We used population genomics to infer the evolutionary history and dynamics of SD-Mal chromosomes. We called SNPs in our Illumina reads from nine complete SD-Mal haplotypes from Zambia (see Materials and methods). For comparison, we also analyzed wildtype (SD+) chromosomes from the same population in Zambia, including those with chromosome 2 inversions: 10 with the In(2L)t inversion and 10 with the In(2R)NS inversion (see Materials and methods). Table 3 shows that nucleotide diversity (π) is significantly lower on SD-Mal haplotypes compared to uninverted SD+ chromosome arms (Table 3; Figure 2A). The relative reduction in diversity on SD-Mal haplotypes is distributed heterogeneously: π is sharply reduced for a large region that spans ~25.8 Mb, representing 53% of chromosome 2 and extending from Sd-RanGAP on 2L (2L:19,441,959; Figure 2—figure supplement 1), across the centromere, and to  ~2.9  Mb beyond the distal breakpoint of In(2R)Mal (2R:18,774,475; Table 3, rows 3, 5, and 6; Figure 2A). Thus, the region of reduced nucleotide diversity on SD-Mal chromosomes covers all the known essential loci for the drive phenotype: Sd-RanGAP, Rspi, and In(2R)Mal.

我们使用群体基因组学来推断 SD-Mal 染色体的进化历史和动态。我们在来自赞比亚的九个完整 SD-Mal 单倍型的 Illumina 读数中称为 SNP(参见材料和方法)。为了比较,我们还分析了来自赞比亚同一种群的野生型 (SD+) 染色体,包括染色体 2 倒位的染色体:10 个具有 In(2L)t 倒位和 10 个具有 In(2R)NS 倒位(见材料和方法) .表 3 显示,与未倒位的 SD+ 染色体臂相比,SD-Mal 单倍型的核苷酸多样性 (π) 显着降低(表 3;图 2A)。 SD-Mal 单倍型多样性的相对减少分布不均:对于跨越约 25.8 Mb 的大区域,π 急剧减少,代表 2 号染色体的 53% 并从 2L 上的 Sd-RanGAP 延伸(2L:19,441,959;图 2-图补充 1),穿过着丝粒,超过 In(2R)Mal 的远端断点约 2.9 Mb(2R:18,774,475;表 3,第 3、5 和 6 行;图 2A)。因此,SD-Mal 染色体上核苷酸多样性降低的区域涵盖了驱动表型的所有已知基本基因座:Sd-RanGAP、Rspi 和 In(2R)Mal。

Figure 2

Figure 2. Diversity on SD-Mal chromosomes. (A) Average pairwise nucleotide diversity per site (π) and (B) Tajima’s D estimates in non-overlapping 10-kb windows along chromosome 2 in Zambian SD-Mal chromosomes (n = 9, orange) and SD+ chromosomes from the same population, bearing the cosmopolitan inversions In(2L)t (n = 10, dark blue) and In(2R)NS (n = 10, light blue). Regions corresponding to pericentric heterochromatin are shaded in gray and the centromere location is marked with a black circle. SD-Mal chromosomes show a sharp decrease in nucleotide diversity and skewed frequency spectrum from the Sd locus (Sd-RanGAP, 2L:19.4 Mb) to ~2.9 Mb beyond the distal breakpoint of In(2R)Mal. The online version of this article includes the following figure supplement(s) for figure 2:

图 2. SD-Mal 染色体的多样性。 (A) 每个位点的平均成对核苷酸多样性 (π) 和 (B) Tajima 的 D 估计值在赞比亚 SD-Mal 染色体(n = 9,橙色)和来自同一群体的 SD+ 染色体中沿染色体 2 的非重叠 10-kb 窗口中 ,具有世界性反转 In(2L)t(n = 10,深蓝色)和 In(2R)NS(n = 10,浅蓝色)。 与中心异染色质相对应的区域以灰色阴影显示,着丝粒位置用黑色圆圈标记。 SD-Mal 染色体显示核苷酸多样性急剧下降,并且从 Sd 基因座 (Sd-RanGAP, 2L:19.4 Mb) 到超过 In(2R)Mal 远端断点约 2.9 Mb 的偏斜频谱。 本文的在线版本包括对图 2 的下图补充:

The reduced nucleotide diversity among SD-Mal might be expected given its low frequency in natural populations (see below). SD persists at low frequencies in populations worldwide, presumably reflecting the balance between drive, negative selection, and genetic suppression and/or resistance. If the SD-Mal supergene has been maintained at stable drive-selection-suppression equilibrium frequency for a long period of time, then its nucleotide diversity may reflect a mutation-drift equilibrium appropriate for its effective population size. Under this scenario, we expect diversity at the supergene to be similar to wildtype (SD+) diversity scaled by the long-term equilibrium frequency of SD. We estimated SD-Mal frequency to be 1.47% by identifying the Sd duplication and In(2R)Mal breakpoints in 204 haploid genomes from Zambia. To approximate our expectation under mutation-drift equilibrium, we scaled average π from the SD+ sample by 1.47% in 10-kb windows across the region corresponding to the SD-Mal supergene, defined as the region from Sd-RanGAP to the distal breakpoint of In(2R)Mal. While nucleotide diversity outside of the SD-Mal supergene region is comparable to SD+ (Table 3, row 1), diversity in the supergene region is significantly lower than expected even when scaled by its frequency (Table 3, row 4), suggesting that the low population frequency of SD-Mal cannot fully explain its reduced diversity. This observation suggests two possibilities: the SD-Mal supergene historically had an equilibrium frequency less than 1.47% in Zambia; or the SD-Mal supergene, having reduced recombination, has experienced hitchhiking effects due to background selection and/or a recent selective sweep.

考虑到 SD-Mal 在自然种群中的低频率(见下文),可能会降低 SD-Mal 的核苷酸多样性。 SD 在全世界人群中以低频率持续存在,可能反映了驱动、负选择和遗传抑制和/或抗性之间的平衡。如果 SD-Mal 超基因长期保持在稳定的驱动选择抑制平衡频率,则其核苷酸多样性可能反映了适合其有效种群大小的突变漂移平衡。在这种情况下,我们预计超基因的多样性类似于野生型(SD+)多样性,由 SD 的长期平衡频率衡量。我们通过识别来自赞比亚的 204 个单倍体基因组中的 Sd 重复和 In(2R)Mal 断点估计 SD-Mal 频率为 1.47%。为了接近我们在突变漂移平衡下的期望,我们在对应于 SD-Mal 超基因的区域(定义为从 Sd-RanGAP 到远端断点的区域)的 10-kb 窗口中将 SD+ 样本的平均 π 缩放 1.47% In(2R)Mal。虽然 SD-Mal 超基因区域外的核苷酸多样性与 SD+ 相当(表 3,第 1 行),但即使按其频率(表 3,第 4 行)衡量,超基因区域的多样性也显着低于预期,这表明SD-Mal 的低种群频率不能完全解释其减少的多样性。这一观察表明了两种可能性:SD-Mal 超基因在赞比亚历史上的平衡频率低于 1.47%;或重组减少的 SD-Mal 超基因由于背景选择和/或最近的选择性扫描而经历了搭便车效应。

To distinguish between these possibilities, we analyzed summaries of the site frequency spectrum. We find strongly negative Tajima’s D mirroring the distribution of reduced diversity, indicating an excess of rare alleles (Figure 2B). Such a skew in the site frequency spectrum suggests a recent increase in frequency of the SD-Mal supergene in Zambia. Given the low recombination frequency between SD-Mal and SD+ chromosomes, we treat them as two subpopulations and estimate their differentiation using Wright’s fixation index, FST. The high differentiation of SD-Mal from SD+ chromosomes from the same population similarly suggests a large shift in allele frequencies. FST in the SD-Mal supergene region is unusually high for chromosomes from the same population (Figure 3A). Neither of the SD+ chromosomes with cosmopolitan inversions show such high differentiation, and mean nucleotide differences (dXY) between SD-Mal and SD+ are comparable to the other inversions, implying that the differentiation of the SD-Mal supergene is recent. Our results—low diversity, strongly negative Tajima’s D, high FST and relatively low dXY—are thus consistent with a rapid increase in frequency of the SD-Mal haplotype that reduced nucleotide diversity within SD-Mal and generated large differences in allele frequencies with SD+ chromosomes.

为了区分这些可能性,我们分析了站点频谱的摘要。我们发现强烈的负 Tajima's D 反映了减少的多样性分布,表明稀有等位基因过多(图 2B)。站点频谱中的这种偏斜表明赞比亚 SD-Mal 超基因的频率最近有所增加。鉴于 SD-Mal 和 SD+ 染色体之间的重组频率较低,我们将它们视为两个亚群,并使用 Wright 的固定指数 FST 估计它们的分化。 SD-Mal 与来自同一群体的 SD+ 染色体的高度分化同样表明等位基因频率发生了很大变化。 SD-Mal 超基因区域的 FST 对于来自同一群体的染色体来说异常高(图 3A)。具有普遍倒位的 SD+ 染色体都没有显示出如此高的分化,并且 SD-Mal 和 SD+ 之间的平均核苷酸差异 (dXY) 与其他倒位相当,这意味着 SD-Mal 超基因的分化是最近的。我们的结果——低多样性、强负 Tajima's D、高 FST 和相对低的 dXY——因此与 SD-Mal 单倍型频率的快速增加一致,这降低了 SD-Mal 内的核苷酸多样性并在等位基因频率上产生了很大的差异与 SD+染色体。

Figure 3

Figure 3. Differentiation between SD-Mal and wildtype chromosomes. (A) Pairwise FST and (B) dXY per base pair in non-overlapping 10-kb windows along chromosome 2, between Zambian SD-Mal haplotypes (n = 9) and wildtype chromosomes from the same population, bearing the cosmopolitan inversions In(2L)t (n = 10) and In(2R)NS (n = 10). Regions corresponding to pericentric heterochromatin are shaded in gray and the centromere location is marked with a black circle.

图 3. SD-Mal 和野生型染色体的区别。 (A) 成对 FST 和 (B) 沿 2 号染色体的非重叠 10-kb 窗口中每个碱基对的 dXY,在赞比亚 SD-Mal 单倍型 (n = 9) 和来自同一种群的野生型染色体之间,具有世界性倒位 In( 2L)t (n = 10) 和 In(2R)NS (n = 10)。 与中心异染色质相对应的区域以灰色阴影显示,着丝粒位置用黑色圆圈标记。

To estimate the timing of the recent expansion of the SD-Mal supergene, we used an approximate Bayesian computation (ABC) method with rejection sampling in neutral coalescent simulations. We do not know if SD chromosomes acquired In(2R)Mal in Zambia or if the inversions occurred de novo on an SD background. For our simulations, we assume that the acquisition of the second inversion (or the double inversion by crossover) was a unique event that enhanced drive strength and/or efficiency and that the onset of the selective sweep occurred following this event. Under this scenario, extant SD-Mal chromosomes have a single origin. We therefore simulated this history in a coalescent framework as an absolute bottleneck to a single chromosome. We performed simulations considering a sample size of n = 9 and assumed no recombination in the ~9.92 Mb region of In(2R)Mal. We simulated with values of S drawn from a uniform distribution ±5% of the observed number of segregating sites in non-coding regions of In(2R)Mal. We considered a prior uniform distribution of the time of the expansion (t) ranging from 0 to 4Ne generations (0–185,836 years ago), assuming that D. melanogaster Ne in Zambia 3,160,475 (Kapopoulou et al., 2018), a In(2R)Mal frequency of 1.47%, and 10 generations per year (Li and Stephan, 2006; Thornton and Andolfatto, 2006; Laurent et al., 2011; Kapopoulou et al., 2018). Using the ABC with rejection sampling conditional on our observed estimates of π and Tajima’s D for In(2R)Mal (πIn(2R)Mal = 584.60, D = –1.33; note that πIn(2R)Mal is an overall, unscaled estimate of nucleotide diversity for the whole In(2R)Mal region and that only non-coding regions were considered), we infer that the SD-Mal expansion began ~0.0884 (95% CIs 0.0837–0.1067) × 4Ne generations ago or, equivalently, ~1644 years ago (1.11% rejection sampling acceptance rate; Figure 4). To account for possible effects of gene conversion between SD and SD+ chromosomes (see below), we discarded SNPs shared with SD+ chromosomes (see below), and recalculated π and Tajima’s D using only private SNPs (πIn(2R)Mal = 427.72, D = –1.45). Based on these parameters, the estimated SD-Mal expansion occurred ~0.0679 (95% CIs 0.0647–0.0868) 4Ne generations ago, ~1261 years (1.02% rejection sampling acceptance rate; Figure 4). To calculate the posterior probability of the model, we performed 100,000 simulations under three models: a model assuming a stable frequency of SD-Mal; a model assuming an exponential growth of SD-Mal, based on parameters estimated for Zambia (Kapopoulou et al., 2018); and a selective sweep model (assuming tall = 0.0884 and tshared_excl = 0.0679) (Figure 4—figure supplement 1). The simulated data are inconsistent with a long-term stable frequency of SD-Mal (all SNPs, pπ = 0.0522, pD = 0.096; private, pπ = 0.0266, pD = 0.0668) or long-term exponential growth (all SNPs, pπ = 0.0465, pD = 0.0907; private, pπ = 0.0215, pD = 0.0605). Instead, our simulations suggest that a recent selective sweep is more consistent with the data (all SNPs, pπ = 0.3554, pD = 0.5952; private, pπ = 0.3480, pD = 0.6142). Taken together, evidence from nucleotide diversity, the site frequency spectrum, population differentiation, and coalescent simulations suggests a rapid non-neutral increase in frequency of the SD-Mal supergene that began <2000 years ago.

为了估计最近 SD-Mal 超基因扩张的时间,我们在中性聚结模拟中使用了一种近似贝叶斯计算 (ABC) 方法和拒绝抽样。我们不知道 SD 染色体是否在赞比亚获得了 In(2R)Mal,或者倒位是否在 SD 背景上重新发生。对于我们的模拟,我们假设获得第二次反转(或通过交叉的双重反转)是增强驱动强度和/或效率的独特事件,并且选择性扫描的开始发生在此事件之后。在这种情况下,现存的 SD-Mal 染色体具有单一来源。因此,我们在合并框架中模拟了这段历史,作为单个染色体的绝对瓶颈。我们进行了模拟,考虑了 n = 9 的样本大小,并假设在 In(2R)Mal 的 ~9.92 Mb 区域中没有重组。我们使用从 In(2R)Mal 非编码区域中观察到的分离位点数量的均匀分布 ±5% 得出的 S 值进行模拟。我们考虑了扩张时间 (t) 的先验均匀分布,范围从 0 到 4Ne 代(0-185,836 年前),假设赞比亚的黑腹果蝇 Ne 3,160,475(Kapopoulou 等人,2018 年),In( 2R) 发病率为 1.47%,每年 10 代(Li 和 Stephan,2006;Thornton 和 Andolfatto,2006;Laurent 等,2011;Kapopoulou 等,2018)。使用带有拒绝抽样的 ABC,条件是我们观察到的 π 估计值和 Tajima 对 In(2R)Mal 的 D (πIn(2R)Mal = 584.60, D = –1.33;注意 πIn(2R)Mal 是对整个 In(2R)Mal 区域的核苷酸多样性和仅考虑非编码区域),我们推断 SD-Mal 扩展开始于 ~0.0884 (95% CIs 0.0837–0.1067) × 4Ne 代之前,或者等效地,~ 1644 年前(1.11% 的拒绝抽样接受率;图 4)。为了解释 SD 和 SD+ 染色体之间基因转换的可能影响(见下文),我们丢弃了与 SD+ 染色体共享的 SNP(见下文),并仅使用私有 SNP 重新计算 π 和 Tajima's D (πIn(2R)Mal = 427.72, D = –1.45)。基于这些参数,估计的 SD-Mal 扩展发生在约 0.0679(95% CI 0.0647–0.0868)4Ne 代前,约 1261 年(1.02% 拒绝抽样接受率;图 4)。为了计算模型的后验概率,我们在三个模型下进行了 100,000 次模拟:假设 SD-Mal 频率稳定的模型;基于赞比亚估计的参数,假设 SD-Mal 呈指数增长的模型(Kapopoulou 等人,2018 年);和一个选择性扫描模型(假设高 = 0.0884 和 tshared_excl = 0.0679)(图 4-图补充 1)。模拟数据与 SD-Mal 的长期稳定频率(所有 SNP,pπ = 0.0522,pD = 0.096;私有,pπ = 0.0266,pD = 0.0668)或长期指数增长(所有 SNP,pπ = 0.0465,pD = 0.0907;私人,pπ = 0.0215,pD = 0.0605)。相反,我们的模拟表明最近的选择性扫描与数据更一致(所有 SNP,pπ = 0.3554,pD = 0.5952;私有,pπ = 0.3480,pD = 0.6142)。总之,来自核苷酸多样性、位点频谱、种群分化和聚结模拟的证据表明,从 <2000 年前开始的 SD-Mal 超基因频率的快速非中性增加。

Figure 4

Figure 4. Estimating the time since the SD-Mal selective sweep. Approximate Bayesian computation (ABC) estimates based on 10,000 posterior samples place the onset of the selective sweep between 0.0884 (95% CI 0.0837–0.1067) and 0.0679 (0.0647–0.0868) × 4Ne generations, that is, ~1261–1644 years ago, considering recent estimates of Ne in Zambia from Kapopoulou et al., 2018, frequency of SD-Mal in Zambia 1.47% and 10 generations per year. Estimates were done considering only In(2R)Mal, where crossing over is rare and only occurs between SD-Mal chromosomes, using all SNPs and excluding shared SNPs in order to account for gene conversion from SD+ chromosomes.

图 4. 估计自 SD-Mal 选择性扫描以来的时间。 基于 10,000 个后验样本的近似贝叶斯计算 (ABC) 估计将选择性扫描的开始置于 0.0884 (95% CI 0.0837–0.1067) 和 0.0679 (0.0647–0.0868) × 4Ne 代之间,即~1261–1644 年前, 考虑到 Kapopoulou 等人最近对赞比亚 Ne 的估计,2018 年,赞比亚 SD-Mal 的频率为 1.47%,每年 10 代。 仅考虑 In(2R)Mal 进行估计,其中交叉很少见且仅发生在 SD-Mal 染色体之间,使用所有 SNP 并排除共享 SNP,以解释来自 SD+ 染色体的基因转换。

The sweep signal on the SD-Mal haplotypes begins immediately distal to Sd-RanGAP on 2L and extends ~3 Mb beyond the distal boundary of In(2R)Mal on 2R. To understand why the sweep extends so far beyond the In(2R)Mal-d distal breakpoint, we consider three, not mutually exclusive, possibilities. First, chromosomal inversions can suppress recombination ~1–3 Mb beyond their break-points (in both multiply inverted balancer chromosomes, and natural inversions), extending the size of the sweep signal. To determine the extent of recombination suppression caused by In(2R)Mal, we estimated recombination rates in the region distal to the inversion. The expected genetic distance between the distal breakpoint of In(2R)Mal (2R:18.77 Mb) and px (2R:22.49 Mb) is ~13.87 cM. Measuring recombination between SD-Mal and standard arrangement chromosomes for the same (collinear) interval, we estimate a genetic distance of ~1.76 (Table 1), an 87.3% reduction. In(2R)Mal strongly reduces recombination beyond its distal boundary. Second, although we have inferred that the essential enhancer(s) reside(s) within the In(2R)Mal inversion (see above), we have not excluded the possibility of weak enhancers distal to the inversion which might contribute to the sweep signal. We find that SD-Mal chromosomes with In(2R)Mal-distal material recombined away (b+ Sd c+ In(2R)Mal px) have modestly but significantly lower drive strength (k = 0.96 vs. 0.98; Table 2, lines 1–2), suggestive of one or more weak distal enhancers. Third, there may be mutations distal to In(2R)Mal that contribute to the fitness of SD-Mal haplotypes but without increasing the strength of drive, for example, compensatory mutations that ameliorate the effects of SD-Mal-linked deleterious mutations.

SD-Mal 单倍型上的扫描信号立即在 2L 上的 Sd-RanGAP 远端开始,并在 2R 上超出 In(2R)Mal 的远端边界约 3 Mb。为了理解为什么扫描远远超出了 In(2R)Mal-d 远端断点,我们考虑了三种不相互排斥的可能性。首先,染色体倒位可以抑制重组超出其断点约 1-3 Mb(在多重倒置平衡染色体和自然倒位中),扩展扫描信号的大小。为了确定由 In(2R)Mal 引起的重组抑制程度,我们估计了反转远端区域的重组率。 In(2R)Mal (2R:18.77 Mb) 和 px (2R:22.49 Mb) 的远端断点之间的预期遗传距离约为 13.87 cM。在相同(共线)间隔测量 SD-Mal 和标准排列染色体之间的重组,我们估计遗传距离约为 1.76(表 1),减少了 87.3%。 In(2R)Mal 强烈减少超出其远端边界的重组。其次,虽然我们推断基本增强子位于 In(2R)Mal 反转(见上文)中,但我们并未排除可能有助于扫描信号的反转远端弱增强子的可能性.我们发现带有 In(2R)Mal 远端物质重组的 SD-Mal 染色体 (b+ Sd c+ In(2R)Mal px) 具有适度但显着降低的驱动强度(k = 0.96 vs. 0.98;表 2,第 1 行– 2),提示一种或多种弱远端增强剂。第三,可能存在 In(2R)Mal 远端的突变,这些突变有助于 SD-Mal 单倍型的适应性,但不会增加驱动力,例如,补偿性突变可以改善 SD-Mal 相关有害突变的影响。

Most supergenes show long-range LD, reduced nucleotide diversity, and differentiation when compared with their wildtype counterparts. While some meiotic drive supergenes show evidence of recurrent selective sweeps or a signature of epistatic selection without strong selective sweeps, others show no signatures of recent or ongoing positive selection. The relatively recent origin (~38.5 kya; Brand et al., 2015) of SD might explain the constant turnover, as there may not have been enough time to reach a stable equilibrium compared to older drive systems like the t-haplotype, whose first inversion arose 3 mya.

与野生型对应物相比,大多数超基因显示出长程 LD、减少的核苷酸多样性和分化。 虽然一些减数分裂驱动超基因显示出反复选择性扫描的证据或没有强选择性扫描的上位选择特征,但其他基因没有显示最近或正在进行的正选择的特征。 SD 相对较新的起源(~38.5 kya;Brand et al., 2015)可能解释了持续的营业额,因为与 t 单倍型等较旧的驱动系统相比,可能没有足够的时间达到稳定的平衡,其第一个 反转出现 3 mya。

Recombination on SD-Mal supergenes

While nearly all SD-Mal haplotypes are individually homozygous lethal and do not recombine with wildtype chromosomes in and around In(2R)Mal, ~90% of pairwise combinations of different SD-Mal chromosomes (SDi/SDj) are viable and fertile in complementation tests. Therefore, recombination via crossing over may occur between SD-Mal chromosomes in SDi/SDj heterozygous females. To determine if SD-Mal chromosomes recombine, we estimated mean pairwise linkage disequilibrium (r2) between SNPs located within the In(2R)Mal arrangement. We found that mean r2 between pairs of SNPs declines as a function of the physical distance separating them (Figure 5A), a hallmark of recombination via crossing over. Pairwise LD is higher and extends further in In(2R)Mal than in the equivalent region of SD+ chromosomes or in any of the other two cosmopolitan inversions, In(2L)t and In(2R)NS (Figure 5A). This pattern is not surprising: the low frequency of SD-Mal makes SDi/SDj genotypes, and hence the opportunity for recombination, rare. (The smaller sample size of SD (n = 9) vs. SD+ (n = 10) may also contribute weakly to its higher estimated LD.) To further characterize the history of recombination between SD-Mal haplotypes, we used 338 non-singleton, biallelic SNPs in In(2R)Mal to trace historical crossover events. From these SNPs, we estimate that Rm, the minimum number of recombination events, in this sample of SD-Mal haplotypes is 15 (Figure 5C). Thus, assuming that these SD-Mal haplotypes are ~16,436 generations old (Figure 4), we estimate that recombination events between SD-Mal haplotypes occur a minimum of once every ~1096 generations. We can thus confirm that crossover events are relatively rare, likely due to the low population frequency of SD-Mal and the possibly reduced fitness of SDi/SDj genotypes.

虽然几乎所有的 SD-Mal 单倍型都是单独的纯合致死的,并且不会与 In(2R)Mal 内部和周围的野生型染色体重组,但约 90% 的不同 SD-Mal 染色体 (SDi/SDj) 的成对组合在互补中是可行和可育的测试。因此,在 SDi/SDj 杂合雌性中,SD-Mal 染色体之间可能会通过交叉进行重组。为了确定 SD-Mal 染色体是否重组,我们估计了位于 In(2R)Mal 排列内的 SNP 之间的平均成对连锁不平衡 (r2)。我们发现 SNP 对之间的平均 r2 随物理距离的变化而下降(图 5A),这是通过交叉重组的标志。 In(2R)Mal 中的成对 LD 比 SD+ 染色体的等效区域或其他两个世界性倒位 In(2L)t 和 In(2R)NS 中的任何一个都更高并且延伸得更远(图 5A)。这种模式并不奇怪:SD-Mal 的低频率使得 SDi/SDj 基因型,因此重组的机会,罕见。 (SD (n = 9) 与 SD+ (n = 10) 的较小样本量也可能对其较高的估计 LD 贡献微弱。)为了进一步表征 SD-Mal 单倍型之间的重组历史,我们使用了 338 个非单子, In(2R)Mal 中的双等位基因 SNP 以追踪历史交叉事件。根据这些 SNP,我们估计在这个 SD-Mal 单倍型样本中,重组事件的最小数量为 15(图 5C)。因此,假设这些 SD-Mal 单倍型的年龄约为 16,436 代(图 4),我们估计 SD-Mal 单倍型之间的重组事件至少每 1096 代发生一次。因此,我们可以确认交叉事件相对罕见,可能是由于 SD-Mal 的人口频率低以及 SDi/SDj 基因型的适应性可能降低。

Figure 5

Figure 5. Recombination on SD-Mal haplotypes. (A) Linkage disequilibrium (r2) as a function of distance in 10-kb windows, measured in In(2R)Mal (n = 9), In(2L)t (n = 10), In(2R)NS (n = 10), and the corresponding region of In(2R)Mal in a standard, uninverted 2R chromosome (n = 10). (B) Histogram of length of runs of SNPs in In(2R)Mal shows that a high proportion of shared SNPs concentrate in runs shorter than 1 kb. (C) Chromosomal configuration of the 338 non-singleton SNPs in nine different SD-Mal lines. Color coded for two states (same in light orange or different in dark orange) using SD-ZI125 as reference. Locations of minimal number of recombination events are labeled as triangles at the bottom. Maximum likelihood tree is displayed on the left.

图 5. SD-Mal 单倍型的重组。 (A) 链接不平衡 (r2) 作为 10 kb 窗口中距离的函数,以 In(2R)Mal (n = 9)、In(2L)t (n = 10)、In(2R)NS (n = 10),以及标准未倒位 2R 染色体中 In(2R)Mal 的相应区域 (n = 10)。 (B) In(2R)Mal 中 SNP 运行长度的直方图显示,大部分共享 SNP 集中在小于 1 kb 的运行中。 (C) 9 个不同 SD-Mal 系中 338 个非单子 SNP 的染色体构型。 使用 SD-ZI125 作为参考,对两种状态进行颜色编码(浅橙色相同或深橙色不同)。 重组事件数量最少的位置在底部标记为三角形。 最大似然树显示在左侧。

While crossing over is suppressed in SD-Mal/SD+ heterozygotes, gene conversion and/or double crossover events may still occur, accounting for the shared SNPs between SD-Mal and SD+ chromosomes within In(2R)Mal. As both events exchange tracts of sequence, we expect shared SNPs to occur in runs of sites at higher densities than private SNPs, which should be distributed randomly. Accordingly, in In(2R)Mal, SNP density is five times higher for runs of shared SNPs (0.63 SNPs/kb) than for runs of SD-private SNPs (0.12 SNPs/kb), as expected if SD+ chromosomes, which have higher SNP densities, were donors of conversion tract sequences. Although we cannot exclude the contribution of double crossovers, we note that 62.2% (89 out of 143) of the shared SNP runs are <1 kb, 80.4% (115 out of 143) are <10 kb (Figure 5B), and the longest run is ~50.2 kb. These sizes are more consistent with current estimates of gene conversion tract lengths in D. melanogaster than with double cross-overs. Surprisingly, these inferred gene conversion events are unevenly distributed across In(2R)Mal, being more frequent in the In(2R)Mal-p than in In(2R)Mal-d. Our discovery that SD-Mal haplotypes can recombine with each other distinguishes the SD-Mal supergene from supergenes that are completely genetically isolated. The lack of crossing over with SD+ chromosomes, however, means that SD-Mal haplotypes evolve as a semi-isolated subpopulation, with a nearly 100-fold smaller Ne and limited gene flow from SD+ via gene conversion events. The reduced recombination, low Ne, and history of epistatic selection may nevertheless lead to a higher genetic load on SD-Mal than SD+ chromosomes. We therefore examined the accumulation of deleterious mutations, including non-synonymous mutations and TEs, on the SD-Mal supergene.

虽然在 SD-Mal/SD+ 杂合子中抑制了交叉,但仍可能发生基因转换和/或双交叉事件,这说明了 In(2R)Mal 中 SD-Mal 和 SD+ 染色体之间的共享 SNP。由于这两个事件都交换序列,我们预计共享的 SNP 会以比私有 SNP 更高的密度出现在一系列站点中,而私有 SNP 应该是随机分布的。因此,在 In(2R)Mal 中,运行共享 SNP (0.63 SNPs/kb) 的 SNP 密度是运行 SD 私有 SNP (0.12 SNPs/kb) 的 5 倍,正如预期的那样,如果 SD+ 染色体具有更高的SNP 密度是转化道序列的供体。虽然我们不能排除双交叉的贡献,但我们注意到共享 SNP 运行的 62.2%(143 个中的 89 个)<1 kb,80.4%(143 个中的 115 个)<10 kb(图 5B)最长的运行时间约为 50.2 kb。这些大小与目前对黑腹果蝇基因转换道长度的估计比双交叉更一致。令人惊讶的是,这些推断的基因转换事件在 In(2R)Mal 中分布不均,在 In(2R)Mal-p 中比在 In(2R)Mal-d 中更频繁。我们发现 SD-Mal 单倍型可以相互重组,这将 SD-Mal 超基因与完全遗传分离的超基因区分开来。然而,缺乏与 SD+ 染色体的交叉意味着 SD-Mal 单倍型进化为半分离的亚群,Ne 小近 100 倍,并且通过基因转换事件从 SD+ 流出的基因流有限。尽管如此,减少的重组、低 Ne 和上位选择的历史可能导致 SD-Mal 上的遗传负荷高于 SD+ 染色体。因此,我们检查了 SD-Mal 超基因上有害突变的积累,包括非同义突变和 TE。

Consequences of reduced recombination, small effective size, and epistatic selection

We first studied the effects of a reduced efficacy of selection on SNPs in In(2R)Mal. As many or most non-synonymous polymorphisms are slightly deleterious, relatively elevated ratios of non-synonymous to synonymous polymorphisms (N/S ratio) can indicate a reduced efficacy of negative selection. For the SNPs in In(2R)Mal, the overall N/S ratio is 2.3-fold higher than that for the same region of SD+ chromosomes (Table 4). Notably, the N/S ratio for private SNPs is 3.1-fold higher (Table 4), whereas the N/S ratios for shared SNPs do not significantly differ from SD+chromosomes (Table 4, Figure 6—figure supplement 1). These findings suggest that gene conversion from SD+ ameliorates the accumulation of potentially deleterious non-synonymous mutations on SD-Mal chromosomes.

重组减少、有效尺寸小和上位选择的后果

我们首先研究了选择效率降低对 In(2R)Mal 中 SNP 的影响。 由于许多或大多数非同义多态性略微有害,因此非同义多态性与同义多态性的相对升高的比率(N / S比率)可能表明负选择的功效降低。 对于 In(2R)Mal 中的 SNP,总体 N/S 比是 SD+ 染色体相同区域的 2.3 倍(表 4)。 值得注意的是,私有 SNP 的 N/S 比高 3.1 倍(表 4),而共享 SNP 的 N/S 比与 SD+染色体没有显着差异(表 4,图 6-图补充 1)。 这些发现表明,来自 SD+ 的基因转换改善了 SD-Mal 染色体上潜在有害的非同义突变的积累。

Figure 6. Transposable elements (TEs) on SD-Mal haplotypes. (A) Number of TE insertions per 100-kb windows along chromosome 2 in Zambian SD chromosomes (n = 9, orange) and wildtype chromosomes from the same population, bearing the cosmopolitan inversions In(2L)t (n = 10, dark blue) and In(2R)NS (n = 10, light blue). (B) Ratio of the number of insertions in the euchromatin of 2R to 2L per library. The relative enrichment in TEs in 2R of SD-Mal haplotypes is mostly due to an increase of TE insertions in non-recombining regions of the chromosome. Asterisks denote significance, p-values estimated by a Kruskal-Wallis test (threshold for significance p = 0.05).

图 6. SD-Mal 单倍型上的转座因子 (TE)。 (A)赞比亚 SD 染色体(n = 9,橙色)和来自同一种群的野生型染色体沿 2 号染色体每 100-kb 窗口的 TE 插入数,具有世界性倒位 In(2L)t(n = 10,深蓝色 ) 和 In(2R)NS (n = 10, 浅蓝色)。 (B) 每个库中 2R 与 2L 的常染色质中插入数的比率。 SD-Mal 单倍型的 2R 中 TE 的相对富集主要是由于染色体非重组区域中 TE 插入的增加。 星号表示显着性,通过 Kruskal-Wallis 检验估计的 p 值(显着性 p = 0.05 的阈值)。

Gene conversion may not, however, rescue SD-Mal from deleterious TEs insertions, as average TE length exceeds the average gene conversion tract length (Kaminker et al., 2002). TEs accumulate in regions of reduced recombination, such as centromeres (Charlesworth et al., 1994) and inversions, especially those at low frequency (Eanes et al., 2009; Sniegowski and Charlesworth, 1994). Indeed, TE densities for the whole euchromatic region of chromosome 2R are significantly higher for SD-Mal compared to SD+ chromosomes (Figure 6A). This increased TE density on SD-Mal is driven by the low recombination regions of the haplotype: In(2R)Mal has significantly higher TE density than SD+ whereas the distal region of 2R outside of the sweep region does not (Figure 6B). The most over-represented families in In(2R)Mal relative to standard chromosomes are M4DM, MDG1, ROO_I, and LINE elements (Figure 6—figure supplement 2)—TEs that are currently or recently active—consistent with the recent origin of the SD-Mal haplotype. Thus, the differences in shared vs. private SNPs suggests that gene conversion from SD+ chromosomes may slow the accumulation of deleterious point mutations but not the accumulation of TEs. Despite occasional recombination, the small Ne of SD-Mal haplotypes has incurred a higher genetic load.

然而,基因转换可能无法从有害的 TE 插入中拯救 SD-Mal,因为平均 TE 长度超过了平均基因转换道长度(Kaminker 等,2002)。 TEs 在重组减少的区域积累,例如着丝粒 (Charlesworth et al., 1994) 和倒位,尤其是那些低频 (Eanes et al., 2009; Sniegowski and Charlesworth, 1994)。事实上,与 SD+ 染色体相比,SD-Mal 染色体 2R 的整个常染色质区域的 TE 密度显着更高(图 6A)。 SD-Mal 上 TE 密度的增加是由单倍型的低重组区域驱动的:In(2R)Mal 的 TE 密度明显高于 SD+,而扫描区域外 2R 的远端区域则没有(图 6B)。相对于标准染色体,In(2R)Mal 中代表最多的家族是 M4DM、MDG1、ROO_I 和 LINE 元素(图 6-图补充 2)——当前或最近活跃的 TE——与最近的起源一致SD-Mal 单倍型。因此,共享与私有 SNP 的差异表明,来自 SD+ 染色体的基因转换可能会减缓有害点突变的积累,但不会减缓 TE 的积累。尽管偶尔会发生重组,但 SD-Mal 单倍型的小 Ne 已经产生了更高的遗传负荷。

Conclusions

Supergenes are balanced, multigenic polymorphisms. Under the classic model of supergene evolution, epistatic selection among component loci favors the recruitment of recombination modifiers that reinforce the linkage of beneficial allelic combinations. The advantages of reduced recombination among strongly selected loci can however compromise the efficacy of selection at linked sites. Supergenes thus provide opportunities to study the interaction of recombination and natural selection. We have studied a population of selfish supergenes, the SD-Mal haplotypes of Zambia, to investigate the interplay of recombination, selection, and meiotic drive. Our findings demonstrate, first, that the SD-Mal supergene extends across ~25.8 Mb of D. melanogaster chromosome 2, a region that comprises the driving Sd-RanGAP, a drive-insensitive deletion at the major Rsp locus, and the In(2R)Mal double inversion. Second, using genetic manipulation, we show that SD-Mal requires Sd-RanGAP and an essential co-driver that localizes almost certainly within the In(2R)Mal rearrangement, and probably within the distal inversion. These data provide experimental evidence for epistasis between Sd-RanGAP and In(2R)Mal: neither allele can drive without the other. Third, we provide population genomics evidence that epistatic selection on loci spanning the SD-Mal supergene region drove a very recent, chromosome-scale selective sweep. These patterns are consistent with recurrent episodes of replacement of one SD haplotype by others. Fourth, despite rare crossovers among complementing SD-Mal haplotypes and gene conversion from wildtype chromosomes, the relative genetic isolation and low frequency of SD-Mal results in the accumulation of deleterious mutations including, especially, TE insertions. From these findings, we conclude that the SD-Mal supergene population is of small effective size, semi-isolate from the greater population of wildtype chromosomes, and subject to bouts of very strong selection.

超基因是平衡的多基因多态性。在经典的表基因进化模型下,组成基因座之间的上位选择有利于招募重组修饰剂,增强有益等位基因组合的联系。然而,在强烈选择的基因座之间减少重组的优势可能会损害在链接位点选择的功效。因此,超基因为研究重组和自然选择的相互作用提供了机会。我们研究了一群自私的超基因,即赞比亚的 SD-Mal 单倍型,以研究重组、选择和减数分裂驱动的相互作用。我们的研究结果表明,首先,SD-Mal 超基因跨越了约 25.8 Mb 的黑腹果蝇染色体 2,该区域包含驱动 Sd-RanGAP、主要 Rsp 基因座的驱动不敏感缺失和 In(2R ) Mal 双重反转。其次,使用基因操作,我们表明 SD-Mal 需要 Sd-RanGAP 和一个基本的共同驱动因素,几乎可以肯定地定位在 In(2R)Mal 重排内,并且可能在远端倒位内。这些数据为 Sd-RanGAP 和 In(2R)Mal 之间的上位性提供了实验证据:没有另一个等位基因不能驱动。第三,我们提供了群体基因组学证据,表明跨越 SD-Mal 超基因区域的基因座上的上位选择推动了最近的染色体规模选择性扫描。这些模式与一个 SD 单倍型被其他单倍型替换的反复发作是一致的。第四,尽管互补的 SD-Mal 单倍型和野生型染色体的基因转换之间很少发生交叉,但 SD-Mal 的相对遗传隔离和低频率导致有害突变的积累,尤其是 TE 插入。从这些发现中,我们得出结论,SD-Mal 超基因群体的有效大小很小,与更大的野生型染色体群体半分离,并且受到非常强的选择。

Non-recombining supergenes that exist exclusively in heterozygous state tend to degenerate, as in the case of Y chromosomes (reviewed in Charlesworth and Charlesworth, 2000) and some autosomal supergenes which, for different reasons, lack any opportunity for recombination. But not all supergenes are necessarily expected to degenerate. In SD-Mal, for instance, complementing SD-Mal haplotypes can recombine via crossing over, if rarely, and gene flow from wildtype SD+ to SD-Mal chromosomes can occur via gene conversion. In the mouse t-haplotype, there is similar evidence for occasional recombination between complementing t-haplotypes and with standard chromosomes, probably via gene conversion. Despite the many parallels characterizing supergenes, their ultimate evolutionary fates depend on the particulars of the system.

仅以杂合状态存在的非重组超基因倾向于退化,例如 Y 染色体(Charlesworth 和 Charlesworth,2000 年综述)和一些常染色体超基因,由于不同的原因,它们缺乏任何重组机会。 但并非所有超基因都必然退化。 例如,在 SD-Mal 中,互补的 SD-Mal 单倍型可以通过交叉重组(如果很少),并且从野生型 SD+ 到 SD-Mal 染色体的基因流动可以通过基因转换发生。 在小鼠 t 单倍型中,有类似的证据表明互补 t 单倍型和标准染色体之间偶尔会发生重组,可能是通过基因转换。 尽管表征超基因有许多相似之处,但它们最终的进化命运取决于系统的细节。

好文章,牛!

上一篇下一篇

猜你喜欢

热点阅读