University of Rochester
Long read sequencing reveals the dynamic evolution of Drosophila Y chromosomes
Drosophila Y chromosomes tend to be gene-poor, repeat-rich and heterochromatic, yet most are essential for male fertility. Genes traffic on and off the Y chromosomes over evolutionary time, therefore Y-linked gene content can differ between Drosophila species. However, our understanding of Y chromosome dynamics is primarily based on movements of known genes from a single reference species: D. melanogaster. We know little about the organization or detailed evolution of Y chromosome structure and content. Because they are dense in repeats, Y chromosomes are particularly challenging to assemble with short read sequencing technology. Only ~10% of the D. melanogaster Y chromosome (~4 Mb out of 40 Mb) was assembled in the latest version of the reference genome. We used deep-coverage long single molecule real-time sequence (SMRT) reads to assemble up to 30 Mb from the Y chromosomes of three sister species in the Drosophila simulans clade—D. simulans, D. mauritiana and D. sechellia—and compared these Y chromosomes to D. melanogaster. The three simulans clade species diverged from a common ancestor just 240 Kya and from D. melanogaster just 2 Mya. Despite their recent divergence, we find differences in Y-linked gene copy number, intron size, repeat content and gene order. We discovered a novel gene family of serine-arginine protein kinase (SRPK) that duplicated to the Y chromosome from an autosome in the simulans clade. SRPK is ubiquitously expressed in Drosophila species and has roles in both oogenesis and spermatogenesis. In mammals, these proteins phosphorylate protamines, an important component of sperm chromatin. Intralocus sexual conflict may have played a role in the evolution of this gene family in the simulans clade. The Y-linked SRPK copies show evidence of subfunctionalization following gene duplication—they retained an exon from a testis-specific isoform that was deleted from the autosomal copies. These data suggest that Y chromosome content is dynamic over short evolutionary time scales and may be, at least in part, driven by conflict.