All animals, plants, fungi and protists — which collectively make up the domain of life called eukaryotes — have genomes with a peculiar feature that has puzzled researchers for almost half a century: Their genes are fragmented.
In their DNA, the information about how to make proteins isn’t laid out in long coherent strings of bases. Instead, genes are split into segments, with intervening sequences, or “introns,” spacing out the exons that encode bits of the protein. When eukaryotes express their genes, their cells have to splice out RNA from the introns and stitch together RNA from the exons to reconstruct the recipes for their proteins.
The mystery of why eukaryotes rely on this baroque system deepened with the discovery that the different branches of the eukaryotic family tree varied widely in the abundance of their introns. The genes of yeast, for instance, have very few introns, but those of land plants have many. Introns make up almost 25% of human DNA. How this tremendous, enigmatic variation in intron frequency evolved has stirred debate among scientists for decades.
Answers may finally be emerging, however, from recent studies of genetic elements called introners that some scientists regard as a kind of genomic parasite. These pieces of DNA can slip into genomes and multiply there, leaving profusions of introns behind them. Last November, researchers presented evidence that introners have been doing this in diverse eukaryotes throughout evolution. Moreover, they showed that introners could explain why explosive gains in introns seem to have been particularly common in aquatic forms of life.
Their findings “might explain the vast majority of intron gain,” said Russ Corbett-Detig, senior author of the new paper and an evolutionary genomics researcher at the University of California, Santa Cruz.
The Puzzle of Eukaryotic Genomes
Because of the introns polka-dotting their DNA, if the genes of eukaryotes were translated directly into proteins, the resulting molecules would typically be nonfunctional garbage. For that reason, all eukaryotic cells are equipped with special genetic shears called spliceosomes. These protein complexes recognize the distinctive sequences that flank intron RNA and remove it from the preliminary RNA transcripts of active genes. Then they splice together the coding segments from exons to produce messenger RNA that can be translated into a working protein.
(A few prokaryotes also have introns, but they have ways of working around them that don’t involve spliceosomes. For example, some of their introns are “self-splicing” and automatically remove themselves from RNA.)
Why natural selection in eukaryotes favored introns that needed to be removed by spliceosomes is unknown. But the key might be that such introns allow for alternative splicing, a phenomenon that dramatically increases the diversity of products that can arise from a single gene. When the intron RNA is clipped out, the exon RNA sequences can be strung together in a new order to make slightly different proteins, Corbett-Detig explained.
Despite the influence of introns on the biology and genetic complexity of eukaryotic organisms, their evolutionary origins have remained murky. Since the discovery of introns in 1977, researchers have developed numerous theories about where these intrusive sequences came from. Several mechanisms that could create introns have been identified, and all of them may have contributed some introns to eukaryotes. But it’s been hard to say which if any of them might explain where the majority of introns came from.
Moreover, the mystery around the origins of introns only deepens in light of the extreme variation in where introns tend to show up throughout the eukaryote tree of life. Some lineages are particularly heavy with them in ways that point to sudden inundations with introns during their evolutionary history. When you examine the tree of life and how many introns are found on each tip of the tree, Corbett-Detig said, “you can figure out pretty quickly that there must be certain branches where an absolute ton of introns evolved all at once.”
One possible explanation for those explosive infusions of introns involves an unusual kind of genetic element known as an introner. First described in 2009 in the unicellular green algae Micromonas, introners have subsequently turned up in the genomes of some other algae, some species of fungi, tiny marine organisms called dinoflagellates and simple invertebrates called tunicates.
The distinctive feature of introners is that they create introns. Introners copy and paste themselves into stretches of coding DNA that offer an appropriate splicing site. Then they move on, leaving behind a specific intron sequence flanked by splicing sites, which splits the coding DNA into two exons. This process can be repeated on a massive scale throughout a genome. In fungi, for example, introners appear to account for most of the intron gain during at least the last 100,000 years.
How introners accomplish this became clearer in 2016, when researchers found that introners in two species of algae had strong similarities to DNA transposons, members of a larger family of genetic elements called transposable elements or “jumping genes.” Transposons also insert huge numbers of copies of themselves into genomes.
The parallels between introners and transposons strongly suggested a possible answer to the mystery of where most introns came from. Introners could cause introns to burst forth in genomes in great numbers, which might explain the punctuated pattern of their emergence in various eukaryotes. The catch was that introners were only known to exist in a few organisms.
“Did anyone look anywhere else?’” asked Landen Gozashti, who was doing research on evolutionary genomics at Santa Cruz when he read the 2016 algae study. A look at the scientific literature showed that no groups had published any data about introners elsewhere among the eukaryotes. Gozashti, now at Harvard University, Corbett-Detig and their colleagues set out to remedy that.
Stealthy, Abundant Invaders
The team systematically scanned more than 3,300 genomes from across the full breadth of eukaryotic diversity — everything from sheep to sequoias to ciliate protists. They used a series of computational filters to identify potential introners, looking for introns with very similar sequences and whittling away false positives. In the end, they found thousands of introns derived from introners in 175 of those genomes, about 5% of the total, from 48 different species.
Five percent may seem like a small sliver of the eukaryotic pie. But as mutations accumulate in introners over time, sequence similarities between the copies deteriorate until it’s no longer possible to tell that they came from the same source. The evolutionary lineages of many species alive today may have experienced floods of introns, but any influx that occurred more than a few million years ago would be undetectable. The 5% result therefore hints that introners may be far more ubiquitous.
As genomic parasites, introners may have achieved their success through stealth. A good parasite can’t draw too much attention to itself. If an introner disrupts the activity of the gene in which it has embedded itself, it could harm the host organism, and natural selection could remove the genomic parasite altogether. So these elements are continually evolving to be “as neutral as possible” in their influence, said Valentina Peona, a comparative genomicist at Uppsala University.
Gozashti, Corbett-Detig and their colleagues found out how adept introners are at slipping under the radar when they estimated the splicing efficiency of introners, which reflects their ability to avoid disrupting the function of host genes. “Introners actually are spliced better than other introns,” Gozashti said. “These things have gotten really good at it.”
An Aquatic Connection
The work by Gozashti and his colleagues proved that introners are not distributed equally among eukaryotes. For example, introners are more than six times as likely to appear in the genomes of aquatic organisms as in those of terrestrial organisms. Moreover, nearly three-quarters of the genomes from aquatic species that contain introners host multiple introner families.
Corbett-Detig, Gozashti and their colleagues think this pattern can be explained by horizontal gene transfer, the transfer of a genetic sequence from one species to another. These unorthodox gene transfers tend to happen in aquatic environments or in instances of close interspecies association, such as between hosts and parasites, explained Saima Shahid, a plant biologist at Oklahoma State University.
Aquatic environments may encourage horizontal gene transfer because the aqueous medium can become a soup of the nucleic acids shed by countless species. Single-celled organisms paddle around in this stew, so it’s easy for them to take up foreign DNA that might be incorporated into their own. But even much more complex multicellular species lay their eggs or fertilize them in the water, creating opportunities for DNA to be transferred into their lineages.
Clément Gilbert, an evolutionary genomicist at Paris-Saclay University, thinks the aquatic bias in introners is an echo of what his group found in horizontal gene transfer events. In 2020, their work uncovered nearly 1,000 distinct horizontal transfers involving transposons that had occurred in over 300 vertebrate genomes. The vast majority of these transfers happened in teleost fish, Gilbert said.
If introners find their way into hosts primarily through horizontal gene transfers in aquatic environments, that could explain the irregular patterns of big intron gains in eukaryotes. Terrestrial organisms aren’t likely to have the same bursts of introns, Corbett-Detig said, since horizontal transfer occurs far less often among them. The transferred introns could persist in genomes for many millions of years as permanent souvenirs from an ancestral life in the sea and a fateful brush with a deft genomic parasite.
Introners acting as foreign, invasive elements in genomes could also be the explanation for why they would insert introns so suddenly and explosively. Defense mechanisms that a genome might use to suppress its inherited burden of transposons might not work on an unfamiliar genetic element arriving by horizontal transfer.
“Now that element can go crazy all over the genome,” Gozashti said. Even if the introners are initially harmful, the researchers hypothesize that selective pressures could soon tame them by cutting them out of RNA.
Although horizontal gene transfer and introners share a connection to the aquatic environment, the findings don’t yet show definitively that this is where introners come from. But the discovery of introners’ widespread influence does challenge some theories about how genomes — particularly eukaryotic genomes — have evolved.
Reverberations in the Host
The pervasiveness of recent intron gain may act as a counterweight to some ideas about the evolution of genomic complexity. One example involves a theory of intron evolution developed by Michael Lynch of Arizona State University in 2002. Models suggest that in species with small breeding populations, natural selection can be less efficient at removing unhelpful genes. Lynch proposed that those species will therefore tend to build up heaps of nonfunctional genetic junk in their genomes. In contrast, species with very large breeding populations should not be gaining many introns at all.
But Gozashti, Corbett-Detig and their coauthors found the opposite. Some marine protists with gargantuan breeding populations had hundreds or thousands of introners. In contrast, introners were rare in animals and absent in land plants — both groups with much smaller breeding populations.
The evolutionary arms race between invading genetic elements and the host may have a hand in generating a more complicated genome. The parasitic elements are in “constant conflict” with genetic elements that belong to the host, Gozashti explained, because they compete for genomic space. “All these moving pieces are constantly driving each other to evolve,” he said.
That raises the question of what the intron gains meant for the functional biology of the organisms in which they occurred.
Cedric Feschotte, a molecular biologist at Cornell University, suspects it would be interesting to compare two closely related species, only one of which has experienced an intron swarm in recent evolutionary history. The comparison might help to reveal how influxes of introns could promote the appearance of new genes. “Because we know that bringing in introns can also facilitate the capture of additional exons — so completely new stuff,” he said.
Similarly, Feschotte thinks that profusions of introns might help drive the evolution of families of genes that can change rapidly. Stuffed with new introns, those genes could co-opt the new variability enabled by alternative splicing.
Such rapidly evolving genes are widespread in nature. Venomous species, for instance, often need to remix the complex cocktails of peptides in their venoms at the genetic level to adapt to different prey or predators. The ability of the immune system to generate endlessly diverse molecular receptors also depends on genes that can rearrange and recombine quickly.
Peona warns, however, that although introners could provide benefits to an organism, they might also be totally neutral. They should be considered “innocent until proven guilty of function or anything else.”
“One of the things that’s next is looking at metagenomic data to try to find a case that really is a clear horizontal transfer with the exact same introners in two different species,” Corbett-Detig said. Finding this piece of the puzzle would help flesh out the full story of where most of eukaryotes’ introns have come from.
Irina Arkhipova, a molecular evolutionary geneticist at the University of Chicago Marine Biological Laboratory, is interested in knowing more about how introners are spreading through the genome at such large scales. “It just leaves no trace of the enzyme that was responsible for this massive burst of mobility — that’s a mystery,” she said. “You basically have to catch it in the act while it’s still moving.”
For Gozashti, the discovery of introners in such a wide range of eukaryotes holds a lesson about how to approach fundamental questions about the nature of eukaryotic life: Think broadly. Studies often focus on the sliver of biodiversity represented by animals and land plants. But to understand the important patterns of genomic information underlying all life, “we need to sequence more eukaryotic diversity, more of these protist lineages where we don’t know anything about how they evolve,” he said. “Had we just studied land plants and animals, we never would have found introners.”
Editor’s note: Gozashti is a graduate student in the laboratory of Hopi Hoekstra, who serves on the advisory board for Quanta.