At Tiny Scales, a Giant Burst on Tree of Life

A new technique for finding and characterizing microbes has boosted the number of known bacteria by almost 50 percent, revealing a hidden world all around us.

Travis Bedel for Quanta Magazine

It used to be that to find new forms of life, all you had to do was take a walk in the woods. Now it’s not so simple. The most conspicuous organisms have long since been cataloged and fixed on the tree of life, and the ones that remain undiscovered don’t give themselves up easily. You could spend all day by the same watering hole with the best scientific instruments and come up with nothing.

Maybe it’s not surprising, then, that when discoveries do occur, they sometimes come in torrents. Find a different way of looking, and novel forms of life appear everywhere.

A team of microbiologists based at the University of California, Berkeley, recently figured out one such new way of detecting life. At a stroke, their work expanded the number of known types — or phyla — of bacteria by nearly 50 percent, a dramatic change that indicates just how many forms of life on earth have escaped our notice so far.

“Some of the branches in the tree of life had been noted before,” said Chris Brown, a student in the lab of Jill Banfield and lead author of the paper. “With this study we were able to fill in many gaps.”

Life’s Finest Net

As an organizational tool, the tree of life has been around for a long time. Lamarck had his version. Darwin had another. The basic structure of the current tree goes back 40 years to the microbiologist Carl Woese, who divided life into three domains: eukaryotes, which include all plants and animals; bacteria; and archaea, single-celled microorganisms with their own distinct features. After a point, discovery came to hinge on finding new ways of searching.

“We used to think there were just plants and animals,” said Edward Rubin, director of the U.S. Department of Energy’s Joint Genome Institute. “Then we got microscopes, and got microbes. Then we got small levels of DNA sequencing.”

Courtesy of Jill Banfield

Jill Banfield and collaborators at the University of California, Berkeley, have discovered new groups of very small bacteria, expanding the tree of life.

DNA sequencing is at the heart of this current study, though the researchers’ success also owes a debt to more basic technology. The team gathered water samples from a research site on the Colorado River near the town of Rifle, Colo. Before doing any sequencing, they passed the water through a pair of increasingly fine filters — with pores 0.2 and 0.1 microns wide — and then analyzed the cells captured by the filters. At this point they already had undiscovered life on their hands, for the simple reason that scientists had not thought to look on such a tiny scale before.

“Most people assumed that bacteria were bigger, and most bacteria are bigger,” Rubin said. “[Banfield] has shown that there are whole populations that are very small.”

The researchers extracted the DNA from the cellular material and sent it to the Joint Genome Institute for sequencing. What they got back was a mess. Imagine being handed a box of pieces from thousands of different jigsaw puzzles and having to assemble them without knowing what any of the final images look like. That’s the challenge researchers face when performing metagenomic analysis — sequencing scrambled genetic material from many organisms at once.

The Berkeley team began the reassembly process with algorithms that assembled bits of the sequenced genetic code into slightly longer strings called contigs.

“You no longer have tiny pieces of DNA, you have bigger pieces,” Brown said. “Then you figure out which of these larger pieces are part of a single genome.”

This part of the process, in which contigs are combined to reconstruct the genome sequence, is called genome binning. To execute it, the researchers relied on another set of algorithms, customized for the task by Itai Sharon, a co-author of the study. They also assembled some of the genomes manually, making decisions about what goes where based on the fact that some characteristics are consistent for a given genome. For example, the percentage of Gs and Cs will be similar on any part of an organism’s DNA.

When the assembly was complete, the researchers had eight full bacterial genomes and 789 draft genomes that were roughly 90 percent complete. Some of the organisms had been glimpsed before; many others were completely new.

The reason no one had found these organisms before is that the traditional method used to search for small forms of life doesn’t work for everything. That method involves the 16S rRNA gene, which is often compared to a fingerprint because the genetic code it contains is unique for every organism. When confronted with a DNA stew, like the one from the water samples in Rifle, scientists use substances called primers to draw out and amplify all the 16S rRNA genes. The problem is, not all 16S rRNA genes react with the primers, rendering some organisms effectively invisible.

“The primers don’t work as well as people would like them to,” Brown said. “We showed that many of the sequences we reconstructed would have been missed by the traditional 16S amplification-type method.”

By reconstructing complete or nearly complete genomes, Brown and his collaborators were able to locate 16S rRNA genes and identify organisms without relying on primers. The group published their results in the July 9 issue of Nature.

Courtesy of Jill Banfield

Some of the newly discovered bacteria have hairlike structures on their surface.

The fuller genomic picture they created also allowed them to tease out traits of the life forms they’d discovered. All the organisms they found have very short genomes, about one million base pairs (compare that to E. coli, which has about five million), and they all have minimal metabolic function, requiring them to use fermentation to generate energy. They are also missing many basic biosynthetic pathways and need help making nucleotides and amino acids.

“They must be dependent on other organisms in some capacity to survive. This also explains why no one’s been able to grow them in the lab,” Brown said.

A New Domain?

The discovery of new organisms is fairly cut and dried: Either you’ve found one or you haven’t. Cataloging organisms, fitting them into the tree of life, involves more judgment calls.

The researchers divided the 789 organisms into 35 phyla — 28 of which were newly discovered — within the domain bacteria. They based the sorting on the organisms’ evolutionary history and on similarities in the code on the organisms’ 16S rRNA genes — those with at least 75 percent of their code in common went into the same phylum.

With these new additions, there are now roughly 90 identified bacterial phyla. This is a lot more than there were a year ago, but also far fewer than the 1,300 to 1,500 phyla that microbiologists estimate we’ll have once a complete accounting is finished. Recent advances in genetic sequencing and genome binning make Brown and Banfield optimistic, though, that it won’t be long before we’ve mapped them all.

“I think that much of the tree of life will come into view in the next few years,” Banfield wrote in an email.

Of course, no sooner do we think we’ve seen everything than we come up with a new way to see. Rubin thinks that the development of tools like the ones used in the new study make the search for life “a growth industry,” and he thinks it’s likely that growth will occur in surprising ways.

“Looking at things from a different angle may offer that possibility of a fourth domain,” he said — an equal partner to bacteria, archaea and eukaryotes. “There will always be novel stuff that will teach us foundational info about how life operates.”

This article was reprinted on

View Reader Comments (14)

Leave a Comment

Reader CommentsLeave a Comment

  • It seems that, given enough time in conditions that support life, evolution will exploit every ecological niche.

  • "90 identified bacterial phyla […] far fewer than the 1,300 to 1,500 phyla that microbiologists estimate we’ll have once a complete accounting is finished"

    Would a knowledgeable reader please explain me of biologist can extrapolate by more than one order of magnitude how many phyla really exist? Is it pure guess, or based on genetic analysis, or something else? Is it like oil reserve estimates: the number keeps growing with time 🙂


  • Hi Christoper-

    I wrote the story and I'm glad you asked this question. Microbiologists analyze what's called a rarefraction curve to estimate the total number of phyla out there based on the pace at which new phyla are being discovered. As the pace of discovery slows, the curve levels off, and you gain statistical confidence that you're closing in on the actual number.


  • Hi Kevin, wouldn't they have to reconsider the total number of phyla estimated to be in existence due to this newly discovered method of detecting bacteria species? I would think that the pace of discovery is about to change quite a bit as they refine this new discovery method.

  • Hi Kevin. Good article, but I believe you may have misstated the method of capture. The organisms were isolated not because they were captured or retained by 0.2 um filters, but because they passed through them. However, I did not access the full article and only read the abstract. But if you could correct OR elaborate on that statement, it would be helpful, as it's in conflict with what's stated in the abstract.

  • Hmm… they found things that were smaller than they expected? This reminds me of ALH84001

  • Hi Tara-

    Thanks for your question about the methods the researchers used to collect the cellular material. The researchers examined cells that remained on the .2 and .1 filters. They concentrated their analysis, however, on the DNA from the cells that made it through the .2 filter and then collected on the .1 filter. They did this because they knew that the lifeforms they were looking for were very small and likely to have passed through the first filter. This is the relevant section from the paper:

    "We sampled microbial communities from an aquifer adjacent to the Colorado River near the town of Rifle, Colorado, USA in 2011. Groundwater was filtered through a 1.2 mm pre-filter and cells were collected on serial 0.2 and 0.1 mm filters (Extended Data Fig. 1). Post- 0.2 mm filtrates were targeted because CPR bacteria were predicted to have ultra-small cells on the basis of their small genomes."


  • Greeat article left me, a non-biologist, with two immediate questions:
    1)Is there a reason to believe that there are no smaller bacteria?
    2)How large are viruses?

  • The "binning" of the different species was possible based on knowledge that each organism's genome exists in a certain context – its base composition 'accent.' Of the various combinations of the 4 individual bases, Chargaff (1950) showed that (G+C)% tended to be constant for a species. Later, it was found that combinations of certain contiguous bases (dinucleotides, trinucleotides, tetranucleotides, etc. ) also provided a context for a species. While those interested in classification have elegantly employed context differences to expand our appreciation of the diversity of living forms (phylogenetics), it is of much greater interest to ask whether these context differences were secondary to the more obvious functional differences that distinguish species, or whether the context differences (genotype differences) were themselves primary in the speciation process. In this way they would have provided opportunities for the obvious functional differences (phenotype differences) to emerge. Perhaps Kevin Hartnett will address this in a future essay.

  • Bravo to the team. Often times discoveries are made when people look at things with a different perspective and dare to question the accepted status.

  • Len B says: …2)How large are viruses?
    Read this: Giant Viruses
    The recent discovery of really, really big viruses is changing views about the nature of viruses and the history of life

  • Is it possible that viruses are not as separate from "real" organisms as is commonly thought, that there is a continuum of "organisms" that bridge the gaps between bacteria and viruses and/or between archaea and viruses?

  • This technique may be new to you, but it's certainly not new to anyone who has worked with Norm Pace or Carl Woese. Universal and domain specific 16S rRNA primers have been in use for this very purpose since the early 1990's and the advent of thermal cycle sequencing.

  • I've read the nature paper twice and it's still not clear to me if CPR are monophyletic to the exclusion of other bacteria, or not. I would think not, but then for example in Fig 1 caption "..showing the CPR, a monophyletic radiation of candidate phyla". What does that mean?

Comments are closed.