The syn3.0 cells contain the minimum amount of genes needed for life.

Tom Deerinck and Mark Ellisman

The syn3.0 cells contain the minimum number of genes needed for life.

Peel away the layers of a house — the plastered walls, the slate roof, the hardwood floors — and you’re left with a frame, the skeletal form that makes up the core of any structure. Can we do the same with life? Can scientists pare down the layers of complexity to reveal the essence of life, the foundation on which biology is built?

That’s what Craig Venter and his collaborators have attempted to do in a new study published today in the journal Science. Venter’s team painstakingly whittled down the genome of Mycoplasma mycoides, a bacterium that lives in cattle, to reveal a bare-bones set of genetic instructions capable of making life. The result is a tiny organism named syn3.0 that contains just 473 genes. (By comparison, E. coli has about 4,000 to 5,000 genes, and humans have roughly 20,000.)

Yet within those 473 genes lies a gaping hole. Scientists have little idea what roughly a third of them do. Rather than illuminating the essential components of life, syn3.0 has revealed how much we have left to learn about the very basics of biology.

“To me, the most interesting thing is what it tells us about what we don’t know,” said Jack Szostak, a biochemist at Harvard University who was not involved in the study. “So many genes of unknown function seem to be essential.”

“We were totally surprised and shocked,” said Venter, a biologist who heads the J. Craig Venter Institute in La Jolla, Calif., and Rockville, Md., and is most famous for his role in mapping the human genome. The researchers had expected some number of unknown genes in the mix, perhaps totaling five to 10 percent of the genome. “But this is truly a stunning number,” he said.

The seed for Venter’s quest was planted in 1995, when his team deciphered the genome of Mycoplasma genitalium, a microbe that lives in the human urinary tract. When Venter’s researchers started work on this new project, they chose M. genitalium — the second complete bacterial genome to be sequenced — expressly for its diminutive genome size. With 517 genes and 580,000 DNA letters, it has one of the smallest known genomes in a self-replicating organism. (Some symbiotic microbes can survive with just 100-odd genes, but they rely on resources from their host to survive.)

M. genitalium’s trim package of DNA raised the question: What is the smallest number of genes a cell could possess? “We wanted to know the basic gene components of life,” Venter said. “It seemed like a great idea 20 years ago — we had no idea it would be a 20-year process to get here.”


Clyde A. Hutchison, a biologist at JCVI who led the new study, has been researching mycoplasma bacteria as models for the minimal cell since 1990.

Minimal Design

Venter and his collaborators originally set out to design a stripped-down genome based on what scientists knew about biology. They would start with genes involved in the most critical processes of the cell, such as copying and translating DNA, and build from there.

But before they could create this streamlined version of life, the researchers had to figure out how to design and build genomes from scratch. Rather than editing DNA in a living organism, as most researchers did, they wanted to exert greater control — to plan their genome on a computer and then synthesize the DNA in test tubes.

In 2008, Venter and his collaborator Hamilton Smith created the first synthetic bacterial genome by building a modified version of M. genitalium’s DNA. Then in 2010 they made the first self-replicating synthetic organism, manufacturing a version of M. mycoides’ genome and then transplanting it into a different Mycoplasma species. The synthetic genome took over the cell, replacing the native operating system with a human-made version. The synthetic M. mycoides genome was mostly identical to the natural version, save for a few genetic watermarks — researchers added their names and a few famous quotes, including a slightly garbled version of Richard Feynman’s assertion, “What I cannot create, I do not understand.”

With the right tools finally in hand, the researchers designed a set of genetic blueprints for their minimal cell and then tried to build them. Yet “not one design worked,” Venter said. He saw their repeated failures as a rebuke for their hubris. Does modern science have sufficient knowledge of basic biological principles to build a cell? “The answer was a resounding no,” he said.

So the team took a different and more labor-intensive tack, replacing the design approach with trial and error. They disrupted M. mycoides’ genes, determining which were essential for the bacteria to survive. They erased the extraneous genes to create syn3.0, which has a smaller genome than any independently replicating organism discovered on Earth to date.

What’s left after trimming the genetic fat? The majority of the remaining genes are involved in one of three functions: producing RNA and proteins, preserving the fidelity of genetic information, or creating the cell membrane. Genes for editing DNA were largely expendable.

But it is unclear what the remaining 149 genes do. Scientists can broadly classify 70 of them based on the genes’ structure, but the researchers have little idea of what precise role the genes play in the cell. The function of 79 genes is a complete mystery. “We don’t know what they provide or why they are essential for life — maybe they are doing something more subtle, something obviously not appreciated yet in biology,” Venter said. “It’s a very humbling set of experiments.”

Synthetic Biology

Venter envisions syn3.0 as a cellular chassis that scientists can build on. Researchers can embellish the genome to create new organisms, which could help them to better understand stages of evolution lost to time. “In theory, we should be able to add genes back to [syn3.0] to recapitulate key parts of evolution,” Venter said. For example, they might try to create more advanced bacteria, or even to convert the basic chassis into different biological classes altogether. “We could reduce billions of years of evolution to maybe years or months or weeks,” he said.

Venter and his collaborators also plan to use the cells for industrial purposes, designing cells that can produce pharmaceuticals or other chemicals. “We have one cell in production to make omega-3s more efficiently than it can be isolated from fish,” Venter said.

One of the challenges in synthetic biology — the quest to engineer cells for specific purposes — has been that living organisms behave unpredictably. Theoretically, a minimal cell would provide an engineering advantage because it has fewer unpredictable components. It’s not yet clear whether this will prove true. Most efforts in synthetic biology employ existing microbes, such as E. coli, and scientists may not yet see a good reason to switch.

Venter’s team is eager to figure out what the mystery genes do, but the challenge is multiplied by the fact that these genes don’t resemble any other known genes. One way to investigate their function is to engineer versions of the cell in which each of these genes can be turned on and off. When they’re off, “what’s the first thing to get messed up?” Szostak said. “You can try to pin it to general class, like metabolism or DNA replication.”

Dwindling to Zero

Venter is careful to avoid calling syn3.0 a universal minimal cell. If he had done the same set of experiments with a different microbe, he points out, he would have ended up with a different set of genes.

In fact, there’s no single set of genes that all living things need in order to exist. When scientists first began searching for such a thing 20 years ago, they hoped that simply comparing the genome sequences from a bunch of different species would reveal an essential core shared by all species. But as the number of genome sequences blossomed, that essential core disappeared. In 2010, David Ussery, a biologist at Oak Ridge National Laboratory in Tennessee, and his collaborators compared 1,000 genomes. They found that not a single gene is shared across all of life. “There are different ways to have a core set of instructions,” Szostak said.

Moreover, what’s essential in biology depends largely on an organism’s environment. For example, imagine a microbe that lives in the presence of a toxin, such as an antibiotic. A gene that can break down the toxin would be essential for a microbe in that environment. But remove the toxin, and that gene is no longer essential.

Venter’s minimal cell is a product not just of its environment, but of the entirety of the history of life on Earth. Sometime in biology’s 4-billion-year record, cells much simpler than this one must have existed. “We didn’t go from nothing to a cell with 400 genes,” Szostak said. He and others are trying to make more basic life-forms that are representative of these earlier stages of evolution.

Some scientists say that this type of bottom-up approach is necessary in order to truly understand life’s essence. “If we are ever to understand even the simplest living organism, we have to be able to design and synthesize one from scratch,” said Anthony Forster, a biologist at Uppsala University in Sweden. “We are still far from this goal.”

This article was reprinted on

View Reader Comments (28)

Leave a Comment

Reader CommentsLeave a Comment

  • The article indicates that a microbe needs certain genes to survive in a given environment. So the apparently redundant genes may have had a function at one time in a historically different world.
    To decode this it seems necessary to identify the whole sequence of environmental challenges that an organism has survived, in order to recognize what each remaining gene did and when it was used.
    An assumption of this mechanism is that genes can react to new external challenges and adapt to meet them. If this is a chance process of evolution there must be many failed cells and organisms in the tree of life.
    Conversely, if genes in a cell can be aged and dated, it might reveal a chronology of the earth's environment since life began. Logically this sequence should look the same for any genome used as a starting point.

  • Rather than thinking in terms of the function of each gene, per se, we should view it terms of the entire gene regulatory and protein networks; i.e., it's the network, stupid.

  • One hypothesis that could account for the fact that Ventner et al failed to both synthesise the genome, and identify any common defining characteristic of the genome, is that the natural law/s governing the behaviour of any such genome may obey an algorithmically verifiable, but not algorithmically computable, logic such as that which is considered in Thesis 1 on p.42 of the following paper just published in 'Cognitive Systems Research':

    'The Truth Assignments That Differentiate Human Reasoning From Mechanistic Reasoning: The Evidence-Based Argument for Lucas' Goedelian Thesis'

    The paper is freely downloadable till 8th May 2016 from:

  • Venter et. al. are at the forefront of synthetic biology. This latest work is an absolute impressive achievement. We are getting closer to that goal of understanding origin of life by creating it from scratch; however, and as quoted in the article what Antony Forster said “We are still far from this goal.”

  • Reading this makes you realize how far science still has to go in understanding life. It is truly humbling. It is very interesting to think about the possibility of common genes in all life but as complicated as life is I hope they did not expect to find too much information.

  • I wonder if the cell would still be viable if the microbiologists had shuffled all of the genes randomly. Is the order of the genes important? From contemporary microbiology one would guess no, but what if the gene structure itself is somehow important?

    Then begin deleting those mystery genes one by one, and see if it still grows.

  • They should be using other small microorganisms as well and whittle the cell down to the smallest genome that permits reproduction. Then compare the genes of the different reproduction capable microbes.
    An experiment of mix and match known genes from all the microbe should be attempted to create a new microbe capable of reproduction. Failures will then require the addition of unknown genes and determine their function. The variables are far too many for a design experiment.

  • We should be careful as scientists can create by mistake a frankenstein bacteria also! That synthetic organism can create chaos also like they show in Hollywood movies.

  • Excerpt: "They found that not a single gene is shared across all of life. “There are different ways to have a core set of instructions,” Szostak said."

    Gene expression is nutrient-dependent and controlled by the physiology of reproduction via transgenerational epigenetic inheritance of the innate immune system and the conserved molecular mechanisms of RNA-mediated cell type differentiation that link atoms to ecosystems in all living genera.

    Schrodinger put this into perspective in 1944.

    Excerpt: "…in the case of higher animals we know the kind of orderliness they feed upon well enough, viz. the extremely well-ordered state of matter in more or less complicated organic compounds, which serve them as foodstuffs. After utilizing it they return it in a very much degraded form -not entirely degraded, however, for plants can still make use of it. (These, of course, have their most power supply of ‘negative entropy’ the sunlight)"

    Szostak's group came close in 2015 with publication of "Thermodynamic insights into 2-thiouridine-enhanced RNA hybridization"

    Unfortunately, we cannot expect anyone who cannot link hydrogen-atom transfer in DNA base pairs in solution from physics and chemistry to the conserved molecular mechanisms that link biophysically constrained nutrient-dependent RNA-mediated amino acid substitutions to supercoiled DNA and cell type differentiation in all living genera. Specialists don't know that the supercoiled DNA is protecting organized genomes from virus-driven entropy.

    Vetner's group has already alluded to that fact with claims that their synthetic organism cannot adapt to any change in the epigenetic landscape, which is what all living organisms must do. The de novo creation of nucleic acids linked the anti-entropic energy of the sun to the creation of all biodiversity — in the context of Einstein's claims, Schrodinger's claims, and Dobzhansky's claims. But after they bastardized Darwin's claims, the neo-Darwinists proceeded to remove his "conditions of life" and substitute de Vries definition of mutation for the virucidal anti-entropic energy force of the sun.

  • —–"“We don’t know what they provide or why they are essential for life — maybe they are doing something more subtle, something obviously not appreciated yet in biology,” Venter said. “It’s a very humbling set of experiments.”———

    In order to figure out what these "mystery genes" do it might be necessary to consider different methods. Turning a gene off and on might not be the only option i think.

    The problem is that the object of study the cell is very small. Volumes are small as well. Experimenting with the volumes, the volumes of specific things in the cell might be necessary as well in order to figure things out.

    Maybe the gene function becomes first visible/clear when different gene expressions reach a specific result/volume.

    Imagine there is e.g. a specific gene which functions first becomes visible when another gene expression reaches a specific volume or if several genes in combination reach a specific volume!

  • Thoughtful and informative research, and so a great article.
    Without a reference, I'll refer to work done on identifying chemical 'miracles' in which it is shown hydrogen sulfide may indeed arise out of a pure hydrogen environment.
    Given proton tunnelling (as a separate but connective idea) and so called 'green life', the spontaneous chemical critical mass of origin, perhaps not only the complexity of the network or algorithm defining its coordination is the only factor. Perhaps folding expression directs proton tunnelling given various substituent alchemies which mirror known ignitions. The cycle of such chemistries, for this layman, could be innumerable in the context of those amino acids any given life coordinates expression by, and the environment in which leavings are also found. Evolutionary paths, consequently, as discussed would be dense making historical atrophies of core sequences (unused portions) binding sites for diseases (bacteria, viruses, symbiants, incomplete sequences (or garbage), etc.). Fun to riff on. Enjoyed the comments, too.

  • Re: Anand's mathematical logic reference: I seriously doubt that issues relating to decidability, limits of what can be proved etc. have any direct bearing here. Now, having said that I would need to read to paper to how and why they deduced that those extra genes, while unknown in function, were required to make a minimal, complete genome.

  • "Peel away the layers of a house — the plastered walls, the slate roof, the hardwood floors — and you’re left with a frame, the skeletal form that makes up the core of any structure. Can we do the same with life? Can scientists pare down the layers of complexity to reveal the essence of life, the foundation on which biology is built?"

    First para in, and already the author has mixed up building foundations with frames.

    "Venter’s team painstakingly whittled down the genome of Mycoplasma mycoides, a bacterium that lives in cattle, to reveal a bare-bones set of genetic instructions capable of making life. Yet within those 473 genes lies a gaping hole. Scientists have little idea what roughly a third of them do."

    So they whittled it down to its "bare bones foundation/frame", but also included 33.3% extra genes that they have no idea what they do. That's like Diet Pepsi with 33% extra sugar.

    I gave up after that, how this writer got a job is the greatest mystery I've contemplated all day.

  • Contrary to inflated claims to “synthetic life” and “created life”, the team has not created life out of nonliving components (aka abiogenesis). The experiment is just a very sophisticated version of breeding – a process that humans have used for thousands of years to develop useful organisms. The understanding of life is closer, but still far, far away.

  • —–"“We don’t know what they provide or why they are essential for life — maybe they are doing something more subtle, something obviously not appreciated yet in biology,” Venter said. “It’s a very humbling set of experiments.”———

    The mystery genes might provide a function/characteristic in the cell which is not that visible in the first place, like logic and other rules are in the language! Logic is most of the time indirect visible! (To some it is even never visible at all)

    So, therefore in order to view the mystery genes maybe it has to be searched in all directions.

    – communication with other living cells
    – communication with dying other cells
    – communication with a specific amount of other cells which surround the cell you are observing.
    – a theoretical hexagonal cell might activate some mystery genes first if all 6 faces are in contact with another cell
    – speaking of some complex multi-cell function
    – functions/characteristics which are visible first if a specific volume of another gene result appears in nearby cells
    – functions/characteristics which are visible first if specific contact cells reach specific volumes
    – genes might also not just be responsible for communication but also for receiving information
    – going into distance and physics here i am stating entanglement (genes might respond to things far far away from the cell, or the cell is affecting something far away)
    – stating another physics here gravitational waves: what if specific genes are activated when applied to some high frequencies etc. .
    – genes activated when the cell is in a higher complex structure integrated (e.g. a theoretical hexagonal cell just to basic 6 cells and those 6 cells surrounded by more)
    – see if genes are activated if genes are in a complex structure integrated (see line above) and then tested to some physical quantities like pressure etc..

    So my idea goes basically into the direction to see if the mystery genes might fulfill a function/characteristic if the cell acts with other cells in a unit, the cell units in a higher order and so on…

  • But is this organism sustainable? Life is a difficult moving thing to define.

    Genes themselves do not have a direct interpretation. With introns, extrons, and sheer circumstance the entire system moves. Transcription and translation tile away, but intepretation becomes an interaction between the organism, the environment, and other organisms. A small part of the beauty is that they may not exist definitive intepretations as the whole system moves. There may exist statements about specific intervals, but holonic folding and unfolding does not have to appeal to human computation, merely finding a way.

    Divergences of vast swaths of data are not so easily caught, let alone grasped. Falsifiable thesises must be set forth. Perhaps these extra genes were evolutionary driving forces with protective experiental feedbacks on previous failed of pinchings of base pairs. Maybe it was just junk from the momentum to get to this moment. Maybe both and neither. It will be alright.

  • I agree with one comment made by Saurabh Jain that this is DANDEROUS RESEARCH.
    It should be performed by specially qualified individuals in a secure biohazard facility like the CDC. These scientists admit to creating a " life form " about which they know little, especially whether it could cause a pandemic.

  • Are JCVI-syn 3.0 cells senescent? The journal article claims they replicate slower. Do they also live longer?

  • @J Vanderwerff

    "So they whittled it down to its "bare bones foundation/frame", but also included 33.3% extra genes that they have no idea what they do."

    Yes, because the organism ceased to function if they removed any of them. These genes clearly serve an essential function, we just don't know what it is. If we knew what they did, it might be possible to create a smaller set of synthetic genes that serve the same purpose, thereby creating an even more bare bones frame – but for now this is the best we can do.

    All of that was obvious from the article. So instead of insulting the author, maybe you should work on your reading comprehension?

  • I had the same question as J. Vanderweff–if the function of those 149 genes is unknown, a complete mystery, then how can it be known for sure that they're essential to this bacterium?

    My best guess, as a layman, is that this genome with 473 genes is the result of experimentation with various genomes, both larger and smaller sets of genes. Thus, this Ventner group would have arrived at this current minimum genome after trying to use genomes that didn't include some or all of those 149 mysterious genes in various combinations, and finding that the bacterium doesn't survive or reproduce without them. Thus, they don't have to know the exact function of the gene, because they know the bacterium can't function without it.

    What happens in every movie where a non-expert needs to disable a car to prevent the bad guy's escape? They pull out the distributor cap or the plug wires, or they put sugar in the gas tank. Surely many in the audience would struggle to explain what the distributor cap does (especially now that most street cars don't have one anymore.) But that doesn't prevent them from knowing that the car starts when the cap is installed and sits dead when it's removed.

    Now, if I'm way off . . . somebody tell me so, OK? I'd do it for you as far as you know.

  • I feel like I should be designing a virus-proof bunker right now. This seems like the perfect organism for a mutation to turn it into an epidemic. Bit worried, but still very impressed and excited for the possibilities.

  • As Kenneth Rubenstein says: "Rather than thinking in terms of the function of each gene, per se, we should view it (in) terms of the entire gene regulatory and protein networks; i.e., it's the network, stupid."

    Yes, it's the foundation and structure that enable the function — and are where the universal models for life are, as 'tweens (in-betweens). We're still looking at the "effect" rather than the "cause". All creation is a self-similarity of human meiosis and mitosis, and of oogenesis, spermatogenesis, conception, gestation and birth. It's as simple and as complicated as that. Skipping steps will only create Frankensteins. (Ditto the attempts at creating AI.)

  • So, now you don't understand Life?

    It's the ability to transform and use energy for the unit's own purpose.

  • “What I cannot create, I do not understand.”

    And it would seem what they can create, they also do not understand. I believe that implies they're batting 1000, which leaves faint praise coupled with an all consuming fear as the sole alternative. What may we expect they'll create next, that they don't understand?

  • The 149 genes of unknown function comprise a "to do" list for researchers completing the project started by Watson & Crick over half a century ago – surely this endeavor was contemplated by thinkers around campfires spanning thousands of generations. What a time to be alive.

    "Some insider information: Most of the mystery essential genes are vague cell wall proteins. Probably what is going on is that if you knock out too many of these genes, you lose cell wall turgidity and the cell becomes nonviable. So what is important is not so much which of these genes you have, but how many of them you have."

    "That hints that you could just duplicate the same cell wall protein gene instead, right?"

    "presumably. Or just put in fewer of them, behind stronger promoters."

  • Martin: It is also likely that the genes are like the stationary, envelopes and stamps – mere containers – for messages that anyone creates/uses to send to their own friends, relatives or business associates; depending on the circumstances, languages and cultural needs of the various situations. They are a convenient infrastructure like roads, telegraphs or internet. Trying to understand what the infrastructure does with out the society that created them is unlikely to solve that puzzle. Genes may evolve like the infrastructure evolves from early day empires to the present day digital economy.

Leave a Comment

Your email address will not be published. Your name will appear near your comment.

Quanta Magazine moderates all comments with the goal of facilitating an informed, substantive, civil conversation about the research developments we cover. Comments that are abusive, profane, self-promotional, misleading, incoherent or off-topic will be rejected. We can only accept comments that are written in English.