Many scientists study multiple strains of an organism, says [The Institute for Genomic Research] TIGR President Claire Fraser. But at TIGR, we’re now going a step further, to actually quantify how many genes are associated with a given species. How many genomes do you need to fully describe a bacterial species?
In pursuit of that question, TIGR scientist Herve Tettelin and colleagues published a study in this week’s (September 19-23) early online edition of the Proceedings of the National Academy of Sciences (PNAS). In the study, TIGR scientists, with collaborators at Chiron Corporation, Harvard Medical School and Seattle Children’s Hospital, compared the genomic sequence of eight isolates of the same bacterial species: Streptococcus agalactiae , also known as Group B Strep (GBS), which can cause infection in newborns and immuno-compromised individuals.
Analyzing the eight GBS genomes, the researchers discovered a surprisingly continual stream of diversity. Each GBS strain contained an average of 1806 genes present in every strain (thus constituting the GBS core genome) plus 439 genes absent in one or more strains. Moreover, mathematical modeling showed that unique genes will continue to emerge, even after thousands of genomes are sequenced. The GBS pan-genome is expected to grow by an average of 33 new genes every time a new strain is sequenced.
Yes, John. Another really esoteric science post. But, it’s fascinating stuff. We can’t let Ray Kurzweil have all the fun.
The pan-genome is more than mere syntax. The concept has real implications for molecular biology. Many important pathogens–including those responsible for influenza, Chlamydia, and gastrointestinal infections, all under study at TIGRcontain multiple strains with specific genomes. By bringing a pan-genome perspective to the study of these organisms, scientists may better learn how new pathogens emerge and better target therapies to specific conditions. One approach is to spotlight a species’s core genome. On the flip side, scientists may eliminate a core genome, hunting instead for fringe genes that explain a specific strain’s unique activity.
TIGR researchers say the pan-genome concept also underscores the limits of traditional known genomes. Researchers often refer to a type genome to describe a given species. That singular, representative genome is often simply the strain easiest to acquire from nature or grow in the lab. Yet scientists worldwide routinely tap these known genomes in public databases to hunt for drug targets, explain ecological niches, and chart evolution. How well do these microbial genomes reflect reality?
As comparative genomics itself evolves, Fraser expects TIGR to increasingly focus on pan-genomes. Many questions remain. Although some microbial species, such as GBS, have infinite pan-genomes, for instance, others are more limited. Comparing eight independent isolates of Bacillus anthracis (the bacterium that causes anthrax), for instance, Tettelin and colleagues found that just four genomes were sufficient to characterize its pan-genome. That raises interesting questions about rates of evolution, notes Fraser. We’re intrigued to learn more about the diversity within a given species, and how it happens, she says.
My neck of the woods does have a small crowd of mathematicians and geneticists. I presume they’re discussing the heck out of this, right now.
The pan-genome idea is pretty cool, but I have to object to the 3 main points in this post that are contradictory or misleading:
1) Moreover, mathematical modeling showed that unique genes will continue to emerge, even after thousands of genomes are sequenced. The GBS pan-genome is expected to grow by an average of 33 new genes every time a new strain is sequenced.
All this is saying is that bacteria mutate A LOT, and that as more and more strains are sequenced, more and more of those mutations will be discovered.
Given that there is a new Flu virus, and hence vaccine, every year, we knew this already. (Bacteria are even worse, hence no cure for the common cold.)
2) Researchers Predict Infinite Genomes
No they don’t. It’s clear than any individual genome is finite in length, and even the pan-gemone is finite in length at any given time. All they say is that the _pan-genome_ will continue to grow as the bacteria mutates. It’s never going to be infinite.
3) How many genomes do you need to fully describe a bacterial species?
This question is really misleading. I thought they were referring to all the junk genes that we know are in any particular genome, and they were going to start knocking these out until you get a genome that has the minimum number of genes necessary to still behave like the original species does.
The minimal set of genes that currently exist in nature for a particular species may be much larger than the minimum number of genes needed to define that species.
Anyway, it’s still a cool article and soothes some of my guilt at not reading my Scientific American.