Thursday, June 28, 2012

What is a bacterial population?

One of the most important parameters within population genetics is effective population size (Ne). Ne  determines how strong genetic drift is within populations, and therefore how weakly selection sorts among genotypes. I'm not going to go through a bunch of examples of why it's such an important parameter (maybe later), suffice to say that  Ne  calculations can be used to explain results such as the origin of genome complexity as well as establishing divergence times between bacteria (such as here). Slight digression: I absolutely hate when divergence times are put on bacterial lineages. Unlike many eukaryotic taxa, bacterial populations don't leave easily dateable fossils so divergence times are completely based on population genetic estimates.

Which brings me back to Ne. Effective population size is just that, a measurement of population size. The problem I have with using this parameter in microbial populations is that I (and I suspect everyone else) don't have a clue as to what constitutes a microbial population.  Huge numbers always get thrown around about how many bacterial cells exist in nature (10 times as many bacterial cells as human cells in the body!). Are all these cells one population? Are all E. coli cells within your colon one population? Any estimation of Ne is just a guess because it's hard to define what a population is and population sizes likely vary over many orders of magnitude. I can be wrong about that, but that's why I'm putting my thoughts out there.

The best place to start tackling this question is in obligate parasites/symbionts. We have good evidence that vertically inherited symbionts such as Buchnera aphidicola have small Ne values because genetic drift has transformed their genomes. Likewise, it might be possible to define what populations are for certain obligate parasites (such as my good friend Helicobacter pylori, which can really only survive inside the human stomach) that are only found in closed environments. Even in cases such as H. pylori, transmission across hosts from stomach to stomach starts to blur the lines between populations. As far as I can tell the situation gets much more difficult to model as you step from generalist pathogens/symbionts to environmental bacteria to spore formers (or persisters) that can survive in environments without growing. What about population subdivision such as in biofilms? What about dramatic lifestyle changes and population bottlenecks such as in Vibrio fischeri. V. fischeri can be found living freely as saprophytes in the open ocean (see link within here), but all it takes is one single bacterium to infect a juvenile squid and be amplified by the billions. Bottlenecking from millions of cells to 1 dramatically altars Ne since calculations of this parameter place extra weight on low numbers.

Unlike what is possible with more visible megafauna, we can't simply go out and perform ecological catch and release studies to estimate population sizes. The best way to think about microbial populations may be, somewhat abstractly, in population genetic terms. If a new mutation arises, what other genotypes are going to affect its rates of fixation or persistance? If genetic variants arise within competing evolutionary backgrounds and directly affect each others frequencies, I take that as good evidence that those competing backgrounds are in the same population. Likewise, if the rates of migration/transmission between hosts or subpopulations significantly affects selection then these subpopulations could be considered parts of a larger whole.

As, hopefully, illustrated in the figure above, I think of microbial populations as a continuum. On the left side is one closed population where three genotypes (red, blue, green) all directly compete for resources and affect the frequencies of one another. On the right side these three genotypes are completely separate from one another and don't directly interact. My guess is that most microbial populations lie somewhere in the middle, where there is some subdivision but migration and transmission between subpopulations alters selection and genetic drift.

So how do we begin to measure microbial populations? I don't think it's been done yet, (please feel free to correct me!), but the rise of metagenomics at least makes the necessary experiments possible. With these technologies we can measure genotypes over time (all genotypes, not just a culturable subsample). We can measure how genotypes affect frequencies of other genotypes, and interactions could be evidence of existing within the same population. We could measure parameters such as strength of genetic drift and selection over time and extrapolate back to get Ne. Of course we then get into the issue of species level interactions, but I'll definitely save that for another time.



  1. Hi David,

    Interesting post, walks the line between accessible and detailed very well.

    I'm interested in a project, one aspect of which will involve looking at the evolution of a spore former. We have a type strain which was isolated in the 1920s and a colleague suggested we use this to root the tree (probably whole genome SNP based phylogeny). However, we would be making the assumption that contemporary cases of the disease have been caused by an evolutionary ancestor of the 1920s strain (or the population the strain came from). In your opinion, is this a valid assumption? Do you know of any literature which tackles this problem?


  2. Hey flashton,

    I try to stay away from making chronological assumptions with strains. Theres a bunch of instances (i.e. Vibrio cholerae, Yersinia pestis) where there have been multiple pandemics where one dominant strain overtakes another. If you isolated V. cholerae now it would not have been directly diverged from an isolate in 1920. The spore forming aspect complicates this even more.

    Do you even have to root the tree? If you do, why not root with something that you are sure is an outgroup? I'm sure this would calm critiques nerves about the phylogeny. If you don't have a close genome available, you should consider including one in the sequencing effort.

  3. by calculating no of microbe in a loop ful of culture & multiply it with he growth rate according to time


Disqus for