My Chrobial Romance: February 2016

Friday, February 26, 2016

Trying to build the cheapest computer that can run an Oxford Nanopore MinION Part I

One of the most difficult things to do in life is to admit that you don't know something. Along those lines, it also very difficult to stumble through learning things (and to make mistakes) in full view of the public. I'm about to do both, but hopefully this process is useful to someone out there. Also, feel free to chime in and tell me where I'm making huge mistakes.

Like many of you I've been following the sequencing developments from Oxford Nanopore pretty closely, and the results have been awesome so far. I'm excited to get a MinION and start playing around with collecting data. However, the first step in the process is finding a computer that actually meets the minimum specs to run a MinION (found here). The most important parts of this (at the moment) are that the machine must run Windows 7 or above, have a solid state hard drive that's big enough to handle the data coming off the machine, have 8Gb of RAM, have an i7 chip, and have a USB 3.0 port. Well, the only windows machine that my family owns is my wife's work computer (it's got the specs but there's no way in hell I'm messing with anything on that machine, out of fear). There are also a bunch of other folks who have reconfigured their other non-Windows machines to run windows, and while I could do that I wanted to take a different tack. So, my goal was to try and build a machine that meets the MinION specs while spending the least amount of money possible. Here goes....

I have hardly ever opened up the case on any of my computers before, and I would equate (at the start) my knowledge of what's inside a computer to "6 year old that is curious about how to make their gameplay better". Luckily, there are a ton of different forums, videos, and websites to guide you through every step of the way. Just know that no matter how much you learn, there are many 8+ year olds out there that can run circles around your computer building skills. I'm cool with that, I just want to build something that can sequence DNA using a USB stick.

First step was to survey the field and figure out what I needed to do. It turns out that Dell and HP build a lot of machines with decent processing power for business purposes that don't really have all the bells and whistles that other computers have. Since (I'm guessing) that these are all bought by businesses and then sold after a few years, it turns out that there are A LOT of these machines that you can buy refurbished. I didn't want to build a computer from scratch (too much of a leap), so I thought it might be good to find a refurbished tower that I could upgrade in other ways to meet the minimum specs. Things I looked for:

1) PCI express slot so that I could add in a USB 3.0 port
2) i7 chip
3) Running Windows 7

Turns out I found a machine that had all of this, and that also has a lot of space for expansion, for 279$ (HP Elite 8200 SFF). There are various flavors of "HP Elite 8200" machines, but I stayed away from the Ultra Slim model just because there wasn't a lot of capability for expansion. Likewise, I stayed away from the regular desktop because it was more $$$.

Here's what the inside looks like:

Next step was to buy a USB 3.0 upgrade, and there are a variety of different versions. If you go with the machine above, you want to be able to fit it in an SFF case, and so you want one that has a bracket for "Low profile" machines. Found one on Newegg for 10$. This comes with the driver you need to load it onto the machine (and which apparently worked the first time), and fits right into the PCIexpress 1x slot. One problem that I didn't anticipate, is that you need to provide power to the USB3.0 ports outside of the PCIexpress slot (on the version I bought this is supplied through a 15 pin SATA connection). There apparently are various ways to wire this, but the refurbished computer already had a splitter going to the DVD drive so that all I needed to buy was a 15 pin Male to 15 pin Female SATA connector (here) for 5$, and Voila! Not sure if this works yet, still waiting for delivery.

Last couple of steps are upgrading RAM and the solid state drive. The RAM is easy, the HP machine I bought has four slots for RAM, and you can buy this on Crucial.com (Crucial will even scan your machine beforehand to make sure the RAM fits). I didn't wan't to go overboard, so I bought 2xGb of RAM from them for 83$. I *think* that RAM works better when it's paired, so that's why I bought 16Gb total, but you might be able to get away with 8Gb if you're going minimum specs. Those basically just snap into the RAM ports on the machine. Easy Cheesy.

Last thing was to buy a solid state drive. Again, I went to Crucial.com and ordered a 500Gb drive (you can supposedly get away with 250Gb with the MinION, but I was advised that bigger is better in this case). Same deal, you can scan your system and make sure you're getting the right part and then order. That drive was 150$ total. When the drive came, I detached the SATA wires from the DVD port and connected them to the drive itself. Then I used this handy step by step guide to clone the original Windows 7 running hard drive to this new solid state drive, then unplugged both hard drives and swapped connections (and took the old hard drive out). Last step was to start the machine up again and, DAMN, it felt good when that machine booted itself up after I swapped hard drives. The SSD doesn't quite fit well in the case, although as long as I don't move it it should be OK for now. Looking for solutions for that in the future.

So there you go, a dedicated machine that can run an Oxford Nanopore MinION (in theory) for about 500$. I'll let you know in the next post whether it passes the basic spec tests before I actually burn in my MinION.

Sunday, February 21, 2016

Coevolution and the Microbiome

Coevolution is defined as cases where two (or more) species RECIPROCALLY affect each other's evolutionary dynamics. Reciprocally is highlighted in every way that I possibly could in the preceding sentence because this aspect really is key. It's a difficult concept to get a grasp on and an even more difficult concept to actually test in nature. I don't want to spend too much space going over the ins and outs of coevolution because numerous people that are smarter than I am have done that in accessible ways...My favorite being Dan Janzen's "When is Coevolution". (For other good lists that are slightly longer and more intense, see Scott Nuismer's or John Thompson's Google Scholar page). I'm missing a bunch of other stuff out there, but it's late so please forgive me and add any other great links in the comments.

I mention this because there is a lot of work being published on microbiomes right now, and I'm sensing a pretty strong tendency across manuscripts and presentations to state that microbiomes and their hosts have "coevolved". In some cases this is certainly true (best examples I can think of off the top of my head are nutritional symbionts in insects, some nodulating Rhizobia in legumes, Plasmodium in humans, and Vibrio-Bobtail Squid). Here's where it gets fuzzy, and I'm largely focusing on human microbiome studies here because they often get all the press...in many cases where researchers claim that microbiomes and hosts have "coevolved" there is absolutely no evidence that this has happened. Sure, there are trends and correlations that make it seem likely that coevolution has taken place. However, read the Janzen piece above again and find me a study where researchers have reciprocally tracked genetic changes in the microbiome and have seen direct evolutionary (read:heritable) changes in human host populations. It's nearly impossible (let alone ethically challenging) to track fitness in human populations over time and cleanly and directly relate those to changes in the microbiome. Codiversification != coevolution, so any story that mentions Helicobacter pylori and humans is pretty much right out (h/t to Jonathan Klassen for that one). Moreover, health != fitness. By definition, obese people are unhealthy yet they can still have kids at a decent clip (note, I'm unaware of studies actually measuring the fitness affects of obesity in humans but would love to hear about them if you know of any). Likewise, much has been made about H. pylori affecting human health negatively through gastritis/gastric cancers and positively through asthma/GERD prevention. I'd love to hear how these things directly affect fitness in human populations, but there's absolutely no data to these points. Proto-humans might well have had gastric problems from H. pylori, but I'm betting that there were many other things that directly impacted their lifespans and fitness with greater magnitude. Please don't get me wrong, I'm not saying that it's impossible that humans and their microbiomes have coevolved, I'm saying that there is no direct evidence that directly addresses the hypothesis of coevolution outside of a handful of pathogens.

All of this is a roundabout way of saying that studying (and demonstrating) coevolution is really really really difficult. You need to actually measure how species 1 directly influences evolution in species 2 AND how species 2 then directly affects evolution of species 1. Reciprocality is key. There are many examples where species 1 influences some trait on species 2, but that doesn't mean that evolutionary dynamics in species 2 will be affected. Likewise, it doesn't mean that species 1 will then be reciprocally affected by this trait change. Moreover, we don't have good models of host-microbiome coevolution because the math is difficult/complex so we therefore don't have very clear ideas about what parameters matter most to drive coevolutionary dynamics. One of my near to mid term goals in science is to try and change this, so as a first step I'm trying (fingers crossed) to organize a workshop at NimBios (probably in Fall 2016) to bring 35ish smart people from across the globe together to talk about how we begin to frame questions about host-microbiome coevolution. If you're interested please fill out the form at the following link: http://goo.gl/forms/G79CwnhYc8

I'll be submitting a workshop proposal by March 1st, and will keep you up to date on what's happening. If you're curious about NimBios and workshops, overview can be found here: http://www.nimbios.org/workshops/

Thursday, February 4, 2016

We tend to be harshest those we love...

Last year a paper was published investigating links between natural transformation and type VI secretion mediated killing, and I cobbled together a blog post with some select thoughts about this paper (here). It's recently come to my attention (should have written this earlier, it's my fault that I dropped the ball) that that particular post could be viewed in a couple of lights depending on your the context. I want to make it completely 100% crystal clear, that it is never my intention in writing in this space to include ad hominems in posts (I'm not Dan Grauer...)*. I write quickly and don't ever really sit on posts, but really I'm writing here because I love what I do and I care for certain topics in science. I'd like to see these topics understood as clearly as possible. For those that "know me", I really enjoy critical discussions about science and sometimes my words get ahead of my inner filter. Suffice it to say that I've made a conscious decision to try and not use this space to trash people and papers personally, but to be critical about science involving topics that are near and dear to my heart. Sometimes when I do this my inner New Yorker comes out, but know that my intention is not to critique the person but to focus on the science. If I do this, let me know and I'll directly tackle/ammend/try to fix the issue in public.

F'ing Bulls**t MT“@WvSchaik: 'Host Demise as a Beneficial Function of Indigenous Microbiota in Human Hosts' paper http://t.co/aHifo7Rx2y”
— David Baltrus (@surt_lab) December 16, 2014

Figure 1. An example of me responding a little quickly in public to a recently published paper

This being said, I'm writing this post because another paper was published recently that linked together natural transformation and microbe-microbe killing. Honestly, I haven't had the chance to read the paper yet because I was sitting in airplanes all day yesterday. I'm guessing that I would probably have similar thoughts as I did about the type VI secretion paper last year. These thoughts come out in public sometimes:

And the hand waving evolutionary explanations for linked regulation of competence and "trait X" continue. https://t.co/hCd1hXgehe
— David Baltrus (@surt_lab) February 3, 2016

Figure 2. An example of me responding to a recently published paper after sitting on airplanes all day and without reading the paper

All right, so what's my beef with these studies? Let me be completely clear, I respect the authors and inevitably I have no problem with the experimental design or the actual reported science. In the type VI secretion case, and probably with this new paper, I have absolutely no problem with the experimental design or with the genetics. I don't think the papers are wrong science-wise. I'm writing about these papers because I spent the better part of 5 years in graduate school huddled in the fetal position thinking about the evolutionary effects of natural transformation in bacterial populations. My problems with a lot of these papers are usually directed solely at the evolutionary interpretations and spin within discussion sections and press releases.

There's a historical legacy that surrounds researchers of natural transformation in bacteria, where there are a couple of entrenched camps that tend to argue past each other. These fights usually flare up around disagreements that conflate questions about original evolutionary benefits of natural transformation and benefits of natural transformation that are measurable today (after these systems originally evolved). After many years of thinking about this, I'm actually agnostic when it comes to the original evolutionary scenarios for natural transformation. Rosie Redfield is one hell of a thinker and I defer to her about such things (so I guess that firmly places me within one camp). I tend to be more interested in wondering about how strong the selective forces on natural transformation are within present day bacterial populations and on trying to figure out realistic parameters that could affect our evolutionary interpretation of gene exchange in bacteria.

Like I said above, I think the genetics and molecular biology within these papers are tip top and have absolutely no quarrel with those. I get caught up when the discussions start to extrapolate from the controlled conditions of the lab environment into natural populations. Natural transformation certainly leads to gene exchange in natural populations of bacteria. However, suggesting that pathways are linked in regulation because of evolutionary benefits of natural transformation is a leap of faith that no paper out there has been able to tackle as of yet. What leads to the disconnect? There is a long standing tradition across many, many, many papers that describe evolutionary "just so" stories whereby we witness results in the lab or under certain conditions and think/assume that natural selection must act that way across many different conditions or environments. My comments about "hand waving" are usually directed at such extrapolations.

This is getting long, so I'll save the nitty gritty for another time. However, long story short, these extrapolations hinge on critical parameters of these experiments being similar in the lab and under natural conditions. There are no natural populations of bacteria for which we have realistic estimates of things like A) the DNA pool available for natural transformation B) natural selection pressures over space and time within and between bacterial populations that are exchanging DNA C) the repeatability and direction of these selection pressures D) how often cells encounter other cells that they can kill in nature E) having killed these cells in nature, how often these cells take up DNA F) evolutionary costs of natural transformation in nature G) I'm missing something because it's early in the morning but there are other parameters. When one sits down to write mathematical models that account for all of these above parameters, it ends up being REALLY difficult to find parameter space whereby natural transformation is GENERALLY beneficial. That's not to say that gene exchange doesn't matter within natural populations (as it certainly does) but it's hard to find situations where there are clear results where natural transformation is beneficial even a majority of the time. Under laboratory experiments like the ones in the type VI secretion paper (and probably the bacteriocin paper, again, haven't read yet) all of these parameters are actually controlled for pretty cleanly:

A) the DNA pool is controlled by the experimenters so that there is no contaminating DNA from other strains/species that could compete with genes of interest for uptake

B) Natural selection is really strong because these experiments are typically selecting for antibiotic resistance where cells pick up the relevant DNA survive and those that don't die. The same would be true if we experiments were set up to investigate phage predation, etc...

C) Typically in these lab experiments, there is only one direction for selection to act and the environment doesn't change over time (i.e. there is only one antibiotic that the cells need to become resistant to, and therefore one locus that they need to pick up through natural transformation)

D) lab experiments are usually biased so that cells are encountering cells that they can kill at pretty much optimal frequencies (50/50).

E) lab experiments are done under conditions whereby cells are highly competent for natural transformation

F) there are few costs for natural transformation systems in these lab experiments because the experiments themselves usually occur in relatively cush situations for bacteria (media containing abundant nutrients, controlled temperatures, etc...) and only take place under limited amounts of time. In fact, for most bacteria, if you passage them under lab conditions for extended periods of time they usually lower competence levels (which suggests an evolutionary cost). Like I said though, the lab experiments are only performed over limited amounts of passage time.

So to sum this all together, I apologize for any perceived slights. That's not my intention (yeah, I know get out the bingo cards). If you feel I've been too personal, please let me know and I'll try and fix anyway I can. There are many great groups focused on understanding natural transformation in bacteria and I respect much of their work. I've just spent way too much time worrying about evolutionary scenarios that usually pop up in discussion sections of these papers without (what I perceive) is firm grounding within evolutionary biology. These papers usually aren't usually set up to be direct tests of evolutionary theory, but it's very easy to write about how we think evolution should work. These papers usually end up being very good at describing the if natural transformation works for gene exchange under certain scenarios rather than how it's actually happening in nature. That's completely OK, just be careful about extrapolating.

*c'mon, that one was way too easy

Tuesday, February 2, 2016

How to (not) write a microbiome grant, part II. A deeper dive on preliminary data

As in Part I, a few quick notes that come to mind as I'm reviewing microbiome grants....

One of the biggest challenges and frustrations with grant writing is knowing just how much and what type preliminary data to include and how much detail on methods to provide. Within the context of the grant, preliminary data has a couple of different jobs. First, it's there to convince the reviewers that you can actually perform the type of experiments and analyses that you are proposing. Second, it's there to justify why the proposed experiments are interesting or necessary. There's certainly no magical formula, but I think there are a few things to keep in mind to when struggling over these two variables. (DISCLAIMER: just one person's opinion)

1. The amount of preliminary data required changes throughout the course of your career.

Don't kill the messenger, but track records matter. It's just the way it is. Early career researchers need to include more detail and need to justify their proposed experiments moreso than established researchers. If I'm reading a grant and I see that the PI has published (even as a preprint, because I can go and read the methods if there is a question in my mind) these kinds of analyses before, it's much easier to believe they'll be successful performing the proposed analyses. All else equal, that inherently gives established researches a leg up given page limits.

2. You must include enough detail to convince me that you know what you're talking about with the analyses.

If I'm reading a grant that proposes types of experiments that I'm familiar with, I probably have a decent idea of the associated pitfalls and critical variables. If you've done the experiments or understand how to do them well enough to carry them out, you should also have an idea of the critical points to include in your methods and analyses. It's very likely that, even if you don't have experience with specific protocols, that you'll know someone that does...do whatever you can to understand the ins and outs of the proposed experiments and write enough detail to cover the critical points. Assume that at least one reviewer is going to be familiar with the experimental protocols and include enough information to convince this reviewer you know what you're talking about. Assume that other reviewers may not understand the protocols and include enough basic information to give them an idea of what you're talking about.

3. The type of preliminary data required changes throughout the course of your career.

If you have a proven track record in the field, or if you've hit both of the above points in your grant, the preliminary data within your proposal should provide just enough smoke to convince the reviewer that there's a fire somewhere (metaphorical of course). It's very easy to propose "fishing expedition" experiments, one's where you are going to make a lot of observations and some magical result is going to come from combining together all this data.

When I was started as a PI, I kept proposing a few different RNAseq experiments that I thought would be very interesting and insightful. Inevitably, I'd get the reviews back and I'd get dinged for not having a hypothesis. "Pssssshhhhh" I'd say to myself as I gripped my stress relief ball, "The hypothesis is that gene expression WILL change!". With a bit more perspective gained from grant panel experience, I understand exactly now what the "fishing expedition" critique means. It's a combination of the psychology of having to review a bunch of related grants at a single time coupled with the reality that you have to bin grants into different piles as a reviewer.

Here's an example with microbiomes (in the style of Law and Order, this very example may be based on real events that are happening at this exact moment). Say a researcher has 10, 15-page microbiome grants to review before the panel meeting. A large percentage of these grants are interested in measuring microbiome dynamics over time, space, and across individuals. The methods and proposed analyses are usually very similar across a large swath of these grants. The only ways left to bin as a reviewer are by host species and whether there's enough smoke to think there may be a fire. If you can't make the case that your study system is different than other ones, chances are that your grant is going to get lumped in with those proposing similar methods and placed in the "others" pile unless someone else on the panel makes a good case during discussion.

Preliminary data is your way to make the grant stand out. If you are proposing that individual to individual variation matters, or that changes in microbiome dynamics over time matter, it's easy enough to get a few samples and sequence 16s. It doesn't have to be a full study, it just has to be enough to show that there is some signal that's interesting enough to follow up on. There's a surprising lack of pilot experiments in a lot of these microbiome grants (IMO), and the only thing I can think of is that it's hard to find sequencing centers that can process a handful of 16s samples relatively cheaply. I again assume that this is because you typically you need a certain threshold number of samples for a MiSeq run (vs. Sanger sequencing where you can perform just one reaction). One way to get around this is to find others that are interested in generating the same kind of data and pool resources together to pull together a whole MiSeq run. There seem to be a couple of places that could facilitate finding others to pool with (like GenoHub).

My null hypothesis as a reviewer is that microbiome dynamics are going to be the same in your system as they are in well studied systems. Use this preliminary data and pilot experiments to disprove my reviewer null hypothesis. Are there differences abundances or frequences for taxa that are important in other systems? For instance, if you're proposing a phyllosphere microbiome study, off the top of my head I can imagine a top five for the taxa you should find in high abundance. Is your system different (If you can't answer that, there's some reading you should do). If you sequence a couple of plants, do the larger plants have differences in microbiomes that you can follow up on? Is there something unique about your microbiome of interest compared to others (i.e. the rice rhizosphere apparently has some archea). If there are differences between these experiments and previous ones from other systems, think about hypotheses to explain why these differences and build your grant off of that. "Sequence everything and sort out the important trends later" doesn't work when every grant is proposing to do the same thing.

4. "But I don't have any preliminary data to include"

Yes you do. It may not be your own, but there are enough public datasets for you to reanalyze other's work (to at least show that you can do the analyses and give yourself some sort of track record). It doesn't even have to focus on the system you are proposing to work in so long as it moves your narrative forward (see Points 1 and 2 above).

<slight update to point 4> There's a flipside to this. If you're proposing experiments similar to others that have been published before in the same system, don't just cite the previous papers. Give your reviewers a context for why your proposed study is going to be different than previously published studies.