Track your comments!
[x]


When you register, comments on your articles and replies to your comments appear here. Register Now!

Sign in to your account
[x]

Not a Scientific Blogging member yet?

Register Now for a Free Scientificblogging.com Account

  • Customize your profile with pictures, banner, a blogroll and more.
  • Leave comments on articles, add other members to your friend lists, chat with people on the site.
  • Write blog posts that can be seen by hundreds of thousands of readers.

It's free and it only takes a minute!

Already a Scientific Blogging member?

Sign In Now

Banner
By Michael White | August 21st 2009 11:58 AM | 20 comments | Print | E-mail | Track Comments
.

More Adaptive Complexity articles

All

About Michael White

Welcome to Adaptive Complexity, where I write about genomics, systems biology, evolution, and the connection between science and literature, government, and society.

I'm a biochemist


... Full Bio

I'm not the only one who hates computational biologists:

Zhang et al. (11), Braunewell and Bornholdt (12), Ge et al. (13), and Okabe and Sasai (14) have presented stochastic models of the yeast cell cycle based on a deterministic Boolean model from Li et al. (15). The main concern of all of these authors was the robustness of cell cycle progression in the presence of intrinsic and extrinsic sources of noise. None of them compared their models to observed statistics of cell cycle properties in wild-type or mutant cells.

In sciencese, that's a major dig against computational biologists. You have these clever computer guys, and they don't give a damn whether their models are actually right. They've come up with some clever algorithm or theoretical analysis, and that's good enough.

I'm not saying that computational guys have to test everything themselves, but many of them are not even proactive about getting someone else to test them. They're just happy to move right along to the next project that will have zero impact on biology.

Deep down the problem is that these guys aren't biologists at heart. They don't care about, and cannot properly fomulate, genuinely interesting biological questions. They've been trained as computer scientists, and they care primarily about computer science. But then they're stuck in the worst of both worlds - they're not really contributing to fundamental computer science, and they're not contributing to biology either.




Comments

adaptivecomplexity's picture
OK, just to be clear: I don't really hate computational biologists. There are good ones who are serious about learning something about biology; they just publish the minority of papers in the field.

Hank's picture
Cop out.  You didn't even give us a chance to wonder if you had given yourself over to hate before qualifying that you don't hate them even though you said you did.

I'm still using this quote: "You want this, don't you? The hate is swelling in you now. Take your Jedi weapon. Use it."

Andrea Kuszewski's picture
....or, "Your eyes are full of hate, 41. Good. Hate keeps a man alive. It gives him strength."

I am not even going to say what movie that is from, because as a learned man, you should know.

adaptivecomplexity's picture
I have friends who are computational biologists, so I can't be too mean! And there are good computational biologists, so I need to be careful who I hate.
But someone does need to take a light saber to some of those papers.


Andrea Kuszewski's picture
I share your frustration, only mine is with a sub-section of Computational Neuroscientists who are attempting to reverse-engineer the brain, yet failing to take into account humanity, emotion and free will as factors in consciousness. (not to open a can of worms or anything)  ;)

Actually, I run into that sort of thing frequently in Psychology research. They look at correlations and numbers, and they get so involved in their little charts and tiny dots forming their pretty little patterns in the matrices of their numerical microcosm, that they lose sight of the big picture and what the research is really trying to accomplish in the first place.

Without having read the PNAS paper or any of the cited papers, I could venture a guess that perhaps some papers were published on work still in the stage of getting the models to work. Of course the idea of such modeling should be to compare (calibrate?) to data on organic life, but they might not be that far just yet.

But of course, without any comparison to, or prediction of, organic life ever, the point of doing simulations is moot.

adaptivecomplexity's picture
It's ok to publish a paper with just the model, but the people who built the model need to be seriously invested in getting it tested. Which is frequently not the case - these guys tend to simply move on to the next project, without caring whether their model is right. 
Every week in my field someone comes a long with a new model explain why the cell division cycle is robust, but it never moves beyond the computational stage.


If the models truly never moves beyond the computational stage, then I find that appalling. Why on Earth would they not? (Okay, I know the answer, so don't bother.)

adaptivecomplexity's picture
I know you know the answer but I'll say it anyway! There are too many researchers who don't care enough about testing. It's a cultural problem in the field.

"Deep down the problem is that these guys aren't biologists at heart."

The problem is that these guys are not doing science. Period. Of course, the problem is not restricted to computational biologists. There are plenty of other flavors of biologists who don't do science. For example the entire field of genomics.

A more serious problem with modeling is that we don't have a proper framework to interpret the results. Even if you have a serious scientist who comes up with a model that's consistent with existing data and shows robustness. It may even make correct predictions. But is it the only such model? Maybe there are millions of models that explain the data equally well. How can you be sure? Exhaustive enumeration is impossible for all but simplest models, so we are stuck. This is a general problem, but especially severe in computational biology. Even for a problem as well-studied as sequence alignment, we have no good way of dealing with it. We simply retreat to the old standby of comparing our model against the random chance and try to impress other people with our p-value.

adaptivecomplexity's picture
For example the entire field of genomics.

Since I work at a place called the Center for Genome Sciences, I feel a little obligation to defend genomics! I see your point, and agree that there is a lot of non-hypothesis-driven data generation in genomics.


But to take a positive example: genomics has made some significant contributions to understanding gene regulation. In my own field, cell cycle transcription in yeast, genomics completely changed the game - particularly this paper (which has been subsequently much abused). There is also the issue of definition in genomics - is it really a field by itself? Everyone uses genomic tools now, but it's hard to say what is genuinely genomics and what is not.



You bring up an important point about modeling. There can be two good purposes to modeling. One is utilitarian: take for example models that look for cis-regulatory elements in genome sequence. Nobody cares if it's the only way to do it, as long as it works, giving you have enough confidence in the predictions to go and do the experimental work.



The other class of models are those that are supposed to explain something. I think models tied to a physical mechanism, which you can test experimentally, can be useful. Since we can't generally measure all of the important parameters necessary for a completely mechanistic model, sometimes you make simplifications that can be somewhat arbitrary (and hence your model is not necessarily the only way to interpret experiments).

But you can still get predictions that shed insight on the physical mechanism, which is the case in paper I've linked to in the post.


I have two responses to the assertion that the entire field of genomics is biologists who don't do science. First, not all fields of science are experimental. Some are mostly observational. Would you say that paleontologists and astronomers, for example, are not doing science? Secondly, it seems evident to me that, over the past decade, the study of genomes has transformed many entire fields of study. Much of this comes from high-throughput techniques generating the resources that people can use for more detailed study, but the are also journals overflowing with papers full of biological insight possible only by comparing and interpreting these data. I'm bewildered that there would be such an indictment of so transformative a field of study, even if those who produce the data are not always doing experimental science themselves.

"genomics has made some significant contributions to understanding gene regulation."

Has it? Do we really understand cell cycle regulation better because of Spellman's paper? Genomics has generated plenty of data but very little knowledge. We now know a lot of genes that are under cell-cycle control, but we don't know how the regulation actually happens. Most of the exciting "new" models of gene regulation (By folks like Terry Hwa, Eric Siggia, Eran Segal, etc, etc), are just simple variants of Shea and Ackers model from the early 80's. Sure, we now have high-throughput methods that can parametrize such models easily, but conceptually we have not advanced much at all.

Maybe I'm too cynical. Are there examples of genomics studies that has actually led to some real biological (or mathematical , chemical, whatever) insight? Where measuring thousands of things sloppily at once, with very little control actually beats out measuring a few things carefully in a well-controlled way?

adaptivecomplexity's picture
In the case of the Spellman paper, we learned that cell cycle-dependent transcription is a wide-spread phenomenon. Prior to that paper, best estimates were that only 100-200 genes were cell cycle regulated. I see that as a genuinely new discovery, which has spawned further studies of how that transcription is regulated.
I basically agree with you though about the issue of insight.  Most questions aren't answered just with high-throughput data.  I prefer to think of genomics as a set of tools, rather than a scientific discipline. The tools have been extremely useful - in quantitative genetics, in 'metagenomics' (by which we can study unculturable microbes in the gut), and in molecular biology, where high-throughput data can give you some good clues about the function of an uncharacterized gene of interest.


Wouldn't that make them bioinformaticians, as opposed to computational biologists? It's my understanding that computational biologists are experimental based biologists who use computers to answer questions. It's the bioinformaticians that build the software and programs.

I work in a primarily computational biology group (called The Battelle Center for Mathematical Medicine) and I would say many of my colleagues would agree with the general idea that many papers would be substantially better and have a much greater impact if biologists were more involved in the work. I don't think a compsci should try to be a biologist any more than I want purchase an operating system designed by a PhD level biologist. I do think working as a team would be much better for both fields. There is however, a real problem with silos here. Compsci's are usually in compsci departments that have no biologists and have extremely different goals for their faculty. Biostats are in biostat departments. Stats are in stat departments. They teach classes for tenure. Writing a book isn't career suicide. Quick publications aren't necessarily how they want to spend their time. While there are computational biology centers, such as where I work and of course many of the large institutions have them too, there aren't nearly enough. There also aren't nearly enough biologists interesting in learning how to communicate/collaborate with computational people. I suspect in time more biologists will see the benefits of collaborating with computational people and the fields will slowly merge together in a more cooperative manner. Having good computational analysis of your real data can be helpful. I've published in good places and my colleague down the hall recently had a Cell paper - we both include lots of real data in our work. If you have any interest and don't mind learning how to communicate your ideas to someone outside of biology you should drop by a few local departments and see if a computational person sees something cool in your work. It may pay off better than some comments above would lead you to believe.

adaptivecomplexity's picture
Where I work, there is good integration between the computational and experimental people, often within the same lab. I know these collaborations can work well.  I don't call myself a computational biologist because I don't design algorithms or write anything more complicated than Perl or Matlab scripts, but I regularly build ODE models of genetic circuits and search sequence data for cis-regulatory sites. There is no doubt in my mind that cooperation between computation and experiment can be extremely useful.

But I can't understand the point behind a comp bio research program that has no connection to experiment. If you're not actually demonstrably learning anything about biology, then what's the point? Serious computational biologists cannot just be computer scientists - they have to be interested in contributing to biology. They don't have to necessarily do the experimental work themselves, but they need to care about having their ideas tested, about knowing whether their model results are right.


Its equally easy to complain about many of the small scale "careful" biological studies done where someone discovers that if you knock out component X, you find that protein Y appears to play a role in pathway Z. Who cares if that function actually matters in wt or that you only saw that in 1 out of the 10 westerns that were done, but 9 were discarded because they were clearly wrong. There is both good and bad science done whatever the methods used ( small scale, high-throughput, computational). These methods are useful under different contexts and have all made outstanding contributions to science.

Actually most "computational biologists" I know are from a biological background, and if they have one thing in common it is that they all want their predictions to be tested. Only when tested their papers have a chance to be published in most cases anyway.

Anyway, let's not start talking about those 100% wet guys who tend to focus all their attention on that 1 experiment out of 10 that succeeded, and their complete lack of understanding of even basic statistical tests. It goes both ways, you know.

adaptivecomplexity's picture
But one problem is that it's a lot easier to crank out poor computational results than it is to crank out bad experiments.

It's true that a lot of molecular biologists live in ignorance of anything mathematical, but, having lived in both worlds, I find that the experimental people are a much more self-critical community. There is something about struggling to get your experiments to work that makes you realize how hard it is to genuinely learn something about nature.



Add a comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <sup> <sub> <a> <em> <strong> <center> <cite> <code> <TH><ul> <ol> <li> <dl> <dt> <dd> <img> <br> <p> <blockquote> <strike> <object> <param> <embed> <del> <pre> <b> <i> <table> <tbody> <div> <tr> <td> <h1> <h2> <h3> <h4> <h5> <h6> <hr> <iframe>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.
CAPTCHA
If you register, you will never be bothered to prove you are human again. And you get a real editor toolbar to use instead of this HTML thing that wards off spam bots.