Banner
By T. Ryan Gregory | January 19th 2008 07:49 AM | 2 comments | Print | E-mail | Track Comments
About T. Ryan

I am an evolutionary biologist specializing in genome size evolution at the University of Guelph in Guelph, Ontario, Canada. Be sure to visit

...

View T. Ryan's Profile

I first became interested in genome size because of its tie-ins with important evolutionary questions in which I was (and still am) interested, such as punctuated vs. gradual patterns, levels of selection, and adaptive vs. non-adaptive processes.

What I didn't realize was that one component of the question, the quantity of DNA that is non-functional (but not necessarily inconsequential) with regard to the phenotype of the organism, is such a hot-button issue. I had vague inklings at first that young-earth creationists would object to the idea of non-functional DNA -- because God, as they say, don't make no junk. (Why intelligent design proponents, who purport to take a strictly scientific view of the question, also assume that non-coding DNA cannot be non-functional remains unstated).

And of course there has always been a persistent undertone in biology that non-coding DNA must be doing something or it would have been deleted. This latter view, which derives directly from a hardcore adaptationist approach, destroys the argument by creationists that "Darwinism" has prevented researchers from considering functions for non-coding DNA.

Indeed, the main motivation for the early papers on "selfish DNA" was to counter this adaptationist assumption (Doolittle and Sapienza 1980).

Creationist nonsense about DNA does not surprise me. What has intrigued me much more is the debate among biologists about this, and the rather questionable claims, suppositions, and extrapolations that get made not just by the media but by various scientists themselves.

Take Francis Collins. He's a major player in genome biology and led the charge by the public Human Genome Project. And yet, he makes claims that non-coding DNA may be present in the genome "just in case" it needs to be put to use in the future. This makes no sense from an evolutionary perspective. It would be tempting to attribute this to Collins's adherence to the notion of theistic evolution, but in fact one can find this sort of fuzzy foresight argument being brought up by lots of authors. I suppose it's just disappointing that there is not better communication between genome biology and evolutionary biology.

The case that frustrates me most is that of John Mattick. He of the worst figure ever is one of the primary promulgators of the view that scientists have overlooked possible function for non-coding DNA and that this is "one of the biggest mistakes in the history of molecular biology" that can only be corrected by a "new paradigm", and so on. Basically, the argument seems to be that much of the non-coding portion of a given genome is involved in regulation and such. In the past, Mattick has refrained from pinning down an estimate of how much non-coding DNA he believes is functional, but his presentation of (extremely selective) data left little doubt that he considers more non-coding DNA to be correlated with greater complexity. But now we're starting to get some more explicit and increasingly bold claims.

As Check (2007) pointed out in a news article in Nature,


Mattick thinks scientists are vastly underestimating how much of the genome is functional. He and Birney have placed a bet on the question. Mattick thinks at least 20% of possible functional elements in our genome will eventually be proven useful. Birney thinks fewer are functional.

Now consider this quote by Comings (1972), who was the first person to use the term "junk DNA" extensively (even before Ohno's (1972) coinage appeared in print):
These considerations suggest that up to 20% of the genome is actively used and the remaining 80+% is junk. But being junk doesn't mean it is entirely useless. Common sense suggests that anything that is completely useless would be discarded. There are several possible functions for junk DNA.

So, even if Mattick is right about 20% of the human genome being functional, which is considered a rather high estimate on the basis of available data, he still would be merely agreeing with the author of the first major discussion about junk DNA.

Now, I should point out that I do not have a vested interest in how much of the human genome is functional. 5%? Fine. 20%? Fine. 50%? Ok. I will go where the data indicate. My reason for rejecting the notion of "more complexity means more DNA" is comparative: I refer you to the "onion test" for a simple illustration. However, as readers of my Genomicron blog already know, I find it rather irksome when people take any new finding about (potential) function in some part of the human genome and extrapolate this to mean that all DNA in every genome must be serving some role.

Anyway, back to what Mattick suggests. As noted, for the most part he has gone about arguing for large-scale function more by hint than by direct claim. However, finally he says the following (Phaesant and Mattick 2007).


Thus, although admittedly on the basis of as yet limited evidence, it is quite plausible that many, if not the majority, of the expressed transcripts are functional and that a major component of genomic information is rapidly evolving regulatory DNA and RNA. Consequently,it is possible that much if not most of the human genome may be functional. This possibility cannot be ruled out on the available evidence, either from conservation analysis or from genetic studies, but does challenge current conceptions of the extent of functionality of the human genome and the nature of the genetic programming of humans and other complex organisms. [Emphasis added]

It seems to me that "we can't rule this out" is not a reason to think that something is plausible, let alone true. In fact, the existence of mechanisms such as transposable element spread and the pseudogenization of duplicate genes suggests that there is good reason to expect much (probably most) of the genome to be non-functional unless data show otherwise. Some TEs have taken on a function, some cause disease, some are merely benign or only slightly detrimental. The proportions of non-coding elements in each of these categories remain to be determined, but they are not all equally likely by default.

The question of which sequences are functional, and in what way, is one of the more contentious and therefore interesting ones in genome biology. On the one hand, new information from various sources including the ENCODE project indicates that much non-coding is transcribed, though it remains an open question whether this has to do with function or noise. On the other hand, a recent analysis has suggested that as many as 4,000 sequences within the human genome initially thought to be genes are not really genes after all (Clamp et al. 2007), bringing the total count down to around 20,000.

Some people, mostly creationists and strict adaptationists (strange bedfellows, I agree) desperately want the vast non-coding majority of eukaryote DNA to have a function. They latch onto any new discovery of function in some segment of the genome or another (or indeed, any mere restatement of what many authors have been saying since the 1970s) and consider their position supported. The rest of us will just have to wait and see.

________________

References

Check, E. (2007). Genome project turns up evolutionary surprises. Nature 447: 760-761.

Clamp, M., B. Fry, M. Kamal, X. Xie, J. Cuff, M.F. Lin, M. Kellis, K. Lindblad-Toh, and E.S. Lander (2007). Distinguishing protein-coding and noncoding genes in the human genome. Proceedings of the National Academy of Sciences USA 104: 19428-19433.

Comings, D.E. 1972. The structure and function of chromatin. Advances in Human Genetics 3: 237-431.

Doolittle, W.F. and C. Sapienza. 1980. Selfish genes, the phenotype paradigm and genome evolution. Nature 284: 601-603.

Ohno, S. 1972. So much "junk" DNA in our genome. In Evolution of Genetic Systems (ed. H.H. Smith), pp. 366-370. Gordon and Breach, New York.

Phaesant, M. and J.S. Mattick (2007). Raising the estimate of functional human sequences. Genome Research 17: 1245-1253.

 


Comments

adaptivecomplexity's picture
I don't understand the whole air of mystery that surrounds this subject (the term 'dark matter' of the genome, for example). If about ~35% of the genome is found within introns of protein-coding genes (which seems to be the current estimate based on the RefSeq genes in the UCSC Genome Browser), and if non-coding RNA genes are also processed from long transcripts (possibly also containing long introns), you've already got a lot of the genome covered. Add that to the tons of repeat elements hanging around, and there isn't a whole lot left to explain. The extensive transcription found in recent projects is being hyped as more surprising than it really is.

This disconnect between 'molecular' genome biologists and real evolutionary biologists has caused problems before - most notoriously with the initial claims of widespread horizontal transfer of genetic material from bacteria, back when the human genome draft sequence was published. If Francis Collins' team had vetted their ideas with evolutionary biologists, they would have saved themselves some embarrassment.

The "air of mystery" is due to people staking out intellectual positions and claims in an area where no one knows, and to their unwillingness to admit their positions and claims rest not upon adequate evidence but upon the mere presence of an embarrassing elephant in the room:  design, which is widely derided, on the most fundamental and dogmatic of grounds, to be beyond scientific verification, hence discussion.  All this article is about, really, is the author's desire to "educate" readers upon the complete worthlessness of creationism and intelligent design.  It is a polemic.  All the minutiae of jargon cannot hide -- and the author doesn't want it to hide, and takes pains that it should not hide -- his prejudices against design in the natural world.  I think he is caught up in the election season spirit of partisan rhetoric, that's all, along with much of the scientific world on the subject of design.  Why put the only reasonable phrase in the whole argument at the end:  wait and see.

If you want new evidence, read my work.  Professionally speaking, you need to.  It is not in genetics or biology, but it is, or should be, easy for any competent and honest physical scientist to learn.  It is of unprecedented scale and breadth, so takes serious study, lest you ignore its coherence and enlightening power over a vast field and let current prejudices turn you off to confronting new knowledge.  There is a huge test going on, and science is by and large flunking its own responsibility to investigate dispassionately and without prejudice.  You can lead a horse to water but you can't make him drink, especially when he, like this author and too many others, thinks he is leading.  I offer my services for paid talks, to help.

www.lulu.com/hdhsciences


Add a comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <sup> <sub> <a> <em> <strong> <center> <cite> <code> <TH><ul> <ol> <li> <dl> <dt> <dd> <img> <br> <p> <blockquote> <strike> <object> <param> <embed> <del> <pre> <b> <i> <table> <tbody> <div> <tr> <td> <h1> <h2> <h3> <h4> <h5> <h6> <hr> <iframe>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.
CAPTCHA
If you register, you will never be bothered to prove you are human again. And you get a real editor toolbar to use instead of this HTML thing that wards off spam bots.