Sunday, September 16, 2012

Read What Mike White Has to Say About ENCODE and Junk DNA

One of the good things to come out of this ENCODE/junk DNA fiasco is that I've discovered a number of excellent scientists who aren't afraid to speak out on behalf of science. One of them is Mike White, a systems biologist at the Center for Genome Sciences and Systems Biology, Washington Univ. School of Medicine, St. Louis (USA). He blogs at The Finch & Pea.

Mike published an impressive article on the Huffington Post a few days ago. This is a must-read for anyone interested in the controversy over junk DNA: A Genome-Sized Media Failure. Here's part of what he says ...
If you read anything that emerged from the ENCODE media blitz, you were probably told some version of the "junk DNA is debunked" story. It goes like this: When scientists realized that classical, protein-encoding genes make up less than 2% of the human genome, they simply assumed, in a fit of hubris, that the rest of our DNA was useless junk. (You might have also heard this from your high school or college teacher. Your teacher was wrong.) Along came the ENCODE consortium, which found that, far from being useless, junk DNA is packed with functionality. And so everything scientists thought they knew about the genome was wrong, wrong wrong.

The Washington Post headline read, "'Junk DNA' concept debunked by new analysis of human genome." The New York Times wrote that "The human genome is packed with at least four million gene switches that reside in bits of DNA that once were dismissed as 'junk' but that turn out to play critical roles in controlling how cells, organs and other tissues behave." Influenced by misleading press releases and statements by scientists, story after story suggested that debunking junk DNA was the main result of the ENCODE studies. These stories failed us all in three major ways: they distorted the science done before ENCODE, they obscured the real significance of the ENCODE project, and most crucially, they mislead the public on how science really works.

What you should really know about the concept of junk DNA is that, first, it was not based on what scientists didn't know, but rather on what they did know about the genome; and second, that concept has held up quite well, even in light of the ENCODE results.
Way to go, Mike!

In the past week, lot's of scientists have demonstrated that they don't know what they're talking about when they make statements about junk DNA. I don't expect any of those scientists to apologize for misleading the public. After all, their statements were born of ignorance and that same ignorance prevents them from learning the truth, even now.

However, I do expect lots of science journalists to write follow-up articles correcting the misinformation that they have propagated. That's their job.


  1. I am just writing to thank you Larry for your coverage of the ENCODE debacle. I have learnt a lot from about the science behind it in both your blogs and following up the links.

    I guess I learnt a lot about the failings of science journalism but then I was pretty cynical about that already. "God particle" etc.

  2. DH Kaye weighs in to discuss the legal consequences if ENCODE's 80% were true.

    1. Oh, boy, the legal consequences! What a mess!

      Consider the following scenario.

      A guy accused of rape says you can't test my DNA, none of it is junk, it's all functional and ENCODE says so (thus DNA testing is an invasion of my privacy.) So the accused calls as an expert witness the ENCODE scientists who lied to the NY Times and the muggle press, Ewan Birney and John Stomatoyonnopoulos, and they testify under oath.

      Suppose (not likely) they say under oath there's no junk and it's all functional, either they're challenged or they're not. If they're not challenged, rapist walks free. If they are challenged, they could be found guilty of perjury, which is a felony.

      If they tell what they know damn well is true, that about 80-90% is junk, they're laughingstocks in the science community and have to issue an errata on their ENCODE paper.

      Most likely scenario: they equivocate, using terminology that means one thing to scientists and another thing to muggles. The accused's lawyer can exploit this ambiguous language to confuse a jury made of muggles, because that's what Birney's language was intended to do-- deceive muggles. The accused gets off.

  3. Larry, when in a hole stop digging. There is an ocean of evidence that inactive transposons are nonetheless contributing to gene regulation even if they are no longer moving around the genome (which would do more harm than good probably). ENCODE has found that they don't actually have to bind with transcription factors to influence gene expression but can do so at a distance. In, any case, transposons make up half the genome. What about the rest? Do you suppose that it is still mostly junk as well?

    1. There is evidence that a few dozen or so inactive transposons have been co-opted to perform a biological function (usually regulation). The evidence comes from specific examples in many different species.

      Exceptions do not make a rule. There are more than one million defective transposons in our genome. Just because a handful have evolved a function does not mean that all of them have. They are still junk.

      I'm not in a hole but some of you are.

      If you want to engage in this debate you'll have to demonstrate that you have done your homework. So far it's not looking good.

    2. New paper out in Plos One ( finds on the basis of sequence conservation that less than 1/10 the transposons in the genome (280,000 out of over 4 million) have evidence of exaptation of sequence for regulatory function, a total of 7 Mb of sequence actually conserved (~0.2% of the genome.) There may be more that is lineage specific and thus not detectable above the low level of background divergence, but the total can't be all that large.

    3. Isn't this directly contradicted by what John Stamatoyonnopoulos has been telling the press? He says it's the rule, not the exception, that transposons are now regulatory DNA, involved in regulating genes. They can't both be right! Who's wrong here?

      John Stamatoyonnopoulos: “What the ENCODE papers (not the main paper in Nature, but the other length papers that accompanied it) have to say about transposons is incredibly interesting. Essentially, large numbers of these elements come alive in an incredibly cell-specific fashion, and this activity is closely synchronized with cohorts of nearby regulatory DNA regions that are not in transposons, and with the activity of the genes that those regulatory elements control. All of which points squarely to the conclusion that such transposons have been co-opted for the regulation of human genes -- that they have become regulatory DNA. This is the rule, not the exception....”
      [Faye Flam,]

      Larry, do you think John Stam has gone bonkers?

  4. Five reasons why my theory on the function of ‘junk DNA’ is better than theirs (part I)

    I intend to submit the paper below for publication in a peer-reviewed journal. Before submitting it and have it reviewed by a handful (if that) of peers, I decided to post it here on the Blogosphere Preprint Server, which is rapidly becoming the front-line platform for transparent and comprehensive evaluation of scientific contributions (note: because of size limitations, this comment is posted here as two parts).

    The ENCODE project has produced high quality and valuable data. There is no question about that. And, the micro-interpretation of the data has been of equal status. The problem is with the macro-interpretation of the results, which some consider to be the most important part of the scientific process. Apparently, the leaders of the ENCODE project agreed with this criterion, as they came out with one of the most startling biological paradigm since, well, since the Human Genome Project has shown that the DNA sequences coding for proteins and functional RNA, including those having well defined regulatory functions (e.g. promoters, enhancers), comprise less than 2% of the human genome.

    According to ENCODE’s ‘big science’ conclusion, at least 80% of the human genome is functional. This includes much of the DNA that has been previously classified as ‘junk DNA’ (jDNA). A metaphorically presented, in both scientific and lay media, ENCODE’s results means the death of the jDNA.

    However the eulogy of jDNA (all of it) was written more than two decades ago, when I proposed (and conceptually proven) that ‘jDNA’ functions as a sink for the integration of proviruses, transposons and other inserting elements, thereby protecting functional DNA (fDNA) from inactivation or alteration of its expression (see a copy of my paper posted here:; also, see a recent comment in Science, that I posted at Sandwalk: ).

    So, how does ENCODE theory stack ‘mano-a-mano’ with my theory? Here are five reasons why mine is superior:

  5. Five reasons why my theory on the function of ‘junk DNA’ is better than theirs (part II)

    So, how does ENCODE theory stack ‘mano-a-mano’ with my theory? Here are five reasons why mine is superior:

    #5. In order to label 80% of the human genome functional, ENCODE changed the definition of ‘functional’; apparently, 80% of the human genome is ‘biochemically’ functional, which from a biological perspective might be meaningless. My model on the function of jDNA is founded on the fact that DNA can serve not only as an information molecule, a function that is based on its sequence, but also as a ‘structural’ molecule, a function that is not (necessarily) based on its sequence, but on its bare or bulk presence in the genome.

    #4. Surprisingly, ENCODE theory is not explicitly immersed in one of the fundamental tenets of modern biology: Nothing in biology makes sense except in the light of evolution. Indeed, there is no talk about how jDNA (which contain approximately 50% transposon and viral sequences) originated and survived evolutionarily. On the contrary, my model is totally embedded and built on evolutionary principles.

    #3. One of the major objectives of the ENCODE project was to help connect the human genome with health and diseases. Labeling 80% of these sequences ‘biochemically functional’ might create the aura that these sequences contain genetic elements that have not yet been mapped out by the myriad of genome wide studies; well, that remains to be seen. In the context of my model, the protective function of jDNA, particularly in somatic cells, is vital for preventing neoplastic transformations, or cancer; therefore, a better understanding of this function might have significant biomedical applications. Interestingly, this major tenet of my model can be experimentally addressed: e.g. transgenic mice carrying DNA sequences homologous to infectious retro-viruses, such as murine leukemia viruses (MuLV), might be more resistant to cancer induced by experimental MuLV infections as compared to controls.

    #2. The ENCODE theory is a culmination of a 250 million US dollars project. Mine, zilch; well, that’s not true, my model is based on decades of remarkable scientific work by thousands and thousands of scientists who paved the road for it.

    #1. The ENCODE theory has not passed yet the famous Onion Test (, which asks: why do onions have a genome much larger than us, the humans? Do we live in an undercover onion world? The Onion Test is so formidable and inconvenient that, to my knowledge, it has yet to make it through the peer review into the conventional scientific literature or textbooks. So, does my model pass the Onion Test? I think it does, but for a while, I’m going to let you try to figure it out how! And, maybe, when I’m going to submit my paper for publication, I’ll use your ideas, if the reviewers will ever ask me for an answer. Isn’t that smart?

  6. Wow, Moo Moo, you're dumb. Do you not understand that a tiny fraction of a tiny fraction of functionality does not offset a genome that has 45% transposons?

    Do you think you're going to trick us with qualitative, rather than quantitative, statements? Patently dishonest.

    ENCODE has found that they don't actually have to bind with transcription factors to influence gene expression but can do so at a distance

    Do you even know what that means!? You're copying from a source that copied from a source that... and the errors just accumulate.

  7. Larry, here's a delayed thanks for the vote of support from a long-time fan of your blog.

    On the subject of functional transposons, I've got a colleague upstairs who has spent time looking at a particular class of LTRs that picked up good p53 binding sites, and has spread them around the genome. The big lesson here is that LTRs help themselves by picking up transcription factor binding sites.

    And the null hypothesis is that these LTRs, by hopping around at random, produce a lot of spurious p53-regulated transcription.