As recently as five days ago, I penned a post about what the Human Genome Project (HGP) had and had not accomplished. I wish I could say that I had written that with the full knowledge that it would be a great primer for a piece about the genome discoveries released today by ENCODE , the NIH follow-up effort to the HGP. I would be lying if I did.
Anyways, front and center at Nature.com is a pdf of the publication by the Encode Consortium outlining the highlights of their efforts to pass a fine toothed comb through approximately 1% of the human genome.
The publication is fascinating in both its breadth and detail. Before I expound on its virtues, let me first comment on my only suspicion about the project. From my own somewhat limited experience in biomedical research, I am not a big fan of large consortium efforts. While I love the concept of open source sharing of data and collaboration, I have usually found that huge efforts across many labs breed data inconsistencies as a result of methodological and analytical differences. Differences in variables as small as humidity in the lab can yield differences in datasets that can obscure the real story. All of that said, it would be very hard to argue with the key points that are coming out of this publication, because the key points make a lot more sense than the conventional wisdom that has been coming out of college biology text books for years (at least when I was in college).
Most of us have been taught at some point that DNA leads to RNA which leads to protein. Well, all of that is still true, but as time goes on, we continue to discover that there are more and more options for the RNA besides producing protein. Without further ado, here are the take home notes on the ENCODE project:
- While it was once thought that a large proportion of DNA was "junk" which did nothing, it is becoming clearer that the vast marority of DNA does transcribe RNA. Many new non-protein coding RNA's have been discovered in the ENCODE effort.
- Chromatin accessibility, basically how tightly the DNA is wound, has a huge effect on how readily it is transcribed to RNA. In turn, many RNA's can affect how tightly the DNA is wound.
- We have evolved in a way that has rendered about 5% of our DNA inactive.
- Some regions of our DNA are wildly variable from person to person, while other regions barely change (this isn't really news, but they've been able to pinpoint some of the specific variable regions).
- RNA can do many things beside encode for protein. Some RNA's are used by the cell to suppress other RNA's...thereby regulating the genome. (this isn't really news either).
- There is way too much RNA in cells for us to know what all of it does at this point in time