We should have this whole biological-ball-of-wax figured out in about 100 years, or so (that bit is generally always implied in these sorts of assertions). At least that was a brief point that Professor Adrian Bird made at the start of his lecture at the Royal Society before accepting the RSGSK Prize last Tuesday night. It’s an intriguing prospect, but one that largely fell out of the scope of his talk. That said though, if biologists do find themselves out of work a century from now, the blame can be pointed squarely at the likes of Professor Bird. His research has contributed a significant amount to our understanding of one of the mechanisms involved in silencing the activity of genes across the genome.  In the process, he worked out the cause to a severe autism-spectrum disorder affecting one in 10,000 female births and developed a mouse model to help understand it, and potentially reverse it.

Twelve years ago the first draft sequence of the human genome was published and we are only just scratching the surface of understanding all of the information encoded within it. The genome’s publication and the tools born of the genomic era changed the way we approached molecular biology; we turned to more targeted approaches. However it quickly became apparent that there was a lot more information there and a lot fewer genes than we anticipated. The 21,000 (or so) genes that code for proteins proved to be just as numerous as animals we used to stare down our supposedly evolved noses at (read: flatworms).  These, while central as the blueprints to the various parts that make up the cell, are only 1% of the story.

The other 99% is composed of swathes of genetic information strewn throughout the genome that carry instructions on how and when to use the 1%.  It’s a shame economics couldn’t take a page from this arrangement, right? Consortium projects like ENCODE have significantly furthered our understanding how some rest of this genomic DNA works, but large gaps remain in our understanding of how all of this information is organised and utilised inside cells.If you could take all of the DNA that is inside a single human cell, strip it of its protein packaging and arrange the bits from the 23-pairs of chromosomes end-to-end, you would have something in the neighbourhood of two metres in length. All of that fits into a cell nucleus that has a diameter in the order of 10s of microns (that’s one-one hundred thousandth of a metre). Needless to say, one incredible folding act needs to take place to get all of it in there. DNA wraps around a spindle of proteins called histones, forming the repeating unit of DNA architecture that packages those two-ish meters of double helix into chromatin. This nucleosomal structure coils back on itself almost the same way a phone cord will if you start to twist it, packing all of that chromatin in the nucleus rather tightly. The big question is then, how does the cell access the right gene, at the right time, because the process of turning on genes (aka. transcription) requires that they are physically read, and therefore accessible amongst all of that coiling.

Part of this is explained by a field called Epigenetics that really took off in the post-genome era. Its origins go back at least as far as the 1940s-50s, when scientists like Barbara McClintock were discovering that the story of how genes worked didn’t stop at the DNA sequence itself.  She went on to win a Nobel Prize in 1983 for her work on mobile DNA elements that regulate gene expression, even though at the time her work was regarded by some as an eccentricity of Maize genetics. But that’s a whole other post!

Colour variation in maize is regulated by mobile DNA elements in their genome called transposons.

The part of the DNA sequence that immediately precedes that of a gene is called the promoter.  Think of this as a staging ground for the molecular machines that read the genes. In short, the promoter promotes gene activity, thus some or all of the following information can be found there:

  • The When – During what developmental stage should gene-X be active; embryo? child? adult?
  • The Where – In what ‘type’ of cell should gene-X be active; muscle? bone? brain?
  • and The How Much – Is a lot or only a little of gene-product-X needed?

This is nowhere near the whole story at the sequence level, but for now let’s assume it is, as I’m about to add a layer of complexity to it. The prefix epi- in epigenetics comes from Greek, meaning over or on top of as it represents the genetic regulation that supersedes that which is imparted by the DNA sequence, itself. Put another way, epigenetics is like a very elaborate filing system for genetic information in a cell.

Think of it this way, if all cells contain essentially the same set of genetic blueprints, then there have to be ways to mark the genes pertinent to a neuron’s function, for example, as active, while leaving the others who could be either irrelevant or potentially detrimental to neuronal function marked as silent. Epigenetics can be thought of as the collective set of molecular sign posts that index an epigenome, ensuring that only those genes are available to that cell for use when and where they are promoted to do so.

There are principally two ways that these  marks work.  One set involves the marking the histones, the proteins that package DNA into chromatin. These are chemically modified in a number of ways, facilitating a rich code of information that marks genes, where their promoters start, where their coding sequences start/end and in what degree of  readiness for transcription genes should adopt for that particular cell type.  How this code is read by the cell and how the cell organises this information in a functional way inside cells are still very actively researched topics.

The other type involves the direct chemical modification of DNA, called DNA methylation, which is generally associated with gene silencing.  It was this, and the mechanisms that the cell uses to detect this modification that Professor Bird’s group works on.  Certain DNA sequences can be chemically modified by the addition of methyl groups, and these tend to accumulate near the start of gene promoters. As is often the case with biology, when molecules get chemically modified, there’s usually another molecule that is able to read  or recognise this modification.  This, in essence, is how biological information flows in the cell.  One molecule is modified, another binds to it, and that binding effects and outcome.  In this case, methylating the DNA in promoters suddenly makes promoters a target for a protein called MeCP2, which leads to a cascade of events that results in the silencing of that gene. Mutations in MeCP2, as it turns out, are the major cause of Rett Syndrome.  During the lecture, professor Bird showed encouraging results that Rett Syndrome symptoms could be reversed with gene therapy treatment in the animal models within four weeks of treatment (and yes, that’s the same mouse).