Harvard University COVID-19 updates

Department News



Erin O’Shea, and Felix Lam

If stretched end-to-end, the DNA inside a single human cell would measure approximately 2-3 meters. A fiber of this length is physically compacted a million-fold by a hierarchy of packing proteins to fit into the confines of a cell’s nucleus. At the gene level, 150 base pair segments of DNA are compacted by being spooled around protein particles known as nucleosomes. However, not unlike computer-based compression methods, the gain in space efficiency comes at a cost the information contained within is less accessible. Indeed, at any given moment, large swaths of the genome lie inert in a tightly bundled form known as heterochromatin. Yet even transcriptionally-available genes do not have much of their DNA completely accessible. Studies probing the regulatory promoter region of genes in budding yeast found that small 100-150 base pair stretches of DNA are always maintained in an un-spooled state. To activate a particular gene, transcription factors typically bind recognition sequences in the un-spooled segment. This then triggers the surrounding DNA to un-spool and to become accessible to additional transcription components. These permanently exposed binding sequences thus provide a means for the cell to locally “decompress” DNA on-demand to selectively access genetic information.

What our lab has uncovered, however, is that this local decompaction is far more sophisticated than was previously thought. Continuing the work in budding yeast, it turns out that binding sequences in cells are strategically blocked or exposed in order to achieve quantitatively precise gene responses. For example, by varying the binding affinity of recognition sequences in the un-spooled segment, cells have the ability to fine-tune when a gene will activate. In fact, by analyzing the promoter architecture of a collection of genes in the phosphate starvation pathway, we were able to accurately predict the existence of two broad gene subclasses, each of which activate at a distinct level of external stress.

While this clarifies the role of exposed sequences, what about the numerous binding sequences that normally remain inaccessible within spooled segments? What we’ve found is that these hidden sites serve primarily to scale up the total gene expression strength. Interestingly, because these sites collectively come into play only after the gene has first been triggered through the exposed sequences, a gene’s threshold can be programmed largely independently from its expression strength.

The fact that these two gene performance attributes are decoupled from one another has a number of fundamental implications. From an evolutionary standpoint, it suggests that cells are able to modify a gene’s point of activation or its expression capacity simply by mutating DNA binding sequences in either un-spooled or spooled segments. From a design standpoint, it suggests that nature has solved a principal engineering problem of how to modularize gene properties. Were it not for nucleosomes, changing a gene’s threshold would simultaneously affect its expression strength. However, with nucleosomes and the ability to strategically expose particular binding sequences, cells have the power to individually tune each property, giving them more flexibility to adapt to a greater spectrum of selective pressures at the DNA level.

April 16, 2008: Advance Online Publication in Nature

View Erin O’Shea’s Faculty Profile