BIOINFORMATIC DISSECTION OF HIGH-LEVEL CELLULAR BEHAVIOR
January 19th, 2005
Although microarray technology is wildly popular among biologists, even its biggest fans are sometimes overwhelmed by the sense of having too much information. This is especially frustrating for systems biologists trying to spot patterns in data sets that can seem as vast as the Milky Way. MCB graduate student Benjamin de Bivort became interested in this problem in early 2003, during a short course on complex systems taught by MCB associate Yaneer Bar-Yam, president of the New England Complex Systems Institute, an independent educational and research institution located in Cambridge, MA.
If de Bivort could sort microarray data into a number of "bins" that was small enough to be mathematically tractable, yet big enough to yield robust biological results, he thought it would be possible to observe interactions among a cell’s functional components. In the December 21, 2004, issue of Proceedings of the National Academy of Sciences, he and two colleagues report using 12 gene groups to identify the functional and regulatory dynamics of a cell’s overall response to assorted small molecules, cytokines, and signaling proteins. (See de Bivort et al., "Dynamics of cellular level function and regulation derived from murine expression array data.")
In addition, these findings could be used to predict how major functional units within the cell will respond to new mutations, experimental drugs, or other perturbations, de Bivort says. His coinvestigators in this work were Bar-Yam and Sui Huang, a postdoctoral fellow in vascular biology at Children’s Hospital Boston and Harvard Medical School.
Raw microarray data, which came from the Alliance for Cellular Signaling (AfCS)—a multi-institutional consortium designed to answer global questions about cell-signalling networks—exposed approximately 16,000 cDNAs from murine B lymphocytes to 33 different conditions, and measured expression levels at four different times (0.5 hour, 1 hour, 2 hours, 4 hours). De Bivort used the resulting expression profiles, consisting of 132 values for each cDNA, to put the genes into 12 functional groups. Gene ontology notations enabled the researchers to identify categories of functional processes for each of these 12 "megamodules," such as ATP synthesis or chromatin unpacking, and time-step data made it possible to chart how influences exerted by each group changed over time.
"By using large groups and averaging data, we were able to see what’s really going on," says de Bivort. He compares this strategy to letting your eyes drift out of focus so you can see a hidden image of Mickey Mouse emerge from what looks like a uniform geometric pattern.
For the most part, the megamodules turned one another on and off in predictable ways. For example, a group rich in aerobic respiration genes—essential for generating ATP needed for transcription—exerted a strong activating influence on other gene groups. In contrast, megamodules packed with genes for ATP-consuming behaviors, such as DNA replication and chromatin remodeling, inhibited transcription.
Two unexpected insights, however, did surface. One sheds light on how global transcription is regulated to maintain homeostasis, a phenomenon easy to see in a single endocrine circuit with a negative feedback loop but poorly understood at the systems level. De Bivort’s analysis showed that pairs of megamodules that are co-regulated respond in opposite ways to the same experimental condition, balancing one another and keeping the cell on an even keel.
The second surprise involves short-term activation/long-term repression, which de Bivort describes as "a fairly simple idea—a component in a lattice or time series will encourage nearby components that are similar but discourage or repress dissimilar ones that are further away. It’s like kids on a playground: the ones with Pokemon cards will stick together and will push away kids with Beanie Babies." The analysis showed that the only megamodule over-endowed with cell-cycle genes oscillated over time, rather than space, first activating other groups and then later inhibiting them.
What is most likely to interest biomedical researchers, of course, is the predictive value of this analytic method. Using expression profiles from 27 of the experimental exposures, de Bivort and his team were able to correctly predict how the cell would respond to the other 6 ligands. "This means we could predict the response of the cell to any perturbation, not just the 33 in the data set," he says. The interactions among the 12 gene groups are "universal," de Bivort says, and they should hold true for other cell types and ligands.Although some will no doubt use this model to test experimental drugs, de Bivort won’t be among them. He is now a third-year graduate student in the laboratory of Professor Sam Kunes, pursuing thesis research that won the Ernie Peralta Award last October. Turning away from bioinformatics for the time being, de Bivort is using traditional developmental and behavioral genetics methods to study visual learning in drosophila.