Graduate student Kate Shulgina has spent six years in the Eddy Lab developing a computational tool called Codetta that predicts variations in the genetic code that cells use to translate RNA sequences into proteins. Shulgina is a student in Harvard’s Systems, Synthetic, and Quantitative Biology (SSQBio) program and one of many graduate students from university-wide programs who conduct research in MCB labs.
SSQBio students can be found in biology labs in the medical school, at the Broad, and in Harvard FAS departments like MCB. “There’s almost no restriction on which labs you can join,” Shulgina says. “I came to grad school not really sure what I wanted to work on, so it was really important to me to have the most options available.”
Shulgina was born in Belarus and immigrated to the United States as a toddler. The family settled first in Buffalo, NY and later in Ohio. Shulgina grew up birding and hiking around the Great Lakes. Alongside her enthusiasm for the outdoors, she developed an interest in life’s ancient history and curiosity about the processes that allow life to exist.
As an undergraduate at Princeton, Shulgina majored in molecular biology and minored in computer science and neuroscience. As a freshman, she did a stint in Laura Landweber’s lab investigating the early origins of the standard genetic code. Shulgina also spent a semester conducting research in a Hugh Dickinson’s plant biology lab in Oxford, UK, which she describes as a “trial run for grad school.” Her senior thesis research on the genomics of neuropsychiatric disorders took place in Sam Wang’s lab at Princeton. Knowing that she wanted to pursue a career in research, Shulgina began at SSQBio in the fall after her graduation.
In their first year, SSQBio students independently develop a program or set of equations that can address a biological question and present their project as part of their preliminary qualifying exam. The second part of their preliminary qualifying exam consists of preparing and defending a dissertation research proposal. Shulgina’s “PQE1” project focused on finding alternative genetic codes, and her thesis work on Codetta grew out of this unique requirement of the SSQBio program.
Shulgina enjoyed working on her PQE1 project but initially wasn’t sure if she wanted to pursue it as a thesis project. “I was like, ‘Whatever idea I thought of when I was 21 is probably not a great idea. I’d be better off doing something some professor thought of.’ And it took me a while to realize: No, I gotta trust my gut. I really like this! It’s not a bad idea,” she says. “And I think my whole PhD has been like that–just trying to do what I thought was the right thing to do and being true to myself and my science.” MCB faculty Sean Eddy had also been thinking about alternative genetic codes, so Shulgina rotated into his lab and expanded upon the Python code she had written to build Codetta.
Codetta analyzes DNA sequence data and predicts how the codons in the data will be translated into proteins. Most organisms that have been studied translate codons in the same way–with each 3 nucleotide codon corresponding to a particular amino acid or a start or stop signal. But scientists have identified a growing number of organisms that translate codons into different amino acids. For instance, a bacterium that uses an alternative genetic code might translate a codon that typically encodes arginine into methionine instead.
Though most of her Ph.D. work has been computational, Shulgina spent some time in her third year doing benchwork in the Murray Lab. Her goal was to verify an unexpected result from Codetta. Normally, the program yields one amino acid per codon, but, on one yeast genome sequence, the program stalled and wouldn’t produce a result. It turned out that particular codon was likely being translated as two different amino acids.
Shulgina tested Codetta by running large-scale screens of genomes from GenBank. She published the results from this screen in the journal eLife last year.
Codetta is available on GitHub for use by the scientific community. Shulgina says several people from the scientific community have reached out to her with questions about how to use it. Although Shulgina will most likely move on to other scientific questions in the next phase of her career, she hopes that using Codetta or a similar program to verify which genetic code an organism uses will catch on more widely. Existing protein databases are mostly extrapolated from DNA sequencing data, assuming the standard code.
“If people are rarely checking whether you’re using the right genetic code, then it’s possible that some portion of the sequences in these databases are not accurate,” she says. “So it’s important to check the genetic code to ensure the accuracy of these widely used protein sequence databases that are behind much research in biology.”
“My dream is that even people who don’t care about looking for new genetic codes will use it,” she adds.
“Kate is an inspiration,” says MCB faculty and Shulgina’s adviser Sean Eddy. “She is driven by pure scientific curiosity, and she has a fierce intellect and rigorous standards. She came to my lab with her own original idea and she chased it through to fruition. What she’s done is an ideal and beautiful example of what a PhD project can and should be like.”
Shulgina is still an avid birder and, now, a mushroom hunter. Spending time in the woods is often a “meditative” experience, she says.“Mushroom foraging and birding are kind of similar,” she says. “You go out in the forest, and you have to really pay attention. Take hen-of-the-woods. It grows at the base of oak trees, and it looks like a pile of leaves. So you really have to be very observant.”
Shulgina says conducting her thesis research on a topic she came up with independently has given her a sense of ownership over her research that will carry over into future steps of her scientific career. At present, she is in the early stage of applying to postdoctoral positions.
“My perspective is, ‘Standing where I am right now, which option seems most appealing? What seems most exciting?’” she says. “And if that takes me to become a PI, then it will, and if it takes me somewhere else, then that’s what I’ll do.”