Redefining what cells understand when they interpret the genetic code.
By William Herkewitz
Aug 18, 2016
Marc Lajoie is working on a new type of genetically engineered life: organisms that read a new language of DNA. That is, they have the same letters in their DNA—A,C,G, and T—but they read and interpret them in a totally different way.
Today, Lajoie, a chemical biologist at the University of Washington, and a team of his colleagues published a paper the the journal Science detailing their pursuit. Lajoie explains how you’d build a “genomically recoded organism”—and why they’d be immune to every single virus on Earth.
PM: What is a genomically recoded organism, or GRO?
ML: Well, it’s actually not too complicated, but we need to start with a bit of basic information first. Almost all life shares a common genetic code. This code is basically the language that cells use to read their DNA in order to translate their proteins, which are the molecular machines that perform most of life’s functions. For example, the genetic sequence A – G – G means the same thing for almost all organisms, from your cells to a plant cell to a yeast cell. In this case, it’s the instruction to add a specific molecule called arginine.
“THEY HAVE THE SAME LETTERS IN THEIR DNA—A,C,G, AND T—BUT THEY READ AND INTERPRET THEM IN A TOTALLY DIFFERENT WAY.”
A genomically recoded organism is an organism that we have re-engineered to use a new language: One that sees the genetic sequence AGG as an entirely different instruction. Right now we’re working toward building such a GRO based on bacterial cells.
What’s the benefit of changing that genetic language?
We’re currently focusing on three major applications. The first is virus resistance. When viruses infect a host cell, they essentially inject their genome and hijack the cell to create more viruses. But this only works if both the virus and the cell are speaking the same genetic language. Since GROs speak a different language, the virus’s genetic instructions to replicate itself would be misread, and the virus couldn’t complete its life cycle.
The second is to introduce new biochemical capabilities that are not available in natural organisms. Almost all life shares a common genetic code, which explains how to translate genetic information into proteins. These proteins are composed of amino acids, and there are only 20 amino acids that are routinely used to make proteins. But there are plenty ofunnatural amino acids that have useful chemical properties distinct from those 20. Thanks to great work done in a several other laboratories, we know of over 150 unnatural amino acids that we could use to expand protein function. People are already using these unnatural amino acids to make better drugs for treating disease.
“THERE ARE ONLY 20 AMINO ACIDS ROUTINELY USED TO MAKE PROTEINS. BUT THERE ARE PLENTY OF UNNATURAL AMINO ACIDS.
Finally, the third application is bio-containment. Since these modified organisms may exhibit broad viral resistance, we want to make sure they can’t escape into the world and mix with natural life. In addition to continuing to use our physical firewalls (keeping the organism inside of a laboratory, for example) we can also build genetic firewalls for GROs. To put it simply, we can redesign essential proteins so that the GRO can only survive if it has access to a certain unnatural amino acid that it won’t find in the wild.
How do you go about recoding life’s genetic language in the first place?
To understand how you’d recode an organism, you have to understand a bit about how life’s shared genetic code works. Cells basically transcribe their genetic information in three-letter chunks, called codons. For example, the sequences ATG and TAG are codons. These three-letter codons are recognized by cellular machines called ribosomes, which basically match each codon with an amino acid building block. These codons can also tell a cell when to start or stop building more amino acids.
Now with just three spots and only four different types of nucleotides—which are the genetic letters A, T, G, and C—there are 64 possible codons. That’s interesting, because this is far more codons than than the 20 amino acids that they code for. Because of this, many codons will actually code for the exact same amino acids—making them redundant.
To recode an organism, the first thing you have to do is to free up one of these redundant three-letter codons for some new task. Here’s an example of how this works from one of our previous project: There are three different codons that tell a ribosome to ‘stop’ translation and to release the completed protein. They’re TAG, TAA, and TGA. In 2013, my colleagues and I went through the DNA of a bacterial cell and identified and replaced every instance of TAG in the genetic code with TAA. When we finished, the TAG codon was no longer used in the bacteria, so we were able to delete the cellular machinery responsible for the TAG stop function. This resulted in an unused TAG codon that was free to be reassigned to a new function.
So how do you convince that old codon to do a new trick?
You need to introduce two pieces of biological machinery that, together, redefine the function of a specific codon. One piece of machinery is tRNA, which tells the ribosome which amino acid to incorporate. The other is a protein that tells the tRNA which amino acid to choose. Other labs have shown that you can borrow these pieces of machinery from distantly related organisms and modify them to incorporate a new, unnatural amino acids.
What steps did you take toward a full GRO in the new research you published today?
Well, we took an E. coli bacterium and chose seven different codons that we wanted to eliminate from the organism’s genetic code—essentially leaving them blank. Our strategy was the same as I described earlier. We had to make 321 changes to reassign the TAG codon back in 2013. To reassign all 7 codons for this project will require 62,214 changes in genome, many of them in essential genes, where just a single error in one letter could kill our cell. In total, we had to change about one in every 65 genetic letters, or nucleotides.
Because these changes were so prevalent, it was most efficient to synthesize DNA and to build the entire 4 million base pair genome from scratch in the lab. We’re now in the process of testing it in living bacteria, by swapping it in chunk by chunk and testing whether we’ve created any unintended problems by getting rid of these seven redundant codons. Out of a total of 87 chunks, we’ve gone through 55 so far, and we’ve been encouraged that the majority of them support relatively healthy cells.
Do you expect any major problems before you start teaching your cells new tricks with old codons?
Oh man, there are certainly a bunch of challenges left, and we’re sure there are ones lurking out there that we don’t even know about yet.
Here’s a huge example. Although multiple codons can code for the very same amino acids, that doesn’t mean that they are equal. We already know that these codons can have different effects in other (and frankly, not very well understood) ways. For example, two seemingly redundant codons may have different effects on how much of a certain protein is produced, or exactly when it is produced.
We really don’t know yet if radically changing the genetic code is going to be deleterious to our cell. There has been plenty of theoretical discussion about whether the normal genetic code is optimal, and whether other genetic codes would impair fitness. We may be able to test that hypothesis.