Scientists Added Two New Letters to DNA's Code

Image: Vincent P/Flickr
Image: Vincent P/Flickr

If you’ve taken a science class, you’re likely aware that DNA is the body’s instruction manual. But its language is only written in four letters: A, T, C, and G. Those who paid extra close attention will remember that RNA, the photocopy of the instructions that the cell actually uses, replaces the Ts with the letter U.

Back in 2014, scientists at the Scripps Research Institute in California reported that they’d engineered bacteria whose DNA used a whole new pair of letters, nicknamed X and Y. That same team now reports that they’ve gotten the bacteria to actually use these new letters. The biological possibilities, as a result, now seem endless.

“The resulting semi-synthetic organism both encodes and retrieves increased information,” report the authors this week in Nature, “and should serve as a platform for the creation of new life forms and functions,” like new kinds of bacteria with specialized purposes (cleaning the environment, storing gifs...who knows) for example.


The initial five letters actually represent two pairs of molecules. G, or guanine, always partners with C, or cytosine. A, or adenine, partners with T, thymine in DNA, but in the copy of the DNA the body actually uses, the RNA, it partners with U, or uracil. The DNA double helix is sort of like a zipper whose teeth are composed of these “nucleobases.”

These bases cryptically code for amino acids, the building blocks of proteins. The RNA copies the DNA then heads over to the ribosome, the worker who can actually use the instructions to build a protein from these amino acids.

Researchers previously engineered an E. coli bacteria that could incorporate a new pair, d5SICSTP and dNaMTP or X and Y. But the new paper in Nature reports E. coli actually constructing special fluorescent proteins that add new amino acids, based on the simple instructions containing these whole new letters.

We’re not in the world where the existing five (and now seven) letters become a whole slew of new ones just yet. The researchers only introduced two new ones, and only tested them on two “codons,” or three letter phrases that usually code for single amino acids. It’s like introducing a new letter to the English language but only using it to make a couple of new words.


Still, the sky is the limit if the paper is to be taken at its word. “Thus, the reported semi-synthetic organism is likely to be just the first of a new form of semi-synthetic life that is able to access a broad range of forms and functions not available to natural organisms.”




Former Gizmodo physics writer and founder of Birdmodo, now a science communicator specializing in quantum computing and birds

Share This Story

Get our `newsletter`



While the science is undoubtedly cool, I’d love to see someone talk about the practical uses for this sort of thing. What can be built from X and Y that can’t be done with ATCG?

The only thing that explicitly comes to mind is perhaps the ability to make synthetic bacteria/yeast that simply cannot survive in the wild because there is no natural source for the components of these nucleobases. Without some feedstock, they simply run out of the raw materials and die out. I don’t know enough about these particular nucleobases to know if that’s a possibility.

Potentially you could also create proteins that fold in new ways, but I don’t think we have a real good grasp on how to make proteins consistently fold in the selected ways we want with the base pairs we already have.