Rebecca Dikow

Research Data Scientist, Office of Research Information Services, Office of the CIO

For this Smithsonian Data Scientist, genomics research and the development of computational and analytic tools are key to unlocking the mysteries of evolution.


“Where did we come from?” It is a question we’ve all probably asked at some point in our lives, but geneticists and biologists ask it every day. Now they are able to probe even more deeply into the mysteries of evolution, thanks to the science of genomic sequencing.

“Traditionally, scientists studied relationships among organisms using morphology, which examines physical similarities between structure and form,” says evolutionary biologist and Smithsonian postdoctoral fellow Rebecca Dikow. As DNA sequencing became possible, however, scientists were also able to sequence single genes, then groups of genes, to discover hereditary commonalities.

“But we still weren’t satisfied with what gene sequencing told us about an organism’s evolutionary history,” she continues. “Genomes can give us those answers.” Contained within every living cell, genomes are basically the blueprints of all life—including genes, chromosomes, and DNA—and the comparative analysis of genomes has the potential to revolutionize evolutionary biology. The Smithsonian stands at the cusp of this revolution.

Dr. Dikow’s interest in evolutionary biology was sparked during her undergraduate studies at Cornell University where she studied ecology and evolutionary biology. “I did a summer internship at the American Museum of Natural History in New York,” she says, “and I became interested in studying patterns of relationships between organisms after seeing the power that collections-based research provides.”

She honed that interest during graduate school at the University of Chicago. “For my PhD, I really wanted to start asking phylogenetic questions with whole genomes, because it hadn’t been done.” Phylogenetics is the study of the evolutionary relationships of a group of organisms. Just a few of the areas that can now be explored through genomics include determining how organisms are related, how functional genes have arisen multiple times, and how these genes have come to do the same thing in different groups of organisms.

“The field is still in its infancy, especially for organisms with large genomes. I got started by looking at bacterial genomes because they are relatively small, and my interest has expanded to include theoretical questions in comparative genomics. I have also become interested in the emerging computational and analytical methods that we will use to answer these questions.”

Tasmanian devil

Her first post-graduate project, co-advised by Rob Fleischer of the Smithsonian National Zoological Park and Kris Helgen of the National Museum of Natural History, has given her the opportunity to ask those broader biological questions. Dikow is now studying microbes found on 100-year-old Tasmanian devil specimens housed at the museum. These animals were part of a massive population decline due to an unknown bacterial infection. By using metagenomic approaches to sequence all of the organisms found on the devil’s skin or in its intestines—Dr. Dikow hopes to identify the bacteria that contributed to the epidemic. While Tasmanian devil populations recovered from this epidemic, they are now in the midst of another, perhaps more serious disease decline today. While the two diseases implicated are very different—a bacterial infection 100 years ago and a contagious cancer today—learning more about the bacteria that were present on the skin 100 years ago and throughout the 20th century has the potential to uncover new information about the disease susceptibility and historical immune profile of Tasmanian devils in general.

Just like Dr. Dikow, Smithsonian scientists across the institution are working on genetic sequencing for a variety of organisms, and this activity is generating massive amounts of data. Bioinformatics is the science of collecting and analyzing complex biological data, like genomes. Robust bioinformatics make it possible to analyze all of these new data, and Dikow is helping set up the hardware, software, and workflows to perform powerful analyses that take full advantage of the rapid changes in computational power and efficiency.

Example of a genome diagram

“It’s still in the early stages, but our goal is to help develop software that can be used across the Smithsonian, so that each time a researcher starts a project they don’t have to learn a whole new set of computational tools. That way they can focus on what they do best, which is going out into the field, finding their specimens and studying ecology, behavior, and morphological differences. The bioinformatics will already be in place so that they can more readily answer their evolutionary questions and run their analyses.”

Given the appropriate computational and analytic tools, genomics research will allow Smithsonian scientists to explore biological processes more readily, including the accurate recreation of evolutionary trees depicting the evolutionary relationships among all kinds of organisms. It will also provide answers to other important biological questions, including those that deal with conservation and ecosystem functionality. “They may be complicated questions,” says Dikow, “but once answered, they could help explain the incredible diversity of life on planet Earth.”

And there is no place better to do it than the Smithsonian, she says. “I hear people say, ‘Of course the Smithsonian has to sequence all those genomes, because who else will?’ I think they’re right. It’s just a matter of getting a good team together, because we’ve just begun to scratch the surface of what is possible with genomes. It’s unlimited at this point.”