Project Description: A major goal of astrobiology is understanding how life first arose and evolved. By comparing sequences from across the tree of life, we can make inferences about their ancestors going back to the Last Universal Common Ancestor (LUCA). We recently inferred the order of recruitment of amino acids into the genetic code based on the relative usage of amino acids in protein domains that were present in LUCA. We propose a new approach to inferring the amino acid usage of LUCA by improving the amino acid substitution models used in likelihood calculations of phylogenetic trees. These models normally assume that the frequencies of amino acids are at equilibrium, which is often untrue. If we remove this assumption, we can co-infer amino acid frequencies for separate clades and the root of a tree. Using this approach in LUCA's protein domains, we can infer their ancestral amino acid usage and improve our understanding of LUCA and the genetic code.
NASA Relevance: Understanding the origins and early evolution of life on earth is critical to informing our search for life elsewhere in the cosmos. This research will help us understand the nature of the Last Universal Common Ancestor of all living organisms and even earlier steps in the origin of life, such as the evolution of the genetic code.
Work Description:
- Design and test new models of amino acid substitution
- Write code using python, R, and C++
- Analyze computational results using statistics
- Document methods and results in an accessible format
- Prepare publication quality figures summarizing results
- Meet remotely with international collaborators
- Present results at group meeting and other forums
Open or Reserved Project: 1 position reserved, but will consider new student if requested student not awarded.