The plastid caseinolytic protease complex as a model for cytonuclear coevolution
Date
2021
Authors
Williams, Alissa Marie, author
Sloan, Daniel, advisor
Bedinger, Patricia, committee member
Mueller, Rachel, committee member
Pilon, Marinus, committee member
Stenglein, Mark, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
Coevolution, or evolution in response to reciprocal selective pressures, is important to biological function and the persistence of populations. Competition or mutualisms between organisms can drive coevolution, as can predatory or parasitic relationships. However, coevolution also occurs within cells, as coevolution can result from the interactions between proteins within complexes as well as between the multiple genomes within eukaryotic cells. In protein complexes, subunits must bind tightly and specifically to one another. Changes in one protein subunit are often correlated with changes in the other subunits to preserve the functionality of the complex. Thus, in many protein complexes, correlated rates of evolution are found between the sequences of component subunits. This covariation is strong enough to be used as a method to predict which proteins are connected physically and/or functionally. The coevolution between multiple genomes in eukaryotic cells is known as cytonuclear coevolution. Plants, for example, have a nuclear genome and two cytoplasmic genomes found in the plastid (chloroplast) and mitochondrion. Many protein complexes within these organelles consist of subunits deriving from both the nucleus and the organelle itself. Since the nuclear genome and organelle genomes differ in modes of transmission, mutation rates, and selective pressures, partnerships between proteins originating from two cellular compartments are great models for understanding protein complex evolution. Protein complexes are frequently shaped via gene duplication. Many protein complexes contain paralogous proteins at their cores; the duplication of a self-binding protein leads to dimerization of the paralogous proteins and subsequent recruitment of additional subunits. Gene duplication after establishment of a heteromeric complex allows subunits to specialize. The plastid caseinolytic protease (Clp) complex provides a model system for studying protein complex evolution, in the context of cytonuclear interactions, gene duplication, and evolutionary rate variation. This complex is highly conserved across bacteria and consists of adaptors, chaperones, and a proteolytic core. It is present in both plastids and mitochondria because these organelles are derived from ancient bacterial endosymbionts. The Clp core contains 14 subunits; in mitochondria and most bacteria, all 14 subunits are encoded by the same gene. However, in the cyanobacterial and plastid lineage, multiple rounds of gene duplication have led to a core encoded by nine different genes in the model plant species Arabidopsis thaliana. Further, only one of these plastid Clp core subunit genes is encoded by the plastid itself—the remaining eight are encoded by the nucleus, the result of gene transfers from the organelle to the nucleus early in the history of green plants. In addition to representing multiple rounds of gene duplication, the plastid Clp core also demonstrates extreme rate variation across green plants. The plastid-encoded subunit (ClpP1) is typically highly conserved across species. However, in some species, ClpP1 is one of the most rapidly evolving genes across all three genomes. In this dissertation, I use these features of the plastid Clp complex to shed light on protein complex evolution in various contexts. After a general introduction to the field in Chapter 1, Chapter 2 focuses on the evolutionary history of ClpP1, looking at rate variation and the loss of introns, RNA editing sites, and catalytic sites across green plants. Through mass spectrometry, I determine that ClpP1 is still a functional protein in Silene noctiflora, which has one of the most divergent plastid Clp complexes known. This work also includes an evolutionary rate covariation analysis between ClpP1 and the nuclear-encoded Clp core genes. Chapter 3 provides genomic resources, including a high-quality, long-read transcriptome, for S. noctiflora, which is a species of interest for the reason outlined above. Analysis of the transcriptome revealed a triplication of one of the nuclear-encoded Clp core genes in this species. Chapter 4 discusses the recent duplication history of the nuclear-encoded Clp core genes across a broad range of flowering plants. I use these data to examine and characterize post-duplication evolutionary fates of paralogs. These analyses are extended to another plastid complex, acetyl-CoA carboxylase (ACCase). Taken together, these chapters elucidate various features of plastid Clp complex evolution as well as provide insight into the possible causes and consequences of rate variation and gene duplication in the coevolution of protein complex subunits.