Repository logo
 

Data analysis and predictive modeling for synthetic and naturally occurring biological switches

Date

2016

Authors

Schaumberg, Katherine A., author
Prasad, Ashok, advisor
Medford, June, advisor
Shipman, Patrick, committee member
Antunes, Mauricio, committee member
Krapf, Diego, committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Biological switches are biochemical network motifs responsible for determining the chemical state of cells, and are a key part of every biological system. The impact of these biological switches on cell behavior is broad. For example, many diseases such as cancer are thought to be caused by a misregulation of the bio-chemical state in a cell or group of cells. Also cell fates in differentiating stem cells are controlled by biological switches. Because of their general importance the synthetic biology community has also constructed synthetic biological switches in living organisms. While there are different kinds of possible switches, in my thesis I study switches capable of stably generating two unique molecular states, also called bi-stable switches. Here these switches are studied from two perspectives. In Chapters 1-4 I present theoretical and experimental work on analysis of specific circuits that act like biological switches. In Chapter 5 I employ a data mining perspective to identify gene expression signatures of switches that are sensitive to cytotoxic cancer drugs. This dissertation starts with a computational analysis of the effect of leaky promoter expression on bi-stable biological switches. In several biological and synthetic systems gene transcription is never completely off, even when repressed. This residual expression is referred to here as leaky expression. Bi-stable systems would be expected to have some amount of leaky expression in their off state. However, the impact of leaky expression on the functioning and properties of biological switches has not been well studied. To help fill this gap we conducted a theoretical analysis of leaky expression’s effect on biological switches. Two switches, a positive feedback and negative inhibition-based switch were studied. We found that the different circuit topologies showed different advantages in terms of their ability to handle leaky expression. Next this dissertation describes work done in collaboration with the Medford lab at Colorado State University, to construct and characterize a library of genetic plant parts. These parts would later be used in construction of perhaps the first synthetic bi-stable toggle switch in a plant. As part of this study, experiments were designed and conducted for finding the nature of the experimental noise associated with the assays used to test these plant parts. A mathematical normalization was developed to estimate quantitative information on the performance of each part. Validation experiments were done to assess the usefulness of this method for predicting the behavior of stably transformed plants from higher throughput transient assays. In the end a library of over one hundred quantitatively characterized plant parts in both Arabidopsis and Sorghum was constructed. The quantitative parameters of this library of genetic parts were then used in combination with a probabilistic bootstrap method we developed to predict optimal part combinations for construction of a bi-stable switch in Arabidopsis. The dissertation concludes with a study of biological networks in cancer cells from a data mining perspective. A large amount of data exists in the public domain on the sensitivity of cancer cell lines to cytotoxic drugs. Some cancers appear to be in a "sensitive state" while others are in a "resistant state". We would like to be able to know the gene expression signatures of these two states in order to predict cancer drug sensitivity from gene expression data. As a first step towards this goal we assessed the repeatability of predictions between the two standard databases of cancer cell lines, the NCI60 and the GDSC. This lead to identification of a preprocessing method needed to combine data from multiple databases. This was then followed up with the development of a comparative analysis platform. This platform was used to test the accuracy of models designed to predict drug sensitivity, when different model construction methods were used.

Description

Rights Access

Subject

cancer cell lines
data analysis
synthetic biology
computational biology
big data
plant genetic circuits

Citation

Associated Publications