Repository logo

The development of a high-throughput plasmid assembly pipeline

dc.contributor.authorHernandez, Sarah, author
dc.contributor.authorPeccoud, Jean, advisor
dc.contributor.authorHeaslip, Darragh, committee member
dc.contributor.authorPeebles, Christie, committee member
dc.contributor.authorTroxell, Wade, committee member
dc.date.accessioned2026-01-12T11:29:22Z
dc.date.issued2025
dc.description.abstractPlasmids are a key tool in every biologist's workbox that are used in research, therapeutic development, and industrial biotechnology. With advances in computational design and machine learning, researchers need to design thousands of plasmids to potentially fulfill a single experimental goal. However, such expansive libraries challenge current assembly methods. Today, there is a need for a scalable, high-throughput whole plasmid construction and verification process that can overcome the current limitations of commercial gene synthesis services. Recent progress in molecular technologies and laboratory automation has increased the volume of samples that can be produced, displacing the bottleneck of plasmid assembly from physical operations to data management. This dissertation addresses some of the challenges that need to be overcome to integrate complex physical operations into a system that can execute plasmid construction workflows predictably and at scale. Controlling the quality of the process is essential to benchmarking future iterations of the plasmid manufacturing process. A bioinformatics tool was developed to streamline the analysis of sequencing data, and the reproducibility of this quality control process was characterized extensively. Algorithms from computer science and electrical engineering were adapted to DNA to develop self-documenting plasmids that are easier to track during the construction process and beyond when the plasmids are used to develop and manufacture a biotechnology project. While the development of an industrial-scale plasmid construction infrastructure is beyond the scope of a doctoral project, it has been possible to identify 10 key data management principles that provide a roadmap to the development of such a system. Similarly, the risks associated with the development of this infrastructure were analyzed to guide the development of a safe and resilient infrastructure. A key focus was placed on the sample and data management for constructs produced, relying on plasmid-centric hybrid sequencing and assembly pipelines to streamline data assembly by combining next-generation and long-read sequencing data to leverage the strengths of each platform to circumvent weaknesses associated with individual technologies. This hybrid strategy enables accurate plasmid assemblies, even within complex libraries. In parallel, verification of plasmid sequences by embedded digital documentation is explored by using DNA storage techniques to store information directly in a plasmid sample. By directly linking physical and digital documentation of samples, verification of a construct becomes easier and more reliable for larger libraries. Such use of digital signatures allows for improved reproducibility across the different construction stages, where samples can directly inform the user of potential mutations within the sequence, even in cases where the sequence is not known to the user. Combined with standardized lab techniques, it becomes easier and more efficient to produce, track, and verify larger plasmid libraries than before. These advances support a shift towards scalable, automation-friendly plasmid assembly workflows. By integrating plasmid-specific, robust sequencing workflows, embedded sequence documentation, and high-throughput strategies, the framework presented here lays the foundation for improved plasmid library construction. This work demonstrates that, as the design space for plasmids expands, proper workflows and strong management infrastructures are essential to ensure fidelity of data, the integrity of samples, and reproducible science overall.
dc.format.mediumborn digital
dc.format.mediumdoctoral dissertations
dc.identifierHernandez_colostate_0053A_19263.pdf
dc.identifier.urihttps://hdl.handle.net/10217/242736
dc.identifier.urihttps://doi.org/10.25675/3.025628
dc.languageEnglish
dc.language.isoeng
dc.publisherColorado State University. Libraries
dc.relation.ispartof2020-
dc.rightsCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.rights.accessEmbargo expires: 01/07/2027.
dc.subjectmolecular biology
dc.subjectsequencing
dc.subjectsystems biology
dc.subjectplasmids
dc.subjecthigh-throughput
dc.subjectsynthetic biology
dc.titleThe development of a high-throughput plasmid assembly pipeline
dc.typeText
dc.typeImage
dcterms.embargo.expires2027-01-07
dcterms.embargo.terms2027-01-07
dcterms.rights.dplaThis Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.disciplineSystems Engineering
thesis.degree.grantorColorado State University
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy (Ph.D.)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hernandez_colostate_0053A_19263.pdf
Size:
2.71 MB
Format:
Adobe Portable Document Format
Access status: Embargo until 2027-01-07 , Download