Enzyme selection for optical mapping is hard
Date
2015
Authors
Adams, Laura, author
Boucher, Christina, advisor
Howe, Adele, committee member
Ingram, Patrick, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
The process of assembling a genome, without access to a reference genome, is prone to a type of error called a misassembly error. These errors are difficult to detect and can mimic true, biological variation. Optical mapping data has been shown to have the potential to reduce misassembly errors in draft genomes. Optical mapping data is generated using digestion enzymes on a genome. In this paper, we formulate the problem of selecting optimal digestion enzymes to create the most informative optical map. We show this process in NP-hard and W[1]-hard. We also propose and evaluate a machine learning method using a support vector machine and feature reduction to estimate the optimal enzymes. Using this method, we were able to predict two optimal enzymes exactly and estimate three more within reasonable similarity.
Description
Rights Access
Subject
genome assembly
optical mapping
misassembly error
enzyme selection