Repository logo
 

Enzyme selection for optical mapping is hard

Date

2015

Authors

Adams, Laura, author
Boucher, Christina, advisor
Howe, Adele, committee member
Ingram, Patrick, committee member

Journal Title

Journal ISSN

Volume Title

Abstract

The process of assembling a genome, without access to a reference genome, is prone to a type of error called a misassembly error. These errors are difficult to detect and can mimic true, biological variation. Optical mapping data has been shown to have the potential to reduce misassembly errors in draft genomes. Optical mapping data is generated using digestion enzymes on a genome. In this paper, we formulate the problem of selecting optimal digestion enzymes to create the most informative optical map. We show this process in NP-hard and W[1]-hard. We also propose and evaluate a machine learning method using a support vector machine and feature reduction to estimate the optimal enzymes. Using this method, we were able to predict two optimal enzymes exactly and estimate three more within reasonable similarity.

Description

Rights Access

Subject

genome assembly
optical mapping
misassembly error
enzyme selection

Citation

Associated Publications