Deep learning for bioinformatics sequences: RNA basecalling and protein interactions
dc.contributor.author | Neumann, Don, author | |
dc.contributor.author | Ben-Hur, Asa, advisor | |
dc.contributor.author | Beveridge, Ross, committee member | |
dc.contributor.author | Blanchard, Nathaniel, committee member | |
dc.contributor.author | Reddy, Anireddy, committee member | |
dc.date.accessioned | 2024-05-27T10:32:48Z | |
dc.date.available | 2024-05-27T10:32:48Z | |
dc.date.issued | 2024 | |
dc.description.abstract | In the interdisciplinary field of bioinformatics, sequence data for biological problems comes in many different forms. This ranges from proteins, to RNA, to the ionic current for a strand of nucleotides from an Oxford Nanopore Technologies sequencing device. This data can be used to elucidate the fundamentals of biological processes on many levels, which can help humanity with everything from drug design to curing disease. All of our research focuses on biological problems encoded as sequences. The main focus of our research involves Oxford Nanopore Technology sequencing devices which are capable of directly sequencing long read RNA strands as is. We first concentrate on improving the basecalling accuracy for RNA, and have published a paper with a novel architecture achieving state-of-the-art performance. The basecalling architecture uses convolutional blocks, each with progressively larger kernel sizes which improves accuracy for the noisy nature of the data. We then describe ongoing research into the detection of post-transcriptional RNA modifications in nanopore sequencing data. Building on our basecalling research, we are able to discern modifications with read level resolution. Our work will facilitate research into the detection of N6-methyladeosine (m6A) while also furthering progress in the detection of other post-transcriptional modifications. Finally, we recount our recently accepted paper regarding protein-protein and host-pathogen interaction prediction. We performed experiments demonstrating faulty experimental design for interaction prediction which have plagued the field, giving the faulty impression the problem has been solved. We then provide reasoning and recommendations for future work. | |
dc.format.medium | born digital | |
dc.format.medium | doctoral dissertations | |
dc.identifier | Neumann_colostate_0053A_18230.pdf | |
dc.identifier.uri | https://hdl.handle.net/10217/238479 | |
dc.language | English | |
dc.language.iso | eng | |
dc.publisher | Colorado State University. Libraries | |
dc.relation.ispartof | 2020- | |
dc.rights | Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright. | |
dc.subject | post transcriptional modifications | |
dc.subject | RNA basecalling | |
dc.subject | protein protein interactions | |
dc.subject | deep learning | |
dc.title | Deep learning for bioinformatics sequences: RNA basecalling and protein interactions | |
dc.type | Text | |
dcterms.rights.dpla | This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | Colorado State University | |
thesis.degree.level | Doctoral | |
thesis.degree.name | Doctor of Philosophy (Ph.D.) |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Neumann_colostate_0053A_18230.pdf
- Size:
- 1.69 MB
- Format:
- Adobe Portable Document Format