Machine learning for computer aided programming: from stochastic program repair to verifiable program equivalence
dc.contributor.author | Kommrusch, Steve, author | |
dc.contributor.author | Pouchet, Louis-Noël, advisor | |
dc.contributor.author | Anderson, Charles, advisor | |
dc.contributor.author | Beveridge, Ross, committee member | |
dc.contributor.author | Azimi-Sadjadi, Mahmood, committee member | |
dc.date.accessioned | 2022-05-30T10:22:33Z | |
dc.date.available | 2022-05-30T10:22:33Z | |
dc.date.issued | 2022 | |
dc.description.abstract | Computer programming has benefited from a virtuous cycle of innovation as improvements in computer hardware and software make higher levels of program abstraction and complexity possible. Recent advances in the field of machine learning, including neural network models for translating and answering questions about human language, can also be applied to computer programming itself. This thesis aims to make progress on the problem of using machine learning to improve the quality and robustness of computer programs by contributing new techniques for representation of programming problems, applying neural network models to code, and training procedures to create systems useful for computer aided programming. We first present background and preliminary studies of machine learning concepts. We then present a system that directly produces source code for automatic program repair which advances the state of the art by using a learned copy mechanism during generation. We extend a similar system to tune its learning for security vulnerability repair. We then develop a system for program equivalence which generates deterministically checkable output for equivalent programs. For this work we detail our contribution to the popular OpenNMT-py GitHub project used broadly for neural machine translation. Finally, we show how the deterministically checkable output can provide self-supervised sample selection which improves the performance and generalizability of the system. We develop breadth metrics to demonstrate that the range of problems addressed is representative of the problem space, while demonstrating that our deep neural networks generate proposed solutions which can be verified in linear time. Ultimately, our work provides promising results in multiple areas of computer aided programming which allow human developers to produce quality software more effectively. | |
dc.format.medium | born digital | |
dc.format.medium | doctoral dissertations | |
dc.identifier | Kommrusch_colostate_0053A_17065.pdf | |
dc.identifier.uri | https://hdl.handle.net/10217/235289 | |
dc.language | English | |
dc.language.iso | eng | |
dc.publisher | Colorado State University. Libraries | |
dc.relation.ispartof | 2020- | |
dc.rights | Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright. | |
dc.subject | machine learning | |
dc.subject | program repair | |
dc.subject | program equivalence | |
dc.subject | computer aided programming | |
dc.title | Machine learning for computer aided programming: from stochastic program repair to verifiable program equivalence | |
dc.type | Text | |
dcterms.rights.dpla | This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | Colorado State University | |
thesis.degree.level | Doctoral | |
thesis.degree.name | Doctor of Philosophy (Ph.D.) |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Kommrusch_colostate_0053A_17065.pdf
- Size:
- 4.76 MB
- Format:
- Adobe Portable Document Format