Automatic detection of constraints in software documentation
dc.contributor.author | Ghosh, Joy, author | |
dc.contributor.author | Moreno Cubillos, Laura, advisor | |
dc.contributor.author | Ghosh, Sudipto, committee member | |
dc.contributor.author | Vijayasarathy, Leo, committee member | |
dc.date.accessioned | 2022-01-07T11:29:15Z | |
dc.date.available | 2022-01-07T11:29:15Z | |
dc.date.issued | 2021 | |
dc.description.abstract | Software documentation is an important resource when maintaining and evolving software, as it supports developers in program understanding. To keep it up to date, developers need to verify that all the constraints affected by a change in source code are consistently described in the documentation. The process of detecting all the constraints in the documentation and cross-checking the constraints in the source code is time-consuming. An approach capable of automatically identifying software constraints in documentation could facilitate the process of detecting constraints, which are necessary to cross-check documentation and source code. In this thesis, we explore different machine learning algorithms to build binary classification models that assign sentences extracted from software documentation to one of two categories: constraints and non-constraints. The models are trained on a data set that consists of 368 manually tagged sentences from four open-source software systems. We evaluate the performance of the different models (Decision tree, Naive Bayes, Support Vector Machine, Fine-tuned BERT) based on precision, recall and F1-score. Our best model (i.e., a decision tree featuring bigrams) was able to achieve 74.0% precision, 83.8% recall and an F1-score of 0.79. This suggests that our results are promising and that it is possible to build machine learning based models for the automatic detection of constraints in the software documentation. | |
dc.format.medium | born digital | |
dc.format.medium | masters theses | |
dc.identifier | Ghosh_colostate_0053N_16975.pdf | |
dc.identifier.uri | https://hdl.handle.net/10217/234207 | |
dc.language | English | |
dc.language.iso | eng | |
dc.publisher | Colorado State University. Libraries | |
dc.relation.ispartof | 2020- | |
dc.rights | Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright. | |
dc.title | Automatic detection of constraints in software documentation | |
dc.type | Text | |
dcterms.rights.dpla | This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | Colorado State University | |
thesis.degree.level | Masters | |
thesis.degree.name | Master of Science (M.S.) |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Ghosh_colostate_0053N_16975.pdf
- Size:
- 3.9 MB
- Format:
- Adobe Portable Document Format