Browsing by Author "Bieman, James M., advisor"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item Open Access A methodology for automated lookup table optimization of scientific applications(Colorado State University. Libraries, 2012) Wilcox, Chris, author; Strout, Michelle Mills, advisor; Bieman, James M., advisor; Bohm, Anton P. W., committee member; Turk, Daniel, committee memberTuning the performance of scientific codes is challenging because of their math-intensive nature. Applications such as climate modeling and molecular biology simulate the behavior of natural systems based on scientific equations. Translating these equations into code can often yield expressions that are expensive to evaluate. Trigonometric, logarithmic, and exponential elementary functions are especially problematic because of their high cost relative to ordinary arithmetic. Lookup table (LUT) transformation can expedite function evaluation by precomputing and storing function results, thereby allowing inexpensive memory lookups to replace costly function calls. Practical concerns limit the size and accuracy of LUT data, thus the technique represents a tradeoff between error and performance. Current practice has the programmer apply each LUT transform manually, thereby impacting productivity, obfuscating code, and limiting programmer control over accuracy and performance. The goal of our research is to help scientific programmers use LUT techniques in a more effective and productive manner. Our approach substantially automates the process of applying LUT transformation via a methodology and its implementation in the Mesa tool. Mesa incorporates algorithms that make adding a LUT transform easier for the programmer, including expression enumeration, domain profiling, error analysis, performance modeling, and code generation. In addition we introduce a novel algorithm for LUT optimization that analyzes application code and helps the programmer identify the most beneficial set of expressions to transform. We demonstrate the effectiveness of our algorithms by using Mesa to perform case studies of scientific applications. Mesa achieves performance speedups of 1.4X to 6.9X without significantly compromising application accuracy. Finally we present evidence that our methodology allows the scientific programmer to be more productive and effective.Item Open Access Application of social networking algorithms in program analysis: understanding execution frequencies(Colorado State University. Libraries, 2011) Rahman, Minhazur, author; Bieman, James M., advisor; France, Robert B., committee member; Turk, Daniel E., committee memberThere may be some parts of a program that are more commonly used at runtime, whereas there may be other parts that are less commonly used or not used at all. In this exploratory study, we propose an approach to predict how frequently or rarely different parts of a program will get used at runtime without actually running the program. Knowledge of the most frequently executed parts can help identify the most critical and the most testable parts of a program. The portions predicted to be the less commonly executed tend to be hard to test parts of a program. Knowing the hard to test parts of a program can aid the early development of test cases. In our approach we statically analyse code or static models of code (like UML class diagrams), using quantified social networking measures and web structure mining measures. These measures assign ranks to different portions of code for use in predictions of the relative frequency that a section of code will be used. We validated these rank ordering of predictions by running the program with a common set of use cases and identifying the actual rank ordering. We compared the predictions with other measures that use direct coupling or lines of code. We found that our predictions fared better as they were statistically more correlated to the actual rank ordering than the other measures. We present a prototype tool written as an eclipse plugin, that implements and validates our approach. Given the source code of a Java program, our tool computes the values of the metrics required by our approach to present ranks of all classes in order of how frequently they are expected to get used. Our tool can also instrument the source code to log all the necessary information at runtime that is required to validate our predictions.Item Open Access Assessment and improvement of automated program repair mechanisms and components(Colorado State University. Libraries, 2015) Assiri, Fatmah Yousef, author; Bieman, James M., advisor; Ghosh, Sudipto, committee member; France, Robert B., committee member; Callahan, Gerald, committee memberAutomated program repair (APR) refers to techniques that locate and fix software faults automatically. An APR technique locates potentially faulty locations, then it searches the space of possible changes to select a program modification operator (PMO). The selected PMO is applied to a potentially faulty location thereby creating a new version of the faulty program, called a variant. The variant is validated by executing it against a set of test cases, called repair tests, which is used to identify a repair. When all of the repair tests are successful, the variant is considered a potential repair. Potential repairs that have passed a set of regression tests in addition to those included in the repair tests are deemed to be validated repairs. Different mechanisms and components can be applied to repair faults. APR mechanisms and components have a major impact on APR effectiveness, repair quality, and performance. APR effectiveness is the ability to and potential repairs. Repair quality is defined in terms of repair correctness and maintainability, where repair correctness indicates how well a potential repaired program retains required functionality, and repair maintainability indicates how easy it is to understand and maintain the generated potential repair. APR performance is the time and steps required to find a potential repair. Existing APR techniques can successfully fix faults, but the changes inserted to fix faults can have negative consequences on the quality of potential repairs. When a potential repair is executed against tests that were not included in the repair tests, the "repair" can fail. Such failures indicate that the generated repair is not a validated repair due to the introduction of other faults or the generated potential repair does not actually fix the real fault. In addition, some existing techniques add extraneous changes to the code that obfuscate the program logic and thus reduce its maintainability. APR effectiveness and performance can be dramatically degraded when an APR technique applies many PMOs, uses a large number of repair tests, locates many statements as potentially faulty locations, or applies a random search algorithm. This dissertation develops improved APR techniques and tool set to help optimize APR effectiveness, the quality of generated potential repairs, and APR performance based on a comprehensive evaluation of APR mechanisms and components. The evaluation involves the following: (1) the PMOs used to produce repairs, (2) the properties of repair tests used in the APR, (3) the fault localization techniques employed to identify potentially faulty statements, and (4) the search algorithms involved in the repair process. We also propose a set of guided search algorithms that guide the APR technique to select PMO that fix faults, which thereby improve APR effectiveness, repair quality, and performance. We performed a set of evaluations to investigate potential improvements in APR effectiveness, repair quality, and performance. APR effectiveness of different program modification operators is measured by the percent of fixed faults and the success rate. Success rate is the percentage of trials that result in potential repairs. One trial is equivalent to one execution of the search algorithm. APR effectiveness of different fault localization techniques is measured by the ability of a technique to identify actual faulty statements, and APR effectiveness of various repair test suites and search algorithms is also measured by the success rate. Repair correctness is measured by the percent of failed potential repairs for 100 trials for a faulty program, and the average percent of failed regression tests for N potential repairs for a faulty program; N is the number of potential repairs generated for 100 trials. Repair maintainability is measured by the average size of a potential repair, and the distribution of modifications throughout a potential repaired program. APR performance is measured by the average number of generated variants and the average total time required to find potential repairs. We built an evaluation framework creating a configurable mutation-based APR (MUT-APR) tool. MUT-APR allows us to vary the APR mechanisms and components. Our key findings are the following: (1) simple PMOs successfully fix faulty expression operators and improve the quality of potential repairs compared to other APR techniques that use existing code to repair faults, (2) branch coverage repair test suites improve APR effectiveness and repair quality significantly compared to repair test suites that satisfy statement coverage or random testing; however, they lowered APR performance, (3) small branch coverage repair test suites improved APR effectiveness, repair quality, and performance significantly compared to large branch coverage repair tests, (4) the Ochiai fault localization technique always identifies seeded faulty statements with an acceptable performance, and (5) guided random search algorithm improves APR effectiveness, repair quality, and performance compared to all other search algorithms; however, the exhaustive search algorithms is guaranteed a potential repair that failed fewer regression tests with a significant performance degradation as the program size increases. These improvements are incorporated into the MUT-APR tool for use in program repairs.Item Open Access Decay and grime buildup in evolving object oriented design patterns(Colorado State University. Libraries, 2009) Izurieta, Clemente, author; Bieman, James M., advisorSoftware designs decay as systems, uses, and operational environments evolve. As software ages the original realizations of design patterns may remain in place, while participants in design pattern realizations accumulate grime-non-pattern-related code. This research examines the extent to which software designs actually decay, rot and accumulate grime by studying the aging of design patterns in successful object oriented systems. By focusing on design patterns we can identify code constructs that conflict with well formed pattern structures. Design pattern rot is the deterioration of the structural integrity of a design pattern realization. Grime buildup in design patterns is a form of decay that does not break the structural integrity of a pattern but can reduce system testability and adaptability. Grime is measured using various types of indices developed and adapted for this research. Grime indices track the internal structural changes in a design pattern realization and the code that surrounds the realization. In general we find that the original pattern functionality remains, and pattern decay is primarily due to grime and not rot. We characterize the nature of grime buildup in design patterns, provide quantifiable evidence of such grime buildup, and find that grime can be classified at organizational, modular and class levels. Organizational level grime refers to namespace and physical file constitution and structure. Metrics at this level help us understand if rot and grime buildup play a role in fomenting disorganization of design patterns. Measures of modular level grime can help us to understand how the coupling of classes belonging to a design pattern develops. As dependencies between design pattern components increase without regard for pattern intent, the modularity of a pattern deteriorates. Class level grime is focused on understanding how classes that participate in design patterns are modified as systems evolve. For each level we use different measurements and surrogate indicators to help analyze the consequences that grime buildup has on testability and adaptability of design patterns. Test cases put in place during the design phase and initial implementation of a project can become ineffective as the system matures. The evolution of a design due to added functionality or defect fixing increases the coupling and dependencies between classes that must be tested. We show that as systems age, the growth of grime and the appearance of anti-patterns (a form of decay) increase testing requirements. Additionally, evidence suggests that, as pattern realizations evolve, the levels of efferent and afferent coupling of the classifiers that participate in patterns increase. Increases in coupling measurements suggest dependencies to and from other software artifacts thus reducing the adaptability and comprehensibility of the pattern. In general we find that grime buildup is most serious at a modular level. We find little evidence of class and organizational grime. Furthermore, we find that modular grime appears to have higher impacts on testability than adaptability of design patterns. Identifying grime helps developers direct refactoring efforts early in the evolution of software, thus keeping costs in check by minimizing the effects of software aging. Long term goals of this research are to curtail the effects of decay by providing the understanding and means necessary to diminish grime buildup.Item Open Access Essential competencies of exceptional professional software engineers(Colorado State University. Libraries, 1991) Turley, Richard T., author; Johnson, Gerry, advisor; Bieman, James M., advisor; Olender, Kurt, committee member; Neidt, Charles O., committee member; Walicki, Jack, committee memberThis dissertation presents a differential study of exceptional and non-exceptional professional software engineers in the work environment. The first phase of the study reports an in-depth review of 20 engineers. The study reports biographical data, Myers-Briggs Type Indicator test results, and Critical Incident Interview data for 10 exceptional and 10 non-exceptional subjects. Phase 1 concludes with a description of 38 essential competencies of software engineers. Phase 2 of this study surveys 129 engineers. Phase 2 reports biographical data for the sample and concludes that the only simple demographic predictor of performance is years of experience in software. This variable is able to correctly classify 63% of the cases studied. Phase 2 also has the participants complete a Q-Sort of the 38 competencies identified in Phase 1. Nine of these competencies are differentially related to engineer performance. A10 variable Canonical Discriminant Function is derived which is capable of correctly classifying 81% of the cases studied. This function consists of three biographical variables and seven competencies. The competencies related to Personal Attributes and Interpersonal Skills are identified as the most significant factors contributing to performance differences.Item Open Access Improving software maintainability through aspectualization(Colorado State University. Libraries, 2009) Mortensen, Michael, author; Ghosh, Sudipto, advisor; Bieman, James M., advisorThe primary claimed benefits of aspect-oriented programming (AOP) are that it improves the understandability and maintainability of software applications by modularizing cross-cutting concerns. Before there is widespread adoption of AOP, developers need further evidence of the actual benefits as well as costs. Applying AOP techniques to refactor legacy applications is one way to evaluate costs and benefits. Aspect-based refactoring, called aspectualization, involves moving program code that implements cross-cutting concerns into aspects. Such refactoring can potentially improve the maintainability of legacy systems. Long compilation and weave times, and the lack of an appropriate testing methodology are two challenges to the aspectualization of large legacy systems. We propose an iterative test driven approach for creating and introducing aspects. The approach uses mock systems that enable aspect developers to quickly experiment with different pointcuts and advice, and reduce the compile and weave times. The approach also uses weave analysis, regression testing, and code coverage analysis to test the aspects. We developed several tools for unit and integration testing. We demonstrate the test driven approach in the context of large industrial C++ systems, and we provide guidelines for mock system creation. This research examines the effects on maintainability of replacing cross-cutting concerns with aspects in three industrial applications. We study several revisions of each application, identifying cross-cutting concerns in the initial revision, and also cross-cutting concerns that are added in later revisions. Aspectualization improved maintainability by reducing code size and improving both change locality and concern diffusion. Costs include the effort required for application refactoring and aspect creation, as well as a small decrease in performance.Item Open Access Testing scientific software: techniques for automatic detection of metamorphic relations(Colorado State University. Libraries, 2015) Kanewala, Upulee G., author; Bieman, James M., advisor; Ghosh, Sudipto, committee member; Anderson, Chuck, committee member; Breidt, Jay, committee memberScientific software plays an important role in critical decision making in fields such as the nuclear industry, medicine, and the military. Systematic testing of such software can help to ensure that it works as expected. Comprehensive, automated software testing requires an oracle to check whether the output produced by a test case matches the expected behavior of the program. But the challenges in creating suitable oracles limit the ability to perform automated testing of scientific software. For some programs, creating an oracle may be not possible since the correct output is not known a priori. Further, it may be impractical to implement an oracle for an arbitrary input due to the complexity of a program. The software testing community refers to such programs as non-testable. Many scientific programs fall into this category of non-testable programs, since they are either written to find answers that are previously unknown or they perform complex calculations. In this work, we developed techniques to automatically predict metamorphic relations by analyzing the program structure. These metamorphic relations can serve as automated partial test oracles in scientific software. Metamorphic testing is a method for automating the testing process for programs without test oracles. This technique operates by checking whether a program behaves according to a certain set of properties called metamorphic relations. A metamorphic relation is a relationship between multiple input and output pairs of the program. It specifies how the output should change following a specific change made to the input. A change in the output that differs from what is specified by the metamorphic relation indicates a fault in the program. Metamorphic testing can be effective in testing machine learning applications, bioinformatics programs, health-care simulations, partial differential equations and other programs. Unfortunately, finding appropriate metamorphic relations for use in metamorphic testing remains a labor intensive task that is generally performed by a domain expert or a programmer. In this work we applied novel machine learning based approaches to automatically derive metamorphic relations. We first evaluated the effectiveness of modeling the metamorphic relation prediction problem as a binary classification problem. We found that support vector machines are the most effective binary classifiers for predicting metamorphic relations. We also found that using walk-based graph kernels for feature extraction from graph-based program representations further improves the prediction accuracy. In addition, incorporating mathematical properties of operations in the graph kernel computation improves the prediction accuracy. Further, we found that control flow information of a function are more effective than data dependency information for predicting metamorphic relations. Finally we investigated the possibility of creating multi-label classifiers that can predict multiple metamorphic relations using a single classifier. Our empirical studies show that multi-label classifiers are not effective as binary classifiers for predicting metamorphic relations. Automated testing will make the testing process faster, reduce the testing cost and make it more reliable. Automated testing requires automated test oracles. Automatically discovering metamorphic relations is an important step towards automating oracle creation. Work presented here is the first attempt towards developing automated techniques for deriving metamorphic relations. Our work contributes toward automating the testing process of programs that face oracle problems.