An implementation research on software defect prediction using machine learning techniques

Show full item record



Permalink

http://urn.fi/URN:NBN:fi-fe201804208666
Title: An implementation research on software defect prediction using machine learning techniques
Author: Pulliainen, Laur
Contributor: University of Helsinki, Faculty of Science, Department of Computer Science
Publisher: Helsingin yliopisto
Date: 2018
Language: eng
URI: http://urn.fi/URN:NBN:fi-fe201804208666
http://hdl.handle.net/10138/273586
Thesis level: master's thesis
Discipline: Computer science
Tietojenkäsittelytiede
Datavetenskap
Abstract: Software defect prediction is the process of improving software testing process by identifying defects in the software. It is accomplished by using supervised machine learning with software metrics and defect data as variables. While the theory behind software defect prediction has been validated in previous studies, it has not widely been implemented into practice. In this thesis, a software defect prediction framework is implemented for improving testing process resource allocation and software release time optimization at RELEX Solutions. For this purpose, code and change metrics are collected from RELEX software. The used metrics are selected with the criteria of their frequency of usage in other software defect prediction studies, and availability of the metric in metric collection tools. In addition to metric data, defect data is collected from issue tracker. Then, a framework for classifying the collected data is implemented and experimented on. The framework leverages existing machine learning algorithm libraries to provide classification functionality, using classifiers which are found to perform well in similar software defect prediction experiments. The results from classification are validated utilizing commonly used classifier performance metrics, in addition to which the suitability of the predictions is verified from a use case point of view. It is found that software defect prediction does work in practice, with the implementation achieving comparable results to other similar studies when measuring by classifier performance metrics. When validating against the defined use cases, the performance is found acceptable, however the performance varies between different data sets. It is thus concluded that while results are tentatively positive, further monitoring with future software versions is needed to verify performance and reliability of the framework.


Files in this item

Total number of downloads: Loading...

Files Size Format View
implementation-research-software.pdf 1.490Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record