Abstract
Today’s software complexity makes developing defect-free software almost impossible. Consequently, developing classifiers to classify software modules into defective and non-defective before software releases have attracted great interest in academia and software industry alike. Although many classifiers have been proposed, no one has been proven superior over others. The major reason is that while a research shows that classifier A is better than classifier B, we can find other research that shows the opposite. These conflicts are usually triggered when researchers report results using their preferable performance evaluation measures such as, recall and precision. Although this approach is valid, it does not examine all possible facets of classifiers performance characteristics. Thus, the performance evaluation might improve or deteriorate if researchers choose other performance measures. As a result, software developers usually struggle to select the most suitable classifier to use in their projects. The goal of this paper is to apply the fuzzy analytical hierarchy process (FAHP) as a popular multicriteria decision-making technique to reliably evaluate classifiers’ performance. This evaluation framework incorporates a wider spectrum of performance measures to evaluate classifiers performance rather than relying on selected preferable measures. The results show that this approach will increase software developers’ confidence in research outcomes and help them in avoiding false conclusions and infer reasonable boundaries for them. We exploited 22 popular performance measures and 11 software defect classifiers. The analysis was carried out using KNIME data mining platform and 12 software defect data sets provided by the NASA metrics data program (MDP) repository.
Document Type
Article
Source Publication
IEEE Access Journal
Version
Published Version
DOI
10.1109/ACCESS.2019.2915964
Publication Date
5-19-2019
Volume
7
First Page
62794
Last Page
62804
Rights
“© © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”
Recommended Citation
H. Ghunaim and J. Dichter, “Applying the FAHP to Improve the Performance Evaluation Reliability of Software Defect Classifiers,” IEEE Access, vol. 7, pp. 62794–62804, May 2019. DOI: 10.1109/ACCESS.2019.2915964
Comments
For questions contact ScholarsRepository@fhsu.edu