
Department
Computer Science
Degree Name
Master of Science (MS)
Abstract
Breast cancer remains one of the most common cancers amongst women globally. Early detection is crucial for improving survival rates. While mammography is widely used and an effective imaging technique, it can sometimes yield false positive or false negatives. Mammogram interpretation is highly operator-dependent, introducing variability and the potential for diagnostic errors. Additionally, mammographic images have limitations, such as low contrast in breast tissue and overlapping structures that can obscure lesions or mimic abnormalities. These limitations can lead to unnecessary biopsies or delayed diagnosis. These challenges highlight the needs for advanced and data driven diagnostic tools to support and enhance current screening practices.
In recent years, machine learning has emerged as a transformative force in the health care industry offering innovative solution to complex medical challenges. One of the most promising applications of this technology is in the early and accurate detection of breast cancer particularly through the mammography image analysis. Machine learning algorithms, especially deep learning models, have shown significant potential in identifying patterns in mammography that may indicate abnormal cases. This study utilizes the power of deep learning particularly Convolutional Neural Networks (CNNs) to enhance the classification of breast ultrasound images.
This research proposes a novel two-level classification approach. At the first level classification, the system distinguishes normal from abnormal cases. Next, the second level classification, it further classifies abnormal cases into benign or malignant tumors. To address challenges such as limited and imbalanced datasets, the study incorporates comprehensive preprocessing techniques including contrast enhancement via Contrast Limited Adaptive Histogram Equalization (CLAHE) and region of interest (ROI). The preprocessing step were evaluated separately and in combination to assess their individual contributions. Additionally, data augmentation is utilized to improve model generalization. To further enhance performance and robustness, multiple CNN architectures are combined using an ensemble mechanism. Soft voting and hard voting leverage the strengths of diverse CNNs reducing individual model biases and improving overall accuracy.
The main contributions of this research are: (1) the design of a two-level classification system (2) the utilization of multiple preprocessing techniques to improve image quality, (3) the application of voting ensemble learning to improve predictive stability and accuracy, and (4) an extensive evaluation on the BUSI dataset. This breaks down the complex diagnostic task into smaller, manageable steps to allow for more accurate comparisons, analysis, and improvements at each stage.
The first preprocessing step involved the application of CLAHE. In the first level classification (normal vs abnormal), performance of various models has been evaluated. The models achieved an 82.05% accuracy for AlexNet, 84.62% for DenseNet, 85.26% for Inception, 82.69% for MobileNet, 82.05% for ResNet and 82.69% for VGG16. The accuracy of soft voting is 96.20% and accuracy of hard voting 92.09%. In the second level classification (benign vs malignant), the models achieved an 75.00% accuracy for AlexNet, 80.47% for DenseNet, 35.94% for Inception, 41.41% for MobileNet, 35.94% for ResNet and 82.81% for VGG16. The accuracy of soft voting is 90.84% and hard voting is 85.76%.
The second preprocessing step involved the ROI was implemented. This step helped isolate the regions containing potential abnormalities, such as tumors, while filtering out irrelevant areas. After the application of ROI (normal vs abnormal), the system evaluated the performance of various models. In the first level classification (normal vs abnormal), the models achieved an 82.69% accuracy for AlexNet, 82.69% for DenseNet, 83.97% for Inception, 87.18% for MobileNet, 82.69% for ResNet and 82.05% for VGG16. The accuracy of soft voting is 100.00% and hard voting is 98.08%. In the second level classification (benign vs malignant), the system evaluated the performance of various models. The models achieved an 75.00% accuracy for AlexNet, 80.47% for DenseNet, 35.94% for Inception, 41.41% for MobileNet, 35.94% for ResNet and 82.81% for VGG16. Soft voting achieved 89.15% and hard voting achieved 80.95%.
The final step in the preprocessing stage combines both CLAHE and ROI. The combination led to an improvement in classification performance. In the first level classification (normal vs abnormal), the models achieved an 91.77% accuracy for AlexNet, 92.72% for DenseNet, 95.57% for Inception, 96.52% for MobileNet, 76.90% for ResNet and 91.14% for VGG16. The accuracy of soft voting is 90.88% and hard voting is 93.99%. In the second level classification (benign vs malignant), the system evaluated the performance of various models. The models achieved an 87.69% accuracy for AlexNet, 99.23% for DenseNet, 96.92% for Inception, 93.08% for MobileNet, 94.62% for ResNet and 97.69% for VGG16. Soft voting achieved 96.92% and hard voting achieved 97.69%.
These results indicate that every preprocessing step contributed to an improvement in classification accuracy across all models. This study demonstrates that a two-level, ensemble-based deep learning system can significantly improve the diagnostic accuracy of breast ultrasound images, offering a promising tool to assist radiologists and support early detection efforts, particularly in resource-constrained healthcare environments.
Keywords
carcinogenesis, machine learning, supervised learning, deep learning, classification
Advisor
Dr. Anas Hourani
Date of Award
Spring 2025
Document Type
Thesis
Recommended Citation
Alorico, Noreliz, "Deep Learning-Based Ensemble Two-Step Classification of Medical Images Using CNN Architectures and Ensemble Methods" (2025). Master's Theses. 3258.
Available at:
https://scholars.fhsu.edu/theses/3258
Rights
© The Author