Abstract:
In nearly 95% of the countries worldwide, breast cancer is the main reason of female deaths. The impact that this disease has on human body, depends on the stage in
whichit is diagnosed, being a life-taking disease if not diagnosed in time. This Thesis
makes an analysis on both traditional and revolutionary methods used for Breast Cancer
Detection andClassification, and proposes the best model for different scenarios, based
on the availabilityof data, human expertise, and time limitations. Available datasets that
contain samples of Breast Cancer cells are also analyzed, and all the sources are collected and provided. The methods analyzed are classified into three main categories: Supervised, Unsupervised, and CNN methods. Four methods are analyzed and tested with Breast Cancer Wisconsin Diagnostic (WDBC) dataset from the first category: Random Forest, K-Nearest Neighbor, NaiveBayes, and Support Vector Machine. From the Unsupervised Learning Methods, are analyzed and tested with the same dataset: Auto-encoders, and Self-Organizing Maps. Two CNN models, UNet and ResNet are also built and tested with Breast Ultrasound Images Dataset. Each method is tested several times with different parameter values, with the aim of finding the combination of parameters that generates the best results for the available datasets. From the Supervised Methods Support Vector Machine achieved the highest ac- curacy of 99%. Auto Encoder won against the SOM as a Unsupervised Method with an accuracy of 98%, and within the CNN methods, UNet performed better with an accuracy of 97.44%.