Classification of Recurrence of Breast Cancer Cells Using Machine Learning
Keywords:
Neural Network, Recurrence cells, Breast Cancer, Random ForestAbstract
Breast cancer is the most common cancer spreading in woman of developed and developing
countries. Statistics shows the breast cancer cause so many deaths every year. Symptoms of
breast cancer is lump and thickening of tissues of breast. There are many techniques including
supervised and unsupervised learning methods used in medical science for prediction of breast
cancer. Supervised learning methods are more popular and also used to find the type of cancer
cells. They are also used for prediction of recurrence rate of cancer cells and the survival rate
of woman diagnosed with breast cancer. This research study presents a comparison of machine
learning (ML) classifiers: Random Forest (RF), Support Vector Machine (SVM), Naïve Bayes
(NB) and Artificial Neural Network (ANN) with use of popular feature selection techniques
including: Information Gain (IG), Gain Ratio (GR), Relief-F and Gini-index. The experimental
results show that ANN outperforms all other classifiers with 99.6 % accuracy.
References
F. Bray and J. Ferlay, Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality
worldwide for 36 cancers in 185 countries, CA: a cancer journal for clinicians,(2018), vol. 68, pp. 394-424.
M. Amrane and S. Oukid, Breast cancer classification using machine learning, in 2018 Electric Electronics,
Computer Science, Biomedical Engineerings' Meeting (EBBT),(2018), pp. 1-4.
D. HanahanR.A. Weinberg, Hallmarks of cancer: the next generation, cell,(2011), vol. 144, pp. 646-674.
J.C. GoochF. Schnabel, "Inflammatory Breast Cancer," in Clinical Algorithms in General Surgery, ed:
Springer, 2019, pp. 105-108.
M. Phillips and R.N. Cataneo, Prediction of breast cancer risk with volatile biomarkers in breath, Breast
cancer research and treatment,(2018), vol. 170, pp. 343-350.
K. Kourou and T.P. Exarchos, Machine learning applications in cancer prognosis and prediction,
Computational and structural biotechnology journal,(2015), vol. 13, pp. 8-17.
M.N.Q. Bhuiyan and M. Shamsujjoha, "Transfer Learning and Supervised Classifier Based Prediction Model
for Breast Cancer," in Big Data Analytics for Intelligent Healthcare Management, ed: Elsevier, 2019, pp.
-86.
B.-J. KimS.-H. Kim, Prediction of inherited genomic susceptibility to 20 common cancer types by a
supervised machine-learning method, Proceedings of the National Academy of Sciences,(2018), vol. 115, pp.
-1327.
D. Sun and A. Li, Integrating genomic data and pathological images to effectively predict breast cancer
clinical outcome, Computer methods and programs in biomedicine,(2018), vol. 161, pp. 45-53.
X. Zhang and Q. Zou, Meta-path methods for prioritizing candidate disease miRNAs, IEEE/ACM
Transactions on Computational Biology and Bioinformatics (TCBB),(2019), vol. 16, pp. 283-291.
S. RajasegararA.S. Abdalrada, Breast Cancer Recurrence Prediction Using Random Forest Model, in Recent
Advances on Soft Computing and Data Mining: Proceedings of the Third International Conference on Soft
Computing and Data Mining (SCDM 2018), Johor, Malaysia, February 06-07, 2018,(2018), p. 318.
X. Zhu and X. Du, Random forest based classification of alcohol dependence patients and healthy controls
using resting state MRI, Neuroscience letters,(2018), vol. 676, pp. 27-33.
H. Wang and B. Zheng, A support vector machine-based ensemble algorithm for breast cancer diagnosis,
European Journal of Operational Research,(2018), vol. 267, pp. 687-699.
I. Fakhruzi, An Artificial Neural Network with Bagging to Address Imbalance Datasets on Clinical
Prediction, Diabetes,(2018), vol. 768, p. 2.
G. ManikandanS. Abirami, "A Survey on Feature Selection and Extraction Techniques for HighDimensional Microarray Datasets," in Knowledge Computing and its Applications, ed: Springer, 2018, pp.
-333.
S. Khan and N. Islam, A novel deep learning based framework for the detection and classification of breast
cancer using transfer learning, Pattern Recognition Letters,(2019), vol. 125, pp. 1-6.
T. Ayer and O. Alagoz, Breast cancer risk estimation with artificial neural networks revisited: discrimination
and calibration, Cancer,(2010), vol. 116, pp. 3310-3321.
S. Turgut and M. Dağtekin, Microarray breast cancer data classification using machine learning methods, in
Electric Electronics, Computer Science, Biomedical Engineerings' Meeting (EBBT),(2018), pp. 1-3.
L. Hussain and W. Aziz, Automated Breast Cancer Detection Using Machine Learning Techniques by
Extracting Different Feature Extracting Strategies, in 2018 17th IEEE International Conference On Trust,
Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data
Science And Engineering (TrustCom/BigDataSE),(2018), pp. 327-331.
L. Civcik and B. Yilmaz, Detection of microcalcification in digitized mammograms with multistable cellular
neural networks using a new image enhancement method: automated lesion intensity enhancer (ALIE),
Turkish Journal of Electrical Engineering & Computer Sciences,(2015), vol. 23, pp. 853-872.
A. Jalalian and S.B. Mashohor, Computer-aided detection/diagnosis of breast cancer in mammography and
ultrasound: a review, Clinical imaging,(2013), vol. 37, pp. 420-426.
R.S. Michalski and I. Mozetic, The multi-purpose incremental learning system AQ15 and its testing
application to three medical domains, Proc AAAI 1986,(1986), pp. 1,041-1,045.
T. Niblett, Constructing decision trees in noisy domains, in Proceedings of the 2nd European Conference on
European Working Session on Learning,(1987), pp. 67-78.
M. TanL. Eshelman, "Using weighted networks to represent classification knowledge in noisy domains," in
Machine Learning Proceedings 1988, ed: Elsevier, 1988, pp. 121-134.
B. Cestnik, Assistant 86: A Knowledge-Elicitation Tool for Sophisticated Users, Progress in Machine
Learning,(1987), vol. 62.
R.B. Pereira and A. Plastino, Categorizing feature selection methods for multi-label classification, Artificial
Intelligence Review,(2018), vol. 49, pp. 57-78.
D.H. MazumderR. Veilumuthu, An enhanced feature selection filter for classification of microarray cancer
data, ETRI Journal,(2019), vol. 41, pp. 358-370.
K. YanH. Lu, Evaluating ensemble learning impact on gene selection for automated cancer diagnosis, in
International Workshop on Health Intelligence,(2019), pp. 183-186.
Y.-J. Tseng and C.-E. Huang, Predicting breast cancer metastasis by using serum biomarkers and
clinicopathological data with machine learning technologies, International journal of medical
informatics,(2019), vol. 128, pp. 79-86.
W. Yue and Z. Wang, Machine learning with applications in breast cancer diagnosis and prognosis,
Designs,(2018), vol. 2, p. 13.
Y. Xiao and J. Wu, A deep learning-based multi-model ensemble method for cancer prediction, Computer
methods and programs in biomedicine,(2018), vol. 153, pp. 1-9.
H.U. Khan, Mixed-sentiment classification of web forum posts using lexical and non-lexical features, J Web
Eng,(2017), vol. 16, pp. 161-176.
A. Viloria and J.R. López, Determinating Student Interactions in a Virtual Learning Environment Using Data
Mining, Procedia Computer Science,(2019), vol. 155, pp. 587-592.
B. McCullough and T. Mokfi, On the accuracy of linear regression routines in some data mining packages,
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery,(2019), vol. 9, p. e1279.