Subclass Separation of White Blood Cell Images Using Convolutional Neural Network Models

1 Abstract —The white blood cells produced in the bone marrow and lymphoid tissue known as leucocytes are an important part of the immune system to protect the body against foreign invaders and infectious disease. These cells, which do not have color, have a few days or several weeks of life. A lot of clinic experience is required for a doctor to detect the amount of white blood cells in human blood and classify it. Thus, early and accurate diagnosis can be made in the formation of various disease types, including infection on the immune system, such as anemia and leukemia, while evaluating and determining the disease of a patient. The white blood cells can be separated into four subclasses, such as Eosinophil, Lymphocyte, Monocyte, and Neutrophil. This study focuses on the separation of the white blood cell images by the classification process using convolutional neural network models, which is a deep learning model. A deep learning network, which is slow in the training step due to the complex architecture, but fast in the test step, is used for the feature extraction instead of intricate methods. For the subclass separation of white blood cells, the experimental results show that the AlexNet architecture gives the correct recognition rate among the convolutional neural network architectures tested in the study. Various classifiers are performed on the features derived from the AlexNet architecture to evaluate the classification performance. The best performance in the classification of white blood cells is given by the quadratic discriminant analysis classifier with the accuracy of 97.78 %.

Nowadays, many studies are conducted for the WBC classification.In this study, Convolutional Neural Network (CNN) model, one of the deep learning architectures, was used in the classification process of the WBC images.The LeNet, VGG-16, and AlexNet architectures were used for the feature extraction.The best success was provided by the AlexNet architecture.The results obtained by using the different classifiers on the AlexNet architecture were compared.The best success rate was obtained by Quadratic discriminant analysis (QDA) classification method.By using various classifiers along with the extracted attributes, success rates increased.In the study, WBC microscopic images accessible from the Internet were used.This study was also compared with the other studies using the same data set in the literature.
This article is organized as follows.Section II provides information on the data set and method used.In section III, information on the classification results obtained from the processing steps is presented.Discussion and, finally, conclusions are given in Section IV and Section V.

II. DATA SET AND METHOD
The data set used in this study consists of four categories: eosinophil, lymphocyte, monocyte, and neutrophil.When the data set being used is examined, it is accessible [7].The microscopic WBC data set consists of 3,120 eosinophils, 3,102 lymphocytes, 3,091 monocytes, and 3,122 neutrophil images.The data set consists of a total of 12,435 images.Each image data consists of a depth of 24 bits and a resolution of 320×240 pixels.The file extension of the images is in JPEG image format.
In the feature extraction and classification stage, the RMSprop, Adam, and Stochastic Gradient Descent (SGD) [10], [11] optimization methods were used.

A. Use of Optimization Methods
The main purpose of these methods is to update the weight values at every stage until the best learning in the CNN architecture is realized.Each method performs the update process with its own algorithm.
In the SGD method [10], [11], the weight update for each set of training is performed.Because of this reason, it gets faster and reaches the goal in the earliest possible time.
The RMSProp method [10], [18] is adapted to the average of the slope weights and maintains learning rates per parameter.This method works well in online and nonstationary situations and performs the parameter update using a momentum on the scaled slope.
The Adam method [19] is one of the methods that updates the learning coefficient in each iteration.It adopts parameter learning rates based on the average first moment in RMSProp.It also uses the average of the second moments of the slopes.This method is designed with the advantages of the RMSProp method.

B. Feature Extraction
The AlexNet architecture, which was the leading of ImageNet competition in 2012, achieved the best performance during the feature extraction process.It has the ability to train approximately one million images [20].In the AlexNet architecture, images are given as 227×227 pixels to the input.The first layer consists of 11×11 elements filter with the four stride [21].In this study, the default parameter values are preserved in the AlexNet architecture.
The image size 320×240 pixels used in this study is converted to the image 227×227 pixels format by the AlexNet architecture.This architecture consists of the convolutional layer, pool layer, and fully connected layer.The convolutional layer is based on the process of circulating a particular filter over the entire image.Filters can be of different sizes, such as 3×3 or 5×5 elements.Filters form the output data by applying the convolution process to the images from the previous layer.This convolution process results in an activation map.The activation map consists of attributes specific to each filter.The pooling layer used in the AlexNet architecture uses to reduce image size by preserving attributes.The pooling layer has a structure that reduces costs and retains the image information, also [22].This structure reduces the number of parameters to protect the information obtained from the image [23].
The LeNet architecture, the first CNN network, was proposed by Yann LeCun in 1988 and was still undergoing improvements until the 1998s.The internal structure of the CNN architecture consists of convolutional and average pool layers.This is followed by a straightener convolutional layer, then, two fully connected layers, and, finally, a softmax classifier.The LeNet includes a 5×5 elements filter.Image sizes vary from 32×32×1 to 28×28×6 pixels [24].
The VGG-16 architecture consists of a 16-layer network structure.In the VGG-16, the data entry size is 224×224 pixels and the filter size is 3×3 elements.The structure of VGG-16 architecture consists of five convolutional layers, a pooling layer, and three fully connected layers.The final layer consists of Softmax layer used in the classification process [9].

C. Classifiers
The KNN classifier determines, which objects belong to which class, by examining the properties of the objects [25].Classification is performed by using the distance between the selected feature and the feature closest to it.While calculating the distance between new data and other data, the methods, such as Cosine, Euclidean or Manhattan distance are used [16], [26].
The DT is one of the methods used in data mining classification.It is frequently used in solving classification problems.A decision tree is created before the classification.
Then, the rules produced from the decision tree with the features extracted from the data set are combined to built the classification process [12].
The QDA method is used in most pattern classification and machine learning applications.In these applications, it is used as a size reduction technique in the pretreatment stage.The aim is to avoid the over-memorization and to classify by reducing the computing costs also [27].
The purpose of the LDA method is to transform the features into a lower dimensional space, which maximizes the ratio of the between-class variance to the within-class variance.Logistic regression is a classification algorithm traditionally limited to only two-class classification problems.If you have more than two classes, then LDA is the preferred linear classification technique [13], [14].
The SVM method is a controlled machine learning method that can be used for classification.This method places the attributes obtained from each data image in the coordinate plane.Then, the classification process is performed by finding the hyper-plane that separates the two classes well [15].
Finally, the Softmax method, used in this study, is a generalized form of the LR method.In other words, it is used in classification processes, where the classification label can take more value.In the MNIST dataset problem, it distinguished and classified 10 different numbers [17], [28].
In the final stage of the study, the attribute reduction was performed using the Principal Component Analysis (PCA), together with the AlexNet architecture [29], [30].The extracted attributes were reclassified with QDA. Figure 4 shows the design of the architecture.

III. RESULTS
The AlexNet architecture was compiled using Matlab R2017b image processing software and Python programming language.The data set employed in the training and testing phase of the models was used without transfer learning, i.e., the not trained data set was used.The parameter values of the architectures used in this study are given in Table I.The parameter values of the architectures are the default values.However, the LeNET architecture data entry size was taken as 227×227 pixels instead of 32×32 pixels.This was done because reducing the image size adversely affects the attribute extraction.The Mini Batch size 32 was selected.Mini Batch contributes to the learning by processing all the data in the data set at the same time.However, since this is costly in terms of time consumption and memory usage, the size was not increased.The data set was compiled with the GPU support.The softwares were set up on 64-bit Windows 10 operating.Other features of the computer used were: the NVIDIA GeForce 2 GB graphics card, which is an Intel © i5-core 2.5 GHz processor, and 8 GB RAM.
The validity of this study is related to sensitivity, specificity, and accuracy.The accurate positive (TP), true negative (TN), false positive (FP), and false negative (FN) are used in the calculation of the measurement [31].
In the first stage (Table II), one of the CNN models was tried to be selected.Normally, the leading architectures of the CNN, such as AlexNet, LeNet, and VGG-16, are classified by the classic classifier Softmax.According to Fig. 5, the highest accuracy is obtained with 84.47 % success rate by using the AlexNet architecture.Therefore, the AlexNet architecture is selected as the feature extractor to classify with different classifiers.One of the most important novelty of this article is the use of the Alexnet architecture with such classifiers as QDA, DT, KNN, and SVM.The experimental result of our proposed method is given in Table III.
In the second stage, the AlexNet architecture was used on the Matlab 2017b software.In this section, for the Softmax classifier, 30 % of the data set was used as the test data and 70 % -as the training data.For other classifiers, the 10-fold cross validation was used in the data set.The AlexNet architecture was compiled using the Matlab interface, and the attributes of the image set were extracted.As classifiers, DT, QDA, LDA, SVM, KNN, and Softmax methods were used.The best success rate with the QDA classifier was 97.78 %.The classification results are given in Table IV.Table IV shows the confusion matrix values obtained from the classifiers.In this table, the success rate of the WBC classes was calculated by the metric values in the confusion matrix.The overall success rate was calculated by using the confusion matrix metric values of four classes.The graph of the ROC curve and the confusion matrix obtained from the QDA classifier is given in Fig. 6.
In the third stage, the attributes obtained in the second stage and the QDA classifier were used.Classification process was re-performed by PCA method.Approximately 83.9 % of success was achieved with PCA.When Table V is examined, it is seen that the number of attributes decreased with PCA.Table V shows that the success rate of PCA in the classification process decreased from 97.78 % to 83.9 %.In the use of the PCA method, fewer attributes and time were spent.
The results of this experimental study were compared with the other studies using the same data set in the literature.The comparison results are given in Table VI.

IV. DISCUSSION
One of the methods to increase the success rate of the image data on the existing images is the application of the super pixel method.Sudhir Sornapudi et al. put super pixels on the data set using the simple iterative clustering algorithm [35].In the study, they mentioned that super pixel method applied to the data set homogeneously increases the success rate.So, the super pixel method may be used in future studies in academic circles for to increase the success.
In this study, the use of the PCA method within the CNN model did not contribute positively to the performance results obtained.However, in Table V, it is seen that the classification of the data is not time consuming when the data set is high.This study was compared with other studies using the same data set in the literature also.As a result of the comparison, the best success rate was obtained by the method proposed in this study.
V. CONCLUSIONS WBC fight against infections in our body.Therefore, over time, there may be an increase or decrease in the WBC.If the WBC count exceeds normal, a leukocyte elevation occurs.If the WBC is below the normal number, the condition known as deficiency occurs.The WBC test provides information on the amount of white blood cells in the blood.If the number of white blood cells is outside the normal range, this leads to the formation of various diseases.In this study, we classified the WBC images into subcategories.
With the proposed method, the feature extraction and classification process was performed using the existing dataset -AlexNet, LeNet, and VGG-16 architectures.In all processing steps, the AlexNet architecture gave the best attribution and classification result.The results of the classification of AlexNet architecture with DT, QDA, LDA, SVM, KNN were compared to Softmax classifiers.The best performance was achieved by using the QDA classifier at a rate of approximately 97.78%.

Fig. 2 .
Fig. 2. Stage 1 of the proposed model: selection of the best model using various models.

Fig. 3 .
Fig. 3. Stage 2 of the proposed model: classification of the best model using various classifiers.

Fig. 4 .
Fig. 4. Application of PCA to best performing model and classifier.

V3,
Xception from CNN models.In the next stage, they classify by combining the CNN and RNN models.The authors achieve the best success from the combination of the two models.The success rate is 90.79 %.Ioannis E. Livieris et al.[33] perform the classification process using a SSL.With the SSL model, they label the data set at ratio 10 %, 20 %, 30 %, and 40 %.Here, the contribution of the labeling rate to the success rate in the data set is observed.The authors achieve 93.29 % success with the KNN classifier.Dana Bani-Hani et al.[34] use GA, one of the optimization methods, in their study.They obtain a limited number of populations from the data set they use with the CNN model.Despite this, the authors achieve 91 % success in the classification.

TABLE I .
PARAMETER VALUES OF THE ARCHITECTURES USED IN THE PROPOSED STUDY.

TABLE II .
TRAINING AND TEST STATISTICS FOR THE DATA SET USED IN THE FIRST STAGE.

TABLE III .
CLASSIFICATION RESULTS OF CNN ARCHITECTURES COMPILED IN PYTHON.

TABLE IV .
THE SUCCESS RATE OF ALEXNET ARCHITECTURE USING DIFFERENT CLASSIFIERS IN MATLAB.