Skip to main content

Assessment of combined textural and morphological features for diagnosis of breast masses in ultrasound

Abstract

The objective of this study is to assess the combined performance of textural and morphological features for the detection and diagnosis of breast masses in ultrasound images. We have extracted a total of forty four features using textural and morphological techniques. Support vector machine (SVM) classifier is used to discriminate the tumors into benign or malignant. The performance of individual as well as combined features are assessed using accuracy(Ac), sensitivity(Se), specificity(Sp), Matthews correlation coefficient(MCC) and area AZ under receiver operating characteristics curve. The individual features produced classification accuracy in the range of 61.66% and 90.83% and when features from each category are combined, the accuracy is improved in the range of 79.16% and 95.83%. Moreover, the combination of gray level co-occurrence matrix (GLCM) and ratio of perimeters (P ratio ) presented highest performance among all feature combinations (Ac 95.85%, Se 96%, Sp 91.46%, MCC 0.9146 and AZ 0.9444).The results indicated that the discrimination performance of a computer aided breast cancer diagnosis system increases when textural and morphological features are combined.

Diagnosis of breast masses in ultrasound

Introduction

The most frequently diagnosed cancer all over the world is the Breast cancer, accounting for 23% (1.38 million) of total cancer cases. It is responsible for about 14% (458,400) of the total cancer deaths in 2008, as leading cause of mortality in females. Almost half of the breast cancer cases and 60% of the mortality present in economically developing countries such as India. The availability of early detection facilities in developed countries contribute to the variation in incidence rates [1].

Mammography has been the primary investigating tool for breast cancer screening. Besides the ionizing radiation of mammography increases the health risk of patients and radiologists, depending on the age and breast density of the patient, mammography screening is associated with a false-negative rate of 10–20% [2]. Also, mammography can hardly detect breast cancer in adolescent women with dense breasts. Ultrasound (US) imaging shows increasing interest in breast cancer detection and diagnosis as an effective alternative to mammography. Ultrasound can be used to characterize a breast lesion as solid or cystic. It is efficient in staging breast cancer more precisely and assists physician in guided biopsy [3]. Ultrasound is an effective, convenient, inexpensive, real-time and ionizing radiation-free imaging tool for the diagnosis of breast tumors in clinics [4].

Employing computer algorithms in ultrasound images improves radiologists’ accuracy in distinguishing malignant from benign breast masses [5]. The computer-aided diagnosis (CAD) systems have been introduced to improve the capability of radiologist in interpretation and recognition of breast masses [6,7]. The CAD system increases the efficiency of radiologists and their interpretation can also be improved in terms of accuracy, sensitivity and consistency in discrimination of masses. The overall productivity has been increased by reducing the time required for reading the ultrasonagrams manually by radiologists [8].

It is essential to quantify the characteristics of breast tumors for the detection as well as discrimination, which are quite often difficult to grasp due to intrinsic limitations of the ultrasound imaging process, such as low contrast, speckle noise, heterogeneity or artifacts. It is significant to explore the feature, or set of features, which provide better quantifications of the characteristics of tumors [9]. Some general guidelines [10] for identifying significant features which leads to accurate diagnosis are discrimination, reliability, independence and optimality. However, simply combining the number of best performed features does not make the system effective, but the objective is to identify a set of effective features to classification stage by reducing the redundancy [11].

Textural features, extracted from ultrasound images are efficient features for classifiying breast tumors [12]. Gomes et al. [13] extracted twenty two textural features through gray level co-occurrence matrix (GLCM) using 436 breast ultrasound images and obtained a good classification rate with an area AZ of 0.87. The histogram, GLCM, gray level run length matrix (GLRLN) were used to extract textural features from 5500 prostate cancer images [14]. An accuracy of 92.83% was achieved in differentiating tumors by combining all the three features [14]. However, the textural features are effective with a specific ultrasound system and its precision reduces with images acquired from different US systems or with different US settings. The use of morphological features of the tumor which are almost independent of sonographic gain setting or different US machines is an alternative solution [15]. Huang et al. [15] extracted nineteen morphological features from 118 breast ultrasound images and using support vector machine(SVM) classifier, they achieved an AZ of 0.909. Seven morphological parameters for distinguishing malignant from benign breast tumors in ultrasound images were investigated by Alveranga et al. [16]. The morphological features were derived through convex polygon technique and normalized radial length (nrl) to achieve an AZ of 0.865.

Very few works in the literature have concentrated about combining textural and morphological features in the diagnosis of breast tumors in ultrasound. Wu et al. [17] combined auto-covariance texture features and morphological features extracted from 210 breast ultrasound images to discriminate breast tumors in ultrasound images. An accuracy of 92.86% was achieved using SVM classification. In a later work [18] using the same database, they achieved an accuracy of 96.14% using SVM-genetic algorithm based classifier. Alvaranga et al. [19] evaluated the combined performance of twenty textural and seven morphological features in distinguishing breast tumors. An accuracy of 85.37% was achieved on a database of 246 ultrasound images using fisher linear discriminant analysis(FLDA) classification.

In this work, our objective is to assess the individual and combined performance of textural and morphological features for discriminating breast masses in ultrasound images. Figure 1 shows the four stages of an automated computer aided diagnosis system. We have extracted the thirty nine textural and five morphological features from each region of interest (ROI) of breast ultrasound images in the entire database using an automated segmentation method. Support Vector Machine Classifier along with a 10 fold cross-validation scheme has been used for the assessment of individual and combined features using statistical parameters and ROC analysis.

Figure 1
figure 1

The different stages of the breast ultrasound diagnosis system used for current study.

Methods

Image database

The Breast ultrasound database consists of 120 images including 70 benign and 50 malignant images. The images used in this study are collected through [20], which complies with the HONcode (Health On the Net Foundation) standard for trustworthy health information. The Study protocols are approved by institution’s ethics committee of Gelderse Vallei Hospital, Ede, the Netherlands with the consent of the patients.

Pre processing and segmentation

The speckle noise, a characteristic artifact in ultrasound images, significantly degrades the image quality and hinders finer details, which are essential for discrimination. Image segmentation divides an image into nonoverlapping regions and it is essential to detect breast lesions and to make correct diagnoses in CAD systems [21]. Accurate segmentation requires removal of speckle noise [22], and enhancement of lesion edges [23]. The Non-Local Means (NLM) filter proposed by Buades et al. [24] is based on self similarity or photometric closeness between two pixels. It measures the similarity between two pixels by evaluating the distance between small patches, centered on these two pixels [24,25]. The NLM has been proved to be an effective and suitable filter for removing speckle noise without affecting fine details present at the tumor region in medical ultrasound images [26]. In block-wise NLM [24], for each overlapping block B ik centered around pixels ik, the NLM restoration takes place as,

$$ NL(u)\left({B}_{ik}\right)={\displaystyle \sum_{j\epsilon I}}\omega \left({B}_{ik},{B}_j\right)u\left({B}_j\right) $$
(1)

where, u(B j ) is the intensity of block B j and ω(B ik , B j ) is the weight assigned to u(B j ) in the restoration of block B ik . For a pixel i included in several blocks B ik , several estimations of the restored intensity NL(v)(i) are obtained in different NL(v)(B ik ). The weights [24] are defined as

$$ \omega \left({B}_{ik},{B}_j\right)=\frac{1}{Z_{ik}}{e}^{-\frac{\left\Vert u{B}_{ik} - u{B}_j\right\Vert {}_2^2}{h^2}} $$
(2)

where Z ik is the normalization constant which also ensures \( {\displaystyle \sum_{B_j}}\omega \left({B}_{ik},{B}_j\right)=1 \). The similarity term \( \left\Vert u{B}_{ik} - u{B}_j\right\Vert {}_2^2 \) is measured as a diminishing function of the weighted Euclidean distance. The h is the filtering parameter. We used a smaller search window size of 13 X 13, patch size of 5 X 5 and filtering parameter h = 15 as suggested in [27]. The Figure 2 shows a benign and Figure 3 shows a malignant breast ultrasound images. The original images are shown in (A), the speckle removed images using NLM method [24] are shown in (B).

Figure 2
figure 2

An image of a benign cyst. (A) Original image (B) Preprocessed image using NLM filter (C) Segmented image.

Figure 3
figure 3

An image of a malignant tumor (A) Original image (B) Preprocessed image using NLM filter (C) Segmented image.

An automatic clustering based segmentation method [28] is employed in order to obtain the contour of the lesion. The Fuzzy C Means (FCM) clustering [28,29], a complex non linear model is employed for segmentation. The objective function of FCM is given by

$$ J={\displaystyle \sum_{i=1}^c}{\displaystyle \sum_{j=1}^n}{\mu}_{ij}^m\ {s}_j-{a_i}^2 $$
(3)

where, a 1, a 2, …., a c are the c cluster centres. The μ ij represents the membership of pixel s j in the i th cluster and a i is the cluster centre. The constant m controls the fuzzyness and the membership functions and cluster centres are updated as follows:

$$ {\mu}_{ij}=\frac{1}{{\displaystyle {\sum}_{k=1}^c}\ {\left(\frac{s_j-{a}_i}{s_j-{a}_k}\right)}^{2/\left(m-1\right)}} $$
(4)
$$ {a}_i=\frac{{\displaystyle {\sum}_{j=1}^n}\ {\mu}_{ij}^m\ {s}_j}{{\displaystyle {\sum}_{j=1}^n}\ {\mu}_{ij}^m},i=1,2,\dots, c $$
(5)

By iterating Eqs. (4) and (5), a i and μ ij will vary towards the direction that minimize the objective function gradually and when the change of a i or μ ij is within the given tolerance, the iteration is stopped. However, the algorithm initializes c cluster centers randomly and the solution is sought by iterating cluster centers and partition matrix through local search strategy based on gradient method. Thus the FCM is sensitive to initial values, and the different initial cluster centers lead to different clustering results. Also it often get stuck at local minima and the result is largely dependent on the choice of the initial cluster centers [29], the particle swarm optimization (PSO) algorithm is employed to exploit the searching capability of FCM.

The PSO, a population-based heuristic method based on the inspirations of group behavior of flocks of birds to find optimal solution to the non-linear numeric problems. In PSO, a particle is individual, and a number of particles are grouped as a swarm [29]. The velocity and position of the particle at next iteration are calculated using the following equations:

$$ {V}_i\left(t+1\right)=w\cdot {V}_i(t)+{c}_1\cdot {r}_1\cdot \left({P}_i-{X}_i\right)+{c}_2\cdot {r}_2\cdot \left({P}_g-{X}_i\right) $$
(6)
$$ {X}_i\left(t+1\right)={X}_i(t)+{V}_i\left(t+1\right) $$
(7)

where, V i  = [v i,1, v i,2, …, v i,n ] and X i  = [x i,1, x i,2, …, x i,n ] represent the velocity and position of the particle i. The P i and P g represent the local and global best positions of the particle. The w is the inertia weight that controls the impact of previous velocity of particle on its current one; c 1 and c 2 are acceleration coefficients; r 1 and r 2 are two independent as well as uniformly distributed random variables with a range between 0 and 1. The PSO, with its efficient and adaptive search process provide near optimal solutions of an evaluation (fitness) function in an optimization problem and averts the FCM from trapping into local minima. A set of morphological operations [28] are performed as post processing in order to obtain the exact contour of the tumor. The parameters used for PSOFCM clustering are, number of particles n = 50, maximum inertia weight w max  = 0.9, minimum inertia weight w min  = 0.4, c 1 = 2 and c 2 = 2 [28,29]. The segmented images after morphological operations are shown in Figure 2(C) and Figure 3(C).

Feature extraction

The features in the breast ultrasound images can be divided into four categories; Texture, Morphology, Model based and Descriptor [30]. Relevant features from a specific category or set of features from two or more categories need to be extracted and selected for discriminating tumors in the classification stage [11].

Textural features

An important characteristic for identifying an object or regions of interest in an image is the texture. Though the texture and tone bear an inextricable relationship to one another in an image patch, the dominant property is the texture, when the patch has wide variation of features of discrete gray tone [31]. The textural features used in this work include six Histogram features, Markov Random Fields (MRF) based feature, three Tamura features, seven features of Grey Level Run-Length Matrix (GLRLM) and twenty two features of Gray Level Co-occurrence Matrix (GLCM).

Histogram features

The intensity histogram of an image is closely related to the characteristic of image such as brightness and contrast [14]. These features are computed from the histogram distribution of the image. The Six histogram features namely Mean, Variance, Skewness, Kurtosis, Entropy and Energy (F01 –F06) are listed in Additional file 1 along with their equations.

MRF features

Markov Random Fields (MRF) represents the distribution of conditional probabilities over elements in a lattice. Whereas, the probability assumed by an element depends only on the values of its neighbors [32]. In this context, each pixel in the image is considered as a random variable X r which assume x r  {0, 1, 2, …. G − 1}, where G is the gray level. If η r is the neighbor set of X r , the conditional probability is given as PX r  = x r |η r ).

The auto-logistic probability distribution model of MRF (F07) is presented as [32]

$$ P\left(\ {X}_r = {x}_r\Big|{\eta}_r\right)=\frac{e^{x_rT}}{1+{e}^T} $$
(8)

where, T depends on the neighborhood and the features used by the classifier are the free parameters contained in T and the number of parameters depends on the neighborhood order[32].

Tamura features

Six textural features were defined by Tamura et al. [33] (coarseness, contrast, directionality, line-likeness, regularity and roughness). Coarseness is a fundamental texture feature related to scale and repetition rates in an image. Coarseness identifies the largest size at which a texture exists. Contrast represents the dynamic range of grey levels in an image. Directionality, a global property, describes total degree of directionality over a region. The values for these features are calculated for each pixel to form a Tamura CND image [34]. The descriptions of the three features (F08-F10) are given in Additional file 1.

GLRLM

The texture, a pattern of grey intensity pixel in a particular direction from the reference pixels; the grey level run-length matrix (GLRLM) is a matrix, from which features can be extracted [35]. The number of adjacent pixels with the same grey intensity in a specific direction is known as the run length. Such set of consecutive and collinear pixel points with same gray level is gray level run. GLRLM is a two-dimensional matrix in which each element represents the number of elements j with the intensity i, in the direction θ. The GLRLM [35] is computed as

$$ R\left(\theta \right)=\left(g\left(i,j\right)\Big|\theta \right),0\le i\le {N}_g,0\le j\le {R}_{max} $$
(9)

where N g is the maximum gray level and R max is the maximum length. Additional file 1 is the list of name and descriptions of seven GLRLM features (F11-F17).

GLCM

The gray level co-occurrence matrix (GLCM) is a second-order method to generate texture features. The GLCM comprises the joint frequencies of all pairwise gray level combinations (i, j) with a separation of d along direction θ. By using a distance of one pixel and angles quantized to 45° intervals, four matrices of horizontal, first diagonal, vertical, and second diagonal are used [31,36]. For the four principle directions the unnormalized frequency is defined as follows:

$$ \begin{array}{l}P\left(i,j,d,\theta \right)=\#\\ {}\left\{\begin{array}{c}\hfill \left(\left(k,l\right),\left(m,n\right)\right)\epsilon \left({L}_x \times {L}_y\right)\times \left({L}_x \times {L}_y\right)\Big|\hfill \\ {}\hfill \left(k-m=0,\left|l-n\right|=d\right)\ or\ \left(k-m=d,l-n=-d\right)\ or\ \left(k-m=-d,l-n=d\right)\hfill \\ {}\hfill or\left(\left|k-m\right|=d,l-n=0\right)\ or\ \left(k-m=d,l-n=d\right)\ or\ \left(k-m=-d,l-n=-d\right),\hfill \\ {}\hfill I\left(k,l\right)=i,I\left(m,n\right)=j\hfill \end{array}\right.\end{array} $$
(10)

Where # is the number of elements in the set, (k, l) and (m, n) the coordinates with gray levels i and j, L x and L y the horizontal and vertical spatial domains and (L x  × L y ) are the set of resolution cells [36]. Totally twenty two features (F18-F39) are extracted through GLCM. The features along with their descriptions are given in Additional file 1.

Morphological features

The morphologic features are obtained from some local characteristics such as shape and margin of the breast lesion. It is an established fact that the borders of benign tumors are smoother than the borders of malignant ones; breast tumors can be evaluated based on their morphological information [16]. Five morphological features (F40-F44) are derived using convex polygon technique, based on the determination of the convex hull [15-19,36,37].

The overlap ratio (RS) is the ratio between actual area and the convex hull (F40) which is given as [16]

$$ RS=\frac{A(S)}{A\left({S}_0\right)} $$
(11)

where, the convex hull (S 0) is established for region S. It is the ratio of number of pixels in area S and S 0.

The aspect ratio (F41) is the ratio of a tumor’s depth and width. If the depth exceeds its width (or the ratio is greater than 1) the tumor has the higher probability of being malignant [15].

$$ Aspect\ ratio = \frac{D_{depth}}{D_{width}} $$
(12)

The circularity (C) is an important parameter in breast tumor classification [16]. The circularity is the ratio of square of the perimeter to the tumor area (F42). It is given as

$$ C = \frac{P^2}{A(S)} $$
(13)

The normalized residual value (NRV) [16] is given as (F43)

$$ NRV = \frac{A\left({S}_0\right)-A(S)}{P_0} $$
(14)

where, P 0 is the perimeter of convex hull S 0 . The NRV is the ratio between the residual area and perimeter.

The length of the tumor perimeter is an important indicator of malignancy, as malignant tumors usually have irregular shapes with a large tumor perimeter [15]. The ratio between the perimeter of the convex hull (P 0) and the perimeter of the tumor (P ratio ) (F44) is given as [18]

$$ {P}_{ratio}=\frac{P}{P_0} $$
(15)

It is important to note that unlike textural features; the morphological features have the advantage of being independent of settings of US systems or different US machines in the diagnosis of breast tumors [18].

Classification

Support vector machine (SVM) is an effective learning technique which constructs an optimal separating hyper plane in the high dimensional feature space. The SVM map the input vectors into a high-dimensional feature space through non-linear mapping and then an optimal separating hyperplane is chosen. The classification process involves training and testing of data which consist of some instances [38,39]. To create the optimal separating hyperplane between the classes, the SVM use training data.

For a given set of points {(x i , y i )1 ≤ i ≤ N }, where x i R n is the ith input vector and y i  {−1, 1} is the desired output (class label), the SVM finds a hyperplane to separate the training data with a maximal margin. This optimal separating hyperplane (OSH) w: wx + b = 0 maximizes the margin of the closest data points. The data points on the margin border are called support vectors. The classification solution [38] is given by the function:

$$ f(x)= sign\left({\displaystyle \sum_{i=1}^N}{\alpha}_i{y}_i\ K\left({x}_i,x\right)+b\ \right) $$
(16)

where α i is the positive Lagrange multiplier, x i is the support vector (a total of N) and K(x i , x) is the kernel decision function. Among the several kernel functions used in SVM such as linear, polynomial, and radial basis kernel functions (RBF), the RBF is most often used since it is suitable for classifying multidimensional data. The RBF [39] is given as

$$ K\left(x,y\right)= \exp \left(\gamma x-{y}^2\right) $$
(17)

Moreover, the RBF has fewer parameters such as C and γ which should take appropriate values.

Results

The ROIs are generated for the all 120 US images in the database through image segmentation. For each ROI, totally forty four features, including six histogram, MRF, three Tamura, seven GLRLN and twenty two GLCM features are extracted separately along with five morphological features. We have used 10 fold cross-validation [39] for classification (k = 10), where the total 120 images are divided into 10 partitions. A total of k runs are required to complete the overall classification task with a set of features. In our method, among the k parts of data, (k-1) parts are used to train the classifier while one part is kept under testing. This process is repeated for 10 times to ensure all parts of data are tested. The experiments have been run using MATLAB 7.0 (Mathworks Inc, USA), on a computer which has Intel Core i3 processer (Intel Corp, USA), 8 GB RAM and Windows 7 operating system (Microsoft Corp, USA).

Statistical parameters

The effectiveness of the features is evaluated in terms of statistical parameters; accuracy, sensitivity, specificity and Matthews correlation coefficient (MCC) [30] as shown in contingency table (Figure 4). The MCC is a powerful accuracy evaluation criterion for machine learning methods, with unbalanced number of positive and negative samples in specific [30].

Figure 4
figure 4

Contingency Table along with statistical parameters: Accuracy(Ac), Sensitivity(Se), Specificity(Sp) and Matthews correlation coefficient(MCC).

Evaluation of individual features

The individual feature vectors derived from textural and morphological methods are made as inputs of SVM classifier along with 10 fold validation scheme. The individual features, derived from the entire images of database are arranged as separate datasets. The diagnosed outcome is compared with pathologically proven facts. The values of all input features are normalized into the range of [−1,1], The output class of SVM is corresponded to either 0 or 1 depends on the discriminated tumor is benign or malignant. If the calculated value from the suspicious tumor region is nearer to 1, the image is classified as malignant. If the value is low enough to be considered as 0 and the image region is diagnosed as benign. The performance of individual features is shown in Table 1. Highest classification rates are achieved through GLCM features (Ac 90.83%, Se 90%, Sp 91.42% and MCC 0.8120), whereas with morphological features, the P ratio yielded highest classification values(Ac 85.83%, Se 90%, Sp 82.85% and MCC 0.7193).

Table 1 Individual performance of the textural features (thirty nine features in five categories) and five morphological features sorted by accuracy value

Evaluating combined features

In order to evaluate the performance of combined textural and morphological features, we combined top performing features from three textural categories (F08-F10, F11-F17 and F18-F39) and three morphological categories (F42, F43 and F44). The nine combinations of selected features (Table 2) are then used for the second set of experiments and the improvement has been observed in terms of statistical parameters is also shown in Table 2. The results suggest that the highest values in all parameters are achieved using the combination of GLCM and P ratio (Ac 95.83%, Se 96%, Sp 95.71% and MCC 0.9146).

Table 2 Performances of the combined best sets of textural and morphological features f or the entire database, using accuracy, sensitivity, specificity, and MCC as figures of merit

ROC analysis

The evaluation of overall value of a diagnosis test can be made through the use of a receiver operating characteristic (ROC) curve [40]. The curve is a plot between true positive rate (sensitivity) and false positive rate (1 − specificity) of a classification task. The curve passes through the point (0, 1) on the unit grid and a curve closer to this ideal point indicates the better discriminating ability of the system. The area A Z under the ROC curve (AUC) is an index of the quantitative measure of the overall performance of a diagnosis system [38]. The values of AUC can be used to compare the performances of different methods in distinguishing positive and negative findings of breast tumors as well as the overall performance of a diagnostic system. The AUC values are calculated using software package SPSS (SPSS Inc., USA). The ROC curves in Figure 5 show the AUC values of GLCM features (A Z  = 0.9380), ratio of perimeters (P ratio ) (A Z  = 0.8890) and combined performance of GLCM and P ratio (A Z  = 0.9444).

Figure 5
figure 5

The ROC curve shows the comparison of area under ROC (A Z ) for GLCM (A Z = 0.938), P ratio (A Z = 0.8890) and combination of GLCM and P ratio (A Z =0.9444).

Discussion

We have combined textural features and morphological features to improve the classification accuracy of diagnosing masses in breast ultrasound images. At first we have extracted features from the ROIs of the entire images in the database using textural as well as morphological methods separately as shown in Table 1. The observed accuracy values using the SVM classifier for textural features and morphological features are varied from 66.66% to 90.83% and 61.66% to 85.83% respectively. Texture analysis allows the detection of mathematical patterns in the gray-level distribution of the pixels. Textural features have been found to be efficient in classifying breast tumors in ultrasound images. Since the borders of benign tumors are smoother than the borders of malignant ones, breast tumors may be evaluated based on their morphological information [18]. The infiltrative nature of malignant tumors generate an irregular pattern of impedance discontinuities, which results irregular, spiculated or ill-defined boundary in breast ultrasound images. However, benign tumors have a more uniform growth with smooth, round, and well-defined boundaries [9].

Based on this, Wu et al. [17,18] combined textural and morphological features and achieved an AZ value of 0.9614. Furthermore, a sensitivity of 97.78% is achieved which means the system can detect malignant tumors with high probability in contrast with mammograms where the false negative rates are up to 20% [2]. Based on this, we have selected three best performing features from each textural and morphological category to form nine combined features.

As shown in Figure 6, the GLCM and P ratio combination produced highest numerical values among the all combinations under comparison with accuracy, sensitivity, specificity and MCC (95.83%, 96%, 95.71% and 0.9146) respectively (Table 2). Moreover, we have achieved an AZ value of 0.9444 with the combination of GLCM and P ratio . The textural based features (F01-F39) are extracted through histogram, MRF, Tamura, GLRLN and GLCM from each segmented ROI of the 120 breast ultrasound images. The GLRLN is computed in four orientations (0°, 45°90°135°) as suggested in [14]. The Gray Level Co-occurrence Matrix (GLCM) has been used to extract a total of twenty two features by considering the most appropriate direction θ = 45° and distance d = 2 as suggested in [14,39]. Besides, a quantization level of 32 is used in our experiment; although it is experimentally proved that the gray-level quantization does not improve or worsen the discrimination power of texture features but time consumption increases with the number of quantization levels [13].

Figure 6
figure 6

Comparison of classification performance of the combined best combinations of textural and morphological features, using accuracy, sensitivity, specificity, and MCC.

The shape variation between benign and malignant tumors in an ultrasound image is an effective feature for classifying breast tumors [41,15]. Nineteen morphological features, used in [15] with 118 breast ultrasound images, yielded an AZ value of 0.9087. The most common parameter to quantify tumor shape is the depth-to-width ratio [42], since benign cysts tend to be wider than deeper. An Aspect ratio [15] greater than 1 increases the probability of malignancy. The convex polygon technique is based on the determination of the convex hull; the smallest convex region that contains all points belonging to a given region shape [37]. Accordingly , the amount of irregularity in the contour increases the area difference between convex hull region and the tumor region, which corresponds to the level of malignancy. This characteristic can be quantified using two parameters: RS and NRV, using which, we have obtained accuracy values of 61.66% and 83.33% respectively. The length of the tumor perimeter is an important indicator for diagnosis [15]. As malignant tumors usually have irregular shapes, a large tumor perimeter indicates that a tumor is malignant. The circularity C, the ratio of square of the perimeter to the area, reflects the complexity of contours by producing higher values for irregular shapes. The ratio between the tumor perimeter and convex perimeter (P ratio ) [18] increases when the tumor shape is highly irregular. In our work, the accuracy values obtained by C and Pratio are 74.16% and 85.83% respectively.

The change in classification accuracy for different values of parameter γ is shown in Figure 7. The features for SVM are obtained through GLCM (θ = 45°, d = 2 and L = 32). The RBF kernel is used in SVM classifier, require appropriate values of C and γ for demonstrating optimum performance. We have set the C value (C = 100) as suggested in [18] and γ value is chosen (γ =0.2) by varying the value from 0.1 to 1 and observing the accuracy of discrimination as shown in Figure 7. The curves in Figure 5 compares the area A Z produced by the GLCM (A Z  = 0.9388), P ratio (A Z  = 0.8890) and the combination of these two (A Z  = 0.9444) which suggests that the combined performance of textural and morphological features increases the accuracy of diagnosis. Our experiment aims at assessing the individual and combined classification performance of textural and morphological features in breast ultrasound. However, for an automated breast CAD system, one cannot use all extracted features at the same time and it require an effective feature selection stage should be added. The classification performance depends on the selected feature set and also the size of the feature vector. Inadequate number of training samples for the finite number of training data leads to “curse of dimensionality” problem, which leads to degraded classification performance [43,44]. Introducing an efficient feature selection algorithm at this stage removes the irrelevant and redundant features and a new feature set is framed with low-dimensional dataset for effective classification [39,43]. This study suggests that an effective combination of textural and morphological features can increase the performance of CAD systems using breast ultrasound images.

Figure 7
figure 7

The variations observed in classification accuracy while changing the values of parameter γ between 0.1 to 1.

Conclusion

In this work, we have evaluated the classification performance of combined textural and morphological features for the discrimination of breast masses in ultrasound images. The individual and combined results produced by textural and morphological features are analyzed using statistical parameters: accuracy, sensitivity, specificity, MCC and area under ROC curve. The results suggested that the classification accuracy of breast ultrasound CAD system increases with combined textural and morphological features.

References

  1. Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D (2011) Global cancer statistics. CA Cancer J Clin 61(2):69–90

    Article  Google Scholar 

  2. Boyd NF, Guo H, Martin LJ, Sun L, Stone J, Fishell E, Yaffe MJ (2007) Mammographic density and the risk and detection of breast cancer. N Engl J Med 356(3):227–236

    Article  Google Scholar 

  3. Weigel S, Biesheuvel C, Berkemeyer S, Kugel H, Heindel W (2007) Digital mammography screening: how many breast cancers are additionally detected by bilateral ultrasound examination during assessment? Eur Radiol 23(3):684–691

    Article  Google Scholar 

  4. Zhou S, Shi J, Zhu J, Cai Y, Wang R (2013) Shearlet-based texture feature extraction for classification of breast tumor in ultrasound image.Biomedical. Signal Process Contr 8(6):688–696

    Article  Google Scholar 

  5. Sahiner B, Chan HP, Roubidoux MA, Hadjiiski LM, Helvie MA, Paramagul C, Blane C (2007) Malignant and Benign Breast Masses on 3D US Volumetric Images: Effect of Computer-aided Diagnosis on Radiologist Accuracy 1. Radiology 242(3):716–724

    Article  Google Scholar 

  6. Jalalian A, Mashohor SB, Mahmud HR, Saripan MIB, Ramli ARB, Karasfi B (2013) Computer-aided detection/diagnosis of breast cancer in mammography and ultrasound: a review. Clin Imaging 37(3):420–426

    Article  Google Scholar 

  7. Ikedo Y, Fukuoka D, Hara T, Fujita H, Takada E, Endo T, Morita T (2007) Development of a fully automatic scheme for detection of masses in whole breast ultrasound images. Med Phys 34(11):4378–4388

    Article  Google Scholar 

  8. Doi K (2009) Computer-aided diagnosis in medical imaging: achievements and challenges. In World Congress on Medical Physics and Biomedical Engineering, September 7–12, 2009, Munich, Germany (pp. 96–96) (2009, January). Springer Berlin Heidelberg.

  9. Pereira WCA, Alvarenga AV, Infantosi AFC, Macrini L, Pedreira CE (2010) A non-linear morphometric feature selection approach for breast tumor contour from ultrasonic images. Comput Biol Med 40(11):912–918

    Article  Google Scholar 

  10. Li H, Wang Y, Liu KR, Lo SC, Freedman MT (2001) Computerized radiographic mass detection. I. Lesion site selection by morphological enhancement and contextual segmentation. Med Imaging, IEEE Trans 20(4):289–301

    Article  Google Scholar 

  11. Prabusankarlal KM, Thirumoorthy P, Manavalan R (2014) Computer Aided Breast Cancer Diagnosis Techniques in Ultrasound: A Survey. J Med Imaging Health Inform 4(3):331–349

    Article  Google Scholar 

  12. Huang YL, Chen DR (2005) Support vector machines in sonography: application to decision making in the diagnosis of breast cancer. Clin Imaging 29(3):179–184

    Article  Google Scholar 

  13. Gómez W, Pereira WCA, Infantosi AFC (2012) Analysis of co-occurrence texture statistics as a function of gray-level quantization for classifying breast ultrasound. IEEE Trans Med Imaging 31(10):1889–1899

    Article  Google Scholar 

  14. Radhakrishnan M, Kuttiannan T (2012) Comparative Analysis of Feature Extraction Methods for the Classification of Prostate Cancer from TRUS Medical Images. IJCSI Int J Comput Sci Issues 9(1):1694–0814

    Google Scholar 

  15. Huang YL, Chen DR, Jiang YR, Kuo SJ, Wu HK, Moon WK (2008) Computer‐aided diagnosis using morphological features for classifying breast lesions on ultrasound. Ultrasound Obstet Gynecol 32(4):565–572

    Article  Google Scholar 

  16. Alvarenga AV, Infantosi AFC, Pereira WCA, Azevedo CM (2010) Assessing the performance of morphological parameters in distinguishing breast tumors on ultrasound images. Med Eng Phys 32(1):49–56

    Article  Google Scholar 

  17. Wu WJ, Moon WK (2008) Ultrasound breast tumor image computer-aided diagnosis with texture and morphological features. Acad Radiol 15(7):873–880

    Article  Google Scholar 

  18. Wu WJ, Lin SW, Moon WK (2012) Combining support vector machine with genetic algorithm to classify ultrasound breast tumor images. Comput Med Imaging Graph 36(8):627–633

    Article  Google Scholar 

  19. Alvarenga AV, Infantosi AFC, Pereira WC, Azevedo CM (2012) Assessing the combined performance of texture and morphological parameters in distinguishing breast tumors in ultrasound images. Med Phys 39(12):7350–7358

    Article  Google Scholar 

  20. Ultrasoundcases [http://ultrasoundcases.info/category.aspx?cat=67] (Accessed May 2014).

  21. Jia-Wei T, Chun-Ping N, Yan-Hui G, Heng-Da C, Xiang-Long T (2012) Effect of a Novel Segmentation Algorithm on Radiologists’ Diagnosis of Breast Masses Using Ultrasound Imaging. Ultrasound Med Biol 38(1):119–127

    Article  Google Scholar 

  22. Zhang J, Wang C, Cheng Y (2015). Comparison of Despeckle Filters for Breast Ultrasound Images. Circuits Syst Signal Process, 34(1):185-208

  23. Lu K, Hall CS (2014). Automatic ultrasound image enhancement for 2D semi-automatic breast-lesion segmentation. In SPIE Medical Imaging( 90351M-90351M). International Society for Optics and Photonics.

  24. Buades A, Coll B, Morel JM (2005) A review of image denoising algorithms, with a new one. Multiscale Modeling Simulation 4(2):490–530

    Article  MATH  MathSciNet  Google Scholar 

  25. Wu J, Tang C (2014) Random-valued impulse noise removal using fuzzy weighted non-local means. SIViP 8(2):349–355

    Article  Google Scholar 

  26. Zhan Y, Ding M, Wu L, Zhang X (2014) Nonlocal means method using weight refining for despeckling of ultrasound images. Signal Process 103:201–213

    Article  Google Scholar 

  27. Salmon J (2010) On two parameters for denoising with non-local means. IEEE Signal Process Lett 17(3):269–272

    Article  MathSciNet  Google Scholar 

  28. Prabusankarlal KM, Thirumoorthy P, Manavalan R (2014) Combining Clustering, Morphology and Metaheuristic Optimization Technique for Segmentation of Breast Ultrasound Images to Detect Tumors. Int J Computer Appl 86:28–34

    Google Scholar 

  29. Li C, Zhou J, Kou P, Xiao J (2012) A novel chaotic particle swarm optimization based fuzzy clustering algorithm. Neurocomputing 83:98–109

    Article  Google Scholar 

  30. Cheng HD, Shan J, Ju W, Guo Y, Zhang L (2010) Automated breast cancer detection and classification using ultrasound images: A survey. Pattern Recogn 43(1):299–317

    Article  MATH  Google Scholar 

  31. Haralick RM, Shanmugam K, Dinstein IH. (1973) Textural features for image classification. IEEE Trans Syst Man Cybernetics, SMC-3(6):610-621

  32. Schwartz W R, Pedrini H (2004) Texture classification based on spatial dependence features using co-occurrence matrices and Markov random fields. In: IEEE Int.Conf. Image Processing, 2004. (1) 239–242 (2004, October).

  33. Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybernetics 8(6):460–473

    Article  Google Scholar 

  34. Howarth P, Rüger S (2004)Evaluation of texture features for content-based image retrieval. In: Image and Video Retrieval (326–334). Springer Berlin Heidelberg.

  35. Krishnan KR, Sudhakar R (2013) Automatic Classification of Liver Diseases from Ultrasound Images Using GLRLM Texture Features. In: Soft Computing Applications (611–624) Springer Berlin Heidelberg.

  36. Al-Janobi A (2001) Performance evaluation of cross-diagonal texture matrix method of texture analysis. Pattern Recogn 34(1):171–180

    Article  MATH  Google Scholar 

  37. Alvarenga AV, Pereira WC, Infantosi AFC, de Azevedo CM (2005).Classification of breast tumours on ultrasound images using morphometric parameters. In: IEEE Int. Workshop on Intelligent Signal Processing (206–210) (2005, September)

  38. Huang YL, Wang KL, Chen DR (2006) Diagnosis of breast tumors with ultrasonic texture analysis using support vector machines. Neural Comput Appl 15(2):164–169

    Article  Google Scholar 

  39. Thangavel K, Manavalan R (2014) Soft computing models based feature selection for TRUS prostate cancer image classification. Soft Comput 18(6):1165–1176

    Article  Google Scholar 

  40. DeLong E R, DeLong D M, Clarke-Pearson D L (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44(3):837–845

  41. Chang RF, Wu WJ, Moon WK, Chen DR (2005) Automatic ultrasound segmentation and morphology based diagnosis of solid breast tumors. Breast Cancer Res Treat 89(2):179–185

    Article  Google Scholar 

  42. Horsch K, Giger ML, Venta LA, Vyborny CJ (2002) Computerized diagnosis of breast lesions on ultrasound. Med Phys 29(2):157–164

    Article  Google Scholar 

  43. James A, Dimitrijev S (2012) Ranked selection of nearest discriminating features. Human-Centric Comput Inform Sci 2(1):1–14

    Article  Google Scholar 

  44. Ganesan K, Acharya U, Chua CK, Min LC, Abraham K, Ng K (2013) Computer-aided breast cancer detection using mammograms: a review. IEEE Rev Biomed Eng 6:77–98

    Article  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge Dr T.S.A.Geertsma, MD, Head, Department of Radiology, Gelderse Vallei Hospital, Ede ,the Netherlands, for providing breast ultrasound images in various categories.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kadayanallur Mahadevan Prabusankarlal.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

Author KMP drafted this manuscript, conducted experiments using the datasets and analyzed the results. Author Dr PT suggested the methods used in this study and provided guidelines in drafting the manuscript. Author Dr RM offered technical support in constructing toolbox for experimental studies. All authors corrected and approved the manuscript.

Authors’ information

Author KMP holds a Masters in Science, Masters in Philosophy degrees in Electronics. He also possesses Masters Degree in Computer Science and Engineering. Currently he works as Assistant Professor and Head of electronics and communication, K.S.Rangasamy college of arts and science, India.

Author PT holds PhD in Electronics and working as assistant professor in Electronics at Government arts college, Dharmapuri, India. He is also a research supervisor at Bharathiar University, Coimbatore, India.

Author RM holds PhD in Computer Science. He works as Associate Professor and Head of Computer Applications, K.S.Rangasamy college of arts and science, India. He is also a research supervisor at Periyar University, Salem, India.

Additional file

Additional file 1:

Detailed descriptions of features: Histogram(F01-F06), Tamura (F08-F10), GLRLM(F11-F17) and GLCM(F18-F39).

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prabusankarlal, K., Thirumoorthy, P. & Manavalan, R. Assessment of combined textural and morphological features for diagnosis of breast masses in ultrasound. Hum. Cent. Comput. Inf. Sci. 5, 12 (2015). https://doi.org/10.1186/s13673-015-0029-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13673-015-0029-y

Keywords