key: cord-0057681-ipxci9vg
authors: Bhatt, Hardik H.; Mankodia, Anand P.
title: Deep Recurrent Neural Network with Tanimoto Similarity and MKSIFT Features for Medical Image Search and Retrieval
date: 2020-06-08
journal: Computing Science, Communication and Security
DOI: 10.1007/978-981-15-6648-6_4
sha: 87e89c49039e7c9f8cad95ecd4c94eb69ca44b10
doc_id: 57681
cord_uid: ipxci9vg

The innovation of digital medical images has led to the requirement of rich descriptors and efficient retrieval tool. Thus, the Content Based Image Retrieval (CBIR) technique is essential in the domain of image retrieval. Due to the growing medical image data, the searching or retrieving a relevant image from the dataset is a major problem. To address this problem, this paper propose a new medical image retrieval technique, namely Multiple Kernel Scale Invariant Feature Transform-based Deep Recurrent Neural Network (MKSIFT-Deep RNN) using the image contents. The goal is to present an effective tool that can be utilized for effective retrieval of image from huge medical image database. Here, MKSIFT is adapted for extracting the relevant features obtained from acquired input image. Moreover, MKSIFT evaluates the key point descriptor using kernels functions, wherein the weights are allocated to kernels. The feature vectors are employed in the Deep RNN for classifying the images by training the classifier, which is considered as training phase. In testing phase, a set of query images is given to the classifier which adapts Tanimoto similarity for retrieving the images. The proposed MKSIFT-Deep RNN outperformed other methods with maximal precision of 93.723%, maximal recall of 93.652% and maximal F-measure of 93.687%.

visual semantic relation using the query provided by the user [6] . There is also a need to search images from the dataset which is handled by model for effectual recovery of medical images [7] .

The CBIR is adapted on existing techniques and gained interest in the retrieval of medical images. Different feature extraction methods are adapted on the basis of boundary contour, spatial layout, color, and texture. In [8] , CBIR technique is devised using the features like histograms of oriented gradients (HOG), SIFT, and local binary pattern (LBP), for attaining improved results in image retrieval. In [9] , improved CBIR technique is developed using certain attributes like wavelet-based histogram approaches that utilize relevance feedback for retrieving the images. An optimized technique is devised in [10] for pattern retrieval on the basis of quantized histograms. In [11] , a technique is devised for training deep convolutional neural network (DCNN) in order to enhance the CBIR.

This research presents a novel method, namely MKSIFT-Deep RNN for medical image retrieval using data set of medical images. Here, MKSIFT feature is adapted for generating the feature vector by extracting significant features from the medical image database. The purpose is to retrieve the images from the huge database. Here, MKSIFT is adapted for extracting the relevant features using acquired input image. Moreover, MKSIFT evaluates the key point descriptor using kernel function, wherein the weights are allocated to the kernels. The feature vectors are employed in the Deep RNN for classifying the images by training the classifier, which is considered as training phase. In testing phase, a set of query images is given to the classifier which adapts Tanimoto similarity for retrieving the images.

The major contribution of the research is:

• Proposed MKSIFT-Deep RNN for Medical Image Retrieval: Develop a novel medical image retrieval model, namely Multiple Kernel Scale Invariant Feature Transformbased deep recurrent neural network (MKSIFT-Deep RNN) for effective medical image retrieval.

From the literary works, the CBIR models are devised and modified using deep learning approaches. However, there are still some issues which are not addressed. Firstly, the semantic gap is unsolved that still exist between low level feature representation of medical images. Moreover, the issues confronted by the existing methods stood as the motivation for devising a novel medical image retrieval method.

The eight existing techniques based on medical image retrieval is deliberated below: [4] devised an outsourced CBIR method using bag-of-encrypted-words (BOEW) model for retrieving images. Here, the method utilized permutation, and color value substitution. Moreover, the BOEW model was designed for representing each image using feature vector. The similarity between the images was computed using Manhattan distance. However, the method failed to utilize local descriptors for BOEW model.

The challenges confronted by the existing methodologies are deliberated below:

• Even though, different methods are devised for medical image retrieval [20] semantic gap remains a challenging issue in current CBIR methods. The semantic gap existsamid low-level image pixels obtained via machine and high-level semantic concept obtained via humans [3] . • In [1] , outsourced CBIR scheme is devised using bag-of-encrypted-words (BOEW) model for retrieving the images from massive datasets. The method is effective for faster retrieval, but confronted issues like complex computations and heavy storage. • The proficient recovery of images using massive image datasets is major issue. In recent days, the images are retrieved using visual information and CBIT techniques. Figure 1 illustrates the schematic view of medical image retrieval model using proposed MK-SIFT-based Deep-RNN. Initially, the medical images are fed to feature extraction module. The extraction of features is done using MK-SIFT [1] . Once the significant features are obtained, then the group retrieval is performed using the generated features and Deep RNN [12] . The training of Deep RNN is performed using the MKSIFT features for tuning the optimal weights in order to perform group retrieval. Here, query image is given as an input, which is further matched with the classified images using Tanimoto similarity measure [13] . Thus, the image retrieval is performed using Tanimoto similarity measure. By computing the similarity between query image and classified image set using Tanimoto similarity measure, the retrieval of relevant instances is done. The briefer illustration of each steps is illustrated in the below section. 

where, I k represent k th input image, and l indicate total number of images. Each image I k is processed for extracting the significant features using MKSIFT approach which is elaborated in the below subsection.

The noteworthy features obtained from input image and the connotation of feature extraction is to produce highly relevant features which facilitate improved retrieval of medical images. Meanwhile, the complexity of analyzing the medical image is reduced as the image is modelled as the reduced set of features. In addition, the precision allied with the classification is guaranteed with efficient feature extraction for which the MKSIFT [1] approach is employed. The MKSIFT is a feature extraction technique which is devised by modifying the SIFT feature with different weightage method in key point descriptor for extracting features from input medical image. The MKSIFT-based feature extraction is classified into different steps which involve extremadetection, key point's removal, assignment of orientation, and calculation of descriptors. Moreover, the Gaussian function present in the key point descriptor is restored with exponential kernel and tangential kernel functions.

SIFT [14] is a method which transforms the image into different scale invariants on the basis of local features. The method devises vast features which covers the complete variety of images. For matching images, the SIFT features are mined from images. The four steps considered for the generating the feature set which are described below:

(i) Discovery of Scale-Space Extrema: The first phase for extracting feature is to determine the location and image scales by detecting steady features over scales considering Gaussian function, that can be modelled as,

Where, I (m, n) represent input medical image, * indicate convolution operator, G(m, n, α) denote Gaussian function.

The Gaussian function is expressed as,

In [15] , scale-space extrema is employed in difference-of-Gaussian (DoG) function to determine key point localization. This is represented by a function, X (m, n, α) generated by the difference between two scales, which are detached by a constant as,

where, v indicate constant.

In addition, the DoG function provides an approximation based on the Laplacian of Gaussian α 2 ∇ 2 G. From the equation, the relation between α 2 ∇ 2 G is,

The extrema are determined using each sample point, which is further compared to its neighbours and other nine neighbours that reside in scale. The point is chosen if result is either larger or smaller.

(ii) Localization of Key Point: After detecting key points, the subsequent steps are followed which is elimination of key points with low contrast by carrying a data fit for determining the scale and location. This is computed on the basis of expansion of scale-space function using Taylor series by,

where, g indicate offset given by g = (m, n, α) T

The location is expressed as,

While the offset is instituted to be larger than value of threshold, then the extremum is at the diverse sample point, which holds low contrast. Hence, by this assessment, the low contrast key point is eliminated. The unstable extrema with low contrast are eliminated using function K(d ) expressed as:

In DoG, an anomalous peak poses a outsized principal curvature, which is removed. The principal curvatures isevaluated considering Hessian matrix as, K = X mm X mn X mn X nn (10) where, K represent Hessian matrix.

The key point descriptor is expressed using proper orientation assignment considering key points based on local images. For evaluating the scale invariant, the Gaussian smoothed image is chosen considering scale of the key point. The magnitude and orientation is computed using pixel differences as follows:

where M (m, n) indicate magnitude, w(m, n) represent orientation, and S indicate scale space. Based on magnitude and orientations of the key point, a histogram based on the orientation is designed.

The last phase is computation of key point descriptor using image gradients considering region of key point. On the basis of scale of key point, the orientation and the magnitude are computed to choose the Gaussian blur level of image. The coordinate of descriptor are rotated on the basis of orientation of key point to determine the orientation invariance. However, Gaussian function could not protect image brightness, offering less emphasis to gradients. Thus, MKSIFT method devises two kernel functions, namely tangential and exponential kernels which help to augment variance, thereby minimizing relics of image. Thus, the weight function of MKSIFT is expressed as,

The kernel function is represented by,

where, η and ρ represent weight coefficients that ranges between [0, 1]. The obtained features is accumulated in the feature vector denoted as F; (1 ≤ a ≤ e). The feature vector F is fed to the Deep RNN for classifying the images into groups.

The Deep RNN is employed to retrieve the groups considering the MKSIFT features. The architecture of Deep RNN is portrayed below.

The features F extracted from the input images are given as the input to the Deep RNN classifier. Deep RNN [12] is the network architecture that contains multiple recurrent hidden layers in network hierarchy layer. In Deep RNN the recurrent connection exists at the hidden layer. The Deep RNN classifier operates effectively under the varying input feature length based on the sequence of information. It uses the knowledge of previous state as input in the current prediction and process the iteration using the hidden state information. The recurrent feature makes the Deep RNN to be highly effective in working with the features. Due to the sequential pattern of information, Deep RNN is considered as the best classifier among traditional deep learning approaches. The architecture of Deep RNN is represented in Fig. 2 . , respectively. The pair of each elements of input and the output vectors is termed as the unit. Here, i denotes the arbitrary unit number of b th layer, and y represents the total number of units of b th layer. In addition to this, the arbitrary unit number and the total number of units of (b − 1) th layer is denoted as j and E, respectively. At this time, the input propagation weight from (b − 1) th layer to b th layer is expressed as, W (b) ∈ H y×E , and the recurrent weight of b th layer is modelled as w (b) ∈ H y×y . Here, H denotes the set of weights. However, the components of the input vector is expressed as,

ii are the elements of W (b) and w (b) . i denotes the arbitrary unit number of b th layer. The elements of the output vector of b th layer is represented as,

where, β (b) denotes the activation function. However, the activation functions, like sigmoid function as β(F) = tanh(F), rectified linear unit function (ReLU) as β(F) = max(F, 0), and the logistic sigmoid function as β(

are the frequently used activation function.

To simplify the process, 0 th weight as p

i0 and 0 th unit as J

are introduced and hence the bias is represented as,

Here, J (b,r) denotes the output of classifier.

For effective image retrieval, a query image Q is fed to the feature extractor and is described for producing its new feature vector F ; (1 ≤ a ≤ e). The searching is done by matching training feature vector F against new feature F using Tanimoto similarity measure. The feature fetches the matching images as the result of search output. The Tanimoto metric is represented as, 

where, h a represent a th feature residing in feature vector F, and c a indicate the a th feature residing infeature vector F . Thus, the Tanimoto similarity is employed for retrieving the relevant images from the classified database.

This comparison of proposed method with conventional methods using precision, Fmeasure and recall is illustrated. In addition, the effectiveness of proposed MKSIFT-Deep RNN method is analyzed by varying number of query.

The execution of proposed MKSIFT-Deep RNN is done in PYTHON using PC having Windows 10 OS, 4 GB RAM, and Intel i5 core processor.

The medical image dataset employed for the experimentation to describe the analysis of performance using each medical image retrieval method is described below. Here, the database is designed by considering prostate cancer images, retinal images, iris images, breast cancer images, skin cancer images, bacilli images, and BRATS dataset images [17] [18] [19] .

The effectiveness of proposed MKSIFT-Deep RNN is employed for analyzing methods includes the precision, recall and F-measure.

The precision parameter defines the ratio of relevant images from the retrieved images considering a query and is given as,

where, rel denote relevant images, ret represent retrieved images.

The ratio of total relevant images that are actually retrieved is given as,

The harmonic mean of recall and precision is termed as F-measure and is represented as,

The methods employed for the analysis include: SIFT [14] , HOG + MKSIFT (Applied HOG [16] in MKSIFT), MKSIFT [1] , and proposed MKSIFT-Deep RNN algorithm. Figure 3 portrays the analysis of methods considering query set-1 using precision, recall and F-measure parameter. Each query set poses 15 images from the acquired database. The analysis based on precision parameter is described in Fig. 3a Figure 4 portrays the analysis of methods considering query set-2 using precision, Fmeasure and recall parameter. The analysis using precision parameter is described in Fig. 3a Fig. 4c . When the number of query is 4, the recall values computed 

This research proposes an image retrieval model, namely MKSIFT-Deep RNN for retrieving the relevant image from the medical image database. Here, MKSIFT approach is employed for selecting the relevant feature from the database. The MKSIFT utilizes the SIFT wherein key point descriptor is computed based on different kernel functions. In MKSIFT, the weight is assumed to be stable for classifying the input images. In addition, Deep RNN is employed for classifying the images into groups using generated feature vector. Whenever the query set is given to the proposed MKSIFT-Deep RNN, the medical image is processed to extract the features in order to devise the image contents. These features adapt Tanimoto similarity measure for comparing the images of the classified database for effective image retrieval. The proposed MKSIFT-Deep RNN outperformed other methods with maximal precision of 93.723%, maximal recall of 93.652% and maximal F-measure of 93.687%. For future works, some advanced optimization techniques can be employed to train the deep classifier in order to improve performance by accomplishing better image retrieval.

Multiple kernel scale invariant feature transform and cross indexing for image search and retrieval

Scene analysis and search using local features and support vector machine for effective content-based image retrieval

Content based image retrieval using deep learning process

BOEW: A content-based image retrieval scheme using bag-of-encrypted-words in cloud computing

BSIFT: toward data-independent codebook for large scale image search

Content-based image retrieval and feature extraction: a comprehensive review

A novel technique for effective image gallery search using content based image retrieval system

Feature integration analysis of bag-of-features model for image retrieval

Content bases image search and retrieval using indexing by k means clustering technique

DCT histogram optimization for image database retrieval

Deep convolutional learning for content based image retrieval

Deep recurrent neural network for mobile human activity recognition with high throughput

Color histogram features based image classification in content-based image retrieval systems

Distinctive image features from scale-invariant keypoints

Object recognition from local scale-invariant features

A new content based image retrieval system by HOG of wavelet sub bands

The multimodal brain tumor image segmentation benchmark (BRATS)

Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features

Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge

Disease diagnosis and treatment using deep learning algorithms for the healthcare system