key: cord-0511028-hll82jmb authors: Yang, Zhenyu; Hu, Zongsheng; Ji, Hangjie; Lafata, Kyle; Floyd, Scott; Yin, Fang-Fang; Wang, Chunhao title: A Neural Ordinary Differential Equation Model for Visualizing Deep Neural Network Behaviors in Multi-Parametric MRI based Glioma Segmentation date: 2022-03-01 journal: nan DOI: nan sha: 018b23da2048b5a07ae1bf04434007ad53126853 doc_id: 511028 cord_uid: hll82jmb Purpose: To develop a neural ordinary differential equation (ODE) model for visualizing deep neural network (DNN) behavior during multi-parametric MRI (mp-MRI) based glioma segmentation as a method to enhance deep learning explainability. Methods: By hypothesizing that deep feature extraction can be modeled as a spatiotemporally continuous process, we designed a novel deep learning model, neural ODE, in which deep feature extraction was governed by an ODE without explicit expression. The dynamics of 1) MR images after interactions with DNN and 2) segmentation formation can be visualized after solving ODE. An accumulative contribution curve (ACC) was designed to quantitatively evaluate the utilization of each MRI by DNN towards the final segmentation results. The proposed neural ODE model was demonstrated using 369 glioma patients with a 4-modality mp-MRI protocol: T1, contrast-enhanced T1 (T1-Ce), T2, and FLAIR. Three neural ODE models were trained to segment enhancing tumor (ET), tumor core (TC), and whole tumor (WT). The key MR modalities with significant utilization by DNN were identified based on ACC analysis. Segmentation results by DNN using only the key MR modalities were compared to the ones using all 4 MR modalities. Results: All neural ODE models successfully illustrated image dynamics as expected. ACC analysis identified T1-Ce as the only key modality in ET and TC segmentations, while both FLAIR and T2 were key modalities in WT segmentation. Compared to the U-Net results using all 4 MR modalities, Dice coefficient of ET (0.784->0.775), TC (0.760->0.758), and WT (0.841->0.837) using the key modalities only had minimal differences without significance. Conclusion: The neural ODE model offers a new tool for optimizing the deep learning model inputs with enhanced explainability. The presented methodology can be generalized to other medical image-related deep learning applications. Gliomas, including low-grade glioblastoma (LGG) and high-grade glioblastoma (HGG), are the most severe brain tumor types with extremely high mortality and morbidity [1] [2] [3] [4] [5] . Radiological analysis, particularly magnetic resonance imaging (MRI), has become standard-of-care in glioma management [2, 4, 6, 7] . It is important to acquire accurate delineation of glioma boundaries with potential sub-region segmentations in personalized glioma treatment regimens for improved survival time [8, 9] . Multi-parametric MRI protocols that involve multiple anatomical and functional sequences can provide comprehensive and possible complementary information of spatial disease distribution [7, [10] [11] [12] [13] ; therefore, multi-parametric MRI has been extensively studied for glioma segmentation. Currently, glioma segmentation is mainly done through manual delineations by experienced neuroradiologists [8, 14] . Such manual segmentation could be very time-consuming, and results could be subject to large intra-and inter-rater variability [15, 16] . Many efforts have thus been made for semi-or fully-automatic glioma segmentation for both efficiency and potential accuracy improvements [17] . Earlier efforts include 1 st -order intensity thresholding [18] and region-based approach [19] for regional segmentation, but the reported performance was limited [20] . As a highthroughput computational approach [21] , radiomics analysis combined with classical machine learning methods has been studied for glioma segmentation [17, 22] . The extracted radiomics features serve as computational biomarkers of image intensity texture and morphology descriptors, which are subsequently used by classifiers to distinguish glioma from normal tissue [23] [24] [25] [26] [27] . Recently, deep learning methods have been considered as a new computational approach for glioma segmentation [12, [28] [29] [30] . Deep neural networks can directly learn image features at multiple scales for segmentation tasks [31, 32] , and the state-of-the-art deep learning models have achieved outstanding performances in glioma segmentation [12, 29, 30] . However, despite promising results in research works, potential clinical applications of deep learning models in glioma segmentation remain challenging [22, 33] . One major issue of currently available deep learning models is the lack of model explainability, i.e., the extent to which the internal mechanisms of a deep neural network can be explained in human terms from a clinical perspective. The explainability ensures that the networks are driven by (1) deep features that are appropriate for clinical practice and (2) decisions that are clinically defensible [34] [35] [36] . Without such model explainability, deep learning algorithms remain a 'black box' in implementation. The massive data computations in deep neural networks are beyond human logical and symbolic abilities for causality [37] , which raises technical issues of deep learning model development for medical imaging applications. These issues include, but are not limited to, the utilization of model input ('Do we need this as a part of the model?'), the confidence of model results ('Can I trust this run with some clues?'), and feasibility of model generalization ('How do I know if it works for this case?'). Eventually, the absence of model explainability causes a lack of accountability and confidence in clinic application. In this paper, we develop a deep neural network visualization method for multi-parametric MRI based glioma segmentation to enhance model explainability. Driven by the hypothesis that deep features extraction from multi-source input can be modeled as a spatiotemporal continuous process, we model a deep neural network's feature extraction process by an ordinary differential equation (ODE) [38] . Such a model (i.e., a neural ODE) can provide visualization effects of MRI image dynamics following their interactions with the deep neural network. A quantitative evaluator, Accumulated Contribution Curve (ACC), is designed to evaluate the image dynamics as a measure of each MRI modalities' utilization in glioma segmentation. Several comparison studies are included to verify the key MR modalities identified by ACC in glioma segmentation. This work employed the Brain Tumor Segmentation Challenge (BraTS) 2020 dataset, a publicavailable large MRI image dataset from 369 subjects with glioblastoma (GBM/HGG) and lowgrade glioma (LGG) [7, 12, 39] . Each subject has 4 standard MRI exams as a multi-parametric protocol: native T1 weighted (T1), contrast-enhanced T1-weighted (T1-Ce), T2-weighted (T2), and Fluid Attenuated Inversion Recovery (FLAIR). The ground-truth tumor segmentation from experienced neuro-radiologists comprises three non-overlapping subregions: contrast-enhanced tumor (ET), peritumoral edema (ED), and the necrotic and non-enhancing tumor core (NCR/NET). Following previous studies related to the BraTS dataset [7, 10] , another three overlapped tumor regions (from small to large) are adopted in this work, namely enhancing tumor (ET), tumor core (TC), and whole tumor (WT). Figure 1 illustrates the relationship of ground-truth segmentation masks. All MRI images in the BraTS dataset have been processed with skull stripping and resampled to the isotropic 1mm spatial resolution. We hypothesize that deep feature extraction by a deep neural network during image segmentation can be described mathematically as a spatiotemporal continuous process. As visualized in Figure 2 , an image source before neural network input gradually evolves into a deep feature map towards binarized segmentation. Here, we refer to this derivable evolution as an image flow Φ [40] . When multiple image sources are available, i.e., multi-modality images for segmentation, each individual image source possesses an image flow during the deep feature extraction process. process. In such a process, an image source before neural network input gradually evolves into a deep feature map towards binarized segmentation. We refer to this derivable evolution as an image flow. Mathematically, let be the collection of all available image sources, i.e., = 1,2, … , . We The hidden state = { } ∈ (i.e., the collection intermediate image frames within Φ at t) can be considered as the solution to the ODE initial value problem: As such, 1 = (1) denotes the final deep feature map. Equation (1) The key task of our deep learning model design is to find f in Equation (1) . Based on the universal approximation theorem, can be approximated by a deep neural network that does not require explicit expression as in physical models [44, 45] . While f does not require specific neural network architecture, in this work we focus on U-Net, which is a suitable and popular method for medical image segmentation [29] . The core design of U-Net architecture in this work is shown in Figure 3 (A). Specifically, the structure consists of an encoding part and a decoding part. The encoding part is a convolutional network that consists of five convolutional blocks. Each block contains two convolutional layers followed by a rectified linear unit (ReLU) and a max pooling operation; in this process, the spatial dimension decreases while feature information increases. The decoding part combines the feature and spatial information through the corresponding five up-convolutions and concatenations with high-resolution features from the encoding part; in this process, the spatial dimension increases while feature information decreases. Such a U-shaped structure benefits feature extraction and preserves the structural integrity of the image [46] . In this work, we parametrize the function of by the U-Net shown in Figure To reach final segmentation results as a binarized mask, a convolutional layer follows the final deep feature maps at t=1. This convolutional layer assigns a binary label to each pixel by performing weighted summation on the final deep feature maps 1 : ) . ( where σ is the sigmoid function, and is the learnable weight of image modality k in the convolutional layer. Thus, m (=4 in this study) deep feature maps are constructed to one binarized segmentation mask as in Figure 3 (B). For a trained model, the weight is known, so can be used to reconstruct intermediate segmentation frame along with the hidden state : Evidently, when t = 1, we reach the final segmentation result = . The collection of can also be visualized to illustrate the segmentation formation. By providing the data pair ( 0 , ), we train the model in Figure 3 (B) as a supervised learning problem. The forward problem in Equation (2) can be solved by standard ODE solvers. During the training phase, the gradients of the output 1 with respect to the input 0 and the parameter can be obtained using the adjoint sensitivity method; this includes solving an additional ODE backward in stage dimension and thus invoking another ODE solver in the backward pass. Once the gradient is obtained, the standard gradient-based optimization can be applied. For technical details we refer to the mathematical works in [38, 40, [47] [48] [49] . While the hidden state and segmentation frame provide visual clues of U-Net behavior in segmentation, the image-based illustration may be insufficient to quantify the importance of individual input (i.e., utilization) towards segmentation results. Specifically, micro-regional where ∈ , ∈ . As defined, BCF remains in the same dimension as a single MRI image source, and it can be visualized along with and . To quantify the spatiotemporal heterogeneity of , for each MRI modality k, we compare and final segmentation results and define accumulated contribution factor as: where ∩ and ∪ are intersection and union operations of the Boolean image, respectively, and ranges from 0 to 1. The collection of , namely accumulated contribution curve (ACC), describes the dynamic properties of an image modality's role in multi-modality image segmentation: when ACC has a positive slope (as a function of t), the deep learning model tends to extract more useful deep features from the studied MRI modality towards a positive contribution of final segmentation results. An ACC with a steeper slope and higher value when approaching t = 1 suggests that the studied MRI modality is more important within the multi-parametric imaging protocol for the segmentation task. All calculations in this work were carried out in Python 3.7 with 8 Core Intel Xeon Silver 4112 CPU @ 2.6 GHz, 32 GB RAM, and Nvidia Quadro P5000 Graphic Card. We implemented the neural ODE deep learning model in Figure 3 (B) with TorchDyn Library [50, 51] . A total of three independent models were trained to achieve segmentation results of ET, TC, and WT regions, respectively. A total of 24422 samples as 4-channel 2D MRI images were randomly divided into the training set and test set in the ratio of 8:2 in the patient assignment. The Adam optimizer with the initial learning rate of 10 -3 was applied. The standard 4 th order Runge-Kutta ODE solver was adopted as an ODE solving engine with 10 -3 tolerance, and the adjoint sensitivity method was employed to optimize parameters in the U-shape structure. The parameters in the last 1x1 convolutional layer were also updated in the backpropagation. In the model trained for each segment region, , , and results were saved at a step size of 0.01 from t = 0 to t = 1, and ACCs were calculated based on final segmentation results . Based on the qualitative inspection of , and , as well as quantitative analysis of ACCs, we identified key MRI modalities of each segmentation region. We then evaluated the following two studies: 1) Baseline segmentation results (ET/TC/WT) from the basic U-Net model. i.e., the U-Net model in Figure 3 (A) that uses all 4 available MRI modalities as input. 2) Segmentation results (ET/TC/WT) from a modified U-Net model. i.e., a U-Net model with the same design as in the baseline model but has reduced input dimensions using only the identified key MRI modalities from neural ODE results. In both studies, the same training and test sets were used as in the neural ODE model. The loss function was binary cross-entropy, and the initial learning rate in the Adam optimizer was 10 -3 . Ten-fold cross-validation within the training set was employed to evaluate the segmentation results objectively. Following the previous BraTS related studies [50] , we included Dice similarity index, accuracy, sensitivity, and specificity evaluation metrics. For a successful neural ODE design and analysis, we expect minimal difference in segmentation results between these two studies. Wilcoxon signed-rank test on Dice index with significance level 0.05 was adopted. in the comparison studies. The results from all four MRI modalities are the baseline results. As the main numeric evaluator, the Dice coefficients from segmentation results using neural ODE selected modalities are very close to baseline results without statistically significant differences. The other three evaluators also show minimal numeric differences. In contrast, when using complementary MRI modalities, i.e., ones not selected by the neural ODE model, the Dice coefficients are much lower than baseline results with significant differences. In addition, sensitivity and accuracy results are compromised when using complementary modalities, but specificity seems insensitive to MRI modality selection. Table II Table I . In this work, we demonstrated a deep learning neural ODE design that models the multi-parametric MRI-based glioma segmentation as a continuous process. One of the key innovations in this work is the spatiotemporal continuous segmentation modeling, which was conceptualized as an ODE that governs the deep feature extraction process. Here, by explicitly specifying the ODE as a U- In addition to the input utilization issue, the developed model also helps to address the results confidence issues. Based on the results in Figures (4) - (6), the formation process of the binary segmentation mask can be visualized. Our results successfully illustrated how each segmentation mask was learned by U-Net through transitions from the whole brain to the localized tumor region. The transitions from global to local content and from coarse to fine structure are consistent with human intuition. In general, T1 scans highlight fat tissues within the body. T1-Ce indicates the active lesions by enhancing the blood-brain barrier breakdown region. T2 and FLAIR scans highlights the tissue water of the edema [7, 54] the current 2D-based design, the 3D-based calculation is estimated to cost more than one terabyte of RAM and GPU memory, which is infeasible using state-of-the-art hardware. Future collaboration with high-performance computing (HPC) core is desired to investigate such a huge computation task. As a feasible study, the current ACC analysis for key MRI modality is based on curve shape comparison. The adopted curve feature criteria may be sufficient for 4 input candidates. In future works, when more input channels are modeled in the neural ODE, more quantitative ACC features with sophisticated statistical modeling, such as the clustering method, may be required to identify useful input channels [58, 59] . Another future research direction is asynchronous deep feature extraction. When ACC shows a nonmonotonic shape (such as FLAIR ACC in Figure 7 In this work, we developed a neural ODE model to realize image evolution visualization for deep learning-based image segmentation. In the multi-parametric MRI-based glioma segmentation study, the developed model successfully identified key MR modalities via both qualitative visual clues and quantitative image dynamics analysis. The developed neural ODE model enhances the explainability of U-Net deep learning models, and the presented methodology can be generalized to other deep learning models for improved clinical applications. The collection of from t=0 to t=1 is named accumulated contribution curve (ACC). The solid blue lines represent the mean ACC across all test images, and the shaded areas indicate the corresponding standard deviation. Survival and low-grade glioma: the emergence of genetic information Glioblastoma Multiforme: A Review of its Epidemiology and Pathogenesis through Clinical Presentation and Treatment MRI based medical image analysis: Survey on brain tumor grade classification The epidemiology of glioma in adults: a "state of the science" review Quality of life in adults with brain tumors: Current knowledge and future directions Multiparametric MRI: practical approach and pictorial review of a useful tool in the evaluation of brain tumours and tumour-like lesions. Insights into Imaging The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) 3D brain glioma segmentation in MRI through integrating multiple densely connected 2D convolutional neural networks Brain Tumor Segmentation and Survival Prediction Using Multimodal MRI Scans With Deep Learning Brain tumor segmentation with self-ensembled, deeply-supervised 3D Unet neural networks: a BraTS 2020 challenge solution Generalized Wasserstein Dice Score, Distributionally Robust Deep Learning, and Ranger for Brain Tumor Segmentation: BraTS 2020 Challenge, in Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge Multi-parametric MRI (mpMRI) for treatment response assessment of radiation therapy Automatic glioma segmentation based on adaptive superpixel Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study Brain tumor target volume determination for radiation treatment planning through automated MRI segmentation Review of MRI-based brain tumor image segmentation using deep learning methods MRI brain image segmentation based on thresholding Localized active contour model with background intensity compensation applied on automatic MR brain tumor segmentation Brain image segmentation in recent years: A narrative review Radiomics: a primer on high-throughput image phenotyping Glioma dynamics and computational models: a review of segmentation, registration, and in silico growth algorithms and their clinical applications Classification of high-grade glioma into tumor and nontumor components using support vector machine Tumor segmentation from a multispectral MRI images by using support vector machine classification Brain tumor segmentation with optimized random forest. in International Workshop on Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries Glioma brain tumor detection and segmentation using weighting random forest classifier with optimized ant colony features Comparison of five cluster validity indices performance in brain [18F] FET-PET image segmentation using k-means. Medical physics Machine learning application in Glioma classification: review and comparison analysis Convolutional Networks for Biomedical Image Segmentation BU-Net: Brain Tumor Segmentation Using Modified U-Net Architecture. Electronics Low-grade glioma segmentation based on CNN with fully connected CRF CNN-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances Robust Deep Learning-based Segmentation of Glioblastoma on Routine Clinical MRI Scans Using Sparsified Training Explainable deep learning: A field guide for the uninitiated A Radiomics-Boosted Deep-Learning Model for COVID-19 and Non-COVID-19 Pneumonia Classification Using Chest X-ray Image Quantification of lung function on CT images based on pulmonary radiomic filtering Recursive Division of Image for Explanation of Shallow CNN Models Neural Ordinary Differential Equations Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Scientific Data Neural ODEs for Image Segmentation with Level Sets Explaining deep neural networks with a polynomial time algorithm for shapley value approximation Explanations for attributing deep neural network predictions Interpretable & explorable approximations of black box models Multilayer feedforward networks are universal approximators Stiff neural ordinary differential equations Medical image segmentation based on u-net: A review Neural Ordinary Differential Equations for Semantic Segmentation of Individual Colon Glands Augmented Neural ODEs Second-Order Neural ODE Optimizer Torchdyn: A neural differential equations library TorchDyn: Implicit Models and Neural Numerical Methods in PyTorch A survey on image data augmentation for deep learning Dose-Distribution-Driven PET Image-based Outcome Prediction (DDD-PIOP): A Framework Design for Oropharyngeal Cancer IMRT Application Analyzing magnetic resonance imaging data from glioma patients using deep learning. Computerized medical imaging and graphics Data clustering based on Langevin annealing with a self-consistent potential An exploratory radiomics approach to quantifying pulmonary function in CT images Post-Radiotherapy PET Image Outcome Prediction by Deep Learning under Biological Model Guidance: A Feasibility Study of Oropharyngeal Cancer Application Automatic detection of pulmonary nodules on CT images with YOLOv3: development and evaluation using simulated and patient data. Quantitative Imaging in Medicine and Surgery The global k-means clustering algorithm. Pattern recognition