Parallel Medical Imaging for Intelligent Medical Image Analysis:   Concepts, Methods, and Applications

Chao Gou; Tianyu Shen; Wenbo Zheng; Huadan Xue; Hui Yu; Qiang Ji,; Zhengyu Jin; Fei-Yue Wang

arXiv:1903.04855·cs.CV·June 30, 2021

Parallel Medical Imaging for Intelligent Medical Image Analysis: Concepts, Methods, and Applications

Chao Gou, Tianyu Shen, Wenbo Zheng, Huadan Xue, Hui Yu, Qiang Ji,, Zhengyu Jin, Fei-Yue Wang

PDF

Open Access

TL;DR

This paper introduces Parallel Medical Imaging (PMI), a novel framework that integrates data-driven learning and medical domain knowledge to improve diagnostic accuracy and interpretability in medical image analysis.

Contribution

The paper proposes a new PMI framework that combines interactive ACP-based parallel intelligence with artificial imaging systems for enhanced medical diagnosis.

Findings

01

Effective in mammogram and skin lesion analysis

02

Improves generalization and interpretability of diagnostic models

03

Demonstrated on multiple public datasets

Abstract

There has been much progress in data-driven artificial intelligence technology for medical image analysis in the last decades. However, it still remains challenging due to its distinctive complexity of acquiring and annotating image data, extracting medical domain knowledge, and explaining the diagnostic decision for medical image analysis. In this paper, we propose a data-knowledge-driven framework termed as Parallel Medical Imaging (PMI) for intelligent medical image analysis based on the methodology of interactive ACP-based parallel intelligence. In the PMI framework, computational experiments with predictive learning in a data-driven way are conducted to extract medical knowledge for diagnostic decision support. Artificial imaging systems are introduced to select and prescriptively generate medical image data in a knowledge-driven way to utilize medical domain knowledge. Through the…

Tables4

Table 1. TABLE I: Clinical description for mammography: breast composition, mass shape and margin, density.

Breast composition	a. The breast are almost entirely fatty;
	b. There are scattered areas of fibroglandular density;
	c. The breasts are heterogeneously dense, which may obscure small masses;
	d. The breasts are extremely dense, which lowers the sensitivity of mammography.
masses	shape	Oval; Round; Irregular.
	Margin	Circumscribed; Obscured; Microlobulated; Indistinct; Spiculated.
	Density	High density; Equal density; Low density; Fat-containing.

Table 2. TABLE II: Breast Imaging Reporting and Data System (BI-RADS) Assessment Categories

Category	Description
0	Needs additional imaging evaluation and/or prior mammograms for comparison.
1	Negative.
2	Benign finding(s).
3	Probably benign finding(s). Short-interval follow-up is suggested.
4	Suspicious anomaly. Biopsy should be considered.
5	Highly suggestive of malignancy. Appropriate action should be taken.
6	Biopsy proven malignancy.

Table 3. TABLE III: Experimental results on INbreast dataset.

Method	AUC	Acc.	Sensitivity	Specificity
Pred(CNN)	0.859	0.85	0.828	0.885
Pred+Desc	0.892	88.5	0.863	0.922
Pred+Desc+Pres	0.901	0.902	0.906	0.896

Table 4. TABLE IV: Experimental results on ISIC Skin 2017.

Methods	AUC	Acc	Sensitivity	Specificity
#2	0.856	0.824	0.103	0.998
#1	0.868	0.828	0.735	0.851
SDL[53]	0.868	0.872	-	-
ARLCNN[9]	0.875	0.850	0.658	0.896
Pred	0.667	0.732	-	-
Pred+Pres	0.883	0.890	0.732	0.901
Pred+Pres+Desc	0.912	0.906	0.743	0.907

Equations16

e (x_{k}^{i}) = - c = 1 \sum N p_{k}^{i, c} lo g p_{k}^{i, c},

e (x_{k}^{i}) = - c = 1 \sum N p_{k}^{i, c} lo g p_{k}^{i, c},

d (x_{k}^{i}, x_{k}^{j}) = c = 1 \sum N (p_{k}^{i, c} - p_{k}^{j, c}) lo g \frac{p _{k}^{i, c}}{p _{k}^{j, c}} .

d (x_{k}^{i}, x_{k}^{j}) = c = 1 \sum N (p_{k}^{i, c} - p_{k}^{j, c}) lo g \frac{p _{k}^{i, c}}{p _{k}^{j, c}} .

G min D max {f (D, G) = E_{x \sim p_{mass} (x)} [lo g D (x)] + E_{z \sim p_{z} (z)} [lo g (1 - D (G (z)))]},

G min D max {f (D, G) = E_{x \sim p_{mass} (x)} [lo g D (x)] + E_{z \sim p_{z} (z)} [lo g (1 - D (G (z)))]},

f_{S} = E [lo g P (S = real ∣ X_{real})] + E [lo g P (S = fake ∣ X_{fake})],

f_{S} = E [lo g P (S = real ∣ X_{real})] + E [lo g P (S = fake ∣ X_{fake})],

L = - \frac{1}{n} i = 1_{(x, y)} \sum n [l^{i} lo g (p^{i}) + (1 - l^{i}) lo g (1 - p^{i})],

L = - \frac{1}{n} i = 1_{(x, y)} \sum n [l^{i} lo g (p^{i}) + (1 - l^{i}) lo g (1 - p^{i})],

G^{*} = a r g G min D max E_{x, y} [lo g D (x, y)] + E_{x, z} [lo g (1 - D (x, G (x, z)))] + E_{x, y, z} [∣∣ y - G (x, z) ∣ ∣_{1}] .

G^{*} = a r g G min D max E_{x, y} [lo g D (x, y)] + E_{x, z} [lo g (1 - D (x, G (x, z)))] + E_{x, y, z} [∣∣ y - G (x, z) ∣ ∣_{1}] .

accuracy (Acc.) = \frac{TP + TN}{TP + FN + TN + FP}, specificity = \frac{TN}{TN + FP}, sensitivity = \frac{TP}{TP + FN},

accuracy (Acc.) = \frac{TP + TN}{TP + FN + TN + FP}, specificity = \frac{TN}{TN + FP}, sensitivity = \frac{TP}{TP + FN},

L = i = 1 \sum n j = 1 \sum m [(J_{i, j} - (y_{i} == y_{j}))^{2} + λ ∣∣ H - s g n (H) ∣ ∣_{p}^{p}],

L = i = 1 \sum n j = 1 \sum m [(J_{i, j} - (y_{i} == y_{j}))^{2} + λ ∣∣ H - s g n (H) ∣ ∣_{p}^{p}],

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging · Generative Adversarial Networks and Image Synthesis

Full text

Parallel Medical Imaging for Intelligent Medical Image Analysis: Concepts, Methods, and Applications

Chao Gou*, Tianyu Shen*, Wenbo Zheng, Huadan Xue, Hui Yu, Qiang Ji, Zhengyu Jin, and Fei-Yue Wang, This work was supported in part by the National Natural Science Foundation of China under Grant 61806198.

Equal contribution. C. Gou (Corresponding Author) is with School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou 510275, China. T. Shen, W. Zheng and F.-Y. Wang are with the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China. T. Shen is also with University of Chinese Academy of Sciences, Beijing 100049, China. W. Zheng is also with School of Software Engineering, Xi’an Jiaotong University, Xi’an 710049, China. H. Xue and Z. Jin are with Department of Radiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China. H. Yu is with School of Creative Technologies, University of Portsmouth, Portsmouth, PO1 2DJ. Q. Ji is with Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute. F.-Y. Wang is also with Qingdao Academy of Intelligent Industries, Qingdao 266000, China. (e-mail: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected])

Abstract

Data-driven artificial intelligence technologies have made much progress in medical image analysis in the last decades. However, it still remains challenging due to its distinctive complexity of acquiring and annotating image data, extracting medical domain knowledge, and explaining the diagnostic decision for medical image analysis. In this paper, we propose a data-knowledge-driven framework termed as Parallel Medical Imaging (PMI) for intelligent medical image analysis based on the methodology of interactive ACP-based parallel intelligence. In the PMI framework, computational experiments with predictive learning in a data-driven way are conducted to extract medical knowledge for diagnostic decision support. Artificial imaging systems are introduced to select and prescriptively generate medical image data in a knowledge-driven way to utilize medical domain knowledge. Through the closed-loop optimization based on parallel execution, our proposed PMI framework can boost the generalization ability and alleviate the limitation of medical interpretation for diagnostic decisions. Furthermore, we illustrate the preliminary implementation of PMI method through the case studies of mammogram analysis and skin lesion image analysis. Experimental results on several public medical image datasets demonstrate the effectiveness of proposed PMI.

Index Terms:

Parallel intelligence, parallel medical imaging, ACP, medical image analysis, domain knowledge.

I Introduction

Medical image analysis aims at extracting clinically useful information from computed tomography (CT), positron emission tomography (PET), magnetic resonance (MR), ultrasound, X-ray, and other modalities of images with the assistance of computers for diagnostic decision support [1, 2]. With urgent requirements of medical imaging, medical societies have entered a new era that medical equipments, image data, domain knowledge, and humans including physicians and patients are coupled in the large scale cyber-physical-social spaces (CPSS) [3, 4]. Hence, vision-based medical image analysis is becoming an increasingly prominent role at many clinical workflow stages from screening and diagnosis to treatment delivery, especially in the domain of remote medical consultation. Recently, vision-based medical image analysis has achieved promising results for skin cancer diagnosis [5, 6, 7, 8, 9], red lesion detection in fundus images [10], mammography analysis [11, 12, 13, 14] and pulmonary nodule detection [15, 16]. However, there are still challenges for vision-based medical image analysis. Firstly, these data-driven techniques require a large scale of effective medical images annotated by domain experts or radiologists. Secondly, conventional methods for medical imaging are built in a data-to-knowledge way where algorithms are learned from existing training samples in a bottom-up manner without any feedback or interaction to utilize the medical domain knowledge. Last but not the least, there is a limitation in interpretability for the final medical diagnostic decisions made by learned ”black-box” models especially for non-linear deep neural networks.

ACP methodology was first proposed in [17] for modeling, managing and controlling the complex systems. It consists of Artificial societies, Computational experiments and Parallel execution. The ACP-based parallel intelligence is one form of intelligence generated from the interactions and executions between physical and artificial systems [18]. As part of parallel intelligence, the parallel learning framework was presented in [19] to address issues of data collection and policy exploring in the current machine learning framework. Parallel learning combines descriptive learning, predictive learning, and prescriptive learning into a uniform evolutionary framework to optimize the learning system by self-boosting [20].

Inspired by the interactive ACP-based parallel intelligence, we propose a data-knowledge-driven framework termed as Parallel Medical Imaging (PMI) to address the aforementioned challenges in medical image analysis. Firstly, as pointed by Wang et al. in [21] that evaluations of the objectives can be performed only based on data collected from physical world and virtual world, we propose to conduct computational experiments to predictively extract medical knowledge for diagnostic decision support that are explainable to humans in a data-driven way. Different from conventional medical image analysis frameworks that solely perform data-to-knowledge extraction, we further introduce artificial imaging systems to select and generate specific medical image data for data collection in a knowledge-driven way. Specifically, interactive parallel learning with a descriptive and prescriptive scheme based on the explainable knowledge is performed to achieve knowledge-to-data generation in a top-down manner that allows for boosting the performance of decision model. In addition, the data-knowledge-driven parallel evolution can enable effective large scale data collection and enhance the interpretability of diagnosis.

II Preliminary

II-A ACP methodology

The ACP methodology was initially proposed by Wang et al. [17] for effective modeling and controlling of the complex systems. It consists of Artificial societies (A), Computational experiments (C) and Parallel execution (P). The key idea of ACP is to combine Artificial societies, Computational experiments, and Parallel execution to turn the virtual artificial space into another space for solving complex problems [22]. It is further extended to ACP-based parallel intelligence that is defined as one form of intelligence generated from the interactions and executions between physical and artificial systems [18]. ACP-based parallel intelligence is becoming an increasingly important research topic and is widely applied in various circumstances such as social computing, traffic management and control, ethylene production management, and autonomous driving [18, 23, 3, 24].

II-B Parallel vision framework

ACP methodology is further extended to the computer vision community as a Parallel Vision (PV) framework for better perceiving and understanding complex scenes in [25]. Inspired by ACP methodology, PV contains three parts including artificial systems, computational experiments and parallel execution. From the perspective of PV, the vision models are on-line optimized through parallel execution with a virtual/real interactive policy. From the perspective of PV, existing work of learning-by-synthesis [26, 27, 28] is part of PV with respect to artificial systems and computational experiments. Parallel execution aims to construct a closed loop driven by a large scale of ”big data” to boost the performance of vision systems. As a result, on-line learning through parallel execution allows the perception model to be continuously optimized.

II-C Parallel learning

By taking data, knowledge, and action into a closed loop, a parallel learning framework was introduced in [19] to alleviate the limitation in data collecting and policy exploring of existing machine learning frameworks. Based on ACP methodology, the parallel learning framework can capture the mutual dependency between data and action in an artificial system parallel to the physical system from observations. Parallel learning combines descriptive learning, predictive learning and prescriptive learning to effectively collect/generate data and guide the implementation of complex learning systems [20].

In particular, predictive learning urges the decision system to extract informative knowledge based on collected data. For descriptive learning, it forms a self-consistent artificial system to generate new labeled data following the distribution of observed data in real system with minimum human intervention. Moreover, prescriptive learning allows to guide the system to collect specific data in a supervised manner with descriptive or prior knowledge [29]. Refer to [19, 20] for more details.

III Parallel medical imaging

Conventional medical image analysis frameworks extract clinical knowledge from image data in a bottom-up manner where the model learning is driven by data ignoring the prior medical knowledge. However, in the field of medical imaging, domain knowledge plays a critical role for data collection and diagnosis decision support. Properly utilizing medical knowledge in a top-down manner can not only improve the diagnosis but also enhance the interpretability of diagnostic decision. Inspired by the ACP-based parallel intelligence frameworks and systems, we propose a data-knowledge-driven framework termed as parallel medical imaging (PMI) for medical image analysis.

The overall framework of proposed PMI is shown in Fig. 1. Two major parts of medical image and mdomain knowledge are coupled in PMI by ACP methodology and parallel learning. The key point is to select and generate image data which are representative to extract desired medical knowledge for final diagnostic decision.

Firstly in the stage of artificial systems, raw images are collected, followed by variation operators such as augmentation, selection and reproduction with generation for large scale of image data collection. In this work, inspired by the key idea of evolutionary optimization through the interactions and executions between physical and artificial systems, we introduce artificial imaging systems (AIS) parallel to physical ones. Particularly, a self optimizing AIS can be constructed through descriptive learning. According to relevant knoeledge and the distribution ( $p_{\text{data}}(x)$ ) of acquired real small data, abundant artificial data ( $\tilde{x}\sim p_{\text{data}}(x)$ ) are generated. Secondly, computational experiments with predictive learning are conducted for data-to-knowledge extraction: $y=f(\tilde{x},x;w)$ . $(\tilde{x},x)$ is the combinbation of real small data and artificial big data in AIS, and $y=f(\cdot;w)$ is the mapping represented by the visual models in the computational experiments. Finally, in the stage of parallel execution, prescriptive learning is adopted to guide the data generation in AIS based on the predictively extracted or prior medical knowledge where knowledge-to-data is achieved. This step can also enhance the interpretability of decision. In addition, descriptive learning is adopted in AIS to guide the data selection and generation based on the captured data distribution and knowledge. As a result, final effective diagnosis and prognosis can be achieved through extracted knowledge with enhanced interpretability. Hence, PMI can jointly employ the image data distribution and medical knowledge through bottom-up and top-down learning and inference for final clinical decision. It can reduce the dependence on annotated images and alleviate the limitation of medical interpretation for diagnostic decision. More details are given in the following subsections.

III-A Image data collection

For medical imaging, large scale of image data with accurate annotations is critical for the performance of learning-based methods. Parallel imaging framework was introduced in [30] for image generation for PV [25] to tackle the problems of complex vision systems. However, compared with natural image analysis, medical image analysis requires a higher level of expertise for interpretation and labeling. In addition, it is not easy to collect image data from medical institutions or imaging communities since they should be in accordance with the specific security and privacy policies. Moreover, some lesion types and abnormalities have a very low rate of occurrence in the general population [26, 31]. It is thus more time-consuming and costly to collect effective training data which makes medical imaging remain a challenging task.

In this work, as shown in Fig. 1, from the data perspective of parallel intelligence [18, 30], real medical images with annotations are kind of ’small data’. Through effective reproduction and variation operation such as conventional augmentation, active selection, and generation by introduced artificial imaging systems, a set of ’big data’ with real and synthetic images is formed for conducting computational experiments for medical knowledge extraction.

III-A1 Augmentation and selection of real images

In the step of image data collection, small and/or imbalanced real images for training can be augmented. Similar to conventional methods, rotation, scaling, flipping, translation and noise addition can be applied for medical image augmentation [6, 27, 32, 8].

The performance of learning-based methods for medical image analysis depends not only on the size but also the representativeness of labeled images. However, due to a lack of standardization in imaging and acquisition for medical images, selecting representative training samples for computational experiments remains a challenging task. In this framework, suitable selection of real images is performed to address this challenge. To this end, simple unsupervised/semi-supervised can be applied for data selection. In addition, active learning that aims at using limited medical images for disease classification can be developed. Active learning iteratively selects the most informative samples through the interaction between experts and computer. In active learning, the key is to develop a criterion for uncertainty in the sample selection process. In [33], entropy and diversity are adopted to indicate the power of candidate patches in elevating the performance of the current CNN model. For the $i$ -th patch of $k$ -th candidate denoted by $x_{k}^{i}$ , the prediction is $p_{k}^{i}$ and its entropy is formulated as below:

[TABLE]

where $c=1,2,...,N$ is the possible class. The diversity between sample patch $x_{k}^{i}$ and $x_{k}^{j}$ is defined as below:

[TABLE]

According to [33], the sample patch with higher entropy and higher diversity are expected to be selected to elevate the model performance.

III-A2 Generation of synthetic images

To utilize the medical domain knowledge, we propose to apply descriptive learning and design artificial imaging systems parallel to real imaging systems that can generate synthetic and specific medical images following the distribution of real ones. Many techniques for generating new synthetic medical images in our proposed framework of artificial imaging systems can be applied. They typically fall into three categories.

In the first one, new lesions are mathematically simulated based on various deformation, followed by inserting into the raw projection data or reconstructed clinical images, such as mammography [34] and lung nodules [35]. An example from [34] is illustrated in Fig.2. To assure the realism of the characteristics of the artificial samples, real lesions can be extracted and inserted to the same or different images [36].

In the second one, virtual images are simulated through computer graphics based on abstraction of the prior medical knowledge. Particularly, synthetic images are generated by selection of simulation parameters of models under controlled hypothetical imaging conditions [37]. In [38, 39], computerized phantom (eXtended CArdiac-Torso, XCAT) is served as a virtual patient, followed by feeding into an artificial imaging system with an accurate computerized model, which can generate photorealistic CT image data with patient-quality as show in Fig.3.

In the third one, generative models for image synthesis can be learned in the artificial imaging systems. In [40], the authors propose a model of fully convolutional neural networks for MRI synthesis. This model learns to input modalities into a shared modality-invariant latent space which allows it to benefit from additional input modalities and robust to missing data. Recently, adversarial learning for the generative model is widely used for medical image synthesis [31, 41, 7]. Some sample synthesized skin lesions generated based on GANs [7] are shown in Fig.4. In this work, effective generator of GANs can be utilized into the step of artificial imaging systems.

III-B Medical knowledge extraction

Conventional methods of turning data into medical knowledge rely on visual analysis and interpretation by a domain expert or radiologist in order to find useful patterns in data for decision support [42]. As pointed in radiomics [43, 44], effective conversion of images to mineable data supports the diagnostic decision. In this work, after effective image collection, computational experiments with predictive learning are conducted to extract medical knowledge in PMI. Hence, medical knowledge extraction from images is also a part of radiomics.

For this research topic in PMI, any information (e.g. mass shape, margin, density, location) about the patient’s ultrasonic signs, X-ray findings and other related image-based medical descriptions are termed as ’symptom’. Computational experiments with predictive learning try to perform effective diagnosis. To achieve this goal, medical knowledge needs to be extracted by studying the relationships of obligatory proving or excluding symptoms for diagnosis in books and in practical experience. These certain information about relationships that exist between symptoms and diagnoses, symptoms and symptoms, diagnoses and diagnoses and more complex relationships of combinations of symptoms and diagnoses to a symptom or diagnosis are formalizations of what is called medical knowledge [45].

Predictive learning was originally inspired by the cognitive psychology study that how children construct knowledge of the world by interacting with it [24]. In the step of computation experiments, we perform predictive learning for the diagnosis model from collected image data for decision support. It can be simplified as part of medical knowledge extraction from image data. Conventional data-driven machine learning techniques especially deep learning models can be learned to address knowledge extraction in PMI. Moreover, the hand-crafted features (e.g. LBP, HOG, SIFT) designed by humans using statistical formulation can be applied as prior domain knowledge for approximations of visual content [8, 46].

In general, computational experiments in PMI include detection, segmentation, classification, or relationship caption for decision support for clinical applications. The detection model extracts the knowledge of rough location and size of the lesion area. Subsequently, the segmentation model extracts the detailed shape and margin information of the lesion. Finally, the knowledge of pathological types and assessment categories are obtained through the classification task.

III-C Closed-loop optimization with parallel learning

As shown in Fig. 5, we introduce parallel learning to take advantage of bidirectional optimization between medical image data and clinical description/representation of medical knowledge. In PMI, predictive learning can achieve effective data-to-knowledge extraction through a bottom up manner.. Different from traditional diagnosis of treating medical images as pictures intended solely for visual interpretation, conversely, through a top-down inference, the extracted medical knowledge can be used for guiding the image generation as well as increasing the interpretability of future diagnosis. As described in subsection II-C, we employ descriptive and prescriptive learning of parallel learning to improve the model generalization ability and enhance the interpretation for medical diagnosis decision.

III-C1 Descriptive learning

Descriptive learning aims to devise models to explain and predict learning results [29]. In this work, it urges the introduced artificial imaging system to generate new images that follow the distribution of observed data. The descriptive learning process allows for learning features from unlabeled data in a semi-supervised or unsupervised manner. Adversarial learning of GAN for image generation can be seen as a special case of descriptive learning where the objective is to minimize the difference of distribution for real between generated images[19, 20].

Taking mass image generation for an example [31], we introduce adversarial learning as part of descriptive learning to implicitly learn the mass image distribution $p_{\text{mass}}$ from real image samples. GAN contains a generator $G$ and a discriminator $D$ , and the output of discriminator $D$ can be descriptively used for the optimization of generator $G$ . For the input $x$ , $D(x)$ represents the probability of being a real mass image. For the input $z$ from a simple distribution $p_{z}$ , $G(z)$ represents the generated synthetic image. The basic loss function for adversarial learning is a two-player minimax game formulated as below:

[TABLE]

where $\mathbb{E}$ represents the expectation. We can rewrite this loss function as below:

[TABLE]

where $P(S|X)=D(X)$ and $X_{\text{fake}}=G(z)$ . And $D$ is trained to maximize $f_{S}$ which denotes it assigns to the correct mass image source. $G$ is trained to minimize the second term of $f_{S}$ . It is worth noticing that the key idea of descriptive learning is to model the medical image distribution, perception and reasoning based on the observation in real world. Hence, the descriptively learned generator through adversarial learning can be generalized to artificial imaging systems.

III-C2 Prescriptive learning

According to the definition in [29, 19, 20], prescriptive learning is concerned with guidelines that describe what to do in order to generate specific outcomes. They are often based on descriptive theories or derived from prior knowledge. In this work, we achieve knowledge-to-data generation and enhance interpretability through prescriptive learning of parallel learning. According to the ACP methodology, we perform parallel execution with prescriptive learning to guide the artificial medical imaging systems to collect specific representative image data based on the extracted or prior medical descriptions and knowledge.

For instance, based on the prior medical knowledge that mammograms with spiculated and irregular mass are mostly malignant, we can prescriptively generate various irregular and spiculated mass images with associated pleomorphic calcifications for malignant breast cancer analysis in mammograms [47]. As a result, visual interpretation on the diagnostic results is enhanced through prescriptive learning which effectively capture the relationship between malignancy and interpretability.

IV Case Studies of PMI

IV-A Analysis of mass in mammograms

To validate the effectiveness of the proposed PMI framework, we further perform a case study of mammogram analysis in this subsection. The clinical descriptive details from standard Breast Imaging Reporting and Data System (BI-RADS) [48] are illustrated in Table.I and Table.II that explicitly inform the domain knowledge description. Similar to the work in [47], after capturing the relationship between the malignancy and clinical description as listed in Table II, diagnosis with interpretability can be enhanced. For visual results and diagnosis, the visual diagnosis models is trained for visual information extraction like detection, segmentation and classification. Built upon PMI, we perform an implementation based on GANs with image data collection, medical description of knowledge and parallel learning. The overall framework is illustrated in Fig. 6. Due to page limitation, we only study the problem of local X-ray breast mass classification (benign/malignant) for diagnosis. Another case study of X-ray breast mass segmentation based on ACP methodology can be referred to [31].

IV-B Implementation

Firstly, we achieve malignancy extraction by predictive learning. In particular, we apply a simple CNN to perform predictive learning to classify the mass image with corresponding masks as malignant or benign in the step of computational experiments. The CNN architecture with details is exhibited in Fig. 7. Cross Entropy is used as a loss function which is computed by:

[TABLE]

where $(x,y)$ denotes the pair of input, $n$ is the number of training samples, $l$ represents actual label with $1$ denoting malignancy and [math] for the benign, $p$ is the predicted value.

For the step of descriptive learning, inspired by the idea that adversarial learning is a special case of parallel learning [19], we introduce a generative adversarial network structure for descriptive mass images generation in the artificial imaging system. Specifically, a conditional GAN (cGAN) structure is introduced for generation from given binary masks $x$ which already incorporate the shape and margin descriptive information. The generator $G$ and discriminator $D$ of GAN are trained for learning the distribution of mass images as well as a mapping $G:\{x,z\}\rightarrow y$ between masks $x$ , random noise $z$ , and real mass images $y$ . Inspired by [49], an U-net structure is also introduced as the generator and a PatchGAN architecture is introduced as the discriminator. To learn an effective generator $G$ in artificial imaging systems based on adversarial learning, we set the objective function as below:

[TABLE]

And the discriminator $D$ is learned as described in [50] that aims to distinguish the input as real or synthetic. $G$ and $D$ are alternatively optimized until convergence. Then we can acquire the effective generator which is part of the artificial imaging system performing synthetic data generation. The combination of the ’small’ set of real images and ’big’ set of synthetic ones forms a large scale of training samples that can be further fed into predictive learning for ’small’ knowledge extraction.

As shown in Fig. 5, the workflow of data-to-knowledge is a bottom-up manner with the medical visual and descriptive knowledge learning from existing training samples. Conversely, through the knowledge-to-data inference in a top-down manner, the medical knowledge is used for guiding the image augmentation, selection and generation, and enhancing the interpretability of diagnosis. In this case, prescriptive learning is adopted to generate specific malignant/benign mask images with the corresponding shape and margin based on the descriptively extracted knowledge or prior medical knowledge. To this end, a deep convolutional generative adversarial network (DCGAN) [51] is implemented to generate the specific binary mask. To utilize the medical knowledge that malignant mass is also with irregular shape and spiculated margin, and benign mass with oval shape and circumscribed margin [47, 48], we train two generators separately through adversarial learning in the prescriptive scheme for benign and malignant binary mask generation. In this work, 37 benign and 75 malignant masks are augmented into 296 benign and 600 malignant masks for training the DCGAN model. 262 benign and 409 malignant masks are obtained and used to generate the corresponding mass images through the previously trained cGAN model in the step of descriptive learning. Some generated masks are shown in Fig. 8. Hence, a generative models from introduced DCGAN can achieve knowledge-driven data generation.

By feeding the generated benign and/or malignant binary masks into artificial imaging systems through descriptive learning, more specific realistic-looking lesion images from interpreting conditions such as margin and shape of masses can be collected. Then we can extract more suitable medical knowledge through predictive learning in a data-driven way for final diagnosis. The overall framework jointly employs the image data collection and medical knowledge extraction in a closed loop through data-to-knowledge predictive learning and knowledge-to-data prescriptive learning. Parallel data-knowledge-driven optimization is achieved.

IV-B1 Dataset of Mammograms and Evaluation Criteria

Experiments are conducted in the public available dataset of INbreast [52] which is one of most widely used for mammogram analysis. The INbreast dataset is created by the Breast Research Group, INESC Porto, Portugal, and consists a total of 115 cases (410 images) including 107 images of cancer and 236 images of normal breast. In this work, local ROI of 107 mass images with cancers are cropped into $256\times 256$ pixels along with the corresponding mask applying the same operation. A set of total 112 squared mass images is obtained because some of these cases have more than one mass and they are annotated (benign or malignant) according to the Breast Imaging Reporting and Data System (BI-RADS), which is a standard criteria developed by the American College of Radiology (ACR)[52] as listed in Table II. In this work, 36 masses with BI-RADS Category $\in\{2,3\}$ are categorized as benign, and 76 masses with BI-RADS Category $\in\{4,5,6\}$ are categorized as malignant.

The performance is analyzed by measurement metrices in the binary classification problem, including overall accuracy, sensitivity, specificity, which are defined as

[TABLE]

where TP, TN, FP, and FN are defined the number of true positive, true negative, false positive, and false negative detections, respectively. Moreover, ROC (Receiver Operating Characteristic) curves and their AUCs (Area Under the Curve) is also used to evaluate the performance of classification model. ROC curve is produced by false positive rate (horizontal axis) and true positive rate (vertical axis). A better performance is achieved with a larger AUC.

IV-B2 Experimental Results

For the experiments, 4-fold cross-validation tests have been carried out, which ensures that the samples are tested equally to prevent any bias error. The real samples are augmented into 512 samples and divided into four folds. Each fold contains 128 real samples with 48 benign masses and 80 malignant masses. Three folds are used for training and the rest for testing. Firstly, we perform predictive binary classification on real data using the CNN model where we term it as Pred (baseline). Then we apply the descriptively trained cGAN to generate synthetic mass images. In addition, we perform predictive learning using CNN on the combination of real and synthetic images for binary classification where we term it as Pred+Desc. Finally, to validate the effectiveness of utilizing domain knowledge by prescriptive learning, we apply the introduced DCGAN to generate benign and malignant binary masks followed by synthetic mass image generation from masks through descriptively trained cGAN. As a result, a synthetic dataset with 1040 benign samples and 1636 malignant samples are formed that is combined with three folds of real images to form a new collected set for training the CNN model. We term this procedure as Pred+Desc+Pres. All the testings are conducted on the same real data with the same training parameters setting. Experimental results are listed in Table III.

As shown in Table III, conventional CNN trained on real images with augmentation for malignancy classification achieves an average accuracy of 0.85. After descriptive adversarial learning for synthetic mass image generation for training, its average accuracy improves to 0.885. In addition, the specific type of binary masks and related mass images are prescriptively generated in a knowledge-driven way to enlarge the variations of training data. In this step, medical knowledge such as that benign mass always arises with oval shape and circumscribed margin, while the malignant mass along with irregular shape and spiculated margins is utilized for data generation and collection. Through such closed-loop optimization on existing real data, an average accuracy of 0.902 is further achieved. The CNN trained on the descriptive and prescriptive generated data performs better with higher classification accuracy than the model trained on the real set under the same testing set and training parameters setting. Besides, as shown in Table III, the AUC improvements also demonstrate the effectiveness of the proposed optimization framework. By further investigation, predictive learning from descriptively generated samples in the artificial imaging system can boost the performance in computational experiments for medical knowledge extraction. Generating the desired binary mask of masses for image synthesis based on the prior medical knowledge through prescriptive learning in our proposed GANs-based PMI framework further improves the accuracy and it can enhance the interpretability for diagnosis. This GANs-based PMI framework can be easily generalized to other tasks of medical imaging. In summary, our proposed data-knowledge-driven PMI framework is capable of describing, predicting, and prescribing the correlation between the image data and medical knowledge in real complex imaging systems.

IV-C Analysis of skin lesion images

In last case study, we mainly focus on data-driven for mass classification in mammograms where our major concern is the data generation. For the step of prescriptively learning, we only prescriptively generate specific malignant/benign mask based on the prior knowledge that malignant mass also comes with irregular shape and spiculated margin and benign mass with oval shape and circumscribed margin. In this case of skin image analysis, we try to focus more on knowledge-driven for skin image classification based on PMI.

In particular, based on PMI framework, we introduce a deep relation model embedded with hand-crafted features [8] for skin lesion classification. The key motivation of employing domain knowledge is in two folds. First, dermatologists classify the skin lesion based on the symptoms of color, margin, texture, shape, and so on. Hence, we propose to extract hand-crafted features to capture these statistical information as prior domain knowledge for diagnosis support. Second, a deep meta-learning strategy is introduced in this work to extract transferrable knowledge.

IV-C1 Implementation

For the step of descriptive learning for synthetic data generation, simple data augmentation of random rotation, horizontal and vertical flips, and scaling are used to enlarge the training dataset to alleviate the problem of over-fitting

For predictive learning with prescriptive knowledge, inspired by the relation model [54] to employ transferrable knowledge for classification, we introduce a deep hashing relation model to leverage the power of prior experience learned from classifying skin lesion images with limited samples. According to the dermatologists’ prior experience and existing medical knowledge, clinical dermoscopic symptoms such as number of clinically significant colors, lesion shapes, sizes, local textures are crucial for skin screenings [55, 56, 57]. For instance, a single lesion with symptoms of variegated tonalities of color, asymmetry in shape, or prominent network is more likely to be melanoma[58, 5]. Since the hand-crafted features are approximations of the visual symptoms based on the mathematical and statistical formulations designed by the humans without any training images, they can be served to represent the domain prior knowledge in this work. Hence, we propose to embed LBP, HOG, SIFT, Gabor, Color-Names, GLCM, CIA-LVQ and Canny as hand crafted-features to capture the prior knowledge of texture, edge, color, and so on[8, 59]. To further jointly employ the domain knowledge prescriptively for skin cancer screening with very few labeled skin lesion images, few-shot learning phase is applied in this work. During meta learning, it learns to learn a deep distance metric for classification where transferrable knowledge is learned [54]. In summary, we introduce a novel deep relation network via few-shot learning embedded with hand-crafted features prescriptively for the skin lesion classification. The framework is illustrated in Fig. 9. Our goal is to regress the relation score $J_{i,j}$ to the ground truth where the mismatched pair with similarity 0 and matched pair with similarity 1. The mean square error loss are used to train the model and the objective function is

[TABLE]

where $\lambda||H-sgn(H)||_{p}^{p}$ is a penalty term introduced by [60], $\lambda=0.1\times\frac{1}{NK}$ and $N$ is the number of input size and $K$ is the hashing encoding length. In this work, $H$ is the output of fusion layer.

IV-C2 Dataset of Skin Lesions and Evaluation Criteria

Experiments are conducted on the ISIC Skin 2017 dataset [61] where it contains 2000, 150, 600 skin lesion images for training, validation, and testing respectively. Lesion images are paired with a gold standard diagnosis, i.e. melanoma, nevus, and seborrheic keratosis. In this work, we consider the sub-task of melanoma classification (melanoma vs. others). We collected 1320 additional dermoscopy images from ISIC Archive to enlarge the training dataset[9]. Similar to mass classification, the evaluation criteria of AUC, accuracy, sensitivity, and specificity defined in IV-B1 are used as performance metrics.

IV-C3 Experimental results

Comparison experiments are conducted with other work to validate the effectiveness. Firstly, we perform predictive binary melanoma classification using conventional relation model with only deep feature $f_{net}$ where we term it as Pred. Then, to utilize the domain knowledge through prescriptive learning, we propose to embed the prior hand-crafted feature $f_{hand}$ with deep feature in the meta-learning scheme. Besides, transferable knowledge representation is achieved through few-shot learning in this step. This procedure is termed as Pred+Pres. To validate the effectiveness of the introduced PMI, a simple augmentation of training samples is performed to act as the step of descriptive learning. The whole procedure of PMI is termed as Pred+Pres+Desc.

Experimental results and comparisons with ARLCNN[9], SDL[53] and top two ranking results in the ISIC 2017 challenge leader-board are listed in Table IV. As shown in Table IV, the introduced PMI framework can achieve better performance with AUC of 0.912 compared with the state-of-the-arts. Besides, when we further perform prescriptive meta learning with predictive learning, the AUC of Pred+Pres improves 32.4% compared with the result from baseline of Pred. And it can achieve the best performance of AUC with 0.883 compared with other work. Through further investigation, the incorporating of hand-crafted features with domain knowledge and the meta-learning scheme with prior knowledge by prescriptive learning in PMI is critical for this task of melanoma classification. Simple augmentation in the step of descriptive learning in PMI can further boost the performance slightly. It further validates that the introduced data-knowledge driven PMI framework is effective for medical image analysis.

V Conclusion

In this paper, we propose a data-knowledge-driven framework termed as PMI for vision-based intelligent medical image analysis. Artificial imaging systems with descriptive learning allow to collect large scale synthetic and real images for training and evaluating the models in the computational experiments. With a knowledge-to-data in a top-down manner through prescriptive learning, we can select and generate specific image data based on the prior or extracted medical domain knowledge. With a data-to-knowledge in a bottom-up inference through predictive learning, we can extract medical knowledge for clinical diagnostic supporting systems. Through parallel execution, a ’large’ scale of medial image data is collected from a ’small’ set of real images, followed by ’small’ intelligence with interpretable medical knowledge extraction. Experimental results from case studies also demonstrate that the data-knowledge-driven PMI scheme alleviates the limitations of a small quantity of available medical images and enhance the interpretability for final diagnosis and prognosis with more descriptive information.

Future work will focus on expanding the proposed PMI framework beyond diagnosis decision support in medical imaging. For the foreseeable future, the field of parallel medical imaging has tremendous potential to supplement and verify the work of clinicians, train radiologists to be more skilled, perform the surgical planning, apply intra-operative navigation, give personalized medicine recommendation, and visualize medical images with interpretable masks, particularly in the complex field of imaging analytics with complicated diseases.

Bibliography61

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. Shen, G. Wu, and H.-I. Suk, “Deep learning in medical image analysis,” Annual review of biomedical engineering , vol. 19, pp. 221–248, 2017.
2[2] Z. Hu, J. Tang, Z. Wang, K. Zhang, L. Zhang, and Q. Sun, “Deep learning for image-based cancer detection and diagnosis—a survey,” Pattern Recognition , 2018.
3[3] F.-Y. Wang and P. K. Wong, “Intelligent systems and technology for integrative and predictive medicine: An acp approach,” ACM Transactions on Intelligent Systems and Technology (TIST) , vol. 4, no. 2, p. 32, 2013.
4[4] F. Y. Wang, “Parallel healthcare: Robotic medical and health process automation for secured and smart social healthcares,” IEEE Transactions on Computational Social Systems , vol. 7, no. 3, pp. 581–586, 2020.
5[5] H. Haenssle, C. Fink, R. Schneiderbauer, F. Toberer, T. Buhl, A. Blum, A. Kalloo, A. Hassen, L. Thomas, A. Enk et al. , “Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists,” Annals of Oncology , 2018.
6[6] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,” Nature , vol. 542, no. 7639, p. 115, 2017.
7[7] F. WANG, C. GOU, J. WANG, T. SHEN, W. ZHENG, and H. YU, “Parallel skin: a vision based dermatological analysis framework,” Pattern Recognition and Artificial Intelligence , vol. 32, no. 7, pp. 577–588, 2019.
8[8] W. Zheng, C. Gou, and L. Yan, “A relation hashing network embedded with prior features for skin lesion classification,” in International Workshop on Machine Learning in Medical Imaging . Springer, 2019, pp. 115–123.