key: cord-0616299-hujo176x
authors: Li, Johann; Zhu, Guangming; Hua, Cong; Feng, Mingtao; BasheerBennamoun,; Li, Ping; Lu, Xiaoyuan; Song, Juan; Shen, Peiyi; Xu, Xu; Mei, Lin; Zhang, Liang; Shah, Syed Afaq Ali; Bennamoun, Mohammed
title: A Systematic Collection of Medical Image Datasets for Deep Learning
date: 2021-06-24
journal: nan
DOI: nan
sha: e34bf6a6d8f8f54d10b8b670fe0f5991e3393788
doc_id: 616299
cord_uid: hujo176x

The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analysis. Medical image acquisition, annotation, and analysis are costly, and their usage is constrained by ethical restrictions. They also require many resources, such as human expertise and funding. That makes it difficult for non-medical researchers to have access to useful and large medical data. Thus, as comprehensive as possible, this paper provides a collection of medical image datasets with their associated challenges for deep learning research. We have collected information of around three hundred datasets and challenges mainly reported between 2013 and 2020 and categorized them into four categories: head&neck, chest&abdomen, pathology&blood, and ``others''. Our paper has three purposes: 1) to provide a most up to date and complete list that can be used as a universal reference to easily find the datasets for clinical image analysis, 2) to guide researchers on the methodology to test and evaluate their methods' performance and robustness on relevant datasets, 3) to provide a ``route'' to relevant algorithms for the relevant medical topics, and challenge leaderboards.

Since the invention of medical imaging technology, the field of medicine had entered a new era. The beginning of medical imaging started with the adoption of X-Rays. With further technical advancements, many other imaging methods, including 3D computed tomography (CT), magnetic resonance imaging (MRI), nuclear medicine, ultrasound, endoscopy, and optical coherence tomography (OCT), were also exploited. Directly or indirectly, these imaging modalities have contributed to the diagnosis and treatment of various diseases, and the research related to the human body's structure and intrinsic mechanisms.

Medical images can provide critical insight into the diagnosis and treatment of many diseases. The human body's different reactions to imaging modalities are used to produce scans of the body. Reflection and transmission are commonly used in medical imaging because the reflection or transmission ratio of different body tissues and substances are different. Some other methods acquire images by changing the energy transferred to the body, e. g., magnetic field changes or the rays radiated from a chemical agent.

Before modern AI was applied in medical image analysis, radiologists and pathologists needed to manually look for the critical "biomarkers" in the patient's scans. These "biomarkers", such as tumors and nodules, are the basis for the medics to diagnose and devise treatment plans. Such a diagnostic process needs to be performed by medics with extensive medical knowledge and clinical experience. However, problems such as diagnostic bias and the lack of medical resources are prevalent and cannot be avoided. After the recent breakthroughs in AI (which achieve human-like performance, e. g., for image recognition [1, 2, 3] , and can win games such as Go [4] and real-time strategy games [5] ), the development of AI-based automatic medical image analysis algorithms has attracted lots of attention. Recently, the application of AI in medical image analysis has become one of the major research focuses and has attained many significant achievements [6, 7, 8] .

Many researchers brought their focus to AI-based medical image analysis methods thinking that it might be one of the solutions to the challenges (e.g., medical resource scarcity) and taking advantage of the technological progress [9, 10, 11, 12, 13] . Traditional medical image analysis focuses on detecting and identifying biomarkers for diagnosis and treatment. AI imitates the medic's diagnosis through classification, segmentation, detection, regression, and other AI tasks in an automated or semi-automated way.

AI has achieved a significant performance for many computer vision tasks. This success is yet to be translated to the medical image analysis domain. Deep learning (DL), a branch of AI, is a data-dependent method as it needs massive training data. However, when DL is applied to medical image analysis, the paucity of labeled data becomes a major challenge and a bottleneck.

Data scarcity is a common problem when applying DL methods to a specific domain, and this problem becomes more severe in the case of medical image analysis. Researchers, who apply DL methods to medical image analysis research, do not usually have a medical background, commonly computer scientists. They cannot collect data independently because of the lack of access to medical equipment and patients, and they cannot annotate the acquired data either because they lack the relevant medical knowledge. Furthermore, medical data is owned by institutions who cannot easily make it public due to privacy and ethics restrictions. When researchers evaluate their algorithms on their private data, the results of their research become incomparable.

To address some of these problems, MICCAI, ISBI, AAPM, and other conferences and institutions have launched many DL-related medical image analysis chal-lenges. These aim to design and develop automatic or semi-automatic algorithms and promote medical image analysis research with computer-aided methods. Concurrently, some researchers and institutions also organize projects to collect medical datasets and publish them for research purposes.

Despite all these developments, it is still challenging for novice medical image analysis researchers to find medical data. This paper addresses this challenge and presents a comprehensive survey of existing medical datasets. The paper also identifies and summarizes medical image analysis challenges. It also provides a pathway to identify the most relevant datasets for evaluation and the suitable methods they need in the respective challenge leader board.

This paper refers to other research papers with a number between square brackets and refers to the datasets listed in the tables with numbers between parentheses.

The following sections present the details of the key datasets and challenges. Section 2 summarizes the datasets and challenges, including the years, body parts, tasks, and other information that is relevant to the dataset development. Section 3 discusses the datasets and challenges of the head and neck. Section 4 covers the datasets and challenges related to the chest and abdomen organs. Section 5 examines the datasets and challenges of pathology and blood related tasks. Section 6 introduces other datasets and challenges related to bone, skin, phantom, and animals. We have also created a website with a git repo 1 , which shows the list of these datasets and their respective challenges.

In this section, we provide an overview of the image datasets and challenges. Our collection contains over three hundred medical image datasets and challenges organized between 2004 and 2020. This paper focuses mainly on the ones between 2013 and 2020. Subsections 2.1, 2.2, 2.3, and 2.4 provide information about the year, body parts, modalities, and tasks, respectively. In Subsection 2.5, we introduce the sources from where we have collected these datasets and the challenges. Details about the categorization of these image datasets and challenges into four groups are provided in the subsequent sections. We provide a taxonomy of our paper in Figure 1 to help the reader navigate through the different sections. 

The timeline of these medical image datasets can be split into two, starting from 2013 as the watershed, since Krizhevsky et al.'s excellent success in the ILSVRC competition with their AlexNet [14] in 2012. The continuous advancement of deep learning has, to some extent, driven more and more researchers to focus on medical image analysis and indirectly led to an increase in the number of datasets and competitions each year. The number of datasets and challenges before 2013 are irregular according to our statistics. The main reason is that many datasets developed before 2012 are not aimed at computer-aided diagnosis, for example, ADNI-1 (52) , although those data could be used for DL. Therefore, we only focus on the datasets and challenges which were released after 2013. Figure 2A shows the statistics of the datasets and challenges per year between 2013 and 2020. As shown in Figure 2A , the number of related datasets and challenges increased year by year because of the progress and success of DL in computer vision and medical image analysis. That led more and more researchers to focus on medical image analysis with DL-based methods and more and more datasets and challenges with different body parts and tasks started to appear. Figures 2B and 2C , there was not only an increase in the number of datasets and challenges but also in their variety with respect to the body parts and types of tasks. The research focus ranges from a simple diagnosis or structural analysis (e. g., segmentation and classification) in the early stages to more complex tasks or combinations of tasks that are closer to the clinical needs, including classification, segmentation, detection, regression, generation, tracking, and registration, as time progresses. The focus of these datasets and challenges has also changed from cancer diagnosis to the entire healthcare system. Meanwhile, the organs focused on by researchers also range from the single and simple, but important ones, such as the brain and lungs, to many different other parts of the human body accounting for different sizes, shapes, and other characteristics.

With the success of DL, the number of focused body parts has increased, as shown in Figure 2B . We also show the most targeted researched body parts in Figure  2E , and the top-5 researched organs include the brain, lung, heart, eye, and liver. These organs have been the focus of research because they are the most important parts of the human body. In the beginning, the main reason which motivated researchers to focus on these organs and parts was that a simple diagnosis and a structural study greatly helped in the diagnosis and treatment of cancer (a major threat to human life). Many datasets focus on brain, lung and other organs, without considering DL, and many challenges focus on simple tasks, such as segmentation and classification. Subsequently, AI showed to be more competent to tackle complex tasks, and therefore researchers started to focus on several other organs. For example, eye related diseases, which cause blindness, incited the collection of eye related datasets and the release of challenges. Some other datasets and challenges focus on the small organs, such as the prostate, which are challenging to analyze due to the low resolution of images.

There are several types of medical image modalities. As shown in Figure 2F , the frequently used modalities to acquire medical datasets include Magnetic Resonance Images (MRI), Computed Tomography (CT), Ultrasound (US), Endoscopy, Positron Emission Tomography (PET), Computed Radiography (CR), Electrocardiography, and Optical Coherence Tomography (OCT). We introduce below these main modalities and provide a summary at the end of this subsection.

Radiography: Radiography is an imaging technique based on the difference of attenuation when X-rays passes through the different organs and tissues of the human body. The primary used modalities include CR and CT. CR is a 2D image, and CT is a volume (3D) image. Radiography is the most commonly used method to image the human body. For example, CR is frequently used to diagnose chest related diseases, such as pneumonia, tuberculosis, and COVID-19. Meanwhile, 3D CT plays an important role in the diagnosis and treatment related to cancer and lesions. The advantages of radiography are 1) high resolution of the hard tissues (e. g. bones), 2) lower cost, and 3) compatibility of contrast agents, but the disadvantages are 1) X-rays are harmful for human health, 2) X-rays are ideal for distinguishing between healthy tissues and tumors without the help of contrast agents, and 3) their resolution is limited by the radiation intensity. Moreover, as the main component of the human bone is calcium, CT plays an important role in many bone related diagnoses.

Magnetic resonance: MR images display the body structure caused by the difference of signal released by the different substances of the imaged organ as the magnetic field is changed. MR has many submodalities, such as T1 and T2. For essential organs and tissues, MR is a commonly used imaging method because it is considered non-invasive, effective, and safe. Due to the principle of MR imaging, MR plays an essential role in the diagnosis of brain, heart, and soft tissues. Because higher resolution MR images can be obtained by increasing the magnetic field strength, MR is also suitable for small organs or tissues. However, MR images do have disadvantages such as high cost and incompatibility with metal (e. g., metallic orthopedic implants).

Nuclear medicine: Nuclear medicine captures images by the absorption of the targeted tissue of specific chemical components marked by radioactive isotopes. Tumors and healthy tissues absorb different chemical components, so medics use the specific chemical marked with the radioactive isotope and receive the ray radiated by the chemical. An example is Positron Emission Computed Tomography, i. e., PET, which performs imaging by capturing radiations produced by fluorodeoxyglucose or other similar contrast agents absorbed by the tissue or tumor. Nuclear medicine is good at imaging regions of interest, such as tumors, but the disadvantage is their high cost and the low-resolution.

Ultrasound: Ultrasound operates by acquiring the differences in the absorption and reflection of ultrasound waves when applied to tissues. It is widely used in imaging the heart and fetus because ultrasound causes no damage to these parts and provides real-time imaging. Nevertheless, the main disadvantage is the noise caused by the reflection of irregular shapes of organs and tissues, and the interference with their imaging.

Eye-related modalities: An OCT image is obtained by using low-coherence light to capture 2D and 3D micrometer-resolution images within optical scattering media to diagnose eye-related diseases. The fundus photo is also used for diagnosis purposes. These two methods are non-invasive eye-specific imaging modalities.

Pathology: Pathological data is the gold standard in diagnosing diseases. It is taken with microscopy of the stained tissue slides by the camera to show cell-level features. Pathology is used in the cell-level diagnosis for cancer and tumors.

Other modalities: Other imaging modalities are usual but specific to certain body parts, such as endoscopy, and provide the medics with various biomarkers to make critical decisions when diagnosing, curing, and researching.

Overall, MR, CT, and other modalities are the most commonly used imaging modalities. MR can provide sharp images without harmful radiations of soft tissues. It is therefore widely used in the imaging of brain, heart and many other small organs. CT is an economical and simple imaging approach, and it is widely used for the diagnosis of cancer, e. g., the neck, chest, and abdomen. A pathology image is different from MR and CT, because it is a cell-level imaging method. Pathology is widely used in cancer-related diagnosis.

According to our analysis, our collected datasets and challenges have been used for the tasks of classification, prediction, detection, segmentation, location, characterization, up-sampling, tracking, registration, regression, estimation, coding, automatic annotation, and other tasks. As Figure 2G shows, we grouped these tasks into seven categories: classification, segmentation, detection, regression, generation, registration, and tracking. The following subsections briefly describe each task.

Classification: Classification is used for qualitative analysis. According to pre-defined specific rules, the classification task aims to group medical images or particular regions of an image into two or more distinct categories. The classification task can be used alone for medical image analysis or as a subsequent task after other lower level tasks, such as segmentation and detection, in order to analyze the results and further extract features. There are many ways to express the classification task, such as detection and prediction. The detection tasks (which are also sometimes termed as classification) are different from the ones introduced in the following paragraph, although sometimes the same word is used synonymously. The typical examples of classification tasks include AD prediction and the attributes classification of pulmonary nodules. AD prediction aims to group MR images in Alzheimer's disease (AD) and normal cognition (NC). The attributes classification of pulmonary nodules aims to analyze the pathology attributes of pulmonary nodules. Classification performance measures mainly include accuracy, precision, specificity, sensitivity, F-score, ROC, and AUC. All these measures are based on four basic measures: true positive (TP), false positive (FP), true negative (TN), and false negative (FP).

The segmentation task can be regarded as a pixel-level or voxel-level classification task, but the difference is that the segmentation task is limited to the context. It aims to split an image into different areas or contour specific regions. The regions can contain tumor, tissue, or other specific targets. The results of the segmentation task consist of areas and boundaries. Since segmentation can be seen as a pixel-level classification, the average precision (AP) can be used as a metric. Other performance metrics include intersection over union (IoU), Dice index, Jaccard Index, Hausdorff distance, and average surface distance.

The detection task aims to find an object of interest, and it also usually needs to classify such an object (classification task). In this work, we categorize the tasks which aim to determine the location of the object of interest with a bounding box or a point. The detection task is sometimes represented as a localization task. A typical example of detection is pulmonary nodules detection, which aims to find the pulmonary nodules in chest CT images and annotate the nodules with a bounding box. The performance measures used in the detection tasks include mainly the intersection over union (IoU), mean Average Precision (mAP), precision and recall, false positive rate, receiver operating characteristic curve (ROC), and other metrics. For the task to locate an object without the boundary, the Euclidean Distance is the most commonly used measure.

Regression: Classification is used for qualitative analysis, while regression is used for quantitative analysis. A typical example is the estimation of the volume of a lesion. For the regression task, the root mean square error, i. e., rMSE, mean absolute error, and correlation coefficient are the most commonly used metrics. Tracking: The tracking task aims to locate specific targets, but the tracking is a dynamic process and is therefore different from the detection task. That means the tracking algorithms need to detect or localize targets in different frames. For medical image analysis, the tracking tasks include the tracking of organs and tissues. The tracking is not just of one point, but it can also be of an area, e. g., every part of an organ or tissue. An example is the tracking of the lung when the subject breathes.

Generation: The image data generation task has many different aims, but for simplicity we categorize all of these aims under the "generation task" because they focus on generating image data from other image data. Typical generation tasks include 1) to generate a T2weight image from T1-weight images and 2) to generate a pathology image stained with one stain from an image stained by another stain.

Registration: The image registration task aims to align one image with another image, i. e., to find a transformation (e. g., rotation and translation) to align the two images. Registration is a necessary process for computer-aided diagnosis algorithms from multimodalities. During medical scanning, the movement of the human body cannot be avoided and is a challenge. At the same time, imaging cannot be taken instantly. As a result, images from different viewpoints cannot be aligned directly or when two or more modalities are used. Therefore, researchers rely on registration techniques to solve these alignment problems.

We collected the datasets and challenges mainly from The Cancer Imaging Archive [15] , Grand Challenge, Kaggle, OpenNeuro, PhysioNet [16] , and Codalab.

The original records of datasets and challenges that we collected include four to five hundred, and we removed some of them as some datasets are not suitable for DL and AI methods. We then categorized the remaining datasets and challenges into different groups. Categorizing the datasets and challenges is not easy because all these datasets and challenges are derived from clinical research sources. Thus we used an asymmetric categorization to group these datasets and challenges into four groups, as shown in Figure 1 . This means that we did not use the same sub-taxonomy in each category or sub-category.

First, we split the medical datasets and challenges into two groups: body-level and cell-level (Section 5), according to the imaged body part. The bodylevel datasets focus on specific tissues, while the celllevel ones focus on cells. Second, we grouped the datasets and challenges of the brain, eye, and neck into one group (Section 3), because these are parts of the head. Third, we organized the datasets and challenges related to the chest and abdomen into the same group (Section 4). These datasets and challenges relate to the diagnosis, anatomical segmentation, and treatments. Finally, for the datasets and challenges that cannot be categorized into the above groups, we grouped them under "other" (Section 6), and these datasets and challenges are related to the skin, bone, phantom, and animals.

The introduction of each group and sub-group includes mainly the type of modality, the task, the disease, and the body part. However, not all the groups of datasets can be introduced in that way. For some groups, we introduce the datasets and challenges according to the domain-specific problems. For example, we categorize the pathology datasets into microcosmic and macrocosmic tasks.

The head and neck are significant parts of the human body because many essential organs, glands, and tissues are located there. Several researchers' image analysis work relate to the head and neck. To make an effective use of computers for research, diagnosis, and treatment, many researchers have released datasets and challenges, for examples: 1) the analysis of tissue structure and functions (2, 3, 4, 6) and 2) diseases diagnosis (30, 39, 47) .

Because the brain controls emotions, actions and functions of other organs, the brain's area is significant. First, we introduce the datasets and challenges related to the analysis of the brain structure, function, imaging, and other basic tasks in Subsection 3.1. Second, we introduce the datasets and challenges related to brain disease diagnosis in Subsection 3.2.

Moreover, since the eyes are crucial to our vision, the computer-aided diagnosis of eye-related diseases is also an important research focus. The eye-related datasets and challenges are covered in Subsection 3.3. We introduce other datasets and challenges of the neck and the datasets related to the brain's behavior and cognition in Subsection 3.4.

The basic analysis and processing of the brain medical images are clinically critical for the diagnosis, treatment, and other brain-related analysis tasks. The datasets and challenges we discuss are mainly for the segmentation tasks and center around the brain structure. In contrast, some datasets focus on imaging, including MR imaging acceleration, the non-linear registration of different resolutions, and tissue reconstruction. One of the most popular tasks is the segmentation of white matter (WM), gray matter (G<), and cerebrospinal fluid (CSF), and their respective datasets and challenges are introduced in Subsection 3.1.1. Meanwhile, other tissues and functional areas' segmentation are also the focus of research, and their related datasets and challenges are discussed in Subsection 3.1.2. Subsection 3.1.3 describes the other basic tasks. Table 1 shows the datasets and challenges of these basic tasks.

The segmentation of WM, GM, and CSF has great significance for brain structure research and computeraided diagnosis, particularly using AI. Similarly, for AI algorithms, it is also of great significance to understand the human brain's structure. Therefore, MICCAI and others have held many challenges with this research focus, and researchers could design automatic algorithms to segment magnetic resonance images into different parts. We introduce these datasets and challenges with respect to their modalities and tasks.

Modality: The datasets and challenges which focus on the WM, GM, and CSF segmentation, usually provide MR images. Challenges (2, 3, 4, 5, 6, 7) provide mainly two modalities: T1, T2, while datasets (1, 8) only provide T1 for the white matter hyperintensities segmentation task. Note that, MR scans are sensitive to the hydrogen atom, and such a feature can effectively help image analysts to distinguish between different tissues and parts of the image. Moreover, due to the color of the tissue imaged by MR, these scans are named as "white matter" and "gray matter".

Task: The main focus of these datasets and challenges is the segmentation of WM, GM, and CSF. However, they do not only focus on that. Challenges (1, 4, 5, 6, 7) also provide the annotation of other parts of the brain, including basal ganglia, white matter lesions, cerebellum, and infarction. One of the challenge for segmentation is the presence of a lesion because of the unnatural characterization of lesions. A well-annotated data Ultrasound can help AI to overcome this problem and also achieve more robust results. Challenges (5, 7) use MR images of the neonatal brain, and consider tissue volumes as an indicator of long-term neurodevelopmental performance [22] .

Performance metric: For the segmentation task, the Dice score is one of the most commonly used metrics, and all these datasets and challenges adopt it as a performance measure. Besides the Dice score, datasets (4, 6, 8) also use Hausdorff distance and volumetric similarity as metrics; datasets (2, 3) use the average the Hausdorff distance and the average surface distance as one of their metrics; moreover, dataset (8) also uses sensitivity and F1-score as metrics for performance evaluation.

The segmentation of functional areas and tissues has also an essential meaning for brain-related research and computer-aided diagnosis. In this subsection, we introduce the datasets and challenges that are related to the segmentation of functional areas and tissues.

Tissues segmentation: While, WM, GM, and CSF were introduced in Subsection 3.1.1, the segmentation of other brain tissues is also an active research area. Challenges (1, 4, 6, 7) aim to segment brain images into different tissues, including ventricles, cerebellum, brainstem, and basal ganglia. These challenges provide MR images and the voxel-level annotations of the regions of interest with thirty or forty scans. Because these regions are essential for brain health, researchers need to overcome the challenges related to their size and shape in order to segment them. Dataset (9) focuses on the cerebellum segmentation from the diffusion-weighted image (DWI), while dataset (12) focuses on the segmentation of caudate from the brain MR image.

The segmentation of the human brain cortex into different functional areas is of great significance in education, clinical research, treatment, and other applications. Datasets (10, 11) provide images and annotations for the design of automatic algorithms to segment the brain cortex into different functional areas. Dataset (10) uses DTK protocol [24] , which is modified from DK protocol [31] , and the DTK protocol includes 31 labels, details of which are listed in https://mindboggle.readthedocs.io/ en/latest/labels.html. Dataset (11) is a commercial dataset for research in the segmentation of functional areas of the brain cortex.

In addition to the segmentation tasks of the brain tissues and the functional areas, some of the datasets and challenges also focus on the generation, registration, and tractography.

Generation: Datasets and challenges (14, 17, 18) aim to accelerate MR imaging or generate high-resolution MR images from low-resolution ones. Usually, highresolution imaging requires higher cost, while lowresolution imaging is cheaper but affects the analytical judgment and may lead to an incorrect diagnosis. These challenges provide many scans at low-resolution to allow researchers to design algorithms to convert or map low-resolution images onto higher-resolution ones. The datasets and challenges mainly focus on the generation tasks. Another focus is the cranioplasty (13) to generate a part of broken skull from CT images of the models of the broken skull. Other datasets and challenges (15, 19, 20, 21) focus on the reconstruction of MR images.

The registration between different modalities is another research focus. Challenges (22, 23) focus on the registration between ultrasound data and MR images of the brain. Cross-modality registration is difficult because the subject is not absolutely static. Moreover, the MR is a 3D volume imaging modality and hence is different from ultrasound, which is a 2D imaging modality. Thus, these challenges focus on establishing the topological relation between Preoperative MR image and intraoperative ultrasound. Challenge (19) also focuses on the diffusion MR image registration to eliminate differences between different vendors' hardware devices and protocols.

Tractography: Tractography is another segmentation task and focuses on the segmentation and imaging of the fiber in the WM. Dataset (24) aims to segment the fiber bundles from brain images, including phantom, squirrel monkey, and macaque, while challenges (25, 26) focus on the tractography with DTI, another type of MR image.

Besides the structural analysis and image processing tasks, computer-aided diagnosis is also a research focus in healthcare. Medical image analysis plays a critical role in clinical research, diagnosis, and treatment. The datasets and challenges we have included are for two tasks: 1) the segmentation of lesions and tumors and 2) the classification of diseases. For the segmentation task, the respective datasets and challenges focus on the tumor and lesion segmentation of the human brain, mark the lesion's contour for diagnosis and treatment, and the relevant details are shown in Subsection 3.2.1. For classification tasks, the datasets and challenges have been used for the development of automatic algorithms to classify or predict diseases from medical images, and these datasets and challenges are presented in Subsection 3.2.2.

Tumors and lesions in the brain affect human's healthy life and safety, and image analysis is an effective way to diagnose the relevant diseases. In this subsection, related datasets and challenges are introduced, and they are reported in Table 2 .

Glioma datasets and challenges: Gliomas are one of the most common brain malignancies for adults. Therefore, many challenges and datasets focus on the segmentation of glioma for its diagnosis and treatment. BraTS challenge series (30, 31, 32, 33, 34, 35, 36, 37, 38) have been occurring since 2012 to segment the glioma. The challenges of such a segmentation task are caused by the heterogeneous appearance and shape of gliomas. The heterogeneity of glioma reflects its shape, modalities, and many different histological sub-regions, such as the peritumoral edema, the necrotic core, enhancing, and the non-enhancing tumor core. Therefore, these series of challenges provide multi-modal MR scans to help researchers design and train algorithms to segment tumors and their sub-regions. The tasks of this challenge series include 1) low-and high-grade glioma segmentation (37, 38) , 2) survival prediction from pre-operative images (32, 33) , and 3) the quantification of segmentation uncertainty (30, 31) . Besides the BraTS challenge series, dataset (47) is another one for the segmentation of low-grade glioma and provides T1-weight and T2-weight MR images with biopsy-proven gene status of each subject by fluorescence in-situ hybridization, a. k. a. FISH [46] . Dataset (46) focuses on the processing of brain tumor and aims to design and evaluate DL-based automatic algorithms for glioblastoma segmentation and further research.

Similar to tumor segmentation, brain lesion segmentation also focuses on detecting brain abnormalities. However, the difference is that lesion segmentation deals with damaged tissues. Challenges (39, 40, 41, 42, 48) focus on stroke lesion segmentation because stroke is also life-threatening and can disable the surviving patients.

Stroke is often associated with high socioeconomic costs and disabilities. Automatic analysis algorithms help to diagnose and treat stroke, since its manifestation is triggered by local thrombosis, hemodynamic factors, or embolic causes. In MR images, the infarct core can be identified with diffusion MR images, while the penumbra (which can be treated) can be characterized by perfusion MR images. The challenge ISLES 2015 (42) focuses on sub-acute ischemic stroke lesion segmentation and acute stroke outcome/penumbra estimation and provides 50 and 60 multi-modalities MR scans of data for training and validation, respectively, for two subtasks, i. e., sub-acute ischemic stroke lesion segmentation and acute stroke outcome/penumbra estimation. The subsequent year's challenge, ISLES 2016 (41), focuses on the segmentation of lesions and the prediction of the degree of disability. This challenge provides about 70 scans, including clinical parameters and MR modalities, such as DWI, ADC, and perfusion maps. The challenge ISLES 2017 (40) focuses on the segmentation with acute MR images, and ISLES 2018 (39) , focuses on the segmentation task based on acute CT perfusion data. Moreover, dataset (48) focuses on the segmentation of the brain after stroke for further treatments.

Intracranial hemorrhage related datasets: Intracranial hemorrhage is another type of medical condition that affects our health. Dataset and challenge (43, 44) focus on the detection and segmentation of intracranial hemorrhage to help medics locate the hemorrhage regions and decide on a treatment plan. Dataset (45) also provides data for the classification of normal or hemorrhage CT images.

Multiple sclerosis lesion related datasets: Multiple sclerosis lesion is another kind of lesion in the brain which is not life-threatening and deadly but can cause disabilities. Datasets and challenges (49, 50, 51) are about the multiple sclerosis lesion segmentation with multimodalities MR data (T1w, T2w, FLAIR, etc.).

Except for the tumor and lesion segmentation, brain disease classification also plays an essential role in healthcare. Brain related diseases have a severe effect on patients' health and their lives, e. g., Alzheimer's disease (AD) [63, 64, 65, 66] and Parkinson's disease (PD). Therefore, effective diagnosis and early intervention can effectively reduce the health damage to patients, the effect on the social times of families, and the economical impact on society. In this section, we first introduce the datasets and challenges of AD (52, 53, 54, 55, 56, 62) , Table 2 : Summary of datasets and challenges for the brain lesion and tumor segmentation task.

Year Modalities LGG-1p19qDeletion [39, 40] 2017 and then we introduce other diseases (63, 64, 65) . Table  3 shows the relevant challenges and datasets.

Alzheimer's disease: AD affects a person's behavior, cognition, memory, and daily life activities. Such a progressive neurodegenerative disorder affects the normal daily life of patients because suffering from such a disease makes patients not know who they are and what they should do which then progresses to the point until they forget everything they know. The disease takes an unbearable toll on the patient and leads to a high cost to their loved ones and to the society. For example, according to [67], AD became the sixth deadly cause in the U.S. in 2018 and costs more than two to three hundred billion U.S. dollars. Therefore, researchers are doing everything they could to explore the causes of AD and its treatments. Diagnosis based on medical images has become a research focus because early diagnosis and intervention have significance on the progress of this disease. Hence many researchers work on the classification, i. e., prediction of AD using brain images. The datasets mainly include "Alzheimer's Disease Neuroimaging Initiative (ADNI)" and "Open Access Series of Imaging Studies (OASIS)".

The ADNI is a series of projects that aim to develop clinical, imaging, genetic, and biochemical biomarkers for the early detection and tracking of AD. It includes four stages: ADNI-1 (52), ADNI-GO (53), ADNI-2 (54), and ADNI-3 (55) . These projects provide image data of the brain for researchers, and the modalities of images include MR (T1 and T2) and PET (FDG, PIB, Florbetapir, and AV-1451). These four stages consists of 1400 subjects. The subjects can be categorized into normal cognition (NC), mild cognitive impairment (MCI), and AD, where MCI can be split into early mild cognitive impairment (EMCI), later mild cognitive impairment (LMCI).

The OASIS is a series of projects aiming to provide neuroimaging data of the brain, which researchers can freely access. OASIS released three datasets, which are named OASIS-1, OASIS-2, and OASIS-3. All these three datasets are related to AD, but these datasets are also used in functional areas segmentation and other tasks. The OASIS-1 (56) contains 418 subjects aged from 18 to 96, and for the subjects older than 60, there are 100 subjects diagnosed with AD. The dataset includes 434 MR sessions. The OASIS-2 (57) contains 150 subjects, aged between 60 to 96, and each subject includes three or four MR sessions (T1). About 72 subjects were diagnosed as normal, while 51 subjects were diagnosed with AD. Besides, there are 14 subjects who were diagnosed as normal but were characterized as AD at a later visit. The OASIS-3 (58) includes more than 1000 subjects, more than 2000 MR sessions (T1w, T2w, FLAIR, etc.), and more than 1500 PET sessions (PIB, AV45, and FDG). The dataset includes 609 normal subjects and 489 AD subjects.

Moreover, there are many other challenges based on ADNI and OASIS or independence datasets. Challenge (60) is based on ADNI and aims at the prediction of the longitudinal evolution. Dataset (61) is based on OASIS and it is released on Kaggle for the classification of AD. Challenge (62) is an independent AD-related challenge to classify subjects into NC, MCI, and AD.

Other diseases: Similar to AD, other brain diseases are also important from the diagnosis and treatment perspective. However, the number of datasets and challenges of these diseases is not as large as AD. A few datasets focus on Parkinson's disease (PD) and spinocerebellar ataxia type II (SCA2). Datasets (63, 64) provide images of PD with MR images and classification labels. Dataset (65) provides images and classification labels of spinocerebellar ataxia-II, i.e., SCA2. Dataset (66) provides images and annotations for the diagnosis of mild traumatic brain injury.

As the human's imaging sensor, the eyes' health is essential for human beings, and eye diseases may lead to blindness. We introduce the relevant challenges and datasets in this subsection and list them in Table 4 .

Datasets according to the modality: With regards to the eye-related datasets and challenges, the main used modalities are the fundus photo (70, 72, 73, 74, 75, 76, 77, 82, 84) and OCT (71, 78, 79, 81, 83) . The fundus photo can help medics evaluate the eye's health and locate the retinal lesions because the fundus photo clearly shows the important parts of the eye, such as the blood vessels and the optic disc. OCT is a new imaging approach that is safe for the eye and shows the retinal tissues' in details. However, it has also disadvantages -it is not suitable for diagnosing microangioma and the planning of retinal laser for the photocoagulation treatment.

Datasets according to the analysis task: These datasets and challenges can be used for four tasks.

1) Classification tasks focus on classifying whether the subject has specific diseases or judging whether the subject is abnormal. Datasets and challenges (69, 70, 71, 74, 75, 76, 82) focus on predicting a single disease, while others (73, 77, 78, 79) focus on diagnosing multiple diseases.

2) Segmentation is another task, which provides more information compared to classification. Datasets and challenges (70, 72, 74, 77, 81, 83) focus on the segmentation of the tissues and lesions for further diagnosis and disease analysis.

3) Datasets and challenges (70, 71, 74, 76, 77, 84) focus on the detection of lesions or other landmarks.

These tasks help medics locate key targets, such as areas and tissues, for effective diagnosis or provide feature details for other automated algorithms. 4) Unlike other tasks, the last one focuses on the annotation of the tools used for eye-related surgery (80) . (73) Besides these diseases, dataset (80) aims at the annotation of images.

Besides the brain's structural analysis, the image processing, and the computer-aided diagnosis tasks, another important research focus is the human neck because it holds many essential glands and organs. This subsection discusses the datasets and challenges of the neck and teeth, covered in Subsection 3.4.1 and Subsection 3.4.2, respectively. Moreover, many researchers are working on the analysis of behavior and cognition with DL-based methods. We discuss the details in Subsection 3.4.3.

The neck is also essential for our health. The neck holds many glands and organs, and when these become abnormal, effective diagnosis and segmentation play an essential role in their treatments. The related image datasets and challenges are listed in Table 5 .

Datasets and challenges (85, 87, 88, 90, 94, 95) focus on the segmentation of glands and the lesions and tumors in relevant glands. Dataset (89) focuses on the binary classification tumor vs. normal. Challenge (86) aims at the task of thyroid gland nodules detection with ultrasound images and videos. Challenge (91) focuses on the nerves segmentation in the neck, while challenge (96) focuses on evaluating carotid bifurcation. 

Challenges (92, 93) focus on the diagnosis of dental X-Ray images. The main tasks of these two challenges include landmark localization and caries segmentation. Challenge (92) provides around 400 cephalometric Xray images with the annotation of landmarks by two experienced dentists. Challenge (93) provides about 120 bitewing images with experts' annotations of different parts of the teeth.

To understand what we see, hear, smell, and feel, our brain draws on neurons in our brain to compute and analyze the stimulations and understand what, where, why, and when questions and scenarios are. Many researchers now use Artificial Neural Networks as a research method to analyze the relationship between brain activities and stimulation. They use functional MR images to scan our brain activity, analyze the hemodynamic feedback, and identify the area of the neurons which react. Therefore, the analysis of the reactions of the brain in response to a specific stimulation is an important research focus. Researchers use DL to detect or decode the stimulation of subjects to work out the brain's functionality. The related datasets are listed in Table 6 . Some datasets (98, 103, 107) focus on classifying the stimulations or the subject's attribution based on the subject's functional MR images. Dataset (98) aims to identify whether the subject is a beginner or an expert in programming via the reaction of their brain to source codes. Dataset (103) focuses on diagnosing sub-jects with depression vs. subjects with no-depression using audio stimulations and analyzing the subjects' brain activity. Dataset (107) works on the influence of cannabis on the brain.

Datasets (97, 99, 101, 102, 104, 105, 106) focus on the encoding of the stimulations, i. e., brain activities' decoding. Datasets (101, 104, 105) aim to rebuild what subjects have seen using DL-based methods from their brain activities using functional MR images. On the other hand, datasets (99, 106) work on the encoding of faces that subjects have seen from functional MR images with similar modalities.

There are many vital organs in the chest and abdomen. For example, the heart is responsible for the blood supply; the lungs are responsible for breathing; the kidneys are responsible for the production of urine to eliminate toxins from the body. Therefore, the medical image analysis of organs in the chest and abdomen is an important research focus. Most of the tasks are computeraided diagnosis with classification, detection, and segmentation of lesions being the most targeted tasks.

Many datasets and challenges aim to segment one or more organs in the chest and abdomen for diagnosis or treatment planning. Subsection 4.1 discusses the datasets and challenges for segmentation. Subsection 4.2 introduces the datasets and challenges which focus on the diagnosis of organs in the chest and abdomen. While, Subsection 4.3 describes the datasets and challenges of the chest and abdomen that are not catego- Image reconstruction from human brain activity vision 105 Generic Object Decoding [105] 2018

Image reconstruction from human brain activity vision 106 Adjudicating between facecoding models with individualface fMRI responses [106] 2018

Decoding face from brain activity vision 107

T1w structural MRI study of cannabis users at baseline and 3 years follow up [107] 2018 Impact of cannabis on brain cannabis 1 DWI, Field map rized above, including regression, tracking, registration, and other tasks related to the chest and abdomen organs.

This subsection covers the datasets and challenges of the chest and abdomen organs that are used for anatomic segmentation tasks. The anatomic segmentation tasks include the organ contour segmentation (Subsection 4.1.1) and organ segmentation (Subsection 4.1.2). The contour segmentation is different from organ segmentation-the former aims to separate an organ from the backgroup or mark the boundaries between multiple organs and the background. The latter aims to segment the organ into different parts at the anatomical level. Table 7 presents the datasets and challenges that are used for the segmentation of the chest and abdomen organs.

Organ contour segmentation is a necessary information for the preplanning of surgery and diagnosis. A well-segmented contour of the organs provides a precise mask, which helps to produce accurate segmentation results for the diagnosis, treatment, and operation. This subsection introduces datasets and challenges for the contour segmentation of a single organ and of multiple organs. (127) Generally, these datasets and challenges focus on the larger organs, such as the liver and the lungs, with the aim to diagnose tumors and lesions, and where contour segmentation is a pre-processing step. However, it is challenging to segment smaller organs with low-resolution images, particularly for radiotherapy, because an incorrect contour segmentation of these small organs can lead to severe consequences (e. g., organ damage). Small organs' incorrect contour can lead to their damage during radiotherapy.

The most commonly used image modalities for chest and abdomen organs segmentation are MR and CT. As Table  1 shows, many datasets and challenges use MR images. MR images have higher resolution under certain conditions and have better resolution for soft body tissues and organs, such as the heart and prostate. Meanwhile, CT is the most widely used modality for organ segmentation and other tasks and diagnosis that are related to chest and abdomen, such as the lung and liver, according to our research, because of its convenience, effectiveness, and low cost.

Chest & abdomen datasets according to focus: The purpose of these datasets and challenges can be categorized into three groups: further analysis, benchmark, and radiotherapy. Most datasets and challenges which provide annotated organs' contours are provided with the objective to focus on further analysis and treatments. One of the challenges of segmentation is to achieve a robust segmentation of the whole organ and separate it from the background, without omitting the lesions and tumors, and thus, some test benchmarks (116, 128) are provided for researchers to evaluate their algorithms. Another challenge, which is addressed by datasets and challenges (115, 120) is the imbalance between different organs because of their sizes and shapes, and such an imbalance makes it challenging to segment small organs and provide valuable information for analysis and treatment.

The single organ's contour segmentation tasks usually focus on segmenting a region for subsequent tasks (110, 122, 123, 124, 129, 138, 144) or with an anatomical purpose (118, 126, 133, 137, 142, 143) for research. The difficulty of the former task is that the lesions and tumors may affect the segmentation by separating the organ from the background, while the latter's difficulty is to perform more precise segmentation.

The chest and abdomen multiple organs contour segmentation focuses on splitting the organs from each other. Some of these datasets and challenges (113, 114, 116) focus on the segmentation of multiple organs, including the relatively larger organs, which are easier to segment, and the relatively smaller organs, which can be more challenging to segment compared to the larger ones, especially when the model is handling the larger and smaller organs at the same time. Similarly, some of these datasets and challenges (115, 120) focus on the "organ at risk" which means that these organs are healthy but might be at risk because of radiation therapy. Dataset (127) focuses on multi-atlas-based methods, which are widely used in brain-related research. Dataset (128) aims to provide a benchmark for the segmentation algorithms.

Different from contour segmentation of the chest and abdomen organs, the organ segmentation aims to segment the organ into different parts. Just as the hand has five fingers, organs are made up of multiple parts, and a typical example is the Couinaud liver segmentation method. This subsection introduces the datasets and challenges for organ segmentation. These datasets and challenges are listed in Table 7 .

Heart realted datasets and challenges: Most of these datasets and challenges (112, 119, 125, 130, 132, 134, 140) are related to the heart segmentation. The most frequently used modalities are MR and ultrasound, and the aim is to segment the heart into the left atrium, chambers, valves, and other parts. Though MR and ultrasound can effectively image the different tissues of the heart, the heartbeat results in blurred images, which makes the segmentation task more difficult, while for ultrasound, the dynamic nature of ultrasound images is another challenge for the segmentation algorithm. (139) provides 55 CT scans and focuses on the segmentation of the lung with the labeling of its different parts: outside the lungs, the left lung, the upper lobe of the left lung, the lower lobe of the left lung, the upper lobe of the right lung, the middle lobe of the right lung, and the lower lobe of the right lung. The biggest challenge is the effect of the lung lesions and diseases, such as tuberculosis and pulmonary emphysema, on the performance of the segmentation. Moreover, challenges (135, 141) focus on the segmentation of the lung vessels.

Diseases of organs in the chest and abdomen have a significant impact on human health. Therefore, many researchers work on this problem by analyzing medical images. Several researchers have designed automatic or semi-automatic algorithms for the classification, segmentation, detection, and characterization tasks to help medics diagnose these diseases. In this subsection, we describe the datasets and challenges related to the diagnosis of diseases of the chest and abdomen that are reported in Tables 8, 9 , and 10, respectively.

Chest & abdomen datasets according to modality: According to the datasets and challenges collected, CT is the most commonly used imaging modality for the chest & abdomen, because of its suitable imaging quality and ability to clearly display tissues and lesions. Some datasets and challenges also provide CT images using contrast agents for clearer images. Besides CT imaging, there are other modalities, including MR, X-Ray digital radiographs, PET, endoscopy, etc. The MR images are used in breast-related diagnosis, cardiac-related tasks, soft tissue sarcoma detection, and ventilation imaging. Because of the organs' size and the CT's resolution, which is limited by the imaging exposure time and radiation dose, MR is a more suitable imaging modality for small or specific organs. The PET is always used with other modalities, such as CT and MRI. The contrast agent's density is related to the metabolism, which means the density of radiation from contrast agent will be high in the tumor, so PET is always used for tumor related tasks. Endoscopy images are used for medical inspection of the stomach, intestines, and others.

The classification of diseases intends to determine whether a subject is healthy or not. It is sometimes called "detection" or "prediction", and the prediction is different from the detection task presented below.

The main focus of these datasets is to judge whether there is any cancer, lesion, or tumor, such as soft tissue sarcoma (192) , prostate lesion (177, 184) , lung cancer (161) , and breast cancer (160) . Classification is an effective task for diagnosis, particularly computer-aided tasks. A quick and early diagnosis can allow effective interventions to increase the probability of the patient recovery before the condition worsens.

Another focus is the classification of diseases. These diseases include mainly pneumothorax (164) , cardiac diseases (175), tuberculosis (178) , pneumonia (179) , and COVID-19, which are discussed at the end of this subsection. The endoscopy related challenges provide data with the aim to classify RGB images and videos to classify patient into "normal" vs. "abnormal". Dataset (169) focuses on the classification based on the diagnostic records. These datasets and challenges provide data for researchers to design AI-based algorithms to diagnose common diseases.

The characterization task of the tumor and lesion is also called attribute classification, which focuses on the subsequent characterization analysis of the tumors and lesions following the detection and segmentation tasks using automatic analysis algorithms. A typical example is the attributes classification of pulmonary and lung cancer (159, 162, 168, 186, 189, 193) . The datasets and challenges usually provide CT scans with the annotation of different attributes, such as lesion type, spiculation, lesion localization, margin, lobulation, calcification, cavity, etc. Each attribute includes two or more categories. Another focus is the characterization of the breast related lesions and tumors (187, 191) .

In most research and clinical situations, classification is not enough. The medics and researchers usually focus on the reason for such a disease, and the localization of the lesion or tumor. Further treatment evaluations, plan and interpretability are the specific focus for medics and DL researchers. Thus, detection and segmentation are the tasks which are receiving a lot of attention at present. The detection task aims to find a region of interest and localize its position. The regions of interest usually include:

-Lung cancer and tumor (173, 180, 189, 195, 197) -Pulmonary nodule (162, 174, 193, 197, 200) -Celiac-related damage (202, 203, 204, 206) -Other lung lesions (172, 183) -Polyp (198, 204) -Cervical cancer (182) -Liver cancer (188) EndoCV 2021 [191, 192] (204) Furthermore, challenges (203, 206) focus on the segmentation of artifacts (e. g., polyps) in endoscopic images.

In 2020, COVID-19 became a research focus because it caused more than 100 million infections and two million deaths. Different datasets and challenges focus on this devastating disease and provide data to help researchers develop deep learning models to detect COVID-19 via various medical image modalities.

In the view of modalities, most of these datasets and challenges use either CT or CR images, and some provide both modalities. One exception is dataset (150) , which uses ultrasound images. These datasets provide image annotations labeled by radiologists.

Most of these datasets and challenges are related to classification tasks. Datasets (145, 146, 147, 157, 159) directly focus on diagnosing COVID-19 from normal subjects. In contrast, datasets and challenges (149, 150, 151, 153, 154) focus on diagnosing COVID-19 from a few other similar diseases, which can also lead to lung opacity or other symptoms, such as Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome (SARS), and Acute Respiratory Distress Syndrome. Moreover, other datasets and challenges (148, 152) focus on the diagnosis task, with natural language processing, genomics, or clinical methods.

Similarly, some other datasets (147, 155, 156 ) focus on the segmentation or detection of COVID-19 related lesions, such as ground-glass opacity, air-containing space, and pleural effusion.

Besides the classification, detection, and segmentation tasks, there are also several other tasks which are the current focus of research. In the following, we present the datasets and challenges related to these tasks, and report them in Table 11 .

Chest & abdomen datasets for regression: Similar to attributes classification, regression is another task which aims to compute or measure the target attributes from given images, but the difference is that the outputs of regression are continuous. A typical example is fetal biometric measurements (217, 227) . These challenges provide ultrasound images to help researchers design algorithms to measure such attributes to estimate the gestational age and monitor the fetus's growth. Besides, another example is cardiac measurements (212, 213, 214, 218, 221, 226, 231) . These datasets and challenges provide MR or ultrasound images to analyze the heart's attributes to detect heart diseases.

Chest & abdomen datasets for tracking: Tracking is a critical task because our body and organs move during imaging. For organs, such as the heart, the characteristics of their motion is informative. Challenges (222, 224) provide ultrasound data to track the liver to analyze the following of a surgery and treatments. Datasets and challenges (215, 228, 229) focus on the tracking of the heart. They provide ultrasound images to track and analyze the heart.

Chest & abdomen datasets for registration: Challenge (216) focuses on the CT registration of lungs and provides CT scans with and without enhanced and contrast agents. Meanwhile, challenges (220, 225) focus on the registration between different modalities of the heart and provide MR, CT, and other modalities to register images with beating hearts.

Challenges (210, 223) focus on localizing specific landmarks, including the amniotic and the heart, using ultrasound and MR images. Challenge (211) focuses on the classification of surgery videos. Dataset (232) focuses on the reconstruction of the coronary artery.

Though radiography, MR imaging, and other imaging modalities have been used as the basis for diagnosis, Generation pathology images are also used as a gold standard for diagnosis, particularly for tumors and lesions. Digital pathology images are generally obtained by collecting tissue samples, making slices, staining, and imaging. Therefore, pathology images are also one of the mainstream image modalities that are used for diagnosis.

The focus of these datasets and challenges include 1) the identity and segmentation of basic elements (e. g., cell and nucleus) in pathology images, and 2) blood-based diagnosis from images. In this section, we present datasets and challenges of the pathology images (Subsection 5.1), and at the same time, cover the datasets and challenges of blood images in Subsection 5.2.

Pathology images are used as the basis of cancer diagnosis. The pathologists and automatic algorithms analyze images based on specific features, such as cancer cells and cells under mitosis. Many organizations and researchers provide datasets and challenges, which focus on the microcosmic pathology and at the whole slide image (WSI) level. The relevant datasets and challenges are listed in Table 12 .

In most situations, WSI is used in pathology diagnosis. Unlike CT or MR images, the pathology image is an optical image similar to the picture photoed by a camera. However, one major difference is that a pathology image is imaged by transillumination, while the usual photo is imaged by reflection. Another difficulty is in the size of the image. WSI is stored in a multi-resolution pyramid structure. A single multi-resolution WSI is generally achieved by capturing many small high-resolution image patches, and it might contain up to billions of pixels. Thus, WSI is used as a virtual microscope in diagnosis for clinical research, and many challenges use WSI, such as (237, 245, 246, 254, 255, 256) . However, in some situations, the WSI is not suitable for analysis tasks, for example, cell segmentation. Therefore, pathology image patches are used in several other challenges, such as (233) for visual question answering, (259) for mitosis classification, (238, 248) for multi-organs nucleus detection and segmentation.

Datasets for stain: Slides made from human tissues are without color, and required to be stained. The commonly used stains include Hematoxylin, Eosin, and Diaminobenzidine. Usually, two or more stains are used in staining the slide, and the most commonly used

Pathology datasets according to disease: The pathology slides are widely used in the diagnosis of many diseases, especially cancer. The cancer cells and tissues have different shapes compared to their normal counterpart. Thus, the diagnosis via pathology is the gold standard. Many datasets and challenges, such as (238, 248, 260) , do not address any specific disease. At the same time, many datasets and challenges target specific diseases, such as breast cancer (239, 244) , myeloma (262, 268) , cancers in the digestive system (241), cervical cancer (257, 258) , lung cancer (247), thyroid cancer (253) , and osteosarcoma (240) .

Pathology datasets according to task: Generally speaking, the tasks used with these datasets and challenges can be classified into two categories: microcosmic task and WSI-level task. The latter targets the diagnosis of diseases, based on a classification task. Expanded from the simple classification tasks, many datasets and research methodologies focus on complex tasks, such as the segmentation of tumor cell areas (238, 248, 249) and the detection of pathological features (241, 255) . The microcosmic tasks derive from the clinical analysis to identify cells and detect mitosis to extract key features from pathology images to support further disease diagnosis. The following subsections expand on the microcosmic tasks and WSI-leveling tasks, respectively.

Microcosmic tasks focus on microcosmic features extraction (e. g., nucleus features), for further diagnosis and WSI-level tasks. In this subsection, we introduce the microcosmic task related datasets and challenges.

Data: Unlike the WSI-level, the datasets and challenges which focus on microcosmic tasks usually provide small size patch-level images with high-resolution. These patches are suitable for the annotation of microcosmic-level objects and resource-limited algorithms. The size of images varies depending on the image analysis tasks. For the segmentation and detection of cells and nucleus, the size of images is usually a thousand-pixel square to contain the suitable number of cells or nuclei. For individual cell analysis tasks (e. g., mitosis determination), the size is usually of a single cell. For other tasks (e. g., the patch-level classification), the size varies from dataset to dataset. 

Cells are considered to be essential for the pathology image. The analysis of cells is one of the most effective ways to extract pathology image features for diagnosis. The pathologists analyze the size, shape, pattern, and stained color of the cells with their knowledge and expertise to make judgments about these cells and classify them as normal or abnormal. Thus, many datasets and challenges focus on the segmentation and detection of cells. The cells and nucleus can be placed neatly in the slide. However, during the slide preparation, these cells could overlap or locate randomly on the slide. Aiming at such a problem, challenges (257, 258) focus on the segmentation and detection of overlapping cells and nuclei. The shape and size of cells from different organs might be different and can have different recognition and analysis challenges. Therefore challenges (238, 248) focus on the multi-organ cells or nucleus segmentation.

Generally, the size of WSI is too large to be able to analyze every cell and relationships between cells. DL-based methods can easily find essential information from the patchlevel image to support the diagnosis based on feature learning. Many datasets and challenges focus on this problem. The datasets and challenges, which provide patch-level images, mainly focus on the classification, segmentation, or detection tasks. Based on the quality of feature learning, DL has reached the state-of-the-art performance in many areas of computer vision. Therefore, some datasets and challenges focus on the patch itself, and not the cell itself. The tasks can vary from the segmentation, detection, and classification of the cell to the direct classification of the patch. Challenges (242, 250, 252, 253) focus on patch-level image classification to determine whether metastatic or a different tissue is present.

Datasets for other pathology tasks: Besides the detection and segmentation of cells and the patch-level classification, there are other microcosmic tasks. Challenge (259) focuses on the mitotic detection for nuclear atypia scoring. The atypical shape, size, and internal organization of cells are related to the progress of cancer. The more advanced the cancer is, the more atypical the cell looks like. Challenge (260) focuses on cell tracking, to know how cells change shapes and move as they interact with their surrounding environment. This is the key to understand cell migration's mechanobiology and its multiple implications in normal tissue development and many respective diseases. Challenge (233) focuses on the visual question answering task of pathology images using AI where the model is trained to pass the examination of the pathologist.

WSI-level pathology tasks focus on the diagnosis of cancer and pathology image processing. WSI contains all the complete information of a patient to be able to establish an accurate diagnosis. Automatic diagnosis algorithms can quickly analyze the slide. This is useful, especially in developing countries where there is a lack of well-experienced pathologists. However, it is a challenge to directly analyze WSI for both pathologists and algorithms because the size of WSI can be up to 100, 000 × 100, 000 pixels. Thus, such analysis becomes challenging, and to address this, most of the current datasets and challenges focus on the classification and segmentation of biomarkers, cells, and other regions of interest. At the end of this subsection, we introduce other datasets and challenges that are related to the tasks of regression and localization of tumors and biomarkers.

The prime goal of the examination of pathological images, especially WSI, is to diagnose cancer. Thus, how to classify WSI with large size and limited computing resources becomes a research challenge. Datasets and challenges (234, 236, 237, 245) focus on predicting cancer or evaluating WSI, such as Gleason grade or HER2 evaluation. At the same time, some datasets and challenges (244, 254, 255) focus on the classification of metastasized cancer.

Datasets for segmentation and detection of WSI: DLbased methods are seen as a black box which process pathology images. The performance of these methods has achieved the state-of-the-art performance, but the interpretability of these methods is still difficult. From the pathologists' point of view, datasets and challenges (235, 241, 246, 254, 255) focus on the segmentation and detection tasks to determine the critical elements which led to a particular diagnosis, such as cancer cell area and signet ring cell.

Datasets for other WSI tasks: Besides classification and detection, there are a few other tasks based on WSI. This includes the registration of pathology images (243) for data pre-processing and the localization of lymphocytes (242).

Blood image analysis is the basis of the diagnosis of many diseases. In contrast to the pathology images, blood samples' images mainly contain blood cells, and these datasets and challenges are aimed at bloodrelated cancer and cell counting. Similar to pathology images, these datasets and challenges also focus on the segmentation, detection, and classification of cells. The relevant datasets and challenges are listed in Table 13 .

One of the primary tasks of these datasets is the classification of cells, which focuses on identifying the different types of cells. Dataset (271) focuses on classifying red blood cells, white blood cells, platelets, and other cells. At the same time, dataset (264) focuses on the classification of malignant and non-malignant cells. Other datasets and challenges (268) (multiple myeloma segmentation), (263) (mitochondria segmentation), (266) (malaria detection) focus on the segmentation and detection of blood cells and biomarkers.

Although we have categorized the datasets and challenges into three parts: "head and neck", "chest and abdomen", and "pathology and blood", several other datasets cannot be categorized under these three areas. In this section, we introduce the datasets and challenges categorized under "other" which means that these datasets do not fit under the above categories but they are still relevent to DL methods. The topics of this section include bone (Subsection 6.1), skin (Subsection 6.2), phantom (Subsection 6.3), and animal (Subsection 6.4).

Medical image analysis of bone is currently a major research focus. Radioautography is the most effective way to image bones, because X-Ray is sensitive to calcium that makes up human bones. The segmentation of bone, the detection of abnormalities, and their characterization are meaningful clinical and research tasks. Therefore, the following subsections discuss the datasets and challenges for the classification, segmentation, and other tasks, and Table 14 reports these datasets and challenges.

Bone datasets for classification: The classification tasks for bone related computer-aided diagnosis is the focus for many researchers. Though the classification cannot locate the regions of interest, it can still help orthopedists to judge whether the patient is healthy or not, such as in dataset (283). The diagnosis of tears and abnormality is also a research focus, such as meniscal tears (279), vertebral fracture (282), and knee abnormality (279). 

Bone datasets for segmentation: The segmentation task of bone images plays a vital role in clinical diagnosis and treatment. The computer-aided segmentation algorithms and orthopedist need to segment the different parts of the bone from a given image and make a sound judgment to provide a more adequate treatment.

The difficulty with such tasks is the low-resolution of images compared with other image modalities. The focus of these datasets and challenges include the spine (282, 284), vertebrae (275, 276, 281) , and knee cartilage (285).

Other bone related tasks: Besides the classification and segmentation tasks, the datasets and challenges of bone also include imaging (274) , registration (277), spinal curvature estimation (278), labeling (276) , and abnormality detection (280).

Skin cancer is one of the most common type of cancer, and melanoma is one of the most lethal types of skin cancer. To diagnose skin cancer, dermoscopy is used to image the skin, and the classification, segmentation, and detection tasks are employed. The most relevant datasets and challenges are reported in Table 15 .

Aiming at the computer-aided diagnosis of melanoma, ISIC released datasets and a series of challenges for clinical training and for the development of automatic algorithms. The challenges of ISIC include: 2017 (290), 2018 (289), 2019 (288). Challenges (289, 290) include three sub-challenges: lesion segmentation, lesion attribution detection, and lesion classification with thousands of dermoscopic images. Besides, challenge (288) focuses on the classification of melanoma, melanocytic nevus, basal cell carcinoma, actinic keratosis, benign keratosis, dermatofibroma, vascular lesion, squamous cell carcinoma, and others. Challenge (287), i. e., ISIC 2020, focuses on the classification of melanoma to better support dermatological clinical works with 33126 scans of more than 2000 patients.

Moreover, challenge (286) focuses on Diabetic Foot Ulcers, i. e., DFU. The challenges provide more than 2000 images of feet photographed with regular cameras under a consistent light source and annotated by experts for training and testing of automatic detection and classification algorithms.

Phantom is an object based on a specific material to mainly evaluate medical imaging equipment. Phantom can be used in the registration of different pieces of equipment and using it as data for the development of automatic algorithms. Registration is essential for clinical diagnosis. For instance, it reduces the difference between different medical devices with or without the same modalities (293, 294, 295, 298). When data is scarce, then the phantom becomes useful. Some image analysis tasks or experiments require surgically inserted fiducial markers, which are costly and risky. However, the phantom has low cost and risk, easy to image, and easy to be annotated (291, 296, 297). The related datasets and challenges are reported in Table  15 .

Medical image analysis of animal material is relatively a smaller research area. However, it is not as limited by privacy and stricter ethics restrictions as human medical images. The datasets and challenges we found focus on animal brain segmentation (299, 301), depth estimation from endoscopic (300), and multi-modality registration (292). The relevant datasets and challenges are reported in Table 15 .

The success of AI algorithms such as DL has led to their widespread use in several fields, including for med-ical image analysis. Researchers with different knowledge and background tackle image-based clinical tasks using computer vision tools to design automatic algorithms for different applications [11, 12, 12, 260, 261, 262, 263, 264] . Though AI algorithms can successfully handle many tasks, several unsolved problems and challenges hinder the development of AI-based medical image analysis.

DL-based algorithms learn from input images of real data through gradient descent. Large-scale annotated datasets and a powerful DL model are key to the development of successful DL models. For example, the success of AlexNet [14] , GoogleNet [2] , ResNet [3] are based on powerful models, which include millions of parameters. At the same time, a large-scale dataset, such as ImageNet [265] , is also necessary to train the DL model to be able to tune such a large number of parameters. However, when these methods are applied to medical image analysis, many domain-specific problems and challenges start to appear. This subsection discusses some of these challenges.

The biggest challenge in the development of DL models is data scarcity. Different from other areas, the scale of the medical image datasets is usually smaller due to many limitations, e. g., the ethical restrictions. The commonly used datasets for traditional computer vision are in larger scale compared to medical image datasets. For example, the handwritten digits dataset, MNIST [266] includes a training set with 60,000 examples and a testing set with 10,000 examples; the ImageNet dataset [265] includes three million images for training and testing; Microsoft COCO [267] includes more than two million images with annotations. In contrast, many medical image datasets are smaller and only include hundreds or at most thousands of images. For example, the challenge BraTS 2020 (30) There are multiple reasons for the lack of data. The main cause is due to the restricted access to medical images by non-medical researchers, i. e., barriers between disciplines. The root causes of these barriers relate to the cost and difficulties of annotation and the restricted access due to ethics and privacy.

Access to data: As mentioned in the introduction, the direct cause of the data scarcity is that most nonmedical researchers are not allowed to access medical data directly. Though many medical data are generated worldwide every day, most non-medical researchers have no authorization to access clinical data. The easily accessible data are publicly available datasets, but these datasets are not at a large-scale to be able to properly train a DL model.

Ethical reasons: Ethics of medical data usage is a major bottleneck and a limitation to researchers, particularly, computer scientists. Medical data stored in databases always contains sensitive or private information, such as name, age, gender, and ID number. In some cases, the data records of medical images can be used to identify a patient. For example, if an MR scan includes the face, an intruder can identify them for a possibly evil purpose. In most countries and regions, it is illegal to distribute such data with private information without the patients' permission, and nobody would usually consent to such distribution. Therefore, for Deep learning researchers, it is impossible to gain authorization to access these datasets.

Before DL researchers are able to gain authorization even to desensitized data, they still need to pass ethical reviews.

Annotation: Another root cause is the difficulty to annotate medical images. Unlike other computer vision areas, the annotation of medical images requires specialized professions and knowledge. For example, in autopilot, when annotating objects such as vehicles and pedestrians, there are no specific annotators' requirements because most of us can easily distinguish a car or a human. However, when annotating medical images, domain-specific knowledge is essential. E. g., few people if naive would be able to tell the differences between an abnormal and normal tissue. However, it is impossible for a non-specialist to mark the lesion's contour or diagnose a disease.

This difficulty cannot easily be solved even when professionals are employed to annotate data. First, the cost of annotation of medical data is huge. Once the researcher and their organization have obtained some data, they need then to spend more money to employ few medics for its labeling. Such annotation cost is enormous, particularly where medical resources are scarce or where medical costs are high. For example, the challenge PALM (74) provides about 1,200 images with annotation, but its organizers involved only two clinical medics. Second, the physician who annotates the data is required to have a rich clinical and diagnosis experience, thus reducing the number of people who are suitable for this task even further. Third, to avoid any subjectivity, one image needs to be annotated by two or more physicians. Another problem is what to do if the labels of two annotators are not the same? In many challenges, the organizer employs many junior physicians to annotate and employs a senior physician to decide if the junior physicians' annotations are not the same. For example, in the challenge AGE (71) , each data annotation is determined by the mean of four independent ophthalmologists in a group and it is then manually verified by a senior glaucoma expert.

The characteristics of medical images themselves pose difficulties for the medical image analysis tasks.

There are many types and modalities of images that are used in medical image analysis. Similar to computer vision, the modalities include both 2D and 3D. However, the medical images have several other differences. Though the average scale of a medical image dataset is smaller than computer vision-related field datasets, the size of each sample of data is larger on average than the one of a computer vision-related field.

For 2D images, CR, WSI, and other modalities have large variances in the resolution and color than the other computer vision fields. Some modalities might need more bits to encode a pixel, while some modalities are significantly huge. For example, CAMELYON 17 (254) only includes about a thousand of pathology images, but the whole dataset is about three terabytes. Such datasets with few large samples pose a challenge for the AI algorithms, and it has become a focus of research to design an algorithm that can learn from limited computational resources (e. g., the number of labeled samples) and be useful for clinical diagnosis.

For 3D medical images such as CT and MRI, they are dense 3D data, compared with sparse data, such as point cloud, in autopilot. Like the BraTS serial challenges (30, 31, 32, 33, 34, 35, 36, 37, 38) , many researchers face the challenges to design algorithms that can effectively learn from multi-modal dataset.

These characteristics of medical images require welldesigned algorithms with a more robust capability to fit the data well and without overfitting. However, that further leads to the need for more data and resources. It is a challenge to learn suitable features from a small sample dataset.

The ideal scenario is to find or invent a method or an algorithm to simultaneously solve all of these encountered problems. However, there is no silver bullet. The problems and challenges related to the data and the adopted methods cannot be entirely resolved, or sometimes, a problem arises as another is solved. Nevertheless, many ideas have been introduced to address the current problems, and they are introduced in this subsection.

With respect to the problems and challenges mentioned above, researchers are working on two research directions: 1) a more effective model with less data, and 2) a more practical approach to access data. For the learning methods with small datasets, researchers use approaches such as few-shot learning and transfer learning. In order to access more data, researchers adopt three main approaches, namely federated learning, lifelong learning, and active learning.

Many medical image datasets have a small number of samples. For example, challenge MRBrains13 (6) only includes 20 subjects for training and testing, while challenge KITS 19 (110) has about two hundred subjects. Therefore, many researchers struggle to find a practical approach to learn from small samples.

Few-shot learning and zero-shot learning Few-shot learning hits one of the critical spots of DL-based medical image analysis problems, i. e., the development of DL models with fewer data. Humans can effectively learn from few samples. Therefore, different from the standard deep learning-based methods, humans learn to diagnose a disease from images, without the need to view tens of thousands of images (i. e., from only few-shot). Meta-learning, which is also called learning to learn, is a solution used to solve few-shot learning problem. Meta-learning can learn the meta-features from a small data size. The number of medical images in most datasets and challenges is not as large compared to the regular computer vision-related datasets and challenges. Mondal et al. [269] use few-shot learning and GAN to segment medical images. The GAN is modified for semi-supervised learning with few-shot learning. Similar to few-shot learning, zero-shot learning aims at novel samples. Rezaei et al. [270] cover a review of zero-shot learning from autonomous vehicles to COVID-19 diagnosis. However, zero-shot and few-shot learning have also their disadvantages, such as domain gap, overfitting, and interpretability.

Knowledge transfer: Transfer learning is another method, which can recognize and apply knowledge and skills learned from a previous task. For example, both white matter and gray matter segmentation and multi-organs segmentation are segmentation tasks. However, the neural network training is usually independent, which means that almost nobody trains a neural network with two tasks at once. However, it does not mean that these two tasks are unrelated. Besides zero-shot learning and few-shot learning, transfer learning, or say, knowledge transfer, is another method to infer knowledge from a previously learned task. Transfer learning can be applied to two similar tasks and between different domains. The most significant advantage of transfer learning is that they use rich scale datasets to pre-train the neural network and then fine tune and transfer the network to the main task on a few samples dataset.

Besides finding a practical approach to learn from small samples, many researchers have been working on active learning and federated learning (which aims to use data without access to sensitive information). This also reduces annotation costs of deep learning algorithms.

Federated learning: Federated learning provides another way to access data. As discussed previously, the limitation of accessing data is led by privacy and other problems. Instead of directly sharing data, federated learning shares the model to protect privacy from being leaked. With other privacy protection methods, federated learning can effectively use the data from each independent data center or medical center.

However, there are two disadvantages of federated learning: annotation and implementation. The problem of annotation cannot be solved by sharing data but other methods. The main challenges are the implementation, as only a few institutions have attempted federated learning so far. For example, Intel and other institutions have attempted to apply federated learning for brain tumor-related tasks in their research [271] . The main challenges in their implementation include:

1) The implementation and proof of privacy protection, 2) The methodology for sharing and updating millions of the model's parameters, 3) Preventing attacks on DL algorithms and leaks of data privacy on the Internet or computing nodes.

Natural language processing: Natural language processing is also a potential tool to automatically or semiautomatically annotate medical image data. It is a stan-dard procedure for a medic to provide a diagnostic report of the patient, particularly after the medical image was taken. Therefore, such large amounts of data (image and text) is useful for medical image analysis after desensitization, and natural language processing can be used for annotation. Several natural language processing-based methods, e. g., [272, 273, 274] have been applied in medical-related research fields.

Active learning: Active learning aims to reduce the annotation cost by indirectly using the unlabeled data to select the "best" samples to annotate. Generally, data annotation for deep learning requires experts to label data so that the neural network can learn from the data. Active learning does not require too many samples at the beginning of training. In other words, active learning can "help" annotators to label their data. Active learning uses the knowledge learned from the labeled data to select and annotate the unlabeled data. The unlabeled data with annotation from algorithms is used to subsequently train the network over the next number of epochs. Active learning [275, 276] is used in the medical image analysis in a loop of 1) algorithm learn from the data annotated by humans, 2) human annotate the unlabeled data selected by the algorithm 3) algorithm add the newly labeled data to the training set. The advantage of active learning is obvious: annotators do not need to annotate all the data they have, and at the same time, the neural network can learn from data faster from such interactive progress.

In this work, we have provided a comprehensive survey of the datasets and challenges for medical image analysis, collected between 2013 and 2020. The datasets and challenges were categorized into four themes: head and neck, chest and abdomen, pathology and blood, and others. We provide a summary of all the details about these themes and data. We also discuss the problems and challenges of medical image analysis and the possible solutions to these problems and challenges.

Very deep convolutional networks for large-scale image recognition

Going deeper with convolutions

Deep residual learning for image recognition

Mastering the game of Go with deep neural networks and tree search

Grandmaster level in StarCraft II using multi-agent reinforcement learning

U-net: Convolutional networks for biomedical image segmentation

Predicting Isocitrate Dehydrogenase (IDH) Mutation Status in Gliomas Using Multiparameter MRI Radiomics Features

Deep feature loss to denoise OCT images using deep neural networks

AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system

A clinically applicable deep-learning model for detecting intracranial aneurysm in computed tomography angiography images

Altered resting-state functional connectivity and effective connectivity of the habenula in irritable bowel syndrome: A cross-sectional and machine learning study

Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network

A Guide to Convolutional Neural Networks for Computer Vision

ImageNet classification with deep convolutional neural networks

The cancer imaging archive (TCIA): Maintaining and operating a public information repository

PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals

CEREBRUM: a fast and fullyvolumetric Convolutional Encoder-decodeR for weaklysupervised sEgmentation of BRain strUctures from outof-the-scanner MRI

Multi-Site Infant Brain Segmentation Algorithms: The iSeg-2019 Challenge

Benchmark on automatic six-monthold infant brain segmentation algorithms: The iSeg-2017 challenge

NEATBrainS -Image Sciences Institute

MRBrainS Challenge: Online Evaluation Framework for Brain Image Segmentation in 3T MRI Scans

Evaluation of automatic neonatal brain segmentation algorithms: The NeoBrainS12 challenge

Standardized Assessment of Automatic Segmentation of White Matter Hyperintensities and Results of the WMH Segmentation Challenge

101 Labeled Brain Images and a Consistent Human Cortical Labeling Protocol

3D segmentation in the clinic: A grand challeng

A Baseline Approach for AutoImplant: The MICCAI 2020 Cranial Implant Design Challenge. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and

AccelMR 2020 Prediction Challenge -AccelMR 2020 for ISBI 2020

MRI White Matter Reconstruction | ISBI 2019/2020 ME-MENTO Challenge

An open, multi-vendor, multi-fieldstrength brain MR dataset and analysis of publicly available skull stripping methods agreement

An open dataset and benchmarks for accelerated MRI. arXiv

An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest

The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)

Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features

Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge

MICCAI BraTS 2017: Scope | Section for Biomedical Image Analysis (SBIA) | Perelman School of Medicine at the University of Pennsylvania

ISLES 2015 -A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI

Intracranial hemorrhage segmentation using a deep convolutional model. Data

Predicting Deletion of Chromosomal Arms 1p/19q in Low-Grade Gliomas from MR Images Using Machine Intelligence

Data From LGG-1p19qDeletion

A large, open source dataset of stroke anatomical brain images and manual lesion segmentations

Objective Evaluation of Multiple Sclerosis Lesion Segmentation using a Data Management and Processing Infrastructure

MS lesion segmentation challenge: supplementary results

Longitudinal multiple sclerosis lesion segmentation: Resource and challenge

Longitudinal multiple sclerosis lesion segmentation data resource

Fluorescence in situ hybridization (FISH) on touch preparations: A reliable method for detecting loss of heterozygosity at 1p and 19q in oligodendroglial tumors

The Alzheimer's disease neuroimaging initiative

Alzheimer's Disease Neuroimaging Initiative 2 Clinical Core: Progress and plans. Alzheimer's and Dementia

The Alzheimer's Disease Neuroimaging Initiative 3: Continued innovation for clinical trial improvement

Open Access Series of Imaging Studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and demented older adults

Open access series of imaging studies: Longitudinal MRI data in nondemented and demented older adults

OASIS-3: Longitudinal Neuroimaging, Clinical, and Cognitive Dataset for Normal Aging and Alzheimer Disease. medRxiv

MRI Hippocampus Segmentation using Deep Learning autoencoders

MRI Hippocampus Segmentation | Kaggle

The Alzheimer's disease prediction of longitudinal evolution (tadpole) challenge: Results after 1 year follow-up. arXiv

Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge

Executive attention networks show altered relationship with default mode network in PD. NeuroImage: Clinical

Dynamic connectivity at rest predicts attention task performance

PD De Novo: Resting State fMRI and Physiological Signals

Central modulation of parasympathetic outflow is impaired in de novo Parkinson's disease patients

Diffusion tensor imaging

Histogram analysis of dti-derived indices reveals pontocerebellar degeneration and its progression in SCA2

U-net based analysis of MRI for Alzheimer's disease diagnosis. Neural Computing and Applications

Alzheimer's disease Classification from Brain MRI based on transfer learning from CNN

Identification of Alzheimer's disease using a convolutional neural network model based on T1-weighted magnetic resonance imaging

Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline

Alzheimer's disease facts and figures

Automatic detection of rare pathologies in fundus photographs using few-shot learning

REFUGE Challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs

AGE Challenge: Angle Closure Glaucoma Evaluation in Anterior Segment Optical Coherence Tomography

Automatic Detection challenge on Age-related Macular degeneration

IDRiD: Diabetic Retinopathy -Segmentation and Grading Challenge

Indian diabetic retinopathy image dataset (IDRiD): A database for diabetic retinopathy screening research

Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images

Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning

CATARACTS: Challenge on automatic tool annotation for cataRACT surgery

Cataract dataset for image segmentation. arXiv

The Retinal OCT Fluid Detection and Segmentation Benchmark and Challenge

Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema

Retinopathy online challenge: Automatic detection of microaneurysms in digital color fundus photographs

Thyroid Nodule Segmentation And Classification In Ultrasound Images

Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach

Data from Head-Neck-Radiomics-HN1

Data from AAPM RT-MAC Grand Challenge

Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer

Data from Head-Neck-PET-CT

Evaluation of segmentation methods on head and neck CT: Auto-segmentation challenge 2015

Automated segmentation of the parotid gland based on atlas registration and machine learning: A longitudinal mri study in head-and-neck radiation therapy

Evaluation framework for carotid bifurcation lumen segmentation and stenosis grading

Cognitive control of sensory pain encoding in the pregenual anterior cingulate cortex. d1 -decoder construction in day 1, d2 -adaptive control in day 2

Decoding functional category of source code from the brain (fMRI on Java program comprehension

Expert programmers have fine-tuned cortical representations of source code. bioRxiv

Reconstructing faces from fMRI patterns using deep generative neural networks

Modulation of fronto-striatal functional connectivity using transcranial magnetic stimulation

Resting State -TMS

Deep image reconstruction from human brain activity

Deep Image Reconstruction

BOLD5000 A public fMRI dataset of 5000 images. arXiv

Neural processing of emotional musical and nonmusical stimuli in depression

Development of a validated emotionally provocative musical stimulus set for research. Psychology of Music

Limbic hyperactivity in response to emotionally neutral stimuli in schizophrenia: A neuroimaging meta-analysis of the hypervigilant mind

Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders

Generic decoding of seen and imagined objects using hierarchical visual features

Adjudicating between face-coding models with individual-face fMRI responses

Grey matter changes associated with heavy cannabis use: A longitudinal sMRI study

Multi-Centre, Multi-Vendor and Multi-Disease Cardiac Segmentation: The M&Ms Challenge

The KiTS19 challenge data: 300 Kidney tumor cases with clinical context, Ct semantic segmentations, and surgical outcomes

Data from C4KC-KiTS [Data set

Multivariate Mixture Model for Myocardial Segmentation Combining Multi-Source Images

Multivariate mixture model for cardiac segmentation from multi-sequence MRI

Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography

CT-ORG: A Dataset of CT Volumes With Multiple Organ Segmentations

Nc-isbi 2013 challenge: automated segmentation of prostate structures

The liver tumor segmentation benchmark (LiTS). arXiv, abs

CT-MR) healthy abdominal organ segmentation

CT-MR) healthy abdominal organ segmentation

Comparison of semi-automatic and deep learning-based automatic methods for liver segmentation in living liver transplant donors

Multiorgan segmentation using distanceaware adversarial networks

A large annotated medical image dataset for the development and evaluation of segmentation algorithms

Automatic tuberculosis screening using chest radiographs

Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration

Chest X-Ray Analysis of Tuberculosis by Deep Learning with Segmentation and Augmentation

A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging

Autosegmentation for thoracic radiation treatment planning: A grand challenge at AAPM 2017

Data from Lung CT Segmentation Challenge

Multi-level deep convolutional networks for automated pancreas segmentation

Single site breast DCE-MRI data and segmentations from patients undergoing neoadjuvant chemotherapy

Interactive whole-heart segmentation in congenital heart disease

Cloud-Based Evaluation of Anatomical Structure Segmentation and Landmark Detection Algorithms: VISCERAL Anatomy Benchmarks

Leveraging mid-level semantic boundary cues for automated lymph node detection

2D view aggregation for lymph node detection using a shallow hierarchy of linear classifiers

A new 2.5D representation for lymph node detection using random sets of deep convolutional neural network observations

A new 2.5 D representation for lymph node detection in CT

Rule-based ventral cavity multi-organ automatic segmentation in CT scans

Benchmark for Algorithms Segmenting the Left Atrium From 3D CT and MRI Datasets

Evaluation of prostate segmentation algorithms for MRI: The PROMISE12 challenge

Data From Prostate-3T

Standardized evaluation methodology and reference database for evaluating IVUS image segmentation

Extraction of airways from CT (EXACT'09)

Comparison and evaluation of methods for liver segmentation from CT datasets

A CT image dataset about COVID-19. arXiv

CORD-19: The COVID-19 open research dataset

A Critic Evaluation of Methods for COVID-19 Automatic Detection from X-Ray Images. arXiv

Unveiling COVID-19 from chest x-ray with deep learning: A hurdles race with small data

Accelerating COVID-19 differential diagnosis with explainable ultrasound image analysis. arXiv

POCOVID-net: Automatic detection of COVID-19 from a new lung ultrasound imaging dataset (POCUS). arXiv

COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images

Can AI Help in Screening Viral and COVID-19 Pneumonia?

Artificial intelligence for the detection of COVID-19 pneumonia on chest CT using multinational datasets

CT Images in Covid-19 [Data set

Chest imaging representing a COVID-19 positive rural U.S. population

Chest imaging representing a COVID-19 positive rural U.S. population. Scientific Data

Detection of masses and architectural distortions in digital breast tomosynthesis: a publicly available dataset of 5,060 patients and a deep learning model

A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis

A large chest radiograph dataset with uncertainty labels and expert comparison. 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence

Data from NSCLC-Radiomics-Interobserver1

MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports

The MIMIC-CXR Database

Co-registration of pre-operative CT with ex vivo surgically excised ground glass nodules to define spatial extent of invasive adenocarcinoma on in vivo imaging: a proof-of-concept study

Fused Radiology-Pathology Lung Dataset

Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved?

Non-small cell lung cancer: Identifying prognostic imaging biomarkers by leveraging public gene expression microarray data -Methods and preliminary results

A phase II study of 3-Deoxy-3-18F-fluorothymidine PET in the assessment of early response of breast cancer to neoadjuvant chemotherapy: Results from ACRIN 6688

Data from ACRIN-FLT-Breast

ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases

Computer-aided detection of prostate cancer in MRI

ProstateX challenge data

Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge

Flying focal spot (FFS) in cone-beam CT

Image reconstruction and image quality evaluation for a 64-slice CT scanner with z-flying focal spot

Evaluating variability in tumor measurements from same-day repeat CT scans of patients with non-small cell lung cancer

A resource for the assessment of lung nodule size estimation methods: database of thoracic CT scans of an anthropomorphic phantom

A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities

LUNGx Challenge for computerized lung nodule classification

Guest Editorial: LUNGx Challenge for computerized lung nodule classification: reflections and lessons learned

Initial Stanford Study of 26 Cases. The Cancer Imaging Archive

The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans

Data From LIDC-IDRI

Accuracy of CT Colonography for Detection of Large Adenomas and Cancers

Comparing and combining algorithms for computer-aided detection of pulmonary nodules in computed tomography scans: The ANODE09 study

An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy

Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy

Endoscopic surgeon action detection dataset. arXiv

Endoscopy artifact detection (EAD 2019) challenge dataset. arXiv

Automated measurement of fetal head circumference using 2D ultrasound images

Algorithms for left atrial wall segmentation and thickness -Evaluation on an open-source CT and MRI image database

Standardized evaluation framework for evaluating coronary artery stenosis detection, stenosis quantification and lumen segmentation algorithms in computed tomography angiography

Evaluation and comparison of current fetal ultrasound image segmentation methods for biometric measurements: A grand challenge

30000+ questions for medical visual question answering. arXiv

Multi-organ Nuclei Segmentation and Classification Challenge 2020

Automatic cellularity assessment from post-treated breast surgical specimens

Assessment of Residual Breast Cancer Cellularity after Neoadjuvant Chemotherapy using Digital Pathology [Data set

Osteosarcoma data from UT Southwestern/UT Dallas for Viable and Necrotic Tumor Assessment

Histopathological diagnosis for viable and non-viable tumor prediction for osteosarcoma using convolutional neural network

American Society of Pediatric Hematology/Oncology (ASPHO) Palais des congrés de Montréal Montréal

Computer aided image segmentation and classification for viable and non-viable tumor identification in osteosarcoma

Convolutional neural network for histopathological analysis of osteosarcoma

Signet Ring Cell Detection with a Semisupervised Learning Framework

Learning to detect lymphocytes in immunohistochemistry with deep learning

Automatic Non-Rigid Histological Image Registration Challenge

Birl: Benchmark on Image Registration Methods With Landmark Validation. arXiv

Clinical-grade computational pathology using weakly supervised deep learning on whole slide images

Breast Metastases to Axillary Lymph Nodes

Title: Computer-aided diagnosis of lung carcinoma using deep learning -a pilot study

A Multi-Organ Nucleus Segmentation Challenge

Rotation equivariant CNNs for digital pathology

Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer

Grand challenge on breast cancer histology images

H&E-stained sentinel lymph node sections of breast cancer patients: The CAMELYON dataset

From Detection of Individual Metastases to Classification of Lymph Node Status at the Patient Level: The CAMELYON17 Challenge

An objective comparison of cell-tracking algorithms

MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM Images

Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks

A Single-cell Morphological Dataset of Leukocytes from AML Patients and Non-malignant Controls

Neighborhood-Correction Algorithm for Cells

SDCT-AuxNetθ: DCT augmented stain deconvolutional CNN with auxiliary classifier for cancer diagnosis

Heterogeneity loss to handle intersubject and intrasubject variability in cancer. arXiv

Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma

Stain Color Normalization and Segmentation of Plasma Cells in Microscopic Images as a Prelude to Development of Computer Assisted Automated Disease Diagnostic Tool in Multiple Myeloma

MiMM_SBILab Dataset: Microscopic Images of Multiple Myeloma

SD-Layer: Stain deconvolutional layer for CNNs in medical microscopic imaging

Overlapping cell nuclei segmentation in microscopic images using deep belief networks

White Blood cancer dataset of B-ALL and MM for stain normalization

Deep-learning-assisted detection and segmentation of rib fractures from CT scans: Development and validation of FracNet

A Vertebral Segmentation Dataset with Fracture Grading

VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet

Large dataset for abnormality detection in musculoskeletal radiographs. arXiv

Bone texture characterization with fisher encoding of local descriptors

Labeling Vertebrae with Twodimensional Reformations of Multidetector CT Images: An Adversarial Approach for Incorporating Prior Knowledge of Spine Anatomy

Analysis towards diabetic foot ulcer detection. arXiv

International Skin Imaging Collaboration

Data descriptor: The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions

Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (ISIC). arXiv

Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)

Synthetic head and neck and phantom images for determining deformable image registration accuracy in magnetic resonance imaging

Data from Synthetic and Phantom MR Images for Determining Deformable Image Registration Accuracy (MRI-DIR)

Data from CT Phantom Scans for Head, Chest, and Controlled Protocols on 100 Scanners (CC-Radiomics-Phantom-3)

Comprehensive Investigation on Controlling for CT Imaging Variabilities in Radiomics Studies

Credence Cartridge Radiomics Phantom CT Scans with Controlled Scanning Approach (CC-Radiomics-Phantom-2)

Data From Credence Cartridge Radiomics Phantom CT Scans

A 3D population-based brain atlas of the mouse lemur primate with examples of applications in aging studies and comparative anatomy

Radiology Data from the Clinical Proteomic Tumor Analysis Consortium Glioblastoma Multiforme

The anomalous diffusion challenge

Learn2Reg -The Challenge

MeDaS: An open-source platform as service to help break the walls between medicine and informatics. arXiv

Detecting the pulmonary trunk in CT scout views using deep learning

Prediction of ambulatory outcome in patients with corona radiata infarction using deep learning

Predicting conversion to wet age-related macular degeneration using deep learning

Block Level Skip Connections across Cascaded V-Net for Multi-Organ Segmentation

ImageNet: A large-scale hierarchical image database

Gradient-based learning applied to document recognition

Microsoft COCO: Common objects in context

A Dataset of Pulmonary Lesions With Multiple-Level Attributes and Fine Contours. Frontiers in Digital Health

Few-shot 3D multi-modal medical image segmentation using generative adversarial learning

Zero-Shot Learning and Its Applications From Autonomous Vehicles to Covid-19 Diagnosis: A Review. arXiv

Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation

Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence

A natural language processing approach for identifying temporal disease onset information from mental healthcare text

Validation of deep learning natural language processing algorithm for keyword extraction from pathology reports in electronic health records

Deep active learning for nucleus classification in pathology images

Active Learning for Patch-Based Digital Pathology Using Convolutional Neural Networks to Reduce Annotation Costs

We thanks for the projects of National Natural Science Foundation of China (62072358), Zhejiang University special scientific research fund for COVID-19 preverntion and control, National Key R&D Program of China under Grant No 2019YFB1311600.