Shengping Yang PhD, Gilbert Berdine MD
Corresponding author: Shengping Yang
Contact Information: Shengping.Yang@pbrc.edu
DOI: 10.12746/swrccc.v11i46.1139
I am evaluating the feasibility of making predictions on COVID-19 patient treatment outcomes using data from various measurements. My impression is that, compared to traditional statistical methods, artificial intelligence (AI) results in better predictions. I am wondering what the pros and cons of the AI methods are.
First introduced by Alan Turing, who is often referred to as the “father of computer science”, and then later defined by John McCarthy, AI is “the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.”1,2
AI is a rapidly developing field, and encompasses a number of subfields and applications, including machine learning (ML), natural language processing, computer vision, robotics, and expert systems.3 Among them, ML is the subfield that has the most applications in the biomedical field.
Compared to traditional statistical methods, ML methods often dominate accuracy benchmarks and achieve substantially better results. On the other hand, the improved predictive accuracy is associated with increased model complexity, which results in increased difficulty in interpretation.4
ML involves the development of algorithms for understanding and building methods that “learn”, i.e., methods that leverage data to improve performance on some set of tasks.5 In general, ML starts with building a model based on sample data, which are known as training data, and then making predictions or decisions without being explicitly programmed to do so.6 In other words, the decision making process in ML is data-driven, rather than following explicitly programmed instructions. There are several types of ML, including:
Supervised learning involves the development of algorithms that are trained on a labeled dataset, i.e., the output is known for each example in the training data, and the goal is to build a model that can make predictions on new examples based on the patterns learned from the training data. Examples include the interpretation of radiographic images such as chest x-rays or mammography. The training data for these examples are images that have been interpreted by experts. The AI “looks” for patterns in the images that lead to the correct interpretations. By nature of its computational speed, the AI can “try” many patterns including patterns that humans have not considered. Once patterns are recognized based on the training data, the AI is presented with new images to test the pattern recognition algorithm. The AI “learns” by rejecting patterns that are not successful against new images and continuing to evaluate successful pattern algorithms against additional new test images. There are several AI-based supervised learning methods, and many software packages have been developed for facilitating the implementation of such methods. For example, neural networks analysis can be performed using software packages/libraries developed in R (“neuralnet,” “nonet,” “mxnet,” etc.), Python (“Keras,” “TensorFlow,” “PyTorch,” etc.), or Java (“Deeplearning4j,” “Neuroph,” etc.).
There are other supervised learning methods, such as decision trees, random forest, logistic regression, etc. Note that although most of these methods are not considered as AI-based, some can be used as part of an AI system.
Unsupervised learning is a type of ML for discovering some underlying structure or pattern in the data. Compared with supervised learning, the data do not have known labels, and the algorithm is expected to discover the structure of the data through techniques such as clustering. Examples include AI to play chess or go. The AI does not initially “know” what moves are immediately “good” moves, but learns the value of moves by “looking” further and further ahead into the game. Completely new lines of strategy never considered by humans are discovered and adopted by master players. Other examples of the neural network based unsupervised learning algorithms include autoencoders and deep generative models, and they can be implemented by using the packages/libraries developed in R (“neuralnet,” “autoencoder,” “deepautoencoder,” etc.), Python (“Keras,” “Tensorflow,” “Pytorch,” etc), or Java (“Deeplearning4j,” “Encog,” etc.).
There are also other unsupervised learning algorithms, e.g., the k-means and hierarchical clustering.
Besides supervised learning and unsupervised learning, other ML methods include reinforcement learning and semi-supervised learning.
The choice of the ML method for a project depends on various factors, including the type of data, the overall goal and other considerations. If all the training data have known labels, for example, then supervised learning methods are more preferable than unsupervised methods. If the data require nonlinear decision boundaries and contain complex relationships, then neural networks can be more appropriate than random forests. Meanwhile, if there is a need to process a large amount of unstructured data, then a neural network is also preferable than decision trees, which are better suited for structured data. On the other hand, if there is limited computational resources to train the data, then a decision tree might be more practical.
Compared to traditional statistical methods, AI often can extract high quality predictive models from the mining of a large amount of raw datasets and achieve better predictions. In a study with 247,960 de-identified patients infected with COVID-19, for example, the recurrent neural network model was shown to have higher prediction accuracy on a number of clinical outcomes, compared to traditional statistical modeling.7 In another study with 1,500 COVID-19 patients, the random forest analysis was shown to have the best performance in identifying patients with high risk of mortality after infection.8 This is because that most of the AI methods are capable of handling non-linear relationships between predictors and outcomes, and can handle a large number of predictors with stable output. In addition, they are less likely to overfit prediction models than traditional methods.
AI can more efficiently process and analyze large datasets and identify potential targets for prioritizing research efforts. In particular, AI often outperforms traditional modeling in variant calling, genome annotation, variant classification, and phenotype-to-genotype correspondence.9 Artificial intelligence has also been successfully applied in radiology to automate tumor and organ segmentation, as well as tumor monitoring.10 Besides, AI often does not require explicit user input of the features, and can use multiple layers to progressively extract higher level features from raw input.11 For example, a deep learning algorithm that uses a patient’s CT volumes to predict the risk of lung cancer was able to achieve a 94.4% area under the curve among 6,716 national lung cancer screening trial cases.12
Compared with traditional statistical modelling, AI algorithms can often be easily scaled and automated to perform tasks without human intervention, and many are designed to continually learn and improve over time, and thus have better adaptivity to changing circumstances.
Although AI is a powerful tool that can be used to solve a wide variety of problems, there are several limitations that have not been appropriately addressed.
The biggest concern in AI applications, especially in biomedical and healthcare fields, is that most of the AI systems, particularly those based on ML techniques, can be difficult to understand or explain. This problem is often referred to as the “black box” problem; it thus can be a barrier to adopt AI in some contexts. In the medical field, for example, in order for a patient to trust and accept a decision, it is important that the doctor and patient have a clear understanding of how a diagnosis was reached and how a prediction was made. Without knowing the underlying rationales, it is difficult for the doctors to understand the limitations of the AI system and to what degree to rely on it.
Many efforts have been made to improve transparency and interpretability of AI systems. For example, federal agencies and the White House have been working to define federal guidelines for developing and using understandable and explainable AI systems, which focus on developing methods that can be understandable to humans. In December 2020, Executive Order 13960 included that AI should be understandable, specifically that agencies shall “ensure that the operations and outcomes of their AI applications are sufficiently understandable by subject matter experts, users, and others.”13,14
The most straightforward way to develop explainable AI is to use more interpretable models, e.g., decision trees are more interpretable than neural networks. For techniques that are difficult to interpret, such as neural networks, there are methods to make them more interpretable, e.g., sensitivity analysis can be used to help understand how a model is making decisions. Specifically, in sensitivity analysis, the input data are systematically altered and the output data observed, to understand how the networks are using the input to alter the output, and which factors are most important in determining the output. However, it is worth noting, sensitivity analysis has its limitations and might not capture all the interactions among input and output of a network.15
Traditional analysis tools have also been used to improve AI transparency, including validating the results of AI models to ensure that they are accurate and consistent, supplementing the results of AI to provide additional context and details, and identifying the factors that are most important in the AI system.
If feasible, a hybrid approach can be used to combine the strengths of human interpretation and AI systems to make more interpretable decisions. In addition, it is important to regularly test and validate the explanations of an AI system to ensure its operation is intended and the explanations are helpful. Otherwise, the AI might “discover” an algorithm that works very well with existing data, but it does not continue to accurately interpret additional new data. Results are not always generalizable. Nevertheless, explainable AI is an active area of research, and there is still much work to be done in this area.
Artificial intelligence also has other limitations, such as computationally expansive and lack of structure, which introduces difficulties in identifying biases that may be present in the data.
Overall, AI has the potential to transform the field of biomedical research and practices by offering novel decision-making systems for automating disease diagnosis, improving patient outcomes, and reducing the burden of disease. Among the various types of AI algorithms, the choice of an algorithm depends on the data, the objectives, and constraints of a project. Compared to traditional modeling, AI often performs better in making predictions/decisions. However, a serious concern of applying AI, especially in the biomedical field, is its interpretability. Interpretable AI is an area of active research. With the introduction of more powerful, efficient, and transparent AI, it is foreseeable that there will be a greater use of AI in a wider range of fields and domains in the future. Meanwhile, the prevalence of AI in various fields is likely to stir more debates and discussions on the ethical and societal implications of this technology.
Keywords: artificial intelligence, machine learning, supervised learning, unsupervised learning
Article citation: Yang S, Berdine G. Artificial intelligence in biomedical research. The Southwest Respiratory and Critical Care Chronicles 2023;11(46):62–65
From: Department of Biostatistics (SY), Pennington Biomedical Research Center, Baton Rouge, LA; Department of Internal Medicine (GB), Texas Tech University Health Sciences Center, Lubbock, Texas
Submitted: 1/9/2023
Accepted: 1/11/2023
Conflicts of interest: none
This work is licensed under a Creative Commons
Attribution-ShareAlike 4.0 International License.