Open access peer-reviewed chapter

Application of Machine Learning and Data Mining in Medicine: Opportunities and Considerations

Written By

Luwei Li

Submitted: 11 September 2023 Reviewed: 25 September 2023 Published: 19 October 2023

DOI: 10.5772/intechopen.113286

From the Annual Volume

Machine Learning and Data Mining Annual Volume 2023

Edited by Marco Antonio Aceves-Fernández

Chapter metrics overview

100 Chapter Downloads

View Full Metrics

Abstract

With the continuous development of information technology, machine learning and data mining have gradually found widespread applications across various industries. These technologies delve deeper into uncovering intrinsic patterns through the application of computer science. This trend is especially evident in today’s era of advanced artificial intelligence, which marks the anticipated third industrial revolution. By harnessing cutting-edge techniques such as multimodal large-scale models, artificial intelligence is profoundly impacting traditional scientific research methods. The use of machine learning and data mining techniques in medical research has a long-standing history. In addition to traditional methods such as logistic regression, decision trees, and Bayesian analysis, newer technologies such as neural networks, random forests, support vector machines, Histogram-based Gradient Boosting, XGBoost, LightGBM, and CatBoost have gradually gained widespread adoption. Each of these techniques has its own advantages and disadvantages, requiring careful selection based on the specific research objectives in clinical practice. Today, with the emergence of large language models such as ChatGPT 3.5, machine learning and data mining are gaining new meanings and application prospects. ChatGPT offers benefits such as optimized code algorithms and ease of use, saving time and enhancing efficiency for medical researchers. It is worth promoting the use of ChatGPT in clinical research.

Keywords

  • machine learning
  • data mining
  • medicine
  • artificial intelligence
  • interaction

1. Introduction

Data mining techniques [1] are computer algorithms used to identify associations from vast amounts of data. They employ heuristic methods, which essentially involve searching for relevant features and patterns and then extracting more feature information to support better decision-making [2]. Based on the algorithms used, data mining can be categorized into several types [3], with the most commonly used including machine learning algorithms, decision tree algorithms, association rule algorithms, and neural network algorithms.

Machine learning algorithms [4] learn from data and often build a model for prediction and classification. They are highly useful in the realm of big data, as they discover patterns from data to provide support for decision-makers. This algorithm can predict outcomes of features in a dataset and can be applied in medical research for tasks such as identifying important factors and their interactions, predicting disease risks, and performing classification tasks based on observed signals in the data, such as disease prediction analysis.

Decision tree algorithms [5] possess strong representational capabilities and are widely applied in data mining. The gradient boosting tree algorithm, based on decision trees, is even more powerful. Its fundamental idea involves inferencing a final decision from a series of feature attributes using scenario analysis. Decision tree algorithms allow the model to create a classification framework based on decision trees in a supervised learning manner, making rational decisions from each node to achieve a high degree of accuracy in prediction tasks.

Association rule algorithms [6] discover relationships among multiple items in large datasets and draw conclusions [7]. Built on association rules and pattern discovery, this algorithm divides data by association degree to uncover patterns. Association algorithms can have HL patterns (high support and low confidence patterns), LH patterns (low support and high confidence patterns), and LL patterns (low support and low confidence patterns). LL patterns may seem insignificant, but they have relevance in specialized fields like medicine. For instance, in rare medical scenarios with high risks, LL patterns can emerge. Association rule’s positive patterns also find applications in e-commerce, financial analysis, insurance assessment, and other domains.

Neural network algorithms [8], a branch of traditional artificial intelligence, employ clusters of neurons to create an abstract model for generating outputs from inputs. They can represent data types and build predictive models to respond to complex connections within extensive datasets and make decisions based on these connections. They are commonly used in fields such as clinical diagnostic decision-making and automatic classification.

Artificial intelligence (AI) is a field within computer science [9] that aims to mimic human thought processes, learning abilities, and knowledge storage [10]. In the era of big data, AI technologies can utilize vast clinical datasets to support clinical decisions, unveil latent disease subtypes, associations, prognosis indicators, and generate testable hypotheses. AI is gradually transforming the way doctors make clinical decisions and diagnoses [11]. Machine learning is a crucial branch of artificial intelligence. Machine learning and deep learning techniques have shown superior capabilities in handling large, complex, nonlinear, and multidimensional data compared to traditional statistical methods. They have found widespread applications in the medical field [12]. In the following discussion, I will approach the topic from a medical perspective and provide an overview of the latest applications of AI, specifically using machine learning algorithms such as logistic regression, linear regression, random forest, support vector machines, decision tree algorithms, and neural network algorithms. I will also share insights into the algorithmic modeling process, especially considering recent advancements such as the application of the new AI technology, ChatGPT 3.5. In addition, I will emphasize the validation aspects of post-algorithmic modeling, such as ROC curves, DCA curves, CIC curves, calibration curves, K-fold cross-validation, and even the construction of confusion matrices. Another focal point will be the application of these techniques in the context of the global backdrop of COVID-19 infections, particularly in the realm of public health interaction.

Machine learning algorithms have consistently been a focus and hot topic within data mining [13], especially in the domain of medical big data research. Research aims for a substantial number of study samples, sophisticated computer algorithms, and advanced statistical analysis theory. Data mining techniques can be used for diagnosis, prediction, classification, constructing predictive models, and analyzing risk factors [14]. The goal is to generate more reliable and widely applicable models, leading to practical applications that enhance the speed and accuracy of medical diagnosis [15]. Ultimately, this contributes to the recovery of human diseases and indirectly accelerates research in medical robotics.

Machine learning algorithms encompass logistic regression, COX regression, linear regression, random forest, support vector machines, and NaiveBayes, in addition to newer developments such as KNN, GBDT, Histogram-based Gradient Boosting, XGBoost [16], LightGBM [17], and CatBoost [18]. These advanced algorithms have demonstrated significant improvements over traditional modeling techniques, particularly CatBoost, XGBoost, and LightGBM. These three algorithms can rival any advanced machine learning algorithm worldwide in terms of performance. However, determining the optimal algorithm for model construction requires practical data modeling and validation in real scenarios.

Speaking from my perspective, I have been involved in research related to algorithmic modeling, including projects such as “Construction of Chronic Disease Prediction Models and Applications Based on Data Mining” and “Artificial Intelligence Learning of HUA Susceptible Gene Molecular Typing and Risk Prediction.” I possess research experience in using computer algorithms for modeling and validation across tens of thousands of medical big-data cases. My software background extends from SPSS, MedCalc, R software to ChatGPT 3.5, Python, and other statistical software. Statistical theory deepens in tandem with the continuous exploration of statistical operations. I have also published several academic papers on medical computer modeling and validation of chronic diseases, possessing a certain level of applied experience. Here, I will analyze and share experiences in data mining technique modeling and the current clinical applications.

Advertisement

2. Algorithm introduction

2.1 Logistic regression

Logistic regression (LR) is a generalized linear regression model [19] and one of the most widely used algorithms in clinical applications. Besides its ease of implementation in software such as SPSS, its clinical risk analysis based on odds ratios (OR) is easy to comprehend. Moreover, LR’s broad applicability is due to its compatibility with numerous real-world problems. LR algorithms [20] are frequently used in clinical data mining, disease autodiagnosis, economic forecasting, and other domains. They can explore risk factors causing diseases and predict the probability of disease occurrence based on these factors. For instance, LR analysis can yield the weights of independent variables, offering insights into which features serve as risk factors for the outcome variable. Utilizing these weights, one can predict the likelihood of an individual falling ill. Many of my research studies also utilize LR as a foundational model, and it indeed offers predictive capabilities. However, if LR is the primary research tool, I recommend considering whether a linear model is suitable, as many clinical issues are not purely linear. In such cases, constructing a nonlinear regression model, which includes curve equations, segmented regression, spline regression, locally weighted regression, and generalized additive models, is advisable. Prior to building predictive models, I suggest plotting scatter plots to visually assess whether the relationship between independent and dependent variables is linear or nonlinear, thereby selecting the appropriate regression modeling method.

2.2 COX regression

COX regression, on the other hand, is a semiparametric regression model based on time and outcome relationships [21], also known as the proportional hazards model or Cox model. It uses survival outcomes and survival time as dependent variables and can simultaneously analyze the influence of numerous feature factors on survival time. It can analyze data with survival time without requiring the estimation of data survival distribution. Because of these advantages, since its inception, survival curve analysis and COX regression models have found extensive application in medical follow-up studies, particularly in clinically relevant research areas closely tied to survival time, such as malignant tumors and cardiovascular diseases. COX regression is currently one of the most widely used multivariate analysis methods in survival analysis.

2.3 Nomogram

Another intuitive machine learning algorithm is the nomogram. Nomograms represent relationships between multiple independent variables in a plane Cartesian coordinate system using a cluster of nonintersecting line segments [22]. While frequently used in meteorology, they have gained widespread application in the medical field in recent years. Nomograms offer visual and convenient ways to present results based on different equations. Hence, nomograms can be used to depict regression results, including LR and COX regression. However, there is a debate on whether creating nomograms for linear regression is necessary, as the calculations for linear regression are straightforward and the advantages of nomograms might not be as pronounced. For LR and COX regression, nomograms conveniently provide insights into disease risk or proportional hazards. Nomograms are constructed based on regression results, representing multiple line segments that facilitate the calculation of risks for different individuals.

Nomograms are widely used in clinical research [23, 24]. After building an LR model, I have found constructing nomograms based on LR to be a logical next step for elaborating results. More specifically, nomograms are constructed through a multifactor regression model, wherein each influence factor’s contribution to the outcome variable (the magnitude of regression coefficients) is scored for each value level. These scores are then summed up to obtain a total score. By establishing a function between the total score and the probability of the outcome event, the predicted value for the individual’s outcome event can be calculated.

Nomograms are built on the foundation of multifactor regression analysis, integrating multiple predictive indicators and expressing the relationships between variables through scaled line segments drawn on the same plane. They provide a quantifiable and visual representation of results obtained from regression predictive models, which is crucial for guiding clinical research. They hold significant application value in analyzing disease prognosis, especially for diseases such as malignant tumors [25, 26, 27].

2.4 Random forest

Random forest (RF) [28] is a classifier that employs multiple decision trees for training and prediction. RF is essentially an ensemble learning system in machine learning [29]. As an emerging and highly flexible machine learning algorithm, RF has a broad range of applications, from healthcare insurance to medical marketing. It can be used to predict disease risks and the susceptibility of patient populations [30]. RF builds upon decision trees (DT), utilizing knowledge learned from the dataset to classify new data. By setting parameters such as the number of trees and branching conditions, multiple DT models are constructed. The final output is determined by the collective decisions of all the decision trees, achieving optimal classification accuracy that surpasses individual decision trees.

The RF algorithm also incorporates the Bagging approach. To explain intuitively, each decision tree is a classifier, resulting in N classification outcomes for an input sample across N trees. RF aggregates all classification votes and designates the class with the highest vote count as the final output. RF boasts several advantages, such as handling missing values, not requiring dimension reduction for high-dimensional data, and introducing randomness to prevent overfitting. Essentially, RF acts as a versatile powerhouse in the field of machine learning, accommodating a wide array of inputs. It excels in estimating inference mappings, and its versatility renders it almost universally applicable. RF is particularly useful. It does not require extensive parameter tuning and validation such as support vector machines (SVM) and yields higher accuracy [31].

I have used over 10,000 clinical cases to construct RF predictive models for chronic diseases such as diabetes, hypertension, and hyperlipidemia. Analyzing the results from the confusion matrix comparison between the RF model and simultaneously constructed SVM and neural network models, there is a significant gap between the weighted average (weighted avg) and macro average (macro avg) values. This discrepancy might be attributed to the fact that macro avg. is the arithmetic mean of indicators for each class, assigning equal weight to each class and disregarding the issue of class sample imbalance. Thus, in cases of imbalanced class distribution, each class’s performance receives equal attention. In contrast, weighted avg. is a weighted average, calculated by averaging on each class with the class sample size used as the weight. Consequently, in the case of imbalanced class distribution, the weighted average method places greater emphasis on classes with larger sample sizes while attenuating the performance of classes with smaller sample sizes. However, overall, the RF algorithm, compared with SVM and neural networks, might perform better when dealing with large datasets, high-dimensional datasets, and datasets with complex decision boundaries. It is worthy of broader adoption in clinical research.

2.5 Support vector machine

Support vector machine (SVM) is a type of generalized linear classifier that performs binary classification on data in a supervised learning manner. Its decision boundary is a maximum-margin hyperplane derived from learning samples, and it is a classifier known for its sparsity and robustness [32]. SVM can conduct nonlinear classification using kernel methods, implicitly mapping input content to a high-dimensional feature space. SVM is a binary classification algorithm that supports both linear and nonlinear classifications. Evolving over time, it can now handle multiclass classification and regression problems through extension.

SVM finds a hyperplane for classification on the training dataset and then uses this hyperplane to classify new data based on whether the result is greater or less than 0. For completely linearly separable problems, a hyperplane can be found directly. Even considering noisy data, introducing a soft margin can still lead to a solution. In cases of linear inseparability, data can be lifted to a higher dimension to achieve linear separability, and the issue of the unknown mapping function introduced by dimensionality augmentation can be resolved using kernel techniques. SVM’s greatest advantage is its use of kernel techniques to capture nonlinear relationships. In addition, SVM is highly effective in handling small amounts of high-dimensional data, particularly in classification problems. SVM has widespread applications, including medical diagnostics such as disease detection and cancer diagnosis [33]. It can be used in image recognition, video classification, and medical image classification [34] by extracting feature vectors from images and videos and using them as inputs for training models.

Similarly, in building predictive models for chronic diseases, I have also constructed SVM predictive models. The primary parameters for SVM include regularization parameters and kernel parameters. The main parameters of SVM include the regularization parameter, which is also the penalty coefficient of the error term. The choice of kernel function type is crucial. The SVM algorithm can transform nonlinear problems into linear problems by selecting a kernel function. The gamma value, which represents the coefficient of the kernel function, influences the distance between data points in a high-dimensional space. A higher gamma value results in a more complex decision boundary with greater influence from training samples, whereas a lower gamma value leads to a smoother decision boundary. The default setting is “auto,” meaning gamma is automatically selected based on the training data. Strengthening training and adjusting parameter settings on the fly are crucial for SVM because it needs to output model prediction results and probabilities based on the input parameters. Based on the SVM modeling research conducted by the author, if parameters are set improperly, the confusion matrix evaluation of the SVM model, particularly precision regarding positive and negative outcomes, might yield larger errors compared with the random forest (RF) and neural network models. Adjusting model parameters might be necessary in such cases. Through repeated training, SVM’s model could achieve higher accuracy probabilities than RF or even XGBoost [35]. However, SVM has the disadvantage of being slower in terms of computation speed compared with algorithms such as artificial neural networks (ANN) and RF. SVM’s computational complexity is higher, especially for large-scale and high-dimensional datasets, leading to significant computation time and space requirements. In addition, SVM’s training process requires multiple iterations, further increasing the computational complexity. Training an SVM model takes a relatively long time, a notable drawback evident in the author’s chronic disease modeling. For clinical research data that require repetitive training, this could be a significant limitation.

2.6 Decision tree

Decision tree (DT) is a machine learning method [36]. A prominent feature of DT is its tree-like structure, wherein each internal node represents a judgment on an attribute, each branch signifies the output of a decision result, and each leaf node represents a classification outcome. In general, DT is a decision analysis method that evaluates disease risk and feasibility by constructing a DT based on known probabilities of various scenarios occurring to calculate the probability of the expected net present value being greater than or equal to 0. This decision analysis method uses probability analysis graphically and is termed a decision tree due to its resemblance to the branches of a tree. In machine learning, DT is a predictive model [37] representing the mapping relationship between object attributes and object values. DT uses algorithms such as ID3, C4.5, and C5.0 to generate trees, employing the concept of entropy from information theory.

DT used for classification is called a classification tree and is a commonly used classification method [38, 39]. It is a form of supervised learning, wherein the term “supervised learning” refers to the process of providing a set of samples, each with a set of attributes and a predefined category. Through learning, a classifier is obtained that can correctly categorize newly encountered objects. This type of machine learning is known as supervised learning. On the other hand, the Classification and Regression Tree (CART) is an extremely effective nonparametric method for both classification and regression [40]. It achieves its prediction goals by constructing binary trees. The CART model is widely employed in statistical fields and data mining techniques. It employs an entirely different approach to constructing predictive criteria compared with traditional statistics, making it easy to understand, use, and interpret [41]. Predictive trees generated by the CART model are often more accurate than algebraic predictive criteria constructed using commonly used statistical methods. This advantage becomes more pronounced as the data become more complex and the number of variables increases. In many of my studies, the decision tree (DT) model has been extensively utilized. When constructing a DT model, selecting the right algorithm is of utmost importance, and the CART algorithm is a highly practical choice. Its capabilities in classification and regression tasks are powerful, making it suitable for handling large-scale and high-dimensional data. Its greatest advantage lies in its ability to produce easily understandable graphical results, meeting the needs of most clinical studies. However, it is important not to overlook the disadvantages of the CART algorithm. For example, it can only generate binary trees and cannot handle multiclass problems. The CART algorithm is prone to overfitting and typically requires prepruning to prevent overfitting. Prepruning involves setting a criterion during the tree growth process to stop growth when that criterion is reached, but this can create “horizon limitation,” wherein once a node becomes a leaf node, the possibility of favorable branching operations for its successors is cut off. Therefore, pruning parameter settings need to be carefully analyzed based on clinical statistical experience. In addition, DT can indicate interaction effects and can further be used for multiplicative and additive interaction analysis. This is also a focal point and challenge in clinical etiological research in public health, which will be discussed in detail in the final section.

2.7 Artificial neural networks

Artificial neural networks (ANNs) [42] have been a research focus in the field of artificial intelligence since the 1980s. They serve as a specific manifestation of artificial intelligence, thus, this paper places a strong emphasis on them. ANNs abstract the neural network of the human brain from an information processing perspective, creating a simplified model by assembling nodes in different connection patterns to form various networks. In the medical, engineering, and academic fields, ANNs are often referred to directly as neural networks or neural network-like structures. ANNs [43] are a computational model comprised of numerous interconnected nodes (or neurons). Each node represents a specific output function referred to as an activation function. The connection between every two nodes represents a weighted value for the signal passing through that connection, known as a weight. This is equivalent to the memory of an ANN. The network’s output varies based on its connection pattern, weight values, and the specific activation function used. The network itself typically approximates a certain algorithm or function found in nature, or it might express a particular logical strategy. Over the past decade, research on ANNs has made substantial progress.

ANNs have successfully addressed many complex real-world problems in fields such as pattern recognition, intelligent robotics, automatic control, prediction estimation, biology, medicine, and economics, displaying remarkable intelligence [44]. ANN is widely applied in the machine learning domain such as constructing disease diagnosis and prediction models in clinical settings as well as applications such as image recognition and speech recognition [45, 46], thereby extending its use to automated disease detection and self-driving cars. ANN is a highly parallel information processing system with strong adaptive learning capabilities [47]. It does not rely on mathematical models of the research object and it demonstrates robustness to changes in system parameters and external disturbances of the controlled object. ANN can handle complex, multi-input, multi-output nonlinear systems. The fundamental problem addressed by ANN is classification. There are various types of ANNs, including back propagation (BP) neural networks, random neural networks, convolutional neural networks (CNN), long short-term memory networks (LSTM), and multilayer perceptrons (MLP) [48]. The choice of ANN algorithm depends on the research objectives and data types, considering the pros and cons of each algorithm.

ANN is a parallel distributed system that utilizes mechanisms distinct from traditional artificial intelligence and information processing techniques. It overcomes the limitations of traditional logic-based AI in dealing with intuition and unstructured information, making its application in artificial intelligence incredibly versatile. Training an ANN requires a significant amount of time and effort. The types of processing units within the network are divided into three categories: input units, output units, and hidden units. The appropriate ANN algorithm must be selected based on the objectives of clinical research. Four common characteristics of ANNs are nonlinearity, nonlocality, nondeterminacy, and nonconvexity. Their common major drawback, however, is the “black box” effect [49], wherein they exhibit similar functionality to the human brain, producing results that are not entirely explainable.

ANN finds extensive application in medical research, primarily due to its advantages [50, 51]. Specifically, ANN possesses self-learning capabilities. For instance, when implementing medical image recognition, inputting numerous diverse image samples and their corresponding recognition outcomes into ANN enables the network to gradually learn to recognize similar images. Self-learning functionality is particularly significant for predictive modeling. For example, convolutional neural networks (CNN) play a crucial role in research related to color ultrasound, imaging, and electrocardiograms [52]. ANN models will also provide economic forecasts, market predictions, benefit projections, and more, making their potential applications extensive. Furthermore, ANN exhibits associative storage functionality. This type of association can be achieved using the feedback network of an ANN. Connections between neurons are assigned relevant weights, and training algorithms adjust these weights iteratively, minimizing prediction errors and enhancing prediction accuracy.

I have used BP neural networks for modeling, with rectified linear units (ReLU) as the activation function for hidden layers and the sigmoid activation function for the output layer. The sigmoid function, being smooth and differentiable, is more precise than linear functions for classification and exhibits better fault tolerance. The differentiability of the sigmoid function allows for its use in gradient descent. Using the sigmoid function in the output layer restricts output values to a smaller range. As a result, the model’s expanded connectivity is better than models such as decision trees (DT), whereas the hidden layers can reveal interconnections between independent variables and the dependent variable. Furthermore, ANN excels in rapidly finding optimized solutions. Searching for an optimized solution for a complex problem often requires significant computational effort. Leveraging a feedback-type neural network designed for a specific problem allows computers to use their high-speed processing capabilities to quickly find optimized solutions. ANN can output the importance values of independent variable features. Based on the comprehensive interrelationships among independent variables, features that have the greatest impact on the dependent variable can be clearly identified. Its ability to find optimal solutions is better than that of logistic regression (LR) and DT models. Regarding ANN parameter settings, various transformations can be attempted on the training and validation sets. If using a multilayer perceptron (MLP) neural network, adding a support set is recommended, enhancing the effectiveness of ANN model training and learning. In my experience with MLP neural networks for modeling, I believe that MLP neural networks are more capable of discovering complex factor relationships. MLP neural networks are a forward-feed supervised learning technique that, by setting parameters based on the data type of the dependent variable, can yield more accurate predictive classifications and probabilities, guiding clinical diagnosis and treatment. However, based on my current research experience, both BP neural networks and MLP neural networks have certain drawbacks, such as slow convergence speed, susceptibility to getting stuck in local minima, and inability to reach global optimal solutions. As for other models such as radial basis function neural networks, CNN, and random neural networks, which I have not yet attempted to model, I will refrain from discussing them at this moment.

In addition to the mentioned models, there are other machine learning models currently applied in clinical medical research, including Histogram-based Gradient Boosting, CatBoost, LightGBM, XGBoost, GBM, and GBDT, which are modeling algorithms based on decision trees. CatBoost, LightGBM, XGBoost, and GBM are advanced machine learning algorithms that have been improved and refined based on the GBM algorithm. The GBM algorithm optimizes the loss function towards the direction of steepest gradient through ordered iterations. These algorithms, including GBM and its derivatives such as Histogram-based Gradient Boosting, XGBoost, LightGBM, and CatBoost, are considered focal points in medical research, partly because they are well-suited for the flat data commonly used in medical studies. However, these algorithms are still relatively underutilized in clinical applications due to their complexity, making it challenging for ordinary medical practitioners to comprehend. Moreover, each of these algorithms has its own advantages and drawbacks, and I currently lack relevant experience to provide further insight. It is hopeful that in the future, these advanced algorithms will gradually be applied to clinical research for modeling and validation, exploring more suitable modeling algorithms for medical studies. Many advanced algorithms build upon decision trees as their foundation. Combining my previous machine learning modeling experience, I believe this is largely because decision trees have numerous advantages. For instance, decision trees can be thought of as sets of if-then rules, making them easy to understand and interpret. They require minimal feature engineering, and they do not demand any prior assumptions. Decision trees can handle missing values well and exhibit robustness, especially after implementing methods to prevent overfitting. The development of decision tree construction techniques does not require expensive computational costs. Models can be quickly built even when dealing with large training sets.

2.8 Naive Bayes algorithm

The Naive Bayes algorithm is also widely used in medical research, although I have not yet personally employed it. Naive Bayes is a classification algorithm [53] that is based on the Bayesian theorem and the assumption of feature independence. It classifies sample data sets using probability statistics knowledge. It combines prior and posterior probabilities to avoid using only subjective biases from prior probabilities or suffering from overfitting by using sample information alone. This algorithm is suitable for medium-sized data mining [54]. Naive Bayes is extensively applied in the medical field as well. It can be used for disease diagnosis, modeling patient symptoms and medical test results to predict possible diseases, and assist doctors in making diagnostic decisions. In addition, it is commonly used in medical image classification, although its performance in handling image data is not as strong as other deep learning methods; it can still be applied to simple image classification tasks. By modeling image features, it can automatically classify images into different categories.

Advertisement

3. Evaluate predictive model performance

Evaluating the quality of models involves certain criteria. Regression models are often assessed using metrics such as mean squared error, R-squared, and root mean squared error. In medical research, commonly used metrics for classification models include accuracy, precision, recall, and F1 score, which are derived from a confusion matrix. ROC curve [55] is utilized to derive metrics such as AUC, specificity, sensitivity, and Youden’s J index. Furthermore, the clinical impact curve, DCA curve [24], and calibration curve are employed for assessing clinical model performance. These standards are applied in postmodeling validation.

Advertisement

4. Application of artificial intelligence

Building upon the previous explanation, let us delve into the application of artificial intelligence in medical research modeling. Artificial intelligence (AI) is a new technological science that involves the study, development, and application of theory, methods, techniques, and systems for simulating, extending, and expanding human intelligence. AI is a driving force in the new round of technological revolution and industrial transformation. It is an important component of the discipline of intelligence, aiming to understand intelligence and produce intelligent machines capable of reacting in ways similar to human intelligence. This field encompasses robotics, language recognition, image recognition, natural language processing, expert systems, and more.

AI can be implemented on computers in two ways. The first approach employs traditional programming techniques to achieve intelligent behavior without necessarily adhering to methods used by humans or animals. This is known as the engineering approach, which has yielded results in fields such as optical character recognition and computer chess. The second approach, known as the simulation approach, not only focuses on achieving outcomes but also aims for methods similar to those used by humans or biological organisms. Genetic algorithms and ANNs fall into this category.

The recent advancement of AI, such as ChatGPT 3.5 [56], has sparked a new wave of AI learning. I have personally used ChatGPT 3.5 for modeling and validating random forest, support vector machine, and neural network, comparing them with traditional R language-based modeling (RF, SVM, and ANN) through analysis. I have gained some practical experience in this regard. It is important to note that ChatGPT is a language model capable of editing text and computer languages. In the context of medical research, using ChatGPT 3.5 essentially involves coding in computer languages, such as R or Python, to run and obtain results that surpass traditional code-based modeling. Based on test results of ChatGPT 4.0, which achieved over 90% accuracy in various domains such as the U.S. Bar exam, Biology Olympiad, and CPA exam, it is evident that ChatGPT 4.0 possesses enhanced memory, logical analysis, and reasoning capabilities. Its algorithmic models are expected to be more logical and reasoning-oriented. Moreover, ongoing research is exploring the performance of ChatGPT in the U.S. physician exams [57].

ChatGPT boasts distinct advantages, such as enhancing scientific writing, augmenting research fairness and versatility. In the realm of healthcare research [58], its applications encompass effective dataset analysis, code generation, literature reviews, saving time to focus on experimental design, drug discovery, and development, among others. The benefits for healthcare practice involve streamlining workflows, cost-saving, record-keeping, personalized healthcare, and boosting health literacy. In healthcare education, these benefits include enhancing personalized learning and emphasizing critical thinking and problem-based learning [59]. ChatGPT possesses a “brain” reminiscent of humans and can recall previous interactions and user comments, establishing contextual connections—often an area where earlier AI language models lagged behind. Based on these strengths, ChatGPT is gaining traction for extensive use in health care [60], to the extent that it is even listed as a coauthor in many research papers. However, the efficacy [61] and rationale [62] of this practice need evaluation. In addition, developing virtual assistants to aid patients in managing their health is another crucial application of ChatGPT in medicine. ChatGPT can also be utilized for clinical decision support and patient monitoring, suggesting consultations with healthcare professionals based on warning signals and symptoms.

In my research, I have attempted computations using code written by ChatGPT, alongside traditional R language modeling such as RF and ANN. By comprehensively analyzing and comparing the outcomes of modeling using ChatGPT 3.5-generated code for RF, ANN, and others against traditional R language methods, I found that ChatGPT 3.5-based modeling exhibited faster execution, superior accuracy, precision, recall, and simpler parameter settings. The validation performance of models was indeed superior to traditional R language machine learning modeling, making it a valuable approach for clinical promotion and application. Most importantly, models built on AI ChatGPT can automatically adjust model parameters as needed, generating increasingly superior predictive models without the laborious and resource-intensive task of repeatedly training and adjusting parameters, as in traditional R language modeling. AI ChatGPT can continuously optimize algorithmic code during training, unlike traditional R language modeling that lacks real-time algorithm optimization. For healthcare practitioners without coding skills, using AI modeling has tremendous value. Used effectively, ChatGPT can save substantial time for more efficient and prioritized tasks [63]. However, it is important to disclose the shortcomings of using AI ChatGPT in health care and wellness domains [63].

While ChatGPT is currently one of the best editors for computer algorithm code, it does have a time lag and cannot synthesize the latest information to provide optimal algorithmic code. Furthermore, current applications of ChatGPT may raise concerns related to plagiarism, copyright infringement, privacy, and cybersecurity, necessitating a thorough assessment of ChatGPT’s security. Lastly, while ChatGPT has certain ethical requirements, it cannot achieve perfection, and it could potentially provide guidance on illegal activities if consulted. Thus, the use of ChatGPT must adhere to relevant laws and regulations and respect regional customs.

Advertisement

5. Introduction to interaction

Moving on to the topic of interaction [64], interactions are classified into multiplicative interaction (product model) and additive interaction (sum model). In the additive model, when there is no interaction, the combined effect of two or more factors acting on an event equals the sum of the effects when these factors act individually. In the multiplicative model, when there is no interaction, the combined effect of two or more factors acting on an event equals the product of the effects when these factors act individually. According to prior epidemiological literature [65], multiplicative interaction indicates statistical interaction effects, whereas additive interaction further suggests biological interaction effects. Given that features in clinical research typically do not have isolated effects, investigating the mutual effects of multiple features on the outcome variable is a direction in clinical research. This aspect becomes even more prominent in public health studies. Similarly, multiplicative and additive interactions do not necessarily coexist. However, in our research, the simultaneous statistical significance of both multiplicative and additive interactions between two features is used as a meaningful criterion for clinical research. In other words, both features must simultaneously fulfill the criteria for statistically significant multiplicative and additive interactions for the evaluation of their combined effect on the outcome variable to have both statistical and biological significance. This is crucial for etiological studies in clinical research. To assess the significance of multiplicative interaction, hypothesis testing criteria can be set, generally at 0.05. However, assessing additive interaction requires the simultaneous fulfillment of three indicators: the relative excess risk due to interaction (RERI), the attributable proportion due to interaction (AP), and the synergy index (S). When there is no additive interaction, the confidence intervals for RERI and AP should include 0, and the confidence interval for S should include 1.

Evaluating the additive and multiplicative models of clinical impact factors holds great significance for disease prediction and diagnosis, particularly in epidemiological statistical research and especially in the aftermath of the COVID-19 pandemic. Calculating additive interaction is important as it has broader implications for public health. I have personally conducted interaction analysis of factors in chronic diseases such as hyperuricemia and hypertension; finally, combining my experience, let us illustrate the importance of interaction research with a simple example.

Suppose a population attends a gathering event, where some individuals have underlying health conditions while others do not. The gathering event is a risk factor for contracting COVID-19, and we want to intervene in order to enhance the control of COVID-19. To optimize the use of resources such as manpower, communication, incentives, and penalties to achieve the maximum impact, we need to consider both the multiplicative and additive interaction models. Assumingly, we apply interaction theory to investigate the combined effects of having an underlying health condition and attending a gathering event on the risk of COVID-19 infection. Let us say that the multiplicative interaction algorithm in this study indicates a reverse multiplicative interaction, implying that among individuals without underlying health conditions, the risk of infection is higher when attending the gathering event compared to not attending. However, relying solely on the results of the multiplicative interaction overlooks the group of individuals with underlying health conditions and may identify the wrong high-risk group. Now, assuming we use the additive interaction algorithm, and it indicates a positive additive interaction, explaining that among individuals with underlying health conditions, intervening in the gathering event can yield greater public health benefits. Therefore, exploring factor interactions can guide clinical and preventive decisions, allowing for the efficient allocation of resources to achieve optimal outcomes. It is important to analyze both the multiplicative and additive models comprehensively to obtain the most reliable research results. With the widespread occurrence of infectious diseases, there is bound to be an increasing amount of statistical research related to factor interactions. Hopefully, this research can provide valuable insights for future infectious disease prevention and control efforts. This serves as one of the most significant contributions of this paper.

References

  1. 1. Yoo I, Alafaireet P, Marinov M, et al. Data mining in healthcare and biomedicine: A survey of the literature. Journal of Medical Systems. 2012;36(4):2431-2448
  2. 2. Iavindrasana J, Cohen G, Depeursinge A, et al. Clinical data mining: A review. In: Yearbook of Medical Informatics. US: International Map Industry Association (IMIA); 2009. pp. 121-133
  3. 3. Wu WT, Li YJ, Feng AZ, et al. Data mining in clinical big data: The frequently used databases, steps, and methodological models. Military Medical Research. 2021;8(1):44
  4. 4. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930
  5. 5. Hammann F, Drewe J. Decision tree models for data mining in hit discovery. Expert Opinion on Drug Discovery. 2012;7(4):341-352
  6. 6. Tian H. Brand marketing leveraging the advantage of emoji pack relying on association rule algorithm in data mining technology. Computational Intelligence and Neuroscience. 2022;2022:3511211
  7. 7. Hadavi S, Oliaei S, Saidi S, et al. Using data mining and association rules for early diagnosis of Esophageal cancer. The Gulf Journal of Oncology. 2022;1(40):38-46
  8. 8. Kriegeskorte N, Golan T. Neural network models and deep learning. Current Biology. 2019;29(7):R231-R236
  9. 9. Holmes JH, Sacchi L, Bellazzi R, et al. Artificial intelligence in medicine AIME 2015. Artificial Intelligence in Medicine. 2017;81:1-2
  10. 10. Mintz Y, Brodie R. Introduction to artificial intelligence in medicine. Minimally Invasive Therapy & Allied Technologies. 2019;28(2):73-81
  11. 11. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69S:S36-S40
  12. 12. Mentis AA, Garcia I, Jiménez J, Paparoupa M, Xirogianni A, Papandreou A, et al. Artificial intelligence in differential diagnostics of meningitis: A nationwide study. Diagnostics (Basel). 28 Mar 2021;11(4):602
  13. 13. Zia A, Aziz M, Popa I, Khan SA, Hamedani AF, Asif AR. Artificial intelligence-based medical data mining. Journal of Personalized Medicine. 24 Aug 2022;12(9):1359
  14. 14. Birjandi SM, Khasteh SH. A survey on data mining techniques used in medicine. Journal of Diabetes and Metabolic Disorders. 2021;20(2):2055-2071
  15. 15. Wen X, Leng P, Wang J, et al. Clinlabomics: Leveraging clinical laboratory data by data mining strategies. BMC Bioinformatics. 2022;23(1):387
  16. 16. Hou N, Li M, He L, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: A machine learning approach using XGboost. Journal of Translational Medicine. 2020;18(1):462
  17. 17. Zhu J, Su Y, Liu Z, et al. Real-time biomechanical modelling of the liver using LightGBM model. The International Journal of Medical Robotics. 2022;18(6):e2433
  18. 18. Hancock JT, Khoshgoftaar TM. CatBoost for big data: An interdisciplinary review. Journal of Big Data. 2020;7(1):94
  19. 19. Stoltzfus JC. Logistic regression: A brief primer. Academic Emergency Medicine. 2011;18(10):1099-1104
  20. 20. Schober P, Vetter TR. Logistic regression in medical research. Anesthesia and Analgesia. 2021;132(2):365-366
  21. 21. Zhang Z, Reinikainen J, Adeleke KA, et al. Time-varying covariates and coefficients in cox regression models. Annals of Translational Medicine. 2018;6(7):121
  22. 22. Park SY. Nomogram: An analogue tool to deliver digital knowledge. The Journal of Thoracic and Cardiovascular Surgery. 2018;155(4):1793
  23. 23. Wang X, Lu J, Song Z, et al. From past to future: Bibliometric analysis of global research productivity on nomogram (2000-2021). Frontiers in Public Health. 2022;10:997713
  24. 24. Zhang W, Ji L, Wang X, et al. Nomogram predicts risk and prognostic factors for bone metastasis of pancreatic cancer: A population-based analysis. Frontiers in Endocrinology (Lausanne). 2021;12:752176
  25. 25. Hu C, Yang J, Huang Z, et al. Diagnostic and prognostic nomograms for bone metastasis in hepatocellular carcinoma. BMC Cancer. 2020;20(1):494
  26. 26. Yu P, Wu X, Li J, et al. Extrathyroidal extension prediction of papillary thyroid cancer with computed tomography based radiomics nomogram: A Multicenter study. Frontiers in Endocrinology (Lausanne). 2022;13:874396
  27. 27. Zhang D, Hu J, Liu Z, et al. Prognostic nomogram in patients with epithelioid sarcoma: A SEER-based study. Cancer Medicine. 2023;12(3):3079-3088
  28. 28. Rigatti SJ. Random Forest. Journal of Insurance Medicine. 2017;47(1):31-39
  29. 29. Doupe P, Faghmous J, Basu S. Machine learning for health services researchers. Value in Health. 2019;22(7):808-815
  30. 30. Guo L, Wang Z, Du Y, et al. Random-forest algorithm based biomarkers in predicting prognosis in the patients with hepatocellular carcinoma. Cancer Cell International. 2020;20:251
  31. 31. Uddin S, Khan A, Hossain ME, et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making. 2019;19(1):281
  32. 32. Lee EJ, Kim YH, Kim N, et al. Deep into the brain: Artificial intelligence in stroke imaging. Journal of Stroke. 2017;19(3):277-285
  33. 33. Gaonkar B, Davatzikos C. Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification. NeuroImage. 2013;78:270-283
  34. 34. Habehh H, Gohel S. Machine learning in healthcare. Current Genomics. 2021;22(4):291-300
  35. 35. Silva G, Fagundes TP, Teixeira BC, et al. Machine learning for hypertension prediction: A systematic review. Current Hypertension Reports. 2022;24(11):523-533
  36. 36. Al FL, Shomo MI, Alazzam MB, et al. Processing decision tree data using internet of things (IoT) and artificial intelligence technologies with special reference to medical application. BioMed Research International. 2022;2022:8626234
  37. 37. DeGregory KW, Kuiper P, DeSilvio T, et al. A review of machine learning in obesity. Obesity Reviews. 2018;19(5):668-685
  38. 38. Zhu Y, Fang J. Logistic regression-based Trichotomous classification tree and its application in medical diagnosis. Medical Decision Making. 2016;36(8):973-989
  39. 39. Tsien CL, Fraser HS, Long WJ, et al. Using classification tree and logistic regression methods to diagnose myocardial infarction. Studies in Health Technology and Informatics. 1998;52(Pt 1):493-497
  40. 40. Schilling C, Mortimer D, Dalziel K, et al. Using classification and regression trees (CART) to identify prescribing thresholds for cardiovascular disease. PharmacoEconomics. 2016;34(2):195-205
  41. 41. Henrard S, Speybroeck N, Hermans C. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia. Haemophilia. 2015;21(6):715-722
  42. 42. Renganathan V. Overview of artificial neural network models in the biomedical domain. Bratislavské Lekárske Listy. 2019;120(7):536-540
  43. 43. Harada T. (2)neural network. No Shinkei Geka. 2020;48(2):173-188
  44. 44. Clark JW. Neural network modelling. Physics in Medicine and Biology. 1991;36(10):1259-1317
  45. 45. Currie G, Hawk KE, Rohren E, et al. Machine learning and deep learning in medical imaging: Intelligent imaging. Journal of Medical Imaging and Radiation Sciences. 2019;50(4):477-487
  46. 46. Ha J, Kim S, Baik Y, et al. Artificial neural network enabling clinically meaningful biological image data generation. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) in Conjunction with the 43rd Annual Conference of the Canadian Medical and Biological Engineering Society. Vol. 2020. 2020. pp. 2404-2407
  47. 47. Labdai S, Bounar N, Boulkroune A, et al. Artificial neural network-based adaptive control for a DFIG-based WECS. ISA Transactions. 2022;128(Pt B):171-180
  48. 48. Zhang Y, Lin H, Yang Z, et al. Neural network-based approaches for biomedical relation classification: A review. Journal of Biomedical Informatics. 2019;99:103294
  49. 49. Nair TM. Building and interpreting artificial neural network models for biological systems. Methods in Molecular Biology. 2021;2190:185-194
  50. 50. Khan ZH, Mohapatra SK, Khodiar PK, et al. Artificial neural network and medicine. Indian Journal of Physiology and Pharmacology. 1998;42(3):321-342
  51. 51. Cao B, Zhang KC, Wei B, et al. Status quo and future prospects of artificial neural network from the perspective of gastroenterologists. World Journal of Gastroenterology. 2021;27(21):2681-2709
  52. 52. Gharehbaghi A, Babic A. Deep time growing neural network vs convolutional neural network for intelligent phonocardiography. Studies in Health Technology and Informatics. 2022;295:491-494
  53. 53. Zhang Z. Naive Bayes classification in R. Annals of Translational Medicine. 2016;4(12):241
  54. 54. Cao X, Xing L, Majd E, et al. A systematic evaluation of supervised machine learning algorithms for cell phenotype classification using single-cell RNA sequencing data. Frontiers in Genetics. 2022;13:836798
  55. 55. Martinez PJ, Perez MP. ROC curve. Semergen. 2023;49(1):101821
  56. 56. Gordijn B, Have HT. ChatGPT: Evolution or revolution? Medicine, Health Care, and Philosophy. 2023;26(1):1-2
  57. 57. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. Journal of Medical Internet Research. 2023;9:e45312
  58. 58. Will ChatGPT transform healthcare? Nature Medicine. 2023;29(3):505-506
  59. 59. Sallam M. ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare (Basel). 19 Mar 2023;11(6):887
  60. 60. Cascella M, Montomoli J, Bellini V, et al. Evaluating the feasibility of ChatGPT in healthcare: An analysis of multiple clinical and research scenarios. Journal of Medical Systems. 2023;47(1):33
  61. 61. Teixeira DSJ. Is ChatGPT a valid author? Nurse Education in Practice. 2023;68:103600
  62. 62. Krugel S, Ostermaier A, Uhl M. ChatGPT's inconsistent moral advice influences users' judgment. Scientific Reports. 2023;13(1):4569
  63. 63. Dave T, Athaluri SA, Singh S. ChatGPT in medicine: An overview of its applications, advantages, limitations, future prospects, and ethical considerations. Frontiers in Artificial Intelligence. 2023;6:1169595
  64. 64. Bohnke JR. Explanation in causal inference: Methods for mediation and interaction. The Quarterly Journal of Experimental Psychology: QJEP (Hove). 2016;69(6):1243-1244
  65. 65. Rothman KJ. Epidemiology: An Introduction. New York: Oxford University Press; 2002. pp. 168-180

Written By

Luwei Li

Submitted: 11 September 2023 Reviewed: 25 September 2023 Published: 19 October 2023