Open access peer-reviewed chapter

Artificial Intelligence: Development and Applications in Neurosurgery

Written By

Raivat Shah, Vanessa Reese, Martin Oselkin and Stanislaw P. Stawicki

Submitted: 09 May 2023 Reviewed: 28 August 2023 Published: 30 September 2023

DOI: 10.5772/intechopen.113034

Chapter metrics overview

129 Chapter Downloads

View Full Metrics

Abstract

The last decade has witnessed a significant increase in the relevance of artificial intelligence (AI) in neuroscience. Gaining notoriety from its potential to revolutionize medical decision making, data analytics, and clinical workflows, AI is poised to be increasingly implemented into neurosurgical practice. However, certain considerations pose significant challenges to its immediate and widespread implementation. Hence, this chapter will explore current developments in AI as it pertains to the field of clinical neuroscience, with a primary focus on neurosurgery. Additionally included is a brief discussion of important economic and ethical considerations related to the feasibility and implementation of AI-based technologies in neurosciences, including future horizons such as the operational integrations of human and non-human capabilities.

Keywords

  • artificial intelligence
  • neurosurgery
  • machine learning
  • deep learning
  • neural networks
  • telemedicine
  • robotic neurosurgery

1. Introduction

Beginning with Harvey Cushing’s work in the early 1900s, modern neurosurgical advancements are often entwined with parallel developments in both medical and non-medical technologies [1]. Just as the application of microscopy, endoscopy, computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound in neurosurgery have revolutionized and transformed the field, artificial intelligence (AI) is poised to do the same [2]. The past decade has witnessed exponential growth in research seeking to reconcile AI and neurosurgery, with primary goals of improving patient outcomes and enhancing quality of care. Academic interest toward the intersection of the two fields is very evident, with literature search permutations of the phrase “neurosurgery and AI” revealing over 20,000 absolute publications in the last 10 years on the PubMed database [3]. As AI grows in sophistication, ease of applicability, and prominence, it may grow and develop to be intrinsically tied with neurosurgical care in the future. This chapter will provide an overview of the current thoughts and applications of AI in neurosurgery within pre-, intra-, and postoperative contexts, evaluate the nuances of AI functionality in both developmental and use stages, consider implementation costs, feasibility, and limitations. We will also discuss any misconceptions related to the integration of AI within neurosurgery, with a focus on dispelling both exuberantly optimistic and overly negative views.

Advertisement

2. Methods

A literature search was performed using Google Scholar ™ search keywords of “artificial intelligence in medicine,” “robotic neurosurgery,” “artificial intelligence and neurosurgery,” and “cost of artificial intelligence in medicine.” This keyword search was mirrored in PubMed. The PubMed database and Google Scholar ™ were also searched for information on the basic information and explanation of artificial intelligence technologies, using the keywords “machine learning and neurosurgery,” “neural networks in neurosurgery,” and “deep learning in neurosurgery.” There were no de facto inclusion criteria and no specific time limitation or time frame to the articles being utilized; rather, the articles were included based on relevance or relation to artificial intelligence use in medicine and the neuroscience field.

Advertisement

3. Artificial intelligence development and use: woos and woes

Artificial intelligence is an emerging field broadly defined as a set of technologies capable of incorporating human behavior and intelligence into machines and systems [4]. Due to its potential scope in diagnostic efficacy and treatment recommendations, AI is poised to be increasingly implemented into healthcare and clinical practice. However, a better understanding of what AI entails is warranted.

3.1 Machine learning

A discussion of AI in neurosurgery would be incomplete without a basic understanding of machine learning (ML), a subfield of AI [5]. The accelerated increase in computerization of patient data in healthcare has resulted in vast quantities of information beyond what can be reasonably digested by traditional methods of statistical analysis, commonly referred to as “big data” [6]. However, the emergence of ML has unlocked new possibilities for the extraction and identification of potentially valuable patterns from not only past data, but also created a framework for predicting future data trends [7, 8, 9]. The predictive potential of ML can only be harnessed when the model can be presented with large quantities of annotated data [10]. For instance, in radiographic imaging, ML is able to treat each computerized picture element, or pixel, as its own unique variable. Thus, when fed large quantities of data, the ML algorithm can learn at a degree of complexity (e.g., trace contours of fracture lines, parenchymal opacities, etc.) and a scale that is beyond natural human capabilities [10].

Machine learning subdomains have traditionally been grouped into two large categories: supervised and unsupervised learning. The former uses annotated datasets to train an algorithm to predict outcomes on unseen data; unsupervised learning, however, uses ML to cluster datasets without using labels, enabling the extraction of unknown features that may be useful for categorizing and predicting relevant clinical outputs without human intervention [11]. Nevertheless, many ML models in healthcare have been shown to demonstrate performance no better than conventional statistical methods [12, 13]. It should be repeatedly emphasized that the field of ML, in addition to being new, still possesses many fundamental weaknesses that limit its immediate widespread applicability.

Using diagnostic testing to determine the presence or absence of disease is an essential process in clinical medicine. In these scenarios, test results are oftentimes obtained as continuous values, which require conversion and interpretation into dichotomous groups to determine the presence or absence of a disease [14]. A key stage in this process involves defining a cut-off value, or reference value, to differentiate normal from abnormal conditions. The receiver operating characteristic (ROC) curve, the primary tool used for this determination, classifies a patient’s disease state as positive or negative based on test outcomes, simultaneously identifying the optimal cut-off value with the best diagnostic performance [14]. The area under the curve (AUC) serves as a singular, scalar value summarizing the overall performance of a binary classifier [15]. This measure provides an aggregate evaluation of performance across all potential classification thresholds. In essence, the AUC measures the two-dimensional area beneath the ROC curve from points (0,0) to [1,1]. An AUC of 1.0 signifies perfect, error-free classification, whereas an AUC of 0.5, comparable to a random classification method like a coin toss, holds no diagnostic value. Typically, an AUC exceeding 0.8 is deemed acceptable in non-medical contexts, and an AUC surpassing 0.9 is considered excellent [16].

Nonetheless, it is crucial to underscore that strong performance as indicated by AUC values greater than 0.80 does not necessarily guarantee a robust model. If machine learning algorithms have not been cross-validated with novel datasets, they risk being overfit to past data, compromising their generalizability [14]. Thus, when attempting to leverage the model to predict performance on unseen data, the ML model may, at best, only offer slight gains compared to traditional statistical analysis [12, 13, 17, 18, 19]. Additionally, the robustness of any given ML model is directly dependent on the quality and quantity of data fed. If biases from differences in data collection methodologies are present in a dataset, both generalizability and performance of the model are negatively impacted [10]. Furthermore, the AUC is often presented with a 95% confidence interval because the data obtained from the sample are not fixed values but rather influenced by statistical errors. Finally, the use of real-world data inherently introduces corruptions in the dataset, also known as “noise.” Random noise in input datasets can confound ML tasks of classification, clustering, and association analysis in addition to increasing model complexity and time of learning, all of which can degrade the performance of the learning algorithm as noise cannot be easily distinguished from desired inputs unless appropriately pre-processed before introduction to the model [20, 21]. In other words, despite impressive AUC values, such models may lack reliability when applied to new, unseen data, underscoring the critical importance of rigorous validation processes in the development of diagnostic tools.

3.2 Neural networks

The basic functional unit of the nervous system is the neuron [22]. Neurons function by receiving an input, processing the signal, and generating an output signal [23, 24]. Anatomically speaking, neurons are capable of consolidating up to thousands of neurotransmitter-driven synaptic inputs simultaneously via dendritic extensions, processing a highly transformed version of the original inputs in the soma, and producing a singular output through its axon in the form of an action potential [25]. Importantly, neuronal outputs are not generated at a fixed rate but rather are a function of whether or not the signal summation (excitatory - inhibitory inputs) exceeds a predefined threshold value in order to successfully depolarize the neuron and induce an action potential [26, 27, 28]. After traveling through the axon, the action potential signal is transmitted to a multiplicity of neurons synapsed at the axon terminal.

Broadly speaking, artificial neural networks (ANNs) model the biological principles of neuronal signaling in order to stratify and solve complex, nonlinear problems [29]. Considered a subfield of ML, ANNs refer to a digital machine learning algorithm based upon the concept of a biological neuron. Comparatively, where neurons rely on neurotransmitter signaling inputs ANNs leverage binary, categorical, or numeric data sets [5]. Transformation of input signals at the soma into an action potential is akin to an ANN arithmetic-based calculation of inputs into an output [30].

Although the theory underlying ANNs was first developed in the 1980s, premier advances in computational power and training data acquisition at scale have enabled its extensive application in recent years. In neurosurgery, ANNs have grown to be increasingly utilized in diagnostics, prognostics, and management [31]. Deep learning (DL) is yet another class of algorithms increasingly studied in the literature. Although similar to neural networks in principle, the term “deep” refers to the increasing depth of layers present in the neural network – typically accepted to imply at least three layers [32].

The ability to analyze non-linear data by ANNs is ideal for assisting neurosurgeons in clinical decision-making [33]. In particular, ANNs have been widely demonstrated to be superior to traditional analytical methods, especially as it pertains to clinical imaging tasks [34]. Even so, significant challenges still exist which limit the widespread use of ANNs and DL in neurosurgery and medicine at large, including insufficient data, obscured interpretability, reliability of data, high threshold of processing power, and data privacy [3].

3.3 Natural language processing

Natural language processing (NLP) is another subfield that falls under the scope of ML. As its name implies, the goal of NLP is to better enable human-computer communication by leveraging natural human language to better perform data abstraction processes [35]. In other words, the computer functions to understand human-generated text inputs by breaking down sentences into their constituent parts and applying algorithms to derive meaningful outputs. There are two primary divisions within the field of NLP: rules-based models and machine-based models. A rules-based model boasts minimal set-up costs, however is burdensome to scale for large datasets and inflexible as language usage evolves over time; conversely, machine-based models are preferable for large datasets as it can circumvent the rigidity of rules-based model while adapting to evolutions in human lexicon over time [36]. Three methodological approaches that dominate the application of NLP to neurosurgery are classification, annotation, and prediction [37]. Classification involves providing further diagnostic information, and informing the surgeon’s decision making in the preoperative phase. Annotation entails automatizing the annotation of a large amount of data (e.g., radiological images) by identifying specific phenotypes related to a disease condition, enabling the NLP algorithm to train on much larger amounts of data and better extrapolate clinical outcomes. Prediction exploits previous data (e.g., free text notes) to predict patient surgical outcomes and enable the neurosurgeon to arrange the resources necessary for their care accordingly. Machine-based NLP as applied to neurosurgery and medicine at scale remains in its infantile stages, though its possibilities rise with the emergence of Large Language Models.

3.4 Large language models

Large Language Models (LLMs) like ChatGPT, developed by OpenAI, are a new wave of AI technology that have profound implications for diverse fields, including healthcare. Educated on a colossal quantity of textual data, these models grasp the delicate intricacies and nuances of human language, thereby equipping them to form pertinent and contextually relevant responses to a broad spectrum of prompts [38].

In March 2023, the performance of ChatGPT and GPT-4 was assessed on a 500-question mock neurosurgical written boards examination. Using Self-Assessment Exam 1 from the American Board of Neurological Surgery (ABNS), Ali et al. fed questions in single best answer, multiple-choice format. ChatGPT and GPT-4 achieved scores of 73.4 and 83.4%, respectively, relative to the question bank user average of 73.7% [39]. Both the question bank users and the LLMs exceeded the previous year’s passing threshold of 69%, demonstrating the models’ potential technical utility [39].

In a clinical context, including neurosurgery, LLMs could serve multiple purposes. Firstly, they could play a significant role in patient education, simplifying complex neurosurgical procedures, and providing insights into the recovery process in an accessible language [40]. Secondly, these models could help facilitate medical research, from identifying new hypotheses to aiding in clinical decision-making by providing summaries of recent research, medical literature, or guideline updates relevant to specific cases [41].

Another promising application lies in the realm of medical documentation. LLMs could help transcribe doctor-patient conversations, draft surgical reports, or summarize patient histories, thereby streamlining administrative tasks and allowing physicians to focus more on patient care [42]. Continuing Medical Education could also benefit from LLMs. By simulating complex clinical scenarios or generating case studies, these models could serve as an effective teaching tool for medical trainees [43].

Advertisement

4. Preoperative applications

The goal of the preoperative phase of care is to prepare both the neurosurgeon and the patient for a potential operation through means of diagnosis, surgical candidacy stratification, selection of treatment, and informed consent. AI is increasingly entering these realms as a potential adjunct to clinical practice.

4.1 Patient selection

A quantitative means of evaluating an individual patient outcome preoperatively is highly desirable in improving surgical decision-making. At the present moment, clinical outcome judgment is heavily reliant on the individual neurosurgeon. Prognostic indices in use today, though easily applicable, lack adequate predictive performance primarily due to the streamlining of numerical data to categorical data [44, 45]. Conversely, ML, by its very nature, could circumvent such a simplification.

Until now, previous literature has compared neurosurgical patient outcome predictive performance between ML algorithms, classical logistic regressions, prognostic indices, and neurosurgeons with differential results. Against classical logistic regressions, ML models have demonstrated superior performance in predictions of successful endoscopic third ventriculostomy, postoperative ventricular peritoneal shunt infection, mortality after embolization of AVMs, patient satisfaction after laminectomy for lumbar spinal stenosis, in-hospital mortality in patients with traumatic brain injury, cerebral vasospasm after aneurysmal subarachnoid hemorrhage, and outcomes after a burr-hole procedure for a chronic subdural hematoma [45, 46, 47, 48, 49, 50, 51, 52]. Against current logistic regression prognostic indices for prediction of successful endoscopic third ventriculostomy (ETV) 6 months postoperatively, ANNs have demonstrated superior performance [45]. Masoudi et al. found that for ETV prediction 6 months postoperative, their multi-layer perceptron ANN demonstrated an AUC of 0.913 compared to a logistic regression AUC of 0.819 [53]. Some ML models have shown better performance compared to prognostic indices predicting outcome after stereotactic radiosurgery for cerebral arteriovenous malformation (AVM) with AUCs of 0.70–0.71 vs. 0.57–0.69 [44, 52]. A random forest classifier (RFC), a class of ML model achieved an AUC of 0.80, with 0.34 sensitivity, 0.95 specificity, 0.73 positive predictive value, 0.80 negative predictive value, and 0.79 accuracy for the prediction of traumatic brain injury in children following a cranial CT of the brain, demonstrating a substantial alternative to the currently used nomogram for the prediction of intracranial injury following CT in children with TBI [54].

Some recent studies have investigated the differences in ML and clinician performance in predicting neurosurgical outcomes in patients. Emblem et al. found that against fuzzy C-means, a class of ML model, neuroradiologists performed similarly in survival predictions for newly diagnosed glioma patients [55]. Emblem et al. also discovered that a support vector machine (SVM) model combined with perfusion-weighted magnetic resonance (MR) imaging better predicted survival in glioblastoma patients compared to neuroradiologists [56]. Currently, although especially experienced neurosurgeons have been demonstrated to exhibit strong patient survival prediction skills in patients with high-grade glioma undergoing surgery on group-wide metrics, they often missed on the individual level [57]. Hence, future AI tools could help bridge this gap by supporting neurosurgeons’ insights in the prediction of patient survival.

4.2 Diagnostics

Both LLMs and ML have utilization within diagnostics. LLMs can serve as an adjunct to the patient evaluation process by suggesting rarer diagnoses and interventions that the physician may not have typically considered. These can be incorporated with the overall clinical picture as appropriate. The potential scope of which ML can be applied to diagnostics is largely divided between three categories: classification, detection, and segmentation. Classification involves algorithmic stratification of data inputs into categories (e.g., normal, abnormal). Detection entails visual localization of an area of interest (e.g., lesion). Segmentation implies outlining a target area using a precise, pixel-wise boundary [58]. The following categories will elucidate the various areas through which general ML and deep learning (DL) models have been applied to neurodiagnostics.

4.3 Intracranial hemorrhage

Earlier efforts were able to determine important correlations between imaging characteristics, the presence of intracranial hemorrhage (ICH), and patient outcomes [59, 60, 61]. Today, approved commercial software for ICH detection exists on the market with clinical uses including triage and early warning systems, double reading, and hemorrhage type classification. Boasting a validated sensitivity of 88.7 to 96.2% and a specificity of 92.3 to 99.0%, Aidoc for ICH, an FDA-approved DL tool, is one of the industry’s leading support systems for evaluation and warning notification of unenhanced head CT images of ICHs [62, 63, 64, 65, 66]. Aidoc for ICH and other DL learning models have been demonstrated to produce inconsistencies in performance when applied to non-native trained clinical sites [64, 67]. Thus, further studies have sought to investigate alternatives including competing commercial software in addition to independently developed models. For instance, McLouth et al. and Rava et al. have validated the diagnostic capabilities of other DL ICH tools such as CINA v1.0 and Canon’s AUTOStroke Solution ICH across hospital sites in the United States, finding high accuracy and specificity with medium sensitivity thresholds [68, 69]. Wang et al., winners of the 2019-RSNA Brain CT Hemorrhage Challenge, developed a convolutional neural network (CNN) using a diverse array of datasets sourced from three institutions that achieved accuracy levels similar to that of senior radiologists [67]. Despite the outstanding results of the algorithm, it is important to note that the CNN model’s applicability in clinical settings is currently limited by (1) the lack of patient clinical information in the RSNA-challenge provided datasets, thereby obscuring the confounding effects of scanner type, cause of bleeding, and patient demographics, (2) its inapplicability to MRI imaging which is oftentimes crucial for ICH screening and diagnosis, and (3) external validation data are lacking [67].

4.4 Stroke

In the past decade, deep learning applications in stroke imaging have dramatically risen, likely as a byproduct of higher stroke imaging volume with the arrival of endovascular thrombectomy in addition to the increasing acknowledgement of the emergent nature of the disease process [58]. DL applications to stroke imaging can be divided into three areas: (1) Alberta Stroke Program Early CT Score (ASPECTS) measurement, (2) large vessel occlusion (LVO) detection, and (3) infarct prognostication.

ASPECTS is a 10-point topographical quantitative grading scale widely used to guide acute stroke treatment by measuring 10 regions within the middle cerebral artery (MCA) territory for early signs of ischemia [70, 71]. Many commercial DL tools designed to perform automated ASPECTS evaluation have been tested in clinical settings, demonstrating variable results. One study found that three neuroradiologists showed a higher correlation with infarct core than e-ASPECTS (Brainomix) (r = 0.71, 0.76, 0.80, compared to 0.59) while another study found that RAPID ASPECTS (iSchemaView) displayed higher correlation than two neuroradiologists from between symptom onset and imaging until 4 hours post-symptom [72, 73]. These results suggest that automated ASPECTS evaluation may continue to be implemented as an adjunct to current neuroradiological diagnostics. The efficacy of ASPECTS analysis depends on the software utilized and established ground truth.

Early identification of large vessel occlusion (LVO) in the early stages of admission can mitigate the probability of the patient suffering from the long-term implications of stroke and rescue life. A 2019 study developed a U-Net architecture DL tool designed to detect the hyperdense MCA sign in noncontrast head CT scans from a local Hong Kong population and achieved a high sensitivity (.930), though relatively lower specificity accuracy and AUC [74]. Automated LVO detection on CT angiograms (CTA) has become integral to many stroke centers. Viz-AI, a commercial CNN-based solution, has demonstrated 82% sensitivity and 94% specificity for LVO detection [71].

The ability to accurately and reliably predict posttreatment stroke outcomes can aid the neurosurgeon in selecting patients for thrombectomy or other neuroendovascular procedures and developing a plan of care precisely tailored to the individual patient. Recent stroke thrombectomy trials utilizing automated perfusion CT and MR imaging have revolutionized the modern care of stroke patients. The now commercially available Rapid.AI perfusion product, which employs a threshold-based segmentation method, resulted in a 3-fold reduction in severe disability and death when used to select patients for thrombectomy [75]. However, CT perfusion (CTP) maps have historically been unreliable and threshold-based approaches may fail to fully capture the complexity of infarct evolution. Processing this data under a DL system, one can take into account other biomarkers and patient-specific factors for better prognostication. One study validated a CNN designed to identify and predict post-treatment MRI final lesion volume, achieving a modified ROC-AUC of 0.88 [76]. Nishi et al. used a U-Net DL tool to assess clinical post-treatment outcomes of LVO patients using pretreatment diffusion-weighted image data of patients who underwent mechanical thrombectomy, finding an ROC-AUC of 0.81 [77].

4.5 Intracranial aneurysms

Intracranial aneurysms (IAs) are commonplace in the population, with a global estimated prevalence between 2 and 5% [78]. Although most of these aneurysms are asymptomatic, they carry the risk of rupture which if realized leads to a subarachnoid hemorrhage – a prognosis producing a dramatic case fatality of 50% [79]. Thus, there is great interest in the rapid and accurate identification of unruptured intracranial aneurysms on brain imaging.

At the present moment, intra-arterial digital subtraction angiography (IADSA) is the gold-standard for the diagnosis of intracranial aneurysms, with computed tomography angiography (CTA), magnetic resonance angiography (MRA), and transcranial Doppler sonography also shown to be effective diagnostic tests [80]. Time-of-flight MR angiography (TOF-MRA) is a non-invasive, non-contrast enhanced technique that enables discrimination between vessels and stationary tissues by inducing blood inflow effects [81]. Due to the absence of ionizing radiation or intravenous contrast agents, time-of-flight MR angiography (TOF-MRA) is typically the first modality of choice for aneurysm screening. Hence, many inroads for DL applications have been explored in this area.

Nakao et al. developed a computer-assisted detection (CAD) deep CNN architecture combined with a maximum intensity projection (MIP) algorithm trained on 450 patients worth of TOF-MRA scans. The team achieved a high sensitivity of 94.2% (98/104) and only 2.9 false positives per case [82]. Faron et al. similarly developed a CNN model finding an overall sensitivity of 90% with a false positive rate of 6.1%. More consequently, the Faron team further found that there was no significant difference in aneurysm detection performance between the CNN model and two blinded diagnostic neuroradiologists, with an overall increase in human detection sensitivity when combining their detection hits with the CNN model’s hits (reader 1: 98% vs. 95%, P = 0.280; reader 2: 97% vs. 94%, P = 0.333) [83].

Ueda et al. developed a ResNet architecture algorithm fed with 683 TOF-MRA patient scans and achieved a sensitivity of 91% (592 of 649) and 93% (74 of 80) for their internal and external data sets, respectively [84]. More interestingly, the model improved aneurysm detection in their retrospectively collected TOF-MRA scans by 4.8% (31 of 649) and 13% (10 of 80), respectively, compared to the initial radiologist-interpreted assessments.

Until recently, machine-learning algorithms largely focused on MRA imaging. However, more recent efforts were expanded to include CT-based imaging approaches. In 2020, Shi et al. developed a 3D CNN trained on 1177 digital subtraction angiography verified bone-removal CTA cases, which when tested on a cohort of suspected acute ischemic stroke (AIS) patients found that the model could exclude IA-negative cases with 99.0% confidence [85]. Limitations in their study include a relatively small sample of positive cases in the validation cohorts as well as the experimentally reasonable exclusion of CTA data with head trauma and arteriovenous malformation/fistula (AVM/AVF). In 2021, Yang et al. proposed a 3D CNN algorithm for detecting cerebral aneurysms using head CTA images, achieving a very high sensitivity of 97.5% (633 of 649) while revealing 8 intracranial aneurysms overlooked in initial reports [86]. When the model was paired with expert radiologists, their overall weighted alternative free-response receiver operating characteristic (wAFROC) curve improved by 0.01 (P < .05), demonstrating the viability for physician-machine adjunct usage.

4.6 Neuro-oncology

For over a century, neurosurgeons have played an essential role in the management of cancers afflicting the central nervous system (CNS). As the tenth leading cause of death for both men and women, accurate clinical evaluation of disease progression, and early detection of brain tumors using effective brain imaging techniques is paramount to improving patient outcomes. Historically, the preoperative phase involved manual segmentation of brain tumors and small related brain structures by the neurosurgeon – a laborious task [87]. Hence, many automated solutions have been explored, with the broadest categories for automated brain tumor segmentation of MR images including (i) intensity-based, (ii) ML-based, and (iii) hybrid-based approaches.

The intensity-based approaches are among the most conventional methods used in brain tumor segmentation, relying on a basic analysis of pixel values within the spatial domain. The thresholding technique, for instance, functions by binarizing the MR image by pixel intensity relative to an intensity threshold [87]. This technique, however, suffers from many limitations including sensitivity to noise and intensity non-homogeneity. Also classified as an intensity-based approach, the region-based method involves using pre-defined pixel/voxel conditions to extract intensity information by locating a region following seed point selection and connecting pixels with similar intensity values; many studies have recently improved upon this technique but suffer from limitations such as inability to remove noise, subjective manual setting of parameters, and annotation bias [88, 89, 90, 91, 92]. Most existing methods rely on such fully supervised methods [93].

Largely due to the aforementioned constraints and inflexibility, ML-based approaches to brain tumor segmentation have increasingly been explored, both in traditional ML as well as DL forms. Many recent studies leveraging traditional ML models have shown equal or superior performance relative to the conventional intensity-based models, though observing limitations in some studies such as subjective user-directed pixel label refinement of segmentation results, sensitivity to noise and distortions, non-uniform intensity distribution, and extraction of redundant features [94, 95, 96].

In the past decade, interest toward deep learning as applied to brain tumor segmentation has soared in popularity due to its anticipated superior performance compared to more conventional models of data abstraction. Many studies have relied on extracting 2D patches from 3D MR images to use as inputs for the 2D CNN [97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109]. Though CNNs have generally demonstrated improved performance compared to its intensity-approach counterparts, model training is often time-consuming as a large amount of training data, parameters, and processing power are required. Furthermore, 3D contextual information is often bypassed in 2D CNNs, thus spurring the development of 3D CNNs in recent years [110, 111, 112, 113, 114, 115]. Although 3D CNNs enable better exploitation of 3D features from MR image information data, high computational resources (i.e. high network intensiveness and memory consumption) limit its widespread applicability. Thus, 2.5D deep neural networks (DNNs) approaches have been explored; Wang et al. validated a cascaded 2.5D model which improved segmentation accuracy by striking a balance between memory consumption and model complexity, demonstrating superior inference compared to already established models such as DeepMedic and ScaleNet [116, 117, 118].

Recently, Pham et al. introduced a hybrid metaheuristic-ML model to circumvent sensitivity to noise, intensity non-uniformity, and trapping into local minima and dependency on initial clustering centroids [119]. However, this model suffered from decreases in performance, though its introduction spurred the development of many hybrid models to find an optimal balance between each efficiency metric [96, 119, 120, 121, 122]. Other hybrid approaches such as DL-traditional ML and ML-contour based models, though better than conventional methods, have not observed overall efficiency greater than the metaheuristic-ML hybrid [87]. At the present moment, the literature indicates that deep learning based and hybrid-based metaheuristic models are the most efficient and reliable methods available, though its widespread application requires further validation. Despite improvements in deep learning models as applied to brain tumor segmentation, it is imperative to note that limitations in tumor morphological uncertainty, low contrast resolution, annotation biases during data labeling, and imbalanced voxel distribution persist. Thus, advances in AI can aid the neurosurgeon in various brain tumor segmentation contexts though neurosurgeons should remain cautious when using DL models to inform his or her clinical judgment.

4.7 Spine

From the genesis of AI applications in surgery spine has been a site of significant innovation in ML and DL models, generating opportunities for applications in scoliosis quantification, vertebral fracture detection, and vertebral body segmentation.

The Cobb measuring method is the gold standard for quantification of the scoliotic curve [123]. With the digitalization of computerized radiography, most surgeons opt to use built-in computer software such as the Picture Archiving and Communications System (PACS); despite the proven efficiency of the software relative to the traditional “manual” method of Cobb angle measurement, systems like PACS use software (e.g., Surgimap) which requires users to manually select the upper and lower ends of vertebral bodies inherently introduces human error [123, 124, 125, 126, 127]. Hence, Cobb angle measurement has been an area of significant AI exploration.

Caesarendra et al. utilized a deep CNN to measure the Cobb angle of patients diagnosed with adolescent idiopathic scoliosis, producing accuracies up to 93.6% which demonstrates a high reliability compared to neurosurgeons’ measurement (intraclass correlation coefficient > 0.95) [123]. Sun et al. assessed DL models based upon CNNs designed to segment each vertebra and locate the vertebral corners, finding a very high intraclass correlation coefficient (ICC) of 0.994, with a Pearson correlation coefficient and mean absolute error between the model and orthopedic annotation of 0.990 and 2.2° ± 2.0° [128]. These results are especially promising in cases where the Cobb angle does not exceed 90°.

AI applications in vertebral fracture detection have generated tremendous interest due to the relative ease in algorithmic-driven image discrimination relative to other neurosurgical contexts. Many studies have evaluated both ML and DL models in the context of fracture detection. Tomita et al. utilized a deep neural network to detect osteoporotic vertebral fractures trained upon 1432 CT scans, finding an ROC-AUC between 0.909 and 0.918 with an F-score of 90.8% and accuracy of 89.2%, measures approximately equivalent to radiologists [129]. Small et al. tested C-spine, an FDA-approved CNN developed by Aidoc to detect cervical spine fractures, finding an accuracy, sensitivity, and specificity for the CNN and radiologists of 92 vs. 95%, 76 vs. 93%, and 97 vs. 96%, respectively [130]. Derkatch et al. trained a CNN binary classifier fed with dual-energy x-ray absorptiometry data to vertebral compression fractures, which yielded an ROC-AUC of 0.94 with a sensitivity of 87.4% and a specificity of 88.4% [131]. Thus, these data suggest that ML and DL models can serve as an accessory to the radiologist and the neurosurgeon in vertebral fracture detection.

Currently, only a few semi-automatic methods for disc and vertebral labeling exist and are widely utilized. However, these methods are inundated with subjectivity due to the presence of user-directed input. Hence, many studies have sought to develop alternative methods to enhance accuracy and efficiency in radiological evaluation. Lehnen et al. demonstrated the feasibility of using a single CNN to identify various degenerative changes of the lumbar spine from MR images, finding high diagnostic accuracy for intervertebral disc detection/labeling (100%), spinal canal stenosis (98%), and nerve root compressions (91%) as well as moderately high diagnostic accuracy for disc herniations (87%), extrusions (86%), bulgings (76%), and spondylolisthesis (87.61%) [132]. However, the generalizability of their study is limited by a small sample size and exclusion of patients over 70 years old. Furthermore, the use of CNNs for spine segmentation is not particularly novel; in 2018, Whitehead et al. trained a cascade of CNNs and achieved Dice scores of 0.832 and 0.865 for vertebrae and discs, respectively [133]. Huang et al. developed a DL tool appropriately named Spine Explorer which quickly and automatically segments and measures lumbar MR images, achieving a near perfect mean Intersection-over-Union (IoU) of 94.7 and 92.6% for the vertebra and disc, respectively [134]. A year later, Shen et al. expanded the scope of Spine Explorer to include the paraspinal muscles and the spinal canal, finding IoU values of 83.3 to 88.4% and 82.1%, respectively [135]. However, both studies using Spine Explorer suffered from a low patient sample size. Recently, Cheng et al. developed a two-stage MultiResUNet DL model for the automatic segmentation of specific intervertebral discs, which yielded a segmentation accuracy of 94%, potentially indicating its eminence over other DL models, such as the U-Net, CNN-based, Attention U-Net, and standard MultiResUNet models [136].

Spine imaging findings are often insufficient in the determination of the underlying cause of lower back pain (LBP) and are often not of clinical significance due to the high frequency of asymptomatic presenting patients. NLP algorithms, however, can bridge the gap in data abstraction in the relationship between spine imaging findings and LBP. Tan et al. developed an NLP to identify lumbar spine imaging findings related to LBP on x-ray and MR radiology reports, demonstrating a significantly greater sensitivity (0.94, compared to 0.83 for rules-based), a higher overall AUC (0.98, compared to 0.90 for rules-based), and comparable specificity (0.97 vs. 0.95 for rules-based) when compared to the rules-based model [36]. Miotto et al. developed a convolutional neural network which, after training on manual free-text clinical notes on LBP patients, was able to discriminate between acute and chronic LBP (AUC of 0.98 and F score of 0.70), demonstrating the potential for systematization of patient symptomatology [137].

Advertisement

5. Intraoperative applications

The intraoperative phase of patient care revolves around optimizing the neurosurgeon’s functionality and performance in the operating room (OR). AI’s role intraoperatively includes augmented reality (AR), ML for pathology and neurooncologic applications, using algorithms to automate identification of intraoperative injuries based on the operative note.

Augmented reality has a myriad of intraoperative uses in both cranial and spinal procedures. From the cerebrovascular standpoint, AR has been used to decrease the craniotomy size and delineate aneurysm architecture for safer aneurysm clipping [138]. AR has also been used to superimpose white matter tracts onto the surgical field as well as identify eloquent brain regions during tumor resections [139]. The implementation of AR was shown to result in significantly greater rates of total resection with better preservation of critical functions such as vision, speech, and motor [139]. Head-up AR microscope displays with navigation were found to be more accurate than traditional microscopy with navigation based on fiducial or automatic intraoperative CT registration in the setting of transsphenoidal surgeries [140]. Rychen et al. described the successful use of AR to fuse CTA, DSA, and TOF MRI imaging with neuronavigation for superficial temporal artery to middle cerebral artery (STA-MCA) bypass operations [141]. Perhaps one of the most impressive features of these applications is that augmented reality is formulated to work with current microscopes and neuronavigation systems that are commonly used for neurosurgical procedures, rather than requiring an entirely new device.

Resection margins are of the utmost importance in the resection of malignant tumors as remnants of malignant tissue led to the recurrence of disease and decreased survival. Real time analysis of resection margins typically requires an experienced neuropathologist, as well as a processor well versed in chemistry [142]. ML was employed to process samples through the High Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) methodology, with high accuracy (median AUC of 85.6% and AUPR of 93.4%) [142]. Jabarkheel et al. established the use of Raman spectroscopy to accurately differentiate benign and malignant tissue intraoperatively in pediatric tumor resections [143].

Spinal procedures also utilize AR to aid in the precise placement of pedicle screws, superimposing trajectories into the surgical field [144]. Computer-assisted navigation (CAN) has a wide range of uses from tumor resection to deformity correction. When utilized for screw placement, CAN reduces the need for fluoroscopic guidance thus decreasing radiation exposure. CAN also increases operative efficiency, which diminishes the operative time and patient exposure to anesthesia [145].

Another promising AI application in spinal surgery is robotics. The SpineAssist (MAZOR Robotics Inc., Caesarea, Israel), ROSA (Medtech, SA, Montpellier, France), the Excelsius GPS Robot (Globus Medical, Inc., Audubon, PA), and the Da Vinci Surgical System (Intuitive Surgical, Sunnyvale, CA) are the four most studied robotic systems available [145]. Each has its strengths and weaknesses, and it is worth mentioning that all of these systems are still ultimately controlled by the surgeon. Prospective trials on the SpineAssist system demonstrate up to 99% accuracy with pedicle screw placement, as opposed to the 92% accuracy rate achieved with navigation alone [145]. The robot mounts directly onto the spinous process or other bony landmark and easily interfaces with a CAN system. Retrospective trials and case reports for the ROSA and Excelsius machines show increased accuracy of pedicle screw placement, however the difference was not statistically significant for the ROSA system [145]. Both systems are freestanding which removes the issue of incorrect landmark fixation that can occur with the SpineAssist system, and the Excelsius decreases total radiation exposure. The ROSA, initially created for intracranial neurosurgery, uses a camera and a percutaneous pin system placed over bony landmarks that the robot arm follows. In terms of efficiency, the ROSA is less efficient than current methodologies, adding over 70 minutes to the operative time [145]. Lastly, the Da Vinci system is the most widely used surgical robot though not typically used for and not approved for neurosurgical applications such as spinal instrumentation. Current thinking on potential neurosurgical applications of this device are anterior lumbar fusions [145]. Further randomized trials are needed and likely some adjustments to the systems in order to truly harness the advantages they offer.

Advertisement

6. Postoperative applications

The goals of the postoperative phase of care include predicting prognosis, identifying potential postoperative complications, and optimizing variables for enhanced aftercare and recovery. A study by Arvind et al. demonstrated that ANN and LR are superior to the American Society of Anesthesiologist (ASA) class in predicting the incidence of the cardiac, wound, VTE, and mortality in patients undergoing anterior cervical discectomy and fusion (ACDF) [146]. Similarly, Kim et al. found ANN and LR to be more accurate than ASA classification for predicting the same complications in posterior lumbar fusion [147]. AI has also allowed for greater distinction between disease progression versus tumor necrosis from radiation therapy in gliomas [144, 148].

Follow up in the postoperative phase can be simplified using telemedicine with smart phone apps, video conferencing or simple phone communication. A prospective trial by Reider-Demer et al. found that telemedicine postoperative follow up for patients who underwent elective intracranial neurosurgery was a safe and effective alternative to in-office visits [149]. What’s more, the patients preferred the convenience of telemedicine visits.

It has been estimated that doctors spend up to 50% of their time on documentation, and nurses 20% [150]. Moreover, the initiation of the twenty first Century Cures Act has created a great need for methods to quickly produce summaries and communications that are easily understood [151]. Once further refined, LLMs could be invaluable tools to help fill this gap by generating rudimentary plain language medical information that can be modified by clinicians. They can also be used to generate authorization letters and various other types of documentation based on keywords. This would drastically reduce the amount of time spent on documentation and allow physicians as well as other medical providers to devote more of their time to patient care.

Advertisement

7. Cost, feasibility, and limitations

7.1 Cost and feasibility

Successful integration of any new process or technology is dependent upon the ease of implementation, as well as the overall cost of the technology versus the revenue and benefit it generates. The United States leads in health care spending but has the worst outcomes when compared to nations such as Canada, Germany, the United Kingdom, Australia, Japan, Denmark, France, the Netherlands, Switzerland, and Sweden [152]. Health care spending is estimated to comprise nearly 20% of the US gross domestic product in 2025, which equates to $5.3 trillion [145]. Neurosurgery is among the most expensive medical specialties, with the average procedure and hospitalization costing $21,825 to $22,924 depending on the volume of the medical center [153]. The cost of a spinal fusion is 12 times greater than it was 30 years ago [145]. This in combination with the emergence of value-based care and changing reimbursement patterns has led to increased research into cost saving methodologies. AI applications associated with this research include risk adjusted reimbursement models, predictive models of hospital length of stay, and predictors of patients more suitable for outpatient procedures. Within the neurosurgical realm, these studies have focused on spinal surgeries and there is a paucity of data on the intracranial surgical aspects of neurosurgery [154]. A meta-analysis of AI economic studies performed by Khanna et al. revealed that most of the research is focused on either diagnosis or treatment aspects throughout all medicine and the studies lack consideration of purchase and maintenance costs associated with AI, as well as few if any comparisons to alternative technologies [152].

Though investigation into the financial aspects of AI use in neurosurgery is on the rise, no study to date has produced a thorough net present value assessment within a large-scale experimental design [154]. Externally validated studies conducted on a larger scale with robust cost and net gain/loss calculations are necessary to accurately determine the feasibility and true value of the integration of AI into neurosurgery from a financial standpoint. This is particularly important being that the mean cost of an AI system ranges from $20,000 to $1 million, depending upon the system. The more complex the system, the greater the cost, albeit there are minimal viable products available in the $8000 to $15,000 price range [155].

Maintenance and continued operation represent a significant investment as well. AI systems require a staff of project managers, software engineers, data scientists, and software developers. A project manager will cost between $1200 to $4600 per month. Software engineers and data scientists contribute $600 to $1500 per day and $500 to $1100 per day in cost respectively. The annual salary of an in-house data scientist averages $94,000 while a software developer has an annual salary of $80,000 [156]. Additionally, health networks incur an average cost of $15,000 to recruit candidates to fill these positions, as well as the cost to train the staff [156]. Outsourcing the maintenance and operation of the system offers a more frugal alternative to in-house staffing, however, there can be a lack of continuity and immediate availability with the remote staff.

Reimbursement for AI is still in its relative infancy as payers only began to approve coverage of AI use in late 2020 [157]. Currently, eight image-based assistive or autonomous AI devices are approved by Center for Medicare and Medicaid (CMS) for repayment, with two of the technologies holding surgical utility (Table 1). The criteria for repayment is very specific and quite complex, with payments ultimately only covering a maximum of 65% of the actual expense [158]. Compensation is based on Current Procedural Terminology (CPT) codes or New Technology Add-on Payments (NTAPs), which have a reimbursement limit of 3 years [157, 158]. In Europe, AI is not routinely covered and not recognized as a separately reimbursable expense. Several suggested payment models including gainsharing models, outcome incentivization, and advance market commitments have been proposed as the potential for abuse/fraud or underutilization in underserved areas with per use payments has been recognized as a legitimate concern [157].

ManufacturerSystemDescription/UsePayment mechanism
Digital diagnosticsIDX-DRDeep learning algorithm to diagnose diabetic retinopathy from fundoscopic images in the outpatient settingCPT
viz.aiViz LVORadiological computer-assisted triage and notification software that analyzes CT images of the brain and notifies hospital staff when a suspected large-vessel occlusion (LVO) is identifiedNTAP
Rapid AIRapid LVOAI-guided medical imaging acquisition system intended to assist medical professionals in the acquisition of cardiac ultrasound images.NTAP
Caption healthCaption guidanceNTAP
viz.aiViz SDHRadiological computer-assisted triage and notification software that analyzes CT images of the brain and notifies hospital staff when a suspected subdural hematoma is identifiedNTAP
Rapid AIRapid aspectsComputer-aided diagnostic device characterizing brain tissue abnormalities on brain CT imagesNTAP
AIDocBriefcase for PERadiological computer-assisted triage and notification software that analyzes CT images of the chest and notifies hospital staff when a suspected pulmonary embolism is identifiedNTAP
PROCEPT BioRobotics CorporationThe AQUABEAM systemAutonomous tissue removal robot for the treatment of lower urinary tract symptoms due to benign prostatic hyperplasia (BPH).NTAP

Table 1.

Modified from paying for artificial intelligence in medicine. Parikh and Helmchen [157].

Ultimately, the future integration of AI into the field of neurosurgery will depend heavily upon whether the increase in efficiency and performance result in a tangible improvement in patient outcomes while providing a net cost savings to health networks. If AI is proven to be a substantial solution, reassessment of reimbursements and insurance coverage are likely to follow.

7.2 Limitations

The remarkable growth and promise of AI in neurosurgery are not without limitations and concerns that must be taken into account. Firstly, it is imperative to consider that potentially substantial ML-driven improvements in performance are distinct from clinically significant improvements. Although ML models may offer drastic improvements in big data prediction problems, many medical prediction scenarios tend to be intrinsically linear and binary; in such cases, it is unlikely ML models will offer substantial improvements in discrimination and be of clinical value to the neurosurgeon [12, 23]. In short, the efficacy of ML algorithms boils down to the ability to predict future outcomes based on past data.

A primary concern with LLMs is their current inability to fully comprehend context or exercise judgment, which causes significant misinterpretations along with the potential to disseminate incorrect and potentially harmful information [159]. LLMs lack a mechanism for discriminating against biased or false information and cannot inform the end user that the information provided is incorrect. This concern is further compounded by the lack of transparency in the decision-making processes of LLMs like GPT-4. These models can offer explanations as to how and why they make certain decisions upon request, but these justifications are formed post-hoc [160]. This makes it impossible to verify if the explanations accurately represent the model’s actual decision-making process. Even more problematic is that when probed for an explanation, GPT-4 may provide contradictory information to its previous statements [159, 160]. The lack of reliability and reproducibility necessitates constant human oversight to ensure accuracy. Specific to medicine, clinicians would be required to fact check these tools, which could easily negate any time savings LLMs may offer. Intellectual property matters are another issue with LLMs. These tools not only pull data and property from creators without consent, but some have also created and cited false references [150].

Furthermore, there is a tendency for bias, violations of privacy, and inherent logistical difficulties with the global utilization of AI. Datasets used to train algorithms are predominantly composed of information representing the majority and common conditions. This model bias can negatively impact racial and ethnic minority groups, genders, and socioeconomically disadvantaged peoples, in addition to diminishing the ability to recognize difficult anatomy [161, 162]. A study by Kamulageya et al. found that the AI dermatologic algorithm Skin Image Search was woefully inaccurate when presented images of pathology in Ugandan patients with dark (Fitzpatrick 6) skin types [163]. The company website boasts an accuracy of 80% and but was found to only be 17% accurate when presented with darker skin tones [163]. Facial recognition algorithms have also been found to have diminished capabilities with both gender and race, performing the worst with females of darker skin tones [164]. These very groups already suffer from diminished access to care and undertreatment of disease in comparison to non-disadvantaged people. Model variance, which stems from insufficient data from minority groups also furthers the bias of AI algorithms. Differences in practices, equipment, and coding also decrease the generalization of AI algorithms. Designing algorithms with the global population in mind, analyzing performance on a subgroup basis, as well as externally validating the algorithms are ways to combat this [162].

Obtaining large quantities of patient data to train AI systems is difficult due to the necessary privacy protections added to patient data [161]. Inappropriate access to data sets and algorithms poses significant ethical, security, and privacy concerns. Algorithms can be manipulated by the addition of noise or altered data to produce harmful or deleterious effects on the system. Ensuring data privacy and security while allowing users and developers to learn and improve upon the technology is key to moving AI forward.

On a global scale, challenges to telesurgery include lags in connection speeds and the potential for delays and disconnections. The introduction of 5G technology has been touted as a possible remedy, however this remains to be seen [165]. Another consideration to this includes the cost of these systems and the maintenance [165166]. Will lower to middle income countries, which are in the greatest need of assistance, share in the cost or will the burden fall on the higher income nations? While this will reduce medical tourism to a point, this will still remain unless the infrastructure for preoperative and postoperative care is created within the countries in need. A likely solution for remote regions would involve smartphone apps for pre- and post-operative care and medical tourism over a shorter distance for operative and immediate aftercare until the patient is sufficiently recovered. With any AI solution to be implemented in a low to middle income country, the obstacles of infrastructure (electricity, wifi, phone lines, etc.), and governance for AI will need to be overcome on a broad scale.

Frequently stated worries are overreliance on technology, the loss of jobs, and physician disapproval. Most technologies being created are intended to assist and prevent fatigue, and skills must be maintained in order to properly utilize the technology. While there are solutions that involve autonomous actions to be handled solely by AI technology, patients themselves are not in favor of operations or procedures in which a surgeon is not involved. A cross-sectional study conducted by Palmisciano et al. found that while the majority of patient respondents thought AI use was appropriate for image interpretation/preoperative planning or indicating potential complications (76.7 and 82.2% respectively), only 17.7% of these patients approved of AI performing an entire operation [167]. Physicians themselves are also quite welcoming of AI integration into neurosurgery. A survey of neurosurgeons, anesthetists, nurses, and operating room practitioners conducted by Horsfall et al. revealed that the majority of respondents viewed the use of AI in various aspects of neurosurgery favorably [167].

The responses were 62% in favor of use for imaging interpretation, 82% approved of use for operative planning, 70% use for coordinating the surgical team, 85% in favor of AI generated real time alerts to complications or hazards, and 66% approved of autonomous surgery by AI. Members of the Congress of Neurological Surgeons and European Association of the Neurosurgical Societies were polled by Staartjes and colleagues regarding the use of ML in neurosurgery. The results demonstrated that 28.8% of respondents used ML in clinical practice and 31.1% used ML for research [168].

Advertisement

8. The future of AI in neurosurgery

Future directions of AI integration into the field of neurosurgery involve both simple and complex solutions, some with global implications. The rise of telemedicine during the COVID-19 pandemic resulted in expanded applications which can be further built upon to partially address the global shortage of neurosurgeons [165]. Approximately 39 countries do not have access to neurosurgical care [3]. Smartphone apps can be used for postoperative follow up, obviating the need to travel prolonged distances to receive continued evaluation. Telesurgery has garnered significant interest, as the potential to decrease transportation costs, improve logistics, and reduce the carbon footprint associated with medical tourism is great. Conceptualized iterations involve an operative suite with robotic equipment that will be controlled by surgeons in a control room. Given the paucity of neurosurgeons relative to the population in need globally, it has been proposed that a general surgeon be at the control room adjacent to the patient, while a neurosurgeon is at the helm in a remote control room [165]. This also has implications for military use as surgeons would be able to care for patients in war zones remotely rather than risking their lives in the field [165].

Within the operating room, the push toward improved logistics and ergonomics as well as minimal to no contact procedures continues. Technologies to merge the microscope view, navigation imaging, and virtual or augmented reality screen into a single device such as surgical glasses are being developed [165]. There are a few augmented reality glasses (HoloLens, xvision Spine System) designed for surgical planning that are already commercially available [169]. The glasses project 3-D models of the patient’s anatomy (based on preoperative CT scans) directly into the surgical field, and can be controlled in a contactless manner with hand gestures and voice commands [169]. Magnetic navigation systems are being piloted for contactless endovascular operations [3].

Advertisement

9. Conclusion

This chapter broadly elucidated the scope of artificial intelligence in the field of neurosurgery. At the current moment, AI has successfully been introduced in some clinical settings, especially in the realm of diagnostics. With the increasing capacity of ML and ANNs to abstract patient information and produce clinically relevant results, it appears likely that AI will continue to be increasingly integrated within neurosurgery. In particular, a trend prioritizing the transition from fully supervised and rules-based methods toward self, partially, and semi-supervised algorithms is observed in deep learning, although the latter possesses its own set of limitations.

Furthermore, the literature has demonstrated ad nauseam that when ML and ANN algorithms are tested prospectively on novel patient datasets, they perform, at best, equivalent to expert neurosurgeons in diagnostic examples. Thus, notions suggesting a diminishing scope of the neurosurgeon due to the emergence of AI should be dispelled. Rather, AI can serve to function as an adjunct to the neurosurgeon by playing a supportive role in the pre-, intra-, and postoperative phases of care. An ideal world for the neurosurgical patient of the future is one in which they are treated by a neurosurgeon clinically informed by artificial intelligence.

Yet, there are certain issues to be addressed prior to the overwhelming adoption of AI. In order to make this a truly feasible and applicable solution on a wide scale, uniform (or at least interchangeable) and globally generalizable, externally validated products are needed. Robust studies to fully elucidate the entire cost versus the cost savings from increased efficiency and improved clinical results must be conducted. This will help to inform both healthcare networks and payers on the true value of AI, thus facilitating the creation of a framework for reimbursement and funding methods. In short, greater communication and consensus among developers, healthcare systems, physicians, and payers will allow for the true potential of AI to be realized as a health solution.

References

  1. 1. Bliss M. Harvey Cushing: A Life in Surgery. Oxford, United Kingdom: Oxford University Press; 2007
  2. 2. Panesar SS, Kliot M, Parrish R, Fernandez-Miranda J, Cagle Y, Britz GW. Promises and perils of artificial intelligence in neurosurgery. Neurosurgery. 2020;87(1):33-44. DOI: 10.1093/neuros/nyz471
  3. 3. Mofatteh M. Neurosurgery and artificial intelligence. AIMS Neuroscience. 2021;8(4):477-495. DOI: 10.3934/Neuroscience.2021025
  4. 4. Sarker IH. AI-based modeling: Techniques, applications and research issues towards automation, intelligent and smart systems. SN Computer Science. 2022;3(2):158. DOI: 10.1007/s42979-022-01043-x
  5. 5. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to machine learning, neural networks, and deep learning. Translational Vision Science & Technology. 2020;9(2):14. DOI: 10.1167/tvst.9.2.14
  6. 6. Niño M, Illarramendi A. Understanding big data: Antecedents, origin and later development. DYNA New Technologies. 2015;2:1-8. DOI: 10.6036/NT7835
  7. 7. Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electronic Markets. 2021;31(3):685-695. DOI: 10.1007/s12525-021-00475-2
  8. 8. Na KS. Prediction of future cognitive impairment among the community elderly: A machine-learning based approach. Scientific Reports. 2019;9(1):3335. DOI: 10.1038/s41598-019-39478-7
  9. 9. Stawicki S. Application of financial analysis techniques to vital sign data: A novel method of trend interpretation in the intensive care unit. OPUS 12 Scientist. 2007;1(1):14-16
  10. 10. Obermeyer Z, Emanuel EJ. Predicting the future — Big data, machine learning, and clinical medicine. The New England Journal of Medicine. 2016;375(13):1216-1219. DOI: 10.1056/NEJMp1606181
  11. 11. Lee J, Warner E, Shaikhouni S, et al. Unsupervised machine learning for identifying important visual features through bag-of-words using histopathology data from chronic kidney disease. Scientific Reports. 2022;12(1):4832. DOI: 10.1038/s41598-022-08974-8
  12. 12. Gravesteijn BY, Nieboer D, Ercole A, et al. Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury. Journal of Clinical Epidemiology. 2020;122:95-107. DOI: 10.1016/j.jclinepi.2020.03.005
  13. 13. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology. 2019;110:12-22. DOI: 10.1016/j.jclinepi.2019.02.004
  14. 14. Nahm FS. Receiver operating characteristic curve: Overview and practical use for clinicians. Korean Journal of Anesthesiology. 2022;75(1):25-36. DOI: 10.4097/kja.21209
  15. 15. Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006;27(8):861-874. DOI: 10.1016/j.patrec.2005.10.010
  16. 16. Muller MP, Tomlinson G, Marrie TJ, et al. Can routine laboratory tests discriminate between severe acute respiratory syndrome and other causes of community-acquired pneumonia? Clinical Infectious Diseases. 2005;40(8):1079-1086. DOI: 10.1086/428577
  17. 17. Victor Volovici, Ewout W Steyerberg, Maryse C Cnossen, Iain K Haitsma, Clemens M F Dirven, Andrew I R Maas, Hester F Lingsma; Evolution of evidence and guideline recommendations for the medical management of severe traumatic brain injury, Journal of Neurotrauma. 2019; 36(22):3183-3189. Available from: https://www.liebertpub.com/doi/10.1089/neu.2019.6474 [Accessed: December 14, 2022]
  18. 18. Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making. 2019;19(1):281. DOI: 10.1186/s12911-019-1004-8
  19. 19. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. npj Digital Medicine. 2018;1(1):1-8. DOI: 10.1038/s41746-018-0040-6
  20. 20. Zhu X, Wu X. Class noise vs. attribute noise: A quantitative study. Artificial Intelligence Review. 2004;22(3):177-210. DOI: 10.1007/s10462-004-0751-8
  21. 21. Gupta S, Gupta A. Dealing with noise problem in machine learning data-sets: A systematic review. Procedia Computer Science. 2019;161:466-474. DOI: 10.1016/j.procs.2019.11.146
  22. 22. What are the parts of the nervous system? Published October 1, 2018. Available from: https://www.nichd.nih.gov/health/topics/neuro/conditioninfo/parts [Accessed: April 6, 2023]
  23. 23. Shepherd GM. The Synaptic Organization of the Brain. 5th ed. Vol. xiv. Oxford, United Kingdom: Oxford University Press; 2004. p. 719. DOI: 10.1093/acprof:oso/9780195159561.001.1
  24. 24. Ebersole JS, Pedley TA. Current Practice of Clinical Electroencephalography. Philadelphia, Pennsylvania: Lippincott Williams & Wilkins; 2003
  25. 25. Reyes A. Influence of dendritic conductances on the input-output properties of neurons. Annual Review of Neuroscience. 2001;24:653-675. DOI: 10.1146/annurev.neuro.24.1.653
  26. 26. Purves D, Augustine GJ, Fitzpatrick D, et al. Excitatory and inhibitory postsynaptic potentials. Neuroscience, 2nd Ed. Sunderland (MA): Sinauer Associates; 2001. Available from: https://www.ncbi.nlm.nih.gov/books/NBK11117/ [Accessed: January 28, 2023]
  27. 27. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biology. 1990;52(1):99-115. DOI: 10.1007/BF02459570
  28. 28. Han SH, Kim KW, Kim S, Youn YC. Artificial neural network: Understanding the basic concepts without mathematics. Dementia Neurocognitive Disorder. 2018;17(3):83-89. DOI: 10.12779/dnd.2018.17.3.83
  29. 29. Indolia S, Goswami AK, Mishra SP, Asopa P. Conceptual understanding of convolutional neural network- a deep learning approach. Procedia Computer Science. 2018;132:679-688. DOI: 10.1016/j.procs.2018.05.069
  30. 30. Sarker IH. Deep learning: A comprehensive overview on techniques, taxonomy, applications and research directions. SN Computer Science. 2021;2(6):420. DOI: 10.1007/s42979-021-00815-1
  31. 31. Harbaugh RE. Editorial. Artificial neural networks for neurosurgical diagnosis, prognosis, and management. Neurosurgical Focus. 2018;45(5):E3. DOI: 10.3171/2018.8.FOCUS18438
  32. 32. AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the Difference? Published January 19, 2022. Available from: https://www.ibm.com/cloud/blog/ai-vs-machine-learning-vs-deep-learning-vs-neural-networks [Accessed: February 26, 2023]
  33. 33. Azimi P, Mohammadi HR, Benzel EC, Shahzadi S, Azhari S, Montazeri A. Artificial neural networks in neurosurgery. Journal of Neurology, Neurosurgery, and Psychiatry. 2015;86(3):251-256. DOI: 10.1136/jnnp-2014-307807
  34. 34. Yang S, Zhu F, Ling X, Liu Q and Zhao P; Intelligent health care: Applications of deep learning in computational medicine. Frontiers in Genetics. 2021; 12:607471; [Accessed: February 8, 2023]. Available from: https://www.frontiersin.org/articles/10.3389/fgene.2021.607471/full
  35. 35. Khurana D, Koli A, Khatter K, Singh S. Natural language processing: State of the art, current trends and challenges. Multimedia Tools and Applications. 2023;82(3):3713-3744. DOI: 10.1007/s11042-022-13428-4
  36. 36. Tan WK, Hassanpour S, Heagerty PJ, et al. Comparison of natural language processing rules-based and machine-learning systems to identify lumbar spine imaging findings related to low back pain. Academic Radiology. 2018;25(11):1422-1432. DOI: 10.1016/j.acra.2018.03.008
  37. 37. Bacco L, Russo F, Ambrosio L, et al. Natural language processing in low back pain and spine diseases: A systematic review. Frontiers in Surgery 2022; 9, 957085. Available from: https://www.frontiersin.org/articles/10.3389/fsurg.2022.957085 [Accessed: February 23, 2023]
  38. 38. Dave T, Athaluri SA, Singh S. ChatGPT in medicine: An overview of its applications, advantages, limitations, future prospects, and ethical considerations. Frontiers in Artificial Intelligence. 2023;6:1169595. DOI: 10.3389/frai.2023.1169595
  39. 39. Ali R, Tang OY, Connolly ID, et al. Performance of ChatGPT and GPT-4 on neurosurgery written board examinations. Neurosurgery. 2023;00:1-13. DOI: 10.1101/2023.03.25.23287743
  40. 40. Blease C, Kharko A, Annoni M, Gaab J, Locher C. Machine learning in clinical psychology and psychotherapy education: A mixed methods pilot survey of postgraduate students at a Swiss university, Frontiers in Public Health 2021;9, 623088. Available from: https://www.frontiersin.org/articles/10.3389/fpubh.2021.623088 [Accessed: July 31, 2023]
  41. 41. Shickel B, Tighe P, Bihorac A, Rashidi P. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE Journal of Biomedical and Health Informatics. 2018;22(5):1589-1604. DOI: 10.1109/JBHI.2017.2767063
  42. 42. Kumah-Crystal YA, Pirtle CJ, Whyte HM, Goode ES, Anders SH, Lehmann CU. Electronic health record interactions through voice: A review. Applied Clinical Informatics. 2018;9(3):541-552. DOI: 10.1055/s-0038-1666844
  43. 43. Wartman SA, Combs CD. Medical education must move from the information age to the age of artificial intelligence. Academic Medicine: Journal of the Association of American Medical Colleges. 2018;93(8):1107-1109. DOI: 10.1097/ACM.0000000000002044
  44. 44. Oermann EK, Rubinsteyn A, Ding D, et al. Using a machine learning approach to predict outcomes after radiosurgery for cerebral arteriovenous malformations. Scientific Reports. 2016;6(1):21161. DOI: 10.1038/srep21161
  45. 45. Azimi P, Mohammadi HR. Predicting endoscopic third ventriculostomy success in childhood hydrocephalus: An artificial neural network analysis: Clinical article. Journal of Neurosurgery. Pediatrics. 2014;13(4):426-432. DOI: 10.3171/2013.12.PEDS13423
  46. 46. Habibi Z, Ertiaei A, Nikdad MS, et al. Predicting ventriculoperitoneal shunt infection in children with hydrocephalus using artificial neural network. Child's Nervous System. 2016;32(11):2143-2151. DOI: 10.1007/s00381-016-3248-2
  47. 47. Hamed Asadi, Hong Kuan Kok, Seamus Looby, Paul Brennan, Alan O'Hare, John Thornton; Outcomes and complications after endovascular treatment of brain arteriovenous malformations: A prognostication attempt using artificial intelligence. 2016, World Neurosurgery; 96, 562-569.e1; Available from: https://www.sciencedirect.com/science/article/pii/S1878875016309160 [Accessed: December 14, 2022]
  48. 48. Parisa Azimi, Edward C Benzel, Sohrab Shahzadi, Shirzad Azhari, Hasan Reza Mohammadi; Use of artificial neural networks to predict surgical satisfaction in patients with lumbar spinal canal stenosis: Journal of Neurosurgery: Spine, 20, 3 (2014), 300-305. https://thejns.org/spine/view/journals/j-neurosurg-spine/20/3/article-p300.xml [Accessed: December 14, 2022]
  49. 49. Shi HY, Hwang SL, Lee KT, Lin CL. In-hospital mortality after traumatic brain injury surgery: A nationwide population-based comparison of mortality predictors used in artificial neural network and logistic regression models: Clinical article. Journal of Neurosurgery. 2013;118(4):746-752. DOI: 10.3171/2013.1.JNS121130
  50. 50. Travis M Dumont, Anand I Rughani, Bruce I Tranmer; Prediction of symptomatic cerebral vasospasm after aneurysmal subarachnoid hemorrhage with an artificial neural network: Feasibility and comparison with logistic regression models. World Neurosurgery. 2011;75(1):57-63. Available from: https://www.sciencedirect.com/science/article/pii/S187887501000358X [Accessed: December 14, 2022]
  51. 51. Abouzari M, Rashidi A, Zandi-Toghani M, Behzadi M, Asadollahi M. Chronic subdural hematoma outcome prediction using logistic regression and an artificial neural network. Neurosurgical Review. 2009;32(4):479-484. DOI: 10.1007/s10143-009-0215-3
  52. 52. Senders JT, Staples PC, Karhade AV, et al. Machine learning and neurosurgical outcome prediction: A systematic review. World Neurosurgery. 2018;109:476-486.e1. DOI: 10.1016/j.wneu.2017.09.149
  53. 53. Masoudi MS, Rezaei E, Tahmouresi A, et al. Prediction of 6 months endoscopic third ventriculostomy success rate in patients with hydrocephalus using a multi-layer perceptron network. Clinical Neurology and Neurosurgery. 2022;219:107295. DOI: 10.1016/j.clineuro.2022.107295
  54. 54. Tunthanathip T, Duangsuwan J, Wattanakitrungroj N, Tongman S, Phuenpathom N. Comparison of intracranial injury predictability between machine learning algorithms and the nomogram in pediatric traumatic brain injury. Neurosurgical Focus. 2021;51(5):E7. DOI: 10.3171/2021.8.FOCUS2155
  55. 55. Emblem KE, Nedregaard B, Hald JK, Nome T, Due-Tonnessen P, Bjornerud A. Automatic glioma characterization from dynamic susceptibility contrast imaging: Brain tumor segmentation using knowledge-based fuzzy clustering. Journal of Magnetic Resonance Imaging. 2009;30(1):1-10. DOI: 10.1002/jmri.21815
  56. 56. Emblem KE, Pinho MC, Zöllner FG, et al. A generic support vector machine model for preoperative glioma survival associations. Radiology. 2015;275(1):228-234. DOI: 10.1148/radiol.14140770
  57. 57. Sagberg LM, Jakola AS, Reinertsen I, Solheim O. How well do neurosurgeons predict survival in patients with high-grade glioma? Neurosurgical Review. 2022;45(1):865-872. DOI: 10.1007/s10143-021-01613-2
  58. 58. Kaka H, Zhang E, Khan N. Artificial intelligence and deep learning in neuroradiology: Exploring the new frontier. Canadian Association of Radiologists Journal. 2021;72(1):35-44. DOI: 10.1177/0846537120954293
  59. 59. Matsoukas S, Scaggiante J, Schuldt BR, et al. Accuracy of artificial intelligence for the detection of intracranial hemorrhage and chronic cerebral microbleeds: A systematic review and pooled analysis. Radiology Medicine (Torino). 2022;127(10):1106-1123. DOI: 10.1007/s11547-022-01530-4
  60. 60. Stawicki SP, Wojda TR, Nuschke JD, et al. Prognostication of traumatic brain injury outcomes in older trauma patients: A novel risk assessment tool based on initial cranial CT findings. International Journal of Critical Illness and Injury Science. 2017;7(1):23. DOI: 10.4103/IJCIIS.IJCIIS_2_17
  61. 61. Ye H, Gao F, Yin Y, et al. Precise diagnosis of intracranial hemorrhage and subtypes using a three-dimensional joint convolutional and recurrent neural network. European Radiology. 2019;29(11):6191-6201. DOI: 10.1007/s00330-019-06163-2
  62. 62. Intracranial Hemorrhage (ICH), Aidoc. Available from: www.AIforRadiology.com/ [Accessed: February 6, 2023]
  63. 63. Ginat DT. Analysis of head CT scans flagged by deep learning software for acute intracranial hemorrhage. Neuroradiology. 2020;62(3):335-340. DOI: 10.1007/s00234-019-02330-w
  64. 64. Voter AF, Meram E, Garrett JW, Yu JPJ. Diagnostic accuracy and failure mode analysis of a deep learning algorithm for the detection of intracranial hemorrhage. Journal of the American College of Radiology. 2021;18(8):1143-1152. DOI: 10.1016/j.jacr.2021.03.005
  65. 65. Smith J. Radiological Computer Aided Triage and Notification Software (510(k) Number K180647), U. FDA, Editor. Philadelphia, PA; 2018
  66. 66. Ojeda P, Zawaideh M, Mossa-Basha M, Haynor D. The utility of deep learning: Evaluation of a convolutional neural network for detection of intracranial bleeds on non-contrast head computed tomography studies. In: Medical Imaging 2019: Image Processing. Vol. 10949. San Diego, California, USA: SPIE; 2019. pp. 899-906. DOI: 10.1117/12.2513167
  67. 67. Wang X, Liang G, Zhang Y, Blanton H, Bessinger Z, Jacobs N. Inconsistent performance of deep learning models on mammogram classification. Journal of the American College of Radiology. 2020;17(6):796-803. DOI: 10.1016/j.jacr.2020.01.006
  68. 68. McLouth J, Elstrott S, Chaibi Y, et al. Validation of a deep learning tool in the detection of intracranial hemorrhage and large vessel occlusion. Frontiers in Neurology 2021;12, 656112. Available from: https://www.frontiersin.org/articles/10.3389/fneur.2021.656112 [Accessed: February 6, 2023]
  69. 69. Rava RA, Seymour SE, LaQue ME, et al. Assessment of an artificial intelligence algorithm for detection of intracranial hemorrhage. World Neurosurgery. 2021;150:e209-e217. DOI: 10.1016/j.wneu.2021.02.134
  70. 70. Pop NO, Tit DM, Diaconu CC, et al. The Alberta stroke program early CT score (ASPECTS): A predictor of mortality in acute ischemic stroke. Experimental and Therapeutic Medicine. 2021;22(6):1371. DOI: 10.3892/etm.2021.10805
  71. 71. Soun JE, Chow DS, Nagamine M, et al. Artificial intelligence and acute stroke imaging. AJNR. American Journal of Neuroradiology. 2021;42(1):2-11. DOI: 10.3174/ajnr.A6883
  72. 72. Guberina N, Dietrich U, Radbruch A, et al. Detection of early infarction signs with machine learning-based diagnosis by means of the Alberta stroke program early CT score (ASPECTS) in the clinical routine. Neuroradiology. 2018;60(9):889-901. DOI: 10.1007/s00234-018-2066-5
  73. 73. Maegerlein C, Fischer J, Mönch S, et al. Automated calculation of the Alberta stroke program early CT score: Feasibility and reliability. Radiology. 2019;291(1):141-148. DOI: 10.1148/radiol.2019181228
  74. 74. You J, Yu PLH, Tsang ACO, Tsui ELH, Woo PPS, Leung GKK. Automated computer evaluation of acute ischemic stroke and large vessel occlusion. arXiv. 2019;4:1-8. DOI: 10.48550/arXiv.1906.08059
  75. 75. Albers GW, Marks MP, Kemp S, et al. Thrombectomy for stroke at 6 to 16 hours with selection by perfusion imaging. The New England Journal of Medicine. 2018;378(8):708-718. DOI: 10.1056/NEJMoa1713973
  76. 76. Nielsen A, Hansen MB, Tietze A, Mouridsen K. Prediction of tissue outcome and assessment of treatment effect in acute ischemic stroke using deep learning. Stroke. 2018;49(6):1394-1401. DOI: 10.1161/STROKEAHA.117.019740
  77. 77. Nishi H, Oishi N, Ishii A, et al. Deep learning-derived high-level neuroimaging features predict clinical outcomes for large vessel occlusion. Stroke. 2020;51(5):1484-1492. DOI: 10.1161/STROKEAHA.119.028101
  78. 78. Liu J, Zou X, Zhao Y, et al. Prevalence and risk factors for unruptured intracranial aneurysms in the population at high risk for aneurysm in the rural areas of Tianjin. Frontiers in Neurology 2022; 13, 853054. Available from: https://www.frontiersin.org/articles/10.3389/fneur.2022.853054 [Accessed: April 6, 2023]
  79. 79. van Gijn J, Kerr RS, Rinkel GJE. Subarachnoid haemorrhage. The Lancet. 2007;369(9558):306-318. DOI: 10.1016/S0140-6736(07)60153-6
  80. 80. Keedy A. An overview of intracranial aneurysms. McGill Journal of Medicine. 2006;9(2):141-146
  81. 81. Tang H, Hu N, Yuan Y, et al. Accelerated time-of-flight magnetic resonance angiography with sparse undersampling and iterative reconstruction for the evaluation of intracranial arteries. Korean Journal of Radiology. 2019;20(2):265-274. DOI: 10.3348/kjr.2017.0634
  82. 82. Nakao T, Hanaoka S, Nomura Y, et al. Deep neural network-based computer-assisted detection of cerebral aneurysms in MR angiography. Journal of Magnetic Resonance Imaging (JMRI). 2018;47(4):948-953. DOI: 10.1002/jmri.25842
  83. 83. Faron A, Sichtermann T, Teichert N, et al. Performance of a deep-learning neural network to detect intracranial aneurysms from 3D TOF-MRA compared to human readers. Clinical Neuroradiology. 2020;30(3):591-598. DOI: 10.1007/s00062-019-00809-w
  84. 84. Ueda D, Yamamoto A, Nishimori M, et al. Deep learning for MR angiography: Automated detection of cerebral aneurysms. Radiology. 2019;290(1):187-194. DOI: 10.1148/radiol.2018180901
  85. 85. Shi Z, Miao C, Schoepf UJ, et al. A clinically applicable deep-learning model for detecting intracranial aneurysm in computed tomography angiography images. Nature Communications. 2020;11(1):6090. DOI: 10.1038/s41467-020-19527-w
  86. 86. Yang J, Xie M, Hu C, et al. Deep learning for detecting cerebral aneurysms with CT angiography. Radiology. 2021;298(1):155-163. DOI: 10.1148/radiol.2020192154
  87. 87. Fawzi A, Achuthan A, Belaton B. Brain image segmentation in recent years: A narrative review. Brain Sciences. 2021;11(8):1055. DOI: 10.3390/brainsci11081055
  88. 88. Virupakshappa AB. Cognition-based MRI brain tumor segmentation technique using modified level set method. Cognition, Technology & Work. 2019;21(3):357-369. DOI: 10.1007/s10111-018-0472-4
  89. 89. Ilunga-Mbuyamba E, Avina-Cervantes JG, Garcia-Perez A, et al. Localized active contour model with background intensity compensation applied on automatic MR brain tumor segmentation. Neurocomputing. 2017;220:84-97. DOI: 10.1016/j.neucom.2016.07.057
  90. 90. Kermi A, Andjouh K, Zidane F. Fully automated brain tumour segmentation system in 3D-MRI using symmetry analysis of brain and level sets. IET Image Processing. 2018;12(11):1964-1971. DOI: 10.1049/iet-ipr.2017.1124
  91. 91. Achuthan A, Rajeswari M. Segmentation of hippocampus guided by assembled and weighted coherent point drift registration. Journal of King Saud University - Computer and Information Sciences. 2021;33(8):1008-1017. DOI: 10.1016/j.jksuci.2019.06.011
  92. 92. Safavian N, Batouli SAH, Oghabian MA. An automatic level set method for hippocampus segmentation in MR images. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization. 2020;8(4):400-410. DOI: 10.1080/21681163.2019.1706054
  93. 93. Liu Z, Tong L, Chen L, et al. Deep learning based brain tumor segmentation: A survey. Complex & Intelligent Systems. 2023;9(1):1001-1026. DOI: 10.1007/s40747-022-00815-5
  94. 94. Chen G, Li Q , Shi F, Rekik I, Pan Z. RFDCR: Automated brain lesion segmentation using cascaded random forests with dense conditional random fields. NeuroImage. 2020;211:116620. DOI: 10.1016/j.neuroimage.2020.116620
  95. 95. Nitta GR, Sravani T, Nitta S, Muthu B. Dominant gray level based K-means algorithm for MRI images. Health Technology. 2020;10(1):281-287. DOI: 10.1007/s12553-018-00293-1
  96. 96. Mohammadreza Soltaninejad, Guang Yang, Tryphon Lambrou, Nigel Allinson, Timothy L Jones, Thomas R Barrick, Franklyn A Howe, Xujiong Ye; Supervised learning based multimodal MRI brain tumour segmentation using texture features from supervoxels. Computer Methods and Programs in Biomedicine. 2018;157:69-84; Available from: https://www.sciencedirect.com/science/article/pii/S016926071731355X?via%3Dihub [Accessed: February 25, 2023]
  97. 97. Gunasekara SR, Kaldera HNTK, Dissanayake MB. A systematic approach for MRI brain tumor localization and segmentation using deep learning and active contouring. Journal of Healthcare Engineering. 2021;2021:e6695108. DOI: 10.1155/2021/6695108
  98. 98. Ribalta Lorenzo P, Nalepa J, Bobek-Billewicz B, et al. Segmenting brain tumors from FLAIR MRI using fully convolutional neural networks. Computer Methods and Programs in Biomedicine. 2019;176:135-148. DOI: 10.1016/j.cmpb.2019.05.006
  99. 99. Wu D, Ding Y, Zhang M, Yang Q , Qin Z. Multi-features refinement and aggregation for medical brain segmentation. IEEE Access. 2020;8:57483-57496. DOI: 10.1109/ACCESS.2020.2981380
  100. 100. Silva CA, Pinto A, Pereira S, Lopes A. Multi-stage deep layer aggregation for brain tumor segmentation. In: Crimi A, Bakas S, editors. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Lecture Notes in Computer Science. New York City, New York: Springer International Publishing; 2021. pp. 179-188. DOI: 10.1007/978-3-030-72087-2_16
  101. 101. Cui S, Mao L, Jiang J, Liu C, Xiong S. Automatic semantic segmentation of brain gliomas from MRI images using a deep cascaded neural network. Journal of Healthcare Engineering. 2018;2018:e4940593. DOI: 10.1155/2018/4940593
  102. 102. Chen H, Qin Z, Ding Y, Tian L, Qin Z. Brain tumor segmentation with deep convolutional symmetric neural network. Neurocomputing. 2020;392:305-313. DOI: 10.1016/j.neucom.2019.01.111
  103. 103. Guo Z, Li X, Huang H, Guo N, Li Q. Deep learning-based image segmentation on multimodal medical imaging. IEEE Transactions on Radiation and Plasma Medical Sciences. 2019;3(2):162-169. DOI: 10.1109/TRPMS.2018.2890359
  104. 104. Sajid S, Hussain S, Sarwar A. Brain tumor detection and segmentation in MR images using deep learning. Arabian Journal for Science and Engineering. 2019;44(11):9249-9261. DOI: 10.1007/s13369-019-03967-8
  105. 105. Li H, Li A, Wang M. A novel end-to-end brain tumor segmentation method using improved fully convolutional networks. Computers in Biology and Medicine. 2019;108:150-160. DOI: 10.1016/j.compbiomed.2019.03.014
  106. 106. Iqbal S, Ghani MU, Saba T, Rehman A. Brain tumor segmentation in multi-spectral MRI using convolutional neural networks (CNN). Microscopy Research and Technique. 2018;81(4):419-427. DOI: 10.1002/jemt.22994
  107. 107. Chen L, Bentley P, Mori K, Misawa K, Fujiwara M, Rueckert D. DRINet for medical image segmentation. IEEE Transactions on Medical Imaging. 2018;37(11):2453-2462. DOI: 10.1109/TMI.2018.2835303
  108. 108. Moeskops P, de Bresser J, Kuijf HJ, et al. Evaluation of a deep learning approach for the segmentation of brain tissues and white matter hyperintensities of presumed vascular origin in MRI. NeuroImage: Clinical. 2018;17:251-262. DOI: 10.1016/j.nicl.2017.10.007
  109. 109. Mohammad Havaei, Axel Davy, David Warde-Farley, Antoine Biard, Aaron Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, Hugo Larochelle; Brain tumor segmentation with deep neural networks. Medical Image Analysis; 2017; 35; 18-31; Available from: https://www.sciencedirect.com/science/article/pii/S1361841516300330?via%3Dihub [Accessed: February 26, 2023]
  110. 110. Myronenko A. 3D MRI brain tumor segmentation using autoencoder regularization. In: Crimi A, Bakas S, Kuijf H, Keyvan F, Reyes M, van Walsum T, editors. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Lecture Notes in Computer Science. New York City, New York: Springer International Publishing; 2019. pp. 311-320. DOI: 10.1007/978-3-030-11726-9_28
  111. 111. Shubhangi Nema, Akshay Dudhane, Subrahmanyam Murala, Srivatsava Naidu; RescueNet: An unpaired GAN for brain tumor segmentation, Biomedical Signal Processing and Control. 2020; 55, 101641; Available from: https://www.sciencedirect.com/science/article/pii/S1746809419302228?via%3Dihub [Accessed: February 26, 2023]
  112. 112. Baid U, Talbar S, Rane S, et al. A novel approach for fully automatic intra-tumor segmentation with 3D U-net architecture for gliomas. Frontiers in Computational Neuroscience 2020; 14, 10. Available from: https://www.frontiersin.org/articles/10.3389/fncom.2020.00010 [Accessed: February 26, 2023]
  113. 113. Zhou Z, He Z, Shi M, Du J, Chen D. 3D dense connectivity network with atrous convolutional feature pyramid for brain tumor segmentation in magnetic resonance imaging of human heads. Computers in Biology and Medicine. 2020;121:103766. DOI: 10.1016/j.compbiomed.2020.103766
  114. 114. Sun J, Peng Y, Guo Y, Li D. Segmentation of the multimodal brain tumor image used the multi-pathway architecture method based on 3D FCN. Neurocomputing. 2021;423:34-45. DOI: 10.1016/j.neucom.2020.10.031
  115. 115. Ramzan F, Khan MUG, Iqbal S, Saba T, Rehman A. Volumetric segmentation of brain regions from MRI scans using 3D convolutional neural networks. IEEE Access. 2020;8:103697-103709. DOI: 10.1109/ACCESS.2020.2998901
  116. 116. Wang G, Li W, Ourselin S, Vercauteren T. Automatic brain tumor segmentation based on cascaded convolutional neural networks with uncertainty estimation. Frontiers in Computational Neuroscience 2019;13, 56. Available from: https://www.frontiersin.org/articles/10.3389/fncom.2019.00056 [Accessed: February 26, 2023]
  117. 117. Kamnitsas K, Ledig C, Newcombe VFJ, et al. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical Image Analysis. 2017;36:61-78. DOI: 10.1016/j.media.2016.10.004
  118. 118. Fidon L, Li W, Garcia-Peraza-Herrera LC, et al. Scalable multimodal convolutional networks for brain tumour segmentation. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S, editors. Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. Lecture Notes in Computer Science. New York City, New York: Springer International Publishing; 2017. pp. 285-293. DOI: 10.1007/978-3-319-66179-7_33
  119. 119. Pham TX, Siarry P, Oulhadj H. Integrating fuzzy entropy clustering with an improved PSO for MRI brain image segmentation. Applied Soft Computing. 2018;65:230-242. DOI: 10.1016/j.asoc.2018.01.003
  120. 120. Pham TX, Siarry P, Oulhadj H. Segmentation of MR brain images through hidden Markov random field and hybrid metaheuristic algorithm. IEEE Transactions on Image Processing. 2020;29:6507-6522. DOI: 10.1109/TIP.2020.2990346
  121. 121. Thuy Xuan Pham, Patrick Siarry, Hamouche Oulhadj; A multi-objective optimization approach for brain MRI segmentation using fuzzy entropy clustering and region-based active contour methods, Magnetic Resonance Imaging. 2019;61:41-65. Available from: https://www.sciencedirect.com/science/article/pii/S0730725X18301991?via%3Dihub [Accessed: February 26, 2023]
  122. 122. Mishro PK, Agrawal S, Panda R, Abraham A. A novel Type-2 fuzzy C-means clustering for brain MR image segmentation. IEEE Transactions on Cybernetics. 2021;51(8):3901-3912. DOI: 10.1109/TCYB.2020.2994235
  123. 123. Caesarendra W, Rahmaniar W, Mathew J, Thien A. Automated cobb angle measurement for adolescent idiopathic scoliosis using convolutional neural network. Diagnostics. 2022;12(2):396. DOI: 10.3390/diagnostics12020396
  124. 124. Chockalingam N, Dangerfield PH, Giakas G, Cochrane T, Dorgan JC. Computer-assisted cobb measurement of scoliosis. European Spine Journal. 2002;11(4):353-357. DOI: 10.1007/s00586-002-0386-x
  125. 125. Shea KG, Stevens PM, Nelson M, Smith JT, Masters KS, Yandow S. A comparison of manual versus computer-assisted radiographic measurement. Intraobserver measurement variability for cobb angles. Spine. 1998;23(5):551-555. DOI: 10.1097/00007632-199803010-00007
  126. 126. Jin C, Wang S, Yang G, Li E, Liang Z. A review of the methods on cobb angle measurements for spinal curvature. Sensors. 2022;22(9):3258. DOI: 10.3390/s22093258
  127. 127. Wang J, Zhang J, Xu R, Chen TG, Zhou KS, Zhang HH. Measurement of scoliosis cobb angle by end vertebra tilt angle method. Journal of Orthopaedic Surgery. 2018;13(1):223. DOI: 10.1186/s13018-018-0928-5
  128. 128. Sun Y, Xing Y, Zhao Z, Meng X, Xu G, Hai Y. Comparison of manual versus automated measurement of cobb angle in idiopathic scoliosis based on a deep learning keypoint detection technology. European Spine Journal. 2022;31(8):1969-1978. DOI: 10.1007/s00586-021-07025-6
  129. 129. Tomita N, Cheung YY, Hassanpour S. Deep neural networks for automatic detection of osteoporotic vertebral fractures on CT scans. Computers in Biology and Medicine. 2018;98:8-15. Available from: https://www.sciencedirect.com/science/article/pii/S0010482518301185?via%3Dihub [Accessed: February 26, 2023]
  130. 130. Small JE, Osler P, Paul AB, Kunst M. CT cervical spine fracture detection using a convolutional neural network. AJNR. American Journal of Neuroradiology. 2021;42(7):1341-1347. DOI: 10.3174/ajnr.A7094
  131. 131. Derkatch S, Kirby C, Kimelman D, Jozani MJ, Davidson JM, Leslie WD. Identification of vertebral fractures by convolutional neural networks to predict nonvertebral and hip fractures: A registry-based cohort study of dual X-ray absorptiometry. Radiology. 2019;293(2):405-411. DOI: 10.1148/radiol.2019190201
  132. 132. Lehnen NC, Haase R, Faber J, et al. Detection of degenerative changes on MR images of the lumbar spine with a convolutional neural network: A feasibility study. Diagnostics. 2021;11(5):902. DOI: 10.3390/diagnostics11050902
  133. 133. Whitehead W, Moran S, Gaonkar B, Macyszyn L, Iyer S. A deep learning approach to spine segmentation using a feed-forward chain of pixel-wise convolutional networks. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). New York City, New York, USA; 2018. pp. 868-871. DOI: 10.1109/ISBI.2018.8363709
  134. 134. Huang J, Shen H, Wu J, et al. Spine explorer: A deep learning based fully automated program for efficient and reliable quantifications of the vertebrae and discs on sagittal lumbar spine MR images. The Spine Journal. 2020;20(4):590-599. DOI: 10.1016/j.spinee.2019.11.010
  135. 135. Shen H, Huang J, Zheng Q , et al. A deep-learning-based, fully automated program to segment and quantify major spinal components on axial lumbar spine magnetic resonance images. Physical Therapy. 2021;101(6):pzab041. DOI: 10.1093/ptj/pzab041
  136. 136. Cheng YK, Lin CL, Huang YC, et al. Automatic segmentation of specific intervertebral discs through a two-stage MultiResUNet model. Journal of Clinical Medicine. 2021;10(20):4760. DOI: 10.3390/jcm10204760
  137. 137. Miotto R, Percha BL, Glicksberg BS, et al. Identifying acute low Back pain episodes in primary care practice from clinical notes: Observational study. JMIR Medical Informatics. 2020;8(2):e16878. DOI: 10.2196/16878
  138. 138. Meola A, Cutolo F, Carbone M, Cagnazzo F, Ferrari M, Ferrari V. Augmented reality in neurosurgery: A systematic review. Neurosurgical Review. 2017;40(4):537-548. DOI: 10.1007/s10143-016-0732-9
  139. 139. Sun GC, Wang F, Chen XL, et al. Impact of virtual and augmented reality based on intraoperative magnetic resonance imaging and functional neuronavigation in glioma surgery involving eloquent areas. World Neurosurgery. 2016;96:375-382. DOI: 10.1016/j.wneu.2016.07.107
  140. 140. Bopp MHA, Saß B, Pojskić M, et al. Use of neuronavigation and augmented reality in transsphenoidal pituitary adenoma surgery. Journal of Clinical Medicine. 2022;11(19):5590. DOI: 10.3390/jcm11195590
  141. 141. Rychen J, Goldberg J, Raabe A, Bervini D. Augmented reality in superficial temporal artery to middle cerebral artery bypass surgery: Technical note. Operative Neurosurgery. 2020;18(4):444. DOI: 10.1093/ons/opz176
  142. 142. Cakmakci D, Karakaslar EO, Ruhland E, et al. Machine learning assisted intraoperative assessment of brain tumor margins using HRMAS NMR spectroscopy. PLoS Computational Biology. 2020;16(11):e1008184. DOI: 10.1371/journal.pcbi.1008184
  143. 143. Rashad Jabarkheel, Chi-Sing Ho, Adrian J Rodrigues, Michael C Jin, Jonathon J Parker, Kobina Mensah-Brown, Derek Yecies, Gerald A Grant; Rapid intraoperative diagnosis of pediatric brain tumors using Raman spectroscopy: A machine learning approach, Neuro-Oncology Advances, Oxford Academic. 2022; 4(1):vdac118; https://academic.oup.com/noa/article/4/1/vdac118/6650324 [Accessed: March 6, 2023]
  144. 144. Chang M, Canseco JA, Nicholson KJ, Patel N, Vaccaro AR. The role of machine learning in spine surgery: The future is now. Frontiers in Surgery 2020;7, 54. Available from: https://www.frontiersin.org/articles/10.3389/fsurg.2020.00054 [Accessed: March 6, 2023]
  145. 145. Rasouli JJ, Shao J, Neifert S, et al. Artificial intelligence and robotics in spine surgery. Global Spine Journal. 2021;11(4):556-564. DOI: 10.1177/2192568220915718
  146. 146. Arvind V, Kim JS, Oermann EK, Kaji D, Cho SK. Predicting surgical complications in adult patients undergoing anterior cervical discectomy and fusion using machine learning. Neurospine. 2018;15(4):329-337. DOI: 10.14245/ns.1836248.124
  147. 147. Kim JS, Merrill RK, Arvind V, et al. Examining the ability of artificial neural networks machine learning models to accurately predict complications following posterior lumbar spine fusion. Spine. 2018;43(12):853-860. DOI: 10.1097/BRS.0000000000002442
  148. 148. Jinhua Yu, Zhifeng Shi, Yuxi Lian, Zeju Li, Tongtong Liu, Yuan Gao, Yuanyuan Wang, Liang Chen, Ying Mao; Noninvasive IDH1 mutation estimation based on a quantitative radiomics approach for grade II glioma. European Radiology. 2017; 27(8):3509-3522; Available from: https://pubmed.ncbi.nlm.nih.gov/28004160/ [Accessed: March 6, 2023]
  149. 149. Reider-Demer M, Raja P, Martin N, Schwinger M, Babayan D. Prospective and retrospective study of videoconference telemedicine follow-up after elective neurosurgery: Results of a pilot program. Neurosurgical Review. 2018;41(2):497-501. DOI: 10.1007/s10143-017-0878-0
  150. 150. Harrer S. Attention is not all you need: The complicated case of ethically using large language models in healthcare and medicine. eBioMedicine. 2023;90:104512. DOI: 10.1016/j.ebiom.2023.104512
  151. 151. Information Blocking. HealthIT.gov. https://www.healthit.gov/topic/information-blocking [Accessed: July 31, 2023]
  152. 152. Khanna NN, Maindarkar MA, Viswanathan V, et al. Economics of artificial intelligence in healthcare: Diagnosis vs. treatment. Healthcare. 2022;10(12):2493. DOI: 10.3390/healthcare10122493
  153. 153. Yoon JS, Tang OY, Lawton MT. Volume-cost relationship in neurosurgery: Analysis of 12,129,029 admissions from the National Inpatient Sample. World Neurosurgery. 2019;129:e791-e802. DOI: 10.1016/j.wneu.2019.06.034
  154. 154. Wolff J, Pauling J, Keck A, Baumbach J. The economic impact of artificial intelligence in health care: Systematic review. Journal of Medical Internet Research. 2020;22(2):e16866. DOI: 10.2196/16866
  155. 155. Sanyal S. How much does artificial intelligence cost in 2021? Analytics Insight. 2021. p. 1. Available from: https://www.analyticsinsight.net/how-much-does-artificial-intelligence-cost-in-2021/ [Accessed: March 2, 2023]
  156. 156. Luzniak K. Cost of AI in healthcare industry. Neoteric. 2021. pp. 1-11. Available from: https://neoteric.eu/blog/whats-the-cost-of-artificial-intelligence-in-healthcare/ [Accessed: March 2, 2023]
  157. 157. Parikh RB, Helmchen LA. Paying for artificial intelligence in medicine. Npj Digital Medicine. 2022;5(1):1-5. DOI: 10.1038/s41746-022-00609-6
  158. 158. Abràmoff MD, Roehrenbeck C, Trujillo S, et al. A reimbursement framework for artificial intelligence in healthcare. Npj Digital Medicine. 2022;5(1):1-6. DOI: 10.1038/s41746-022-00621-w
  159. 159. Sallam M. ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare. 2023;11(6):887. DOI: 10.3390/healthcare11060887
  160. 160. Bubeck S, Chandrasekaran V, Eldan R, et al. Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv. 2023;5:1-155. DOI: 10.48550/arXiv.2303.12712
  161. 161. Iqbal J, Jahangir K, Mashkoor Y, et al. The future of artificial intelligence in neurosurgery: A narrative review. Surgical Neurology International. 2022;13:536. DOI: 10.25259/SNI_877_2022
  162. 162. Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine. 2019;17(1):195. DOI: 10.1186/s12916-019-1426-2
  163. 163. Kamulegeya LH, Okello M, Bwanika JM, et al. Using artificial intelligence on dermatology conditions in Uganda: A case for diversity in training data sets for machine learning. African Health Sciences. 2023;23(2):753-763. DOI: 10.1101/826057
  164. 164. Larrazabal AJ, Nieto N, Peterson V, Milone DH, Ferrante E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proceedings of the National Academy of Sciences. 2020;117(23):12592-12594. DOI: 10.1073/pnas.1919012117
  165. 165. Zemmar A, Lozano AM, Nelson BJ. The rise of robots in surgical environments during COVID-19. Nature Machine Intelligence. 2020;2(10):566-572. DOI: 10.1038/s42256-020-00238-2
  166. 166. Awuah WA, Kalmanovich J, Mehta A, et al. Harnessing artificial intelligence to bridge the neurosurgery gap in low-income and middle-income countries. Postgraduate Medical Journal. 2023;99(1173):651-653. DOI: 10.1136/pmj-2022-141992
  167. 167. Layard Horsfall H, Palmisciano P, Khan DZ, et al. Attitudes of the surgical team toward artificial intelligence in neurosurgery: International 2-stage cross-sectional survey. World Neurosurgery. 2021;146:e724-e730. DOI: 10.1016/j.wneu.2020.10.171
  168. 168. Staartjes VE, Stumpo V, Kernbach JM, et al. Machine learning in neurosurgery: A global survey. Acta Neurochirurgica. 2020;162(12):3081-3091. DOI: 10.1007/s00701-020-04532-1
  169. 169. Dennler C, Bauer DE, Scheibler AG, et al. Augmented reality in the operating room: A clinical feasibility study. BMC Musculoskeletal Disorders. 2021;22(1):451. DOI: 10.1186/s12891-021-04339-w

Written By

Raivat Shah, Vanessa Reese, Martin Oselkin and Stanislaw P. Stawicki

Submitted: 09 May 2023 Reviewed: 28 August 2023 Published: 30 September 2023