Sensitivity and specificity of Logistic Regression and an Artificial Neural Network in the prediction of Kidney rejection in 10 training and validating datasets of kidney transplant recipients
Predicting clinical outcome following a specific treatment is a challenge that sees physicians and researchers alike sharing the dream of a crystal ball to read into the future. In Medicine, several tools have been developed for the prediction of outcomes following drug treatment and other medical interventions. The standard approach for a binary outcome is to use logistic regression (LR) [1,2] but over the past few years artificial neural networks (ANNs) have become an increasingly popular alternative to LR analysis for prognostic and diagnostic classification in clinical medicine . The growing interest in ANNs has mainly been triggered by their ability to mimic the learning processes of the human brain. The network operates in a feed-forward mode from the input layer through the hidden layers to the output layer. Exactly what interactions are modeled in the hidden layers is still under study. Each layer within the network is made up of computing nodes with remarkable data processing abilities. Each node is connected to other nodes of a previous layer through adaptable inter-neuron connection strengths known as synaptic weights. ANNs are trained for specific applications through a learning process and knowledge is usually retained as a set of connection weights . The backpropagation algorithm and its variants are learning algorithms that are widely used in neural networks. With backpropagation, the input data is repeatedly presented to the network. Each time, the output is compared to the desired output and an error is computed. The error is then fed back through the network and used to adjust the weights in such a way that with each iteration it gradually declines until the neural model produces the desired output.
ANNs have been successfully applied in the fields of mathematics, engineering, medicine, economics, meteorology, psychology, neurology, and many others. Indeed, in medicine, they offer a tantalizing alternative to multivariate analysis, although their role remains advisory since no convincing evidence of any real progress in clinical prognosis has yet been produced .
In the field of nephrology, there are very few reports on the use of ANNs [6-10], most of which describe their ability to individuate predictive factors of technique survival in peritoneal dialysis patients as well as their application to prescription and monitoring of hemodialysis therapy, analyis of factors influencing therapeutic efficacy in idiopathic membranous nephropathy, prediction of survival after radical cystectomy for invasive bladder carcinoma and individual risk for progression to end-stage renal failure in chronic nephropathies.
This all led up to the intriguing challenge of discovering whether ANNs were capable of predicting the outcome of kidney transplantation after analyzing a series of clinical and immunogenetic variables.
2. The complex setting of kidney transplantation
Predicting the outcome of kidney transplantation is important in optimizing transplantation parameters and modifying factors related to the recipient, donor and transplant procedure . The biggest obstacles to be overcome in organ transplantation are the risks of acute and chronic immunologic rejection, especially when they entail loss of graft function despite adjustment of immunosuppressive therapy. Acute renal allograft rejection requires a rapid increase in immunosuppression, but unfortunately, diagnosis in the early stages is often difficult . Blood tests may reveal an increase in serum creatinine but which cannot be considered a specific sign of acute rejection since there are several causes of impaired renal function that can lead to creatinine increase, including excessive levels of some immunosuppressive drugs. Also during ischemic damage, serum creatinine levels are elevated and so provide no indication of rejection. Alternative approaches to the diagnosis of rejection are fine needle aspiration and urine cytology, but the main approach remains histological assessment of needle biopsy. However, because the histological changes of acute rejection develop gradually, the diagnosis can be extremely difficult or late . Although allograft biopsy is considered the gold standard, pathologists working in centres where this approach is used early in the investigation of graft dysfunction, are often faced with a certain degree of uncertainty about the diagnosis. In the past, the Banff classification of renal transplant pathology provided a rational basis for grading of the severity of a variety of histological features, including acute rejection. Unfortunately, the reproducibility of this system has been questioned . What we need is a simple prognostic tool capable of analyzing the most relevant predictive variables of rejection in the setting of kidney transplantation.
3. The role of HLA-G in kidney transplantation outcome
Human Leukocyte Antigen G (HLA-G) represents a “non classic” HLA class I molecule, highly expressed in trophoblast cells.  HLA-G plays a key role in embryo implantation and pregnancy by contributing to maternal immune tolerance of the fetus and, more specifically, by protecting trophoblast cells from maternal natural killer (NK) cells through interaction with their inhibitory KIR receptors. It has also been shown that HLA-G expression by tumoral cells can contribute to an “escape” mechanism, inducing NK tolerance toward cancer cells in ovarian and breast carcinomas, melanoma, acute myeloid leukemia, acute lymphoblastic leukemia and B-cell chronic lymphocytic leukemia.  Additionally it would seem that HLA-G molecules have a role in graft tolerance following hematopoietic stem cell transplantation. These molecules exert their immunotolerogenic function towards the main effector cells involved in graft rejection through inhibition of NK and cytotoxic T lymphocyte (CTL)-mediated cytolysis and CD4+T-cell alloproliferation. 
HLA-G transcript generates 7 alternative messenger ribonucleic acids (mRNAs) that encode 4 membrane-bound (HLA-G1, G2, G3, G4) and 3 soluble protein isoforms (HLA-G5, G6, G7). Moreover, HLA-G allelic variants are characterized by a 14-basepair (bp) deletion-insertion polymorphism located in the 3’-untranslated region (3’UTR) of HLA-G. The presence of the 14-bp insertion is known to generate an additional splice whereby 92 bases are removed from the 3’UTR . HLA-G mRNAs having the 92-base deletion are more stable than the complete mRNA forms, and thus determine an increment in HLA-G expression. Therefore, the 14-bp polymorphism is involved in the mechanisms controlling post-transcriptional regulation of HLA-G molecules
A crucial role has been attributed to the ability of these molecules to preserve graft function from the insults caused by recipient alloreactive NK cells and cytotoxic T lymphocytes (CTL).  This is well supported by the numerous studies demonstrating that high HLA-G plasma concentrations in heart, liver or kidney transplant patients is associated with better graft survival [18-20].
4. Kydney transplantation outcome
In one cohort, a total of 64 patients (20,4%) lost graft function. The patients were divided into 2 groups according to the presence or absence of HLA-G alleles exhibiting the 14-bp insertion polymorphism. The first group included 210 patients (66.9%) with either HLA-G +14-bp/+14-bp or HLA-G −14/+14-bp whereas the second group included 104 homozygotes (33.1%) for the HLA-G −14-bp polymorphism. The patients had a median age of 49 years (range 18-77) and were prevalently males (66.6%). The donors had a median age of 47 years (range 15-75). Nearly all patients (94,9%) had been given a cadaver donor kidney transplant and for most of them (91.7%) it was their first transplant. The average (±SD) number of mismatches was 3 ± 1 antigens for HLA Class I and 1 ± 0.7 antigens for HLA Class II. Average ±SD cold ischemia time (CIT) was 16 ± 5.6 hours. The percentage of patients hyperimmunized against HLA Class I and II antigens (PRA > 50%) was higher in the group of homozygotes for the HLA-G 14-bp deletion. Pre-transplantation serum levels of interleukin-10 (IL-10) were lower in the group of homozygotes for the 14-bp deletion.
Kidney transplant outcome was evaluated by glomerular filtration rate (GFR), serum creatinine and graft function tests. At one year after transplantation, a stronger progressive decline of the estimated GFR, using the abbreviated Modification of Diet in Renal Disease (MDRD) study equation, was observed in the group of homozygotes for the HLA-G 14-bp deletion in comparison with the group of heterozygotes for the 14-bp insertion. This difference between the 2 groups became statistically significant at two years (5.3 ml/min/1.73 m2; P<0.01; 95% CI 1.2 -9.3) and continued to rise at 3 (10.4 ml/min/1.73m2; P<0.0001; 95% CI 6.4-14.3) and 6 years (11.4 ml/min/1.73m2; P<0.0001; 95% CI 7.7 – 15.1) after transplantation.
5. Logistic regression and neural network training
We compared the prognostic performance of ANNs versus LR for predicting rejection in a group of 353 patients who underwent kidney transplantation. The following clinical and immunogenetic parameters were considered: recipient gender, recipient age, donor gender, donor age, patient/donor compatibility: class I (HLA-A, -B) mismatch (0-4), class II (HLA-DRB1 mismatch; positivity for anti-HLA Class I antibodies >50%; positivity for anti-HLA Class II antibodies >50%; IL-10 pg/mL; first versus second transplant, antithymocyte globulin (ATG) induction therapy; type of immunosoppressive therapy (rapamycin, cyclosporine, corticosteroids, mycophenolate mophetyl, everolimus, tacrolimus); time of cold ischemia, recipients homozygous/heterozygous for the 14-bp insertion (+14-bp/+14-bp and +14-bp/−14-bp) and homozygous for the 14-bp deletion (−14-bp/−14-bp). Graft survival was calculated from the date of transplantation to the date of irreversible graft failure or graft loss or the date of the last follow up or death with a functioning graft.
ANNs have different architectures, which consequently require different types of algorithms. The multilayer perceptron is the most popular network architecture in use today (Figure 2). This type of network requires a desired output in order to learn. The network is trained with historical data so that it can produce the correct output when the output is unknown. Until the network is appropriately trained its responses will be random. Finding appropriate architecture needs trial and error method and this is where back-propagation steps in. Each single neuron is connected to the neurons of the previous layer through adaptable synaptic weights. By adjusting the strengths of these connections, ANNs can approximate a function that computes the proper output for a given input pattern. The training data set includes a number of cases, each containing values for a range of well-matched input and output variables. Once the input is propagated to the output neuron, this neuron compares its activation with the expected training output. The difference is treated as the error of the network which is then backpropagated through the layers, from the output to the input layer, and the weights of each layer are adjusted such that with each backpropagation cycle the network gets closer and closer to producing the desired output . We used the Neural Network ToolboxTM 6 of the software Matlab® 2008, version 7.6 (MathWorks, inc.) to develop a three layer feed forward neural network. . The input layer of 15 neurons was represented by the 15 previously listed clinical and immunogenetic parameters. These input data were then processed in the hidden layer (30 neurons). The output neuron predicted a number between 1 and 0 (goal), representing the event “Kidney rejection yes”  or “Kidney rejection no” (0), respectively. For the training procedure, we applied the ‘on-line back-propagation’ method on 10 data sets of 300 patients previously analyzed by LR. The 10 test phases utilized 63 patients randomly extracted from the entire cohort and not used in the training phase. Mean sensitivity (the ability of predicting rejection) and specificity (the ability of predicting no-rejection) of data sets were determined and compared to LR. (Table 1)
|Extraction_1||Test N=63||No||55||40 (73)||48 (87)|
|Yes||8||2 (25)||4 (50)|
|Extraction_2||Test N=63||No||55||38 (69)||48 (87)|
|Yes||8||3 (38)||4 (50)|
|Extraction_3||Test N=63||No||55||30 (55)||48(87)|
|Yes||8||3 (38)||5 (63)|
|Extraction_4||Test N=63||No||55||40 (73)||49 (89)|
|Yes||8||3 (38)||5 (63)|
|Extraction_5||Test N=63||No||7||40 (73)||46 (84)|
|Yes||8||4 (50)||6 (75)|
|Extraction_6||Test N=63||No||55||30 (55)||34 (62)|
|Yes||8||4 (50)||6 (75)|
|Extraction_7||Test N=63||No||55||40 (73)||47 (85)|
|Yes||8||3 (38)||5 (63)|
|Extraction_8||Test N=63||No||55||38 (69)||46 (84)|
|Yes||8||4 (50)||5 (63)|
|Extraction_9||Test N=63||No||55||44 (80)||51 (93)|
|Yes||8||2 (25)||4 (50)|
|Extraction_10||Test N=63||No||55||32 (58)||52 (95)|
|Yes||8||2 (25)||5 (63)|
|Specificity % (mean)||No Rejection||68%||85%|
|Sensitivity % (mean)||YES Rejection||38%||62%|
6. Results and perspectives
ANNs can be considered a useful supportive tool in the prediction of kidney rejection following transplantation. The decision to perform analyses in this particular clinical setting was motivated by the importance of optimizing transplantation parameters and modifying factors related to the recipient, donor and transplant procedure. Another motivation was the need for a simple prognostic tool capable of analyzing the relatively large number of immunogenetic and other variables that have been shown to influence the outcome of transplantation. When comparing the prognostic performance of LR to ANN, the ability of predicting kidney rejection (sensitivity) was 38% for LR versus 62% for ANN. The ability of predicting no-rejection (specificity) was 68% for LR compared to 85% of ANN.
The advantage of ANNs over LR can theoretically be explained by their ability to evaluate complex nonlinear relations among variables. By contrast, ANNs have been faulted for being unable to assess the relative importance of the single variables while LR determines a relative risk for each variable. In many ways, these two approaches are complementary and their combined use should considerably improve the clinical decision-making process and prognosis of kidney transplantation.
We wish to thank Anna Maria Koopmans (affiliations 1,3) for her precious assistance in preparing the manuscript