There are at least three perspectives from which one can think about using risk prediction and outcomes data in cardiothoracic surgery. The first is the perspective of the patient in need of cardiothoracic surgery procedure and care, who wishes to know, “What are my chances?” An engaged patient and family are interested in the patient’s survival of the acute condition or event, but also interested in the near- and long-term outlook and quality of life beyond the recovery period.
The second is the perspective of the surgeon, wishing to know, “How am I doing?” This is the essence of the general competency in medical education known as practice-based learning and improvement, which entails review of and reflection on one’s practice patterns and using this self-assessment to improve one’s practice, particularly assessing against standards of care and specialty practice guidelines. Moreover, it can be argued that as health care is a team sport, rather than a solo act, that perspective should be expanded to reflect the whole interdisciplinary team asking, “How are we doing?”
Finally, the societal perspective—including regulators and payers—seeks information that answers whether or not we are delivering care that is safest and of highest quality for the cost and the resources invested.
The Institute of Medicine in To Err is Human articulates six key parameters by which quality of care is assessed and which are important to all three perspectives, with those parameters being for care that is safe, timely, effective, efficient, equitable and patient-centered . As decisions are made and care is delivered for each patient as an individual case, the practitioner makes what he or she believes to be the best decisions for care based on the individual patient characteristics and needs and the information available at the time. However, it is the decision-making and care delivery trends that are highly informative to all three perspectives described above.
This chapter focuses on the following areas of risk and outcome data use: 1) clinical cardiothoracic databases; 2) risk factors and risk prediction models; 3) application of risk prediction in clinical decision-making and care delivery; 4) process and outcomes data for reporting and improvement; and 5) use of database linkages to determine factors and decisions contributing to outcomes across the continuum of care.
2. Cardiothoracic databases
Surgeons have been interested in monitoring and improving outcomes since Ernest Amory Codman pioneered the study of medical outcomes. He kept “end result cards” on his patients, including long-term outcomes of at least one year, and wrote several papers on the “end result idea” in the early 1900s . The ability to speak knowledgably about risk assessment and clinical outcomes for any clinical specialty requires data gathered on the patients treated by that specialty—whether based on diagnostic category, acute event or performance of a particular procedure or set of procedures. This is of even greater value when one can speak about one’s own patient population and performance data, using data that are complete and valid and analyzed in comparison to a large number of patients of comparable condition.
2.1. Administrative databases
Administrative data, such as the data from the Centers for Medicare and Medicaid Services database, are readily available, relatively inexpensive, and provide information on large numbers of patients. However, administrative data are usually collected by coding for billing purposes rather than collection for clinical studies and improvement, so that critical clinical context and variables are unavailable and the differentiation of comorbidities from complications can be problematic [3,4].
The University HealthSystem Consortium (UHC) is an alliance if 101 academic medical centers and nearly 200 of their affiliate hospitals—representing more than 90% of the US nonprofit academic medical centers—which has developed risk-adjusted mortality models based on discharge abstracts and include adjustments for differences in patient severity using the All Patient Refined Diagnosis Related Groups (APR-DRG). The UHC refers to its data and risk model as clinical (versus administrative), because the data are derived from coding of clinical conditions. However, it should be noted that the source of the data sent to UHC by hospitals and healthcare systems is the financial and administrative data from the system, and therefore dependent on coding and demographic information, and drawn from the same data source used to report patient billing data to Medicare and other payers .
2.2. Clinical databases
There are numerous notable clinical registries and databases for cardiothoracic surgery patients. Local and regional databases are particularly noteworthy for their ability to draw data from physician groups and hospitals that are traditionally competitive relative to one another, but who agree to contribute data to build valid assessment of quality in cardiac care. Regional databases of note include those of the Northern New England Cardiovascular Disease Study Group (NNE) [6,7]. The New York State Department of Health has been building databases for reporting on adult cardiac surgery and percutaneous coronary interventions since the early 1990s, and subsequently added pediatric congenital cardiac surgery [8,9,10]. The Pennsylvania Health Care Cost Containment Council has databases that include hip and knee replacement, diabetes and health-associated infections, besides cardiac care . Other states have followed, with a statewide approach to quality, such as the State of New Jersey Department of Health and Senior Services Office of Health Care Quality Assessment , the Minnesota Cardiac Surgery Database , the Michigan Society of Thoracic and Cardiovascular Surgeons Quality Collaborative , and the Virginia Cardiac Surgery Quality Initiative .
The Veterans Affairs (VA) Cardiac Surgery Database [16,17] has been a leading force in national cardiac databases. The Department of Veterans Affairs has built on the experience in cardiac surgery to develop a database for major surgery outcomes and quality of care, as seen in the National VA Surgical Quality Improvement Program (NSQIP) database . NSQIP has proven so valuable as to now be applied to surgery programs across the country, well beyond the VA medical centers.
The predominant national adult cardiac database in the US, beyond the VA, is the Society of Thoracic Surgeons (STS) National Cardiac Database (NCD) [19,20,21]. Building on the experience with the adult cardiac surgery database, the STS now has three components to the STS National Database—Adult Cardiac, General Thoracic Surgery Database, and Congenital Heart Surgery, with availability of anesthesiology participation in addition to surgeon participation in the Congenital Heart Surgery Database. The STS Congenital Heart Surgery Database has worked collaboratively at the international level on a joint European Association for Cardiothoracic Surgery (EACTS)-STS Congenital Database Committee, particularly on standardization of definitions and naming conventions [22,23,24]. Moreover, the World Society for Pediatric and Congenital Heart Surgery has gathered surgeons from over 50 countries and from all continents except Antarctica to work on pediatric cardiology and pediatric cardiac surgery [25,26,27].
In addition to EACTS, other international registry endeavors in cardiothoracic surgery include the Society for Cardiothoracic Surgery in Great Britain and Ireland , the Australian and New Zealand Society of Cardiac and Thoracic Surgeons Database , the Japan Cardiovascular Surgery Database Organization , and the Chinese coronary artery bypass grafting (CABG) registry . While there are numerous country and international region registries, the STS also now provides the opportunity for international participants to contribute to the Adult Cardiac Surgery Database. The awareness of need for process and outcome data for reporting and improvement has caught fire and driven endeavors around the world. Next, this chapter will explore how the data is analyzed and utilized.
3. Risk prediction—Assessing risk
When outcomes and performance results are first brought up, it seems to be an almost reflexive defense response to say, “My patients are sicker.” Thus, it is the risk-adjusted outcomes that essentially level the playing field, comparing like with like to as close a degree as possible. It is therefore of great importance to gather information that is both clinically relevant and comprehensive, to get as complete a picture as possible of the patient factors, over which the surgeon has little control, as well as aspects that can impact both decision-making, care delivery and outcomes of that care.
Further upstream in the continuum of care, a cardiothoracic surgeon depends on clinical judgment to assess the patient’s cardiac condition in the context of his/her general health, and then to determine whether there is technically something to offer the patient to improve that cardiac condition. Beyond the ability to do something for the patient, there then follows the consideration of whether surgical intervention is appropriate, given the patient’s condition, circumstances and preferences, and when. This is where there is great utility in having access to a reliable, easy-to-use risk prediction scoring system and tool. Granton and Cheng  have gathered information describing several of the predictive scoring systems available: the EuroSCORE [33,34], the STS score, the Parsonnet score [35,36], and regional models such as the Cleveland Clinic model and the NNE score. Additionally, the AusSCORE  has been developed and studied for Australian patients.
A core set of variables associated with outcomes in cardiothoracic surgery have evolved over time. Accuracy of risk models developed based on administrative data in New York  and Pennsylvania  have been shown to be substantially improved by addition of a few critical clinical variables—ejection fraction, reoperation, and left main coronary artery obstruction. One may further question how many variables are actually needed to have a robust risk prediction model. Studies from Ontario  and the Cooperative CABG Database Project  identified six and seven core variables, respectively. In the STS NCD, it has been demonstrated that 78% of the explained variance from the 28-variable model is derived from the eight most important predictors, which are age, surgical acuity, reoperative status, creatinine level, dialysis, shock, chronic lung disease, and ejection fraction .
While there are minor variations between the data collection forms for the various databases, the primary differences are derived from the correlation coefficients for the risk factors as calculated from the different patient populations. For purposes of description of data collection, the STS National Databases—Adult Cardiac Surgery , General Thoracic Surgery , and Congenital Heart Surgery —are used in the section that follows.
3.1. Risk factors
Preoperative risk factors. Preoperative risk factors that are collected relate primarily to the presenting features of the patient. These start with such factors as age, gender, race and ethnicity. Also collected are general resource factors such as referral pattern information via patient zip code and hospital zip code, and the payer type.
The next section after demographic information contains fields for admission, surgery and discharge dates, as well as hours of intensive care unit (ICU) stay. The general health factors are captured by fields for height, weight, smoking status, family history for heart disease, hematocrit, and white blood cell count. The comorbidities recorded include the presence or absence of the following: diabetes, dyslipidemia, renal failure requiring dialysis, hypertension, infectious endocarditis, chronic lung disease, immunosuppressive therapy, peripheral arterial disease, cerebrovascular disease and the form and timing of its effect. In addition, the current renal function is logged as the last creatinine, and for diabetics, the method of diabetes control.
The cardiac presentation section contains fields for the following: previous myocardial infarction and time interval, heart failure, NYHA classification, cardiac presentation on admission, cardiogenic shock, resuscitation, arrhythmia and type, and previous cardiovascular interventions. Preoperative medications are recorded as yes, no or contraindicated/not indicated. Preoperative hemodynamics and catheterization and echocardiographic information are also recorded.
The preoperative risk factors for general thoracic surgery include weight loss over prior three months, steroids, preoperative chemotherapy, pulmonary function tests and results, Zubrod (activity tolerance) score, and clinical staging for lung and esophageal cancer.
For the congenital database, there is the additional information required on date of birth, prematurity, non-cardiac anatomic abnormalities, chromosomal abnormalities and congenital syndromes. The resuscitative preoperative factors captured for the congenital heart surgery patient include cardiopulmonary resuscitation, mechanical circulatory support, and shock. Metabolic risk factors include diabetes mellitus, hypothyroidism, and steroid requirement. Gastrointestinal risk factors relate to presence of colostomy, enterostomy, gastrostomy or esophagostomy, as well as hepatic dysfunction, or necrotizing entero-colitis. Neurological risk factors include neurological deficit, seizures, and stroke or intracranial hemorrhage. Respiratory risk factors include mechanical ventilation, respiratory syncytial virus, single lung, and trachesotomy. Additional factors captured include: coagulation disorder, endocarditis, sepsis, and renal dysfunction.
For congenital heart surgery, there is an additional aspect of risk-adjustment that has been employed and tested. The Risk Adjustment for Congenital Heart Surgery-1 (RACHS-1) was developed by Dr. Kathy Jenkins and investigators from Children’s Hospital Boston . The RACHS-1 goal was to adjust for baseline differences in case-mix and risk when comparing mortality prior to discharge from the hospital among patients under 18 years of age undergoing surgery for congenital cardiac disease or defect. It is important to note that the RACHS-1 was not created to predict the risk of death for individual patients, but to be a tool that allows meaningful comparisons across groups of patients .
It is worth noting that the available databases do not account for nutritional state, especially for malnourished patients. This is an aspect of general health that significantly impacts wound healing ability and immune system capacity to fight infectious complications. Cardiac databases do not capture rehabilitation potential, as would be captured by a preoperative activity level and a realistic activity goal following recovery. For patients with a severely limited preoperative functional level, it would generally be unrealistic to expect a normal rate of separation from mechanical ventilation or length of stay. This also speaks to the level of postoperative recovery support required once the patient can be discharged from the hospital. Homelessness or severely limited social network support is also not captured by available databases. While these aspects of patient condition may not negatively impact the ability to heal and recover from a surgical procedure, they do affect the healthcare resources needed to achieve recovery, both in the hospital and following discharge.
Perioperative risk factors. All of the procedural databases include surgeon identification, diagnosis, primary and secondary procedures, operative start and end dates and times (including operating room entry and exit, anesthesia start and end, and skin incision start and stop), antibiotic selection and timing for administration and discontinuation. Operation status—elective, urgent, emergent/salvage, or palliative is recorded. Blood product administration is also captured.
Cardiopulmonary bypass utilization and associated features are recorded, including use of circulatory arrest, aortic occlusion, cardioplegia, and cerebral oximetry. Medication administration and use of intra-aortic balloon pump support are also captured. For coronary artery bypass surgery, the number of anastomoses is recorded for each type of conduit. Valve surgery procedures are captured, by valve, for type of procedure and prosthesis type utilized. Initial extubation date and time is captured to reflect the interval of intubation and ventilator support.
The general thoracic surgery perioperative data also include pathologic staging for cancer and ICU length of stay. The congenital heart surgery data also include capture of procedure location, temperature, cerebral perfusion and oximetry utilized. In addition, pediatric cardiac surgeons have found it important to address the issue of stratification of complexity in surgery for congenital cardiac diseases. The Aristotle Complexity Score was developed by an expert panel, and consists of assignment of an Aristotle Basic Complexity score to a given procedure based on potential for mortality, potential for morbidity, and technical difficulty .
Postoperative data collected. Postoperative transfusion and complications are collected for the databases. The complications are reflected by category: reoperation and reason, cardiac, neurologic, renal, pulmonary, infectious, vascular and other (gastrointestinal, urologic, hematologic). The discharge status, interval after surgery, and destination are key components, as well as the discharge medications. If readmission is necessitated, the reason is recorded. Key quality measures are captured for each of the databases.
The congenital heart surgery collection forms include sections for gathering key information for anesthesiology participation in the database—preoperative assessment, anesthetic technique, monitoring, intraoperative and postoperative pharmacology, and anesthesia adverse events. As there are a growing number of patients surviving their congenital heart defects to reach adulthood, the congenital heart surgery database also includes sections for adult cardiac surgery data components.
3.2. Risk models
Ivanov and colleagues studied the predictive accuracy of a statistical model against clinicians’ estimates of outcomes after coronary bypass surgery . They found that clinicians, when given the option, preferred to trust their own judgments. Experienced surgeons significantly overestimated the risk of operative mortality compared to their junior colleagues. Overall, the clinicians significantly overestimated the probability of operative mortality (for survivors to a greater degree than non-survivors) and ICU stay greater than 48 hours. Although no predictive model can predict the specific individual who will have an adverse event, statistical models permit reasonably accurate estimates of event rates for subgroups of patients .
The risk factors gathered by the database collection form for each patient are entered into the database, and aggregated for analysis and reporting. This can be accomplished by the individual surgeon or group with paper and pencil or an Excel spreadsheet. However, most clinicians are interested in risk-adjusted analysis of performance and outcomes to account for the particular characteristics of the patients and patient population to which care is rendered. Kozower and colleagues described their head-to-head comparison of the ability to predict risk-adjusted mortality by UHC and STS risk models . What they found is that although the UHC model demonstrated better performance in the total study population, the difference was achieved by reflecting postoperative complications, and therefore the predictive discrimination was equivalent to random chance. Thus, it is critical to use a well-constructed and validated clinical risk model that reflects the patient comorbidities and acuity, with high correlation to the endpoints of interest. The next section describes the processes by which a robust clinical risk model is developed and validated.
3.3. Development of a clinical risk model
The study population is key to the quality and performance of the statistical model, and no risk adjustment model is better than the data on which it is based . The population must be adequately defined, and exclusion applied for key missing data elements. The population is then randomly divided into two samples. The first sample of 60% of the population, known as the training sample, is for development of the risk model—used to identify predictor variables and estimate model coefficients [46,47,48]. Data from the other 40%, known as the test sample, is for the validation of the model, to assess model fit, discrimination and calibration.
operative mortality—defined as death during the same hospitalization as surgery regardless of timing, or within 30 days of surgery regardless of venue or site;
permanent stroke or cerebrovascular accident—a central neurologic deficit persisting longer than 72 hours;
renal failure—an increase of the serum creatinine to more than 2.0 mg/dL and double the most recent preoperative creatinine level, or a new requirement for dialysis;
prolonged requirement of mechanical ventilation support(longer than 24 hours);
deep sternal wound infection—in most recent database versions recorded for as long as 30 days postoperatively;
reoperation for any reason;
major morbidity or mortality—a composite defined as the occurrence of any of the above endpoints;
prolonged postoperative length of stay—length of stay more than 14 days; and
short postoperative length of stay—length of stay less than six days and patient alive at discharge.
The major endpoints for the general thoracic surgery database are selected using similar principles, but with appropriate differences of interest to general thoracic surgeons and their patients. Adverse outcome measure selection is based on clinical judgment, literature review and preliminary data analysis. The postoperative endpoints selected for lung surgery are: mortality (in-hospital mortality regardless of timing or within 30 days of the procedure), tracheostomy, reintubation, initial ventilator support greater than 48 hours, adult respiratory distress syndrome, bronchopleural fistula, pulmonary embolus, pneumonia, bleeding requiring reoperation, and myocardial infarction [49,50]. The selected major outcomes for esophageal surgery consist of the following conditions: bleeding requiring reoperation, anastomotic leak requiring medical or surgical treatment, reintubation, initial ventilation greater than 48 hours, pneumonia, or in-hospital mortality (same hospitalization) regardless of timing .
The STS Congenital Database Taskforce and the Joint EACTS-STS Congenital Database Committee have critically reviewed and defined endpoints as well. Operative mortality is again defined as any death, regardless of cause, occurring within 30 days after surgery in or out of the hospital and after 30 days during the same hospitalization subsequent to the operation . Because of the extensive spectrum of congenital heart defects and the wide variety of procedures for same, the complications of interest for the congenital databases are also more extensive and detailed, with significant attention to definitions [23,43].
All candidate variables are considered, and screened for relevance to prediction at population and at individual levels. It is of great value to have expert clinician review as well, to assure clinical relevance of the variables to be included. The validity—on its face, as well as of construct and content—is key to the value of the risk model, and so it needs to make sense to the clinician users.
The definitions for predictor variables and for endpoints must be strictly standardized. Even for an unambiguous endpoint like mortality, the time period and location become important components of the definition. There are important statistical and policy implications of using in-hospital mortality (without time limit) versus 30-day all-cause mortality (without consideration of where it occurs) versus operative mortality as either of the two. The fixed time period is statistically preferable, although more difficult to obtain with completeness and accuracy than in-hospital mortality .
The risk model development team has to assess the database for several aspects of variable reporting to determine inclusion or exclusion. The first for consideration is the frequency of missing data for each variable. A second consideration is identification of variables that are collected inconsistently or with questionable reliability, even for clinically unavoidable reasons. Use of derived variables (e.g., body mass index) or redundant variables, such as glomerular filtration rate which is a complex function of variables that are consistently and regularly included, should be assessed for appropriateness of inclusion in the model. Finally, there is consideration of whether to include potentially controversial variables, especially those that raise clinical, statistical or health policy issues. Examples of such variables include race and ethnicity, and preoperative intra-aortic balloon pump. As a confirmatory check, it is helpful to review potential candidate variables against external resources, such as previous versions of a risk model, and other comparable risk models from other groups or organizations.
As described by both Clark and by Shahian and colleagues, there are three principal techniques that have been utilized for construction of cardiac surgery risk models [40,52]. Bayesian models are useful in early database experience as they are robust in the face of missing data. Logistic regression models are the most common statistical technique for risk modeling—utilized by regional and national databases such as those in New York, the VA, and the NNE. Multivariable logistic regression is utilized for the STS national databases. Some use simple, additive scores with weights derived from the logistic regression model. Shahian and colleagues found that comparative studies have generally demonstrated that logistic models offer the best overall performance [40,53,54]. There is interest in the potential advance offered by application of algorithmic models, which are also known as machine-learning techniques, as these models permit complex, nonlinear information processing. However, tests of these models have not yet shown significant improvement over logistic or Bayesian models [40,55,56].
3.4. Risk model validation
Once the multivariable logistic regression is applied, the test sample of the defined population is used to test model performance and to validate the new model against its performance for the development sample and against the old model, if an update. The C-index is assessed for the training or development sample and the test or validation sample, looking for close agreement between the two samples for each endpoint. Alternatively, calibration can be assessed by plots of observed versus expected event proportions within deciles of predicted risk for the various endpoints, such as described by Shahian and colleagues [46,47,48].
For each of the STS risk models—CABG surgery, valve surgery, and combined CABG and valve—there have been multiple iterative refinements and updates to each. Calibration is required after obtaining raw STS risk scores, done annually (in quarterly increments), so that the calibration factors are dynamic, updated quarterly after each data harvest . Jin and colleagues have reported that if the risk models are used without calibration, the risk scores are almost always higher than they should be, overstating risk and understating the observed-to-expected ratio .
Numerous types of validity can be used to scrutinize a statistical risk model. One is face validity, where the model is reasonable to experts. A second is content validity, where all important variables have been included. A third type is attributional validity, in which risk adjustment is adequate to insure that differences in outcome are not due to patient characteristics. Finally, there is predictive validity, which provides a measure of how well it performs on a data set other than the one from which it is developed, internal or external . There are two tests applied to test predictive validity. The first, calibration, assesses reliability, or the extent to which the model assigns appropriate risk to the population under consideration, the most common of which is the degree of concordance between deciles of observed and expected risk, or the Hosmer-Lemeshow test . By extension, and as is demonstrated by Tran and colleagues  and Zhang and colleagues , there is also naturally great interest in comparing the risk models against each other, including extension into intermediate timeframes (e.g., one-year survival). The second test, discrimination, assesses the tradeoff between specificity and sensitivity of the risk model at various probability cut points [40,60].
Following the calculation of the model performance measures, the final regression coefficients can then be estimated from the combined training (development) and test (validation) samples. The algorithm, intercept and coefficients can then be deployed for the risk model, for application for risk prediction for individuals and for population analysis.
4. Application of risk prediction in decision-making and care
4.1. Application at the bedside
Risk prediction tools that are easily accessible and user-friendly are the most valuable for application in the clinic or at the bedside. Tools providing calculations as the data are entered contribute to clinical decision-making in real-time. The STS provides an on-line risk calculator for just such a purpose . It is at the bedside or in the clinic where the counseling is being provided, and the patient’s and family’s questions and concerns for risk versus benefit and chances for recovery versus adverse outcomes are being actively considered. Data-driven decision-making is extremely helpful, especially as applied to consideration of the patient preferences and expectations for the care plan and procedure being recommended.
When the patient is high-risk for surgery, the data help convey the statistical chances for mortality and adverse outcomes to provide realistic expectations and balance to the hope for a good outcome that can potentially be held out of proportion to the reality of the situation. Sometimes this may even take the form of supporting data-driven decision-making to recommend against a surgical procedure. A surgeon’s knowing how to do a procedure does not automatically obligate the surgeon to operate, nor does it make a procedure right for every patient. In other words, it is important to consider when it may be appropriate to say ‘no,’ but doing so in an objective manner, based on the best available data and evidence.
When the patient condition and/or the cardiac anatomy is complex, and the recommendations by cardiologist and cardiac surgeon are not clearly guided by the evidence in the literature, data provided by the risk prediction tool can help provide input for shared decision-making by clinicians and patient. This is particularly enhanced by data-driven risk prediction for medical and surgical therapy, and for proceduralists from both medical and surgical subspecialties.
4.2. Individual performance trends
The clinician needs an accurate and reliable data source to answer the question, “How am I doing?” Practice guidelines, building on the evidence in the literature, provide recommendations for the standards of care on which proficient and expert clinicians have built consensus. The guidelines, however, do not provide information on the individual practitioner’s application of and adherence to the guidelines. It should be noted that following guidelines does not absolve the surgeon of applying good judgment for extenuating patient needs and circumstances. The guidelines and standards should be generally applied, however, and the exceptions and variance should occur rarely. Risk models help to assess the trend in decision-making and practice patterns. This is an important opportunity to allow the data to tell the story about actual practice patterns, as opposed to the good intentions to follow guidelines and apply standard of care. One’s perceptions of practice patterns are not always borne out by the data; thus, it is important, even imperative, to regularly evaluate oneself for practice-based learning and improvement.
Individual provider decisions on patient selection for operation are important to assess against appropriateness guidelines. Variance from recommended indications for operation should only be considered as a part of a research study, such as would be employed to assess procedure and timing of surgery for lung cancer.
4.3. Group or hospital performance trends
The trends in patient selection and provider practice patterns are also assessed for the group practice or hospital clinical service—the site of practice—by aggregating the individual provider data within the group or hospital practice. Within a group of providers there may be varying degrees of experience, and providers may be at different points along the lifelong learning curve of evolution from competence to proficiency to expertise and mastery. However, the clinical pathways by which the care is delivered offer an opportunity for setting the expected logistics for delivery of care in the perioperative cardiothoracic surgery patient. The risk-adjusted data provided as feedback to the group help the group to assess the impact of the processes and decision-making strategies commonly employed.
Although hospital and/or provider case volume have long been used as a proxy measure for quality, it is important to note that data show only modest association of hospital procedural volume with CABG outcomes . Thus, volume may not be an adequate quality metric for CABG surgery. Database participation allows for demonstration of where low-volume providers and hospitals achieve high quality outcomes, and where high-volume providers and hospitals do not.
Assessing patient referral and selection trends for the group or hospital against appropriateness guidelines further allow objective data-driven feedback. A new cardiac or thoracic surgery program may be perceived to get good outcomes because it is too conservative in patient selection for surgery—by “cherry-picking” cases to enhance outcomes. Alternatively, a new cardiac or thoracic surgery program may wish to build volume and be or become too liberal in its patient selection for surgery. Database feedback provides valuable perspective against which to weigh public perspective—among the provider and the general community—based on data and the appropriateness guidelines.
5. Performance data for reporting and improvement
5.1. Performance data types
There are two primary types of performance data—process measures and outcome measures. A process measure reflects how an aspect of care is delivered. An outcome measure reflects the impact of care on the patient—or how the patient does as a result of the processes of care.
5.2. Process measures
Process measures may be captured as timed intervals, as exemplified by the time interval between myocardial infarction and operation, time on cardiopulmonary bypass, cross-clamp time, time interval for mechanical ventilation (or time to extubation), length of stay in the ICU and postoperative length of stay. Other process measures are reflected as counts, such as lowest hematocrit on bypass and number of blood products transfused. A third type of process measure is reported as a yes/no response, as in the reporting of use of the internal mammary (or thoracic) artery (IMA) as bypass conduit, use of cardiopulmonary bypass (versus off-pump), or use of recommended medications at discharge, specifically aspirin, beta blockade, and cholesterol-lowering statin therapy.
Preoperative process measures include time from presentation with myocardial infarction to operation, use of intra-aortic balloon pump, preoperative creatinine, and administration of preoperative beta blockade. In general thoracic, preoperative process measures include administration of induction chemotherapy, and urgent versus elective procedure status. Intraoperative process measures in cardiac surgery include use of cardiopulmonary bypass versus off-pump surgery, use of the left IMA as a conduit for bypass, and perioperative transfusion. General thoracic intraoperative process measures include thoracotomy versus video-assisted thoracoscopic surgery, and procedure selection (extent of resection versus wedge resection). These measures are important in the assessment of patient selection and management prior to operation, as related to risk prediction for postoperative outcomes.
Postoperative process measures for cardiac surgery include administration of blood transfusion and discharge medication regimen (antiplatelet med such as aspirin, beta blockade, and statin). Hospital postoperative length of stay, discharge destination and readmission are of increasing importance in assessing care coordination for optimal outcomes.
5.3. Outcome measures
As noted above in relation to development of risk prediction models, it is key to identify endpoints of interest for the database users, providers and patients. Obviously, it is impossible for a database to capture every conceivable outcome, but there is consensus on major adverse events for which the database or registry can provide a robust and valuable resource.
The most familiar outcome measure, of course, is mortality rate. Mortality can be reported as a raw mortality rate—a count of deaths per the patient or case denominator, or as a comparison ratio of the observed mortality rate over the expected or predicted mortality rate. Furthermore, in the section above on risk model endpoints, the various definitions for mortality consideration were reviewed. But why is there so much detail to be worked out around such an unambiguous endpoint of mortality (living versus dead is usually thought of as an easy distinction to determine), and what does it mean to the provider and/or organization? Is the surgery any more successful if the patient gets out of the hospital alive, but dies at home? What is the impact to the patient versus the provider and/or hospital if the patient dies on the 31st day after surgery instead of sooner? And why is it important to capture all-cause mortality versus cardiac or thoracic surgery-specific causes? As an example, all-cause mortality means that death from a bowel obstruction during the same hospitalization as the cardiothoracic procedure would count as in-hospital or operative mortality for the cardiac surgeon. Questions like these are what engage the concerted efforts of the taskforces and expert panels that work to build consensus on the definitions for their databases and the performance reporting from those databases.
Major morbidities, or non-fatal adverse outcomes, commonly reported in cardiac surgery are represented by rates for unplanned reoperation (usually for postoperative hemorrhage), prolonged ventilator support requirement (>24 hours), cerebrovascular accident or new neurologic defect, new renal insufficiency or renal failure, deep sternal wound infection, prolonged length of stay (>7 days). Readmission within 30 days is of growing importance as providers and payers assess quality across the continuum of care, and not just around the in-hospital procedural event.
Major morbidity in general thoracic surgery includes pneumonia, adult respiratory distress syndrome, empyema, sepsis, bronchopleural fistula, pulmonary embolism, ventilator support beyond 48 hours, reintubation, tracheostomy, atrial or ventricular arrhythmias requiring treatment, myocardial infarction, reoperation for bleeding, and central neurologic event.
There is one additional way of thinking about outcomes. This is as a standardized incidence ratio for the composite outcome of any adverse outcome—any mortality or major morbidity. This allows for discussion and counseling that takes into consideration the collective impact of multiple risk factors and comorbidities on successful, uncomplicated recovery and return to quality of life.
5.4. Application of performance measures for reporting quality
Those databases with mandatory participation, such as the New York and Pennsylvania databases and the VA database described previously, provide regular reports, usually on an annual basis. Voluntary databases, such as NNE and STS, also provide regular reports to participants, although on a more frequent interval.
One of the uses of the regular reporting is to provide the individual and group with their own performance results. A second is to place those results in the context of national normative data. This allows the individual and group to assess their practice patterns against not only the guidelines and evidence in the literature, but against other practicing physicians in the specialty as a whole.
Process prevalence and variability can be studied and reported to advance the practice of cardiothoracic surgery. One example is provided by Tabata and colleagues regarding the use of the IMA graft in multivessel CABG surgery . Since use of the IMA graft has been repeatedly shown to be associated with significantly improved short-term and long-term survival in CABG, it is encouraging to see the frequency of IMA use in CABG surgery to be increasing. However Tabata’s study shows that many patients do not receive the benefits of IMA grafts, and some hospitals have a very low IMA use rate, which offers a significant opportunity for continued improvement .
The STS database, as with other databases such as the NNE, has been the source of data for numerous studies on risk factors and association with outcomes. Studies have been done on relationship to outcomes of gender, race, obesity, diabetes, age, renal function, off-pump CABG, and emergency CABG, as examples . Risk profiles and outcomes can be studied by procedure, as has been done with CABG trends over time as shown by Ferguson and colleagues  and ElBardissi and colleagues , and with 15-year valve surgery outcome trends as has been done by Lee and colleagues . Risk factors and outcomes, especially mortality, can be studied by region, to assess outcomes in an effort to identify regional best practices and to spread improvements in cardiac surgical outcomes .
In addition to providing reports that allow individuals and groups to compare with their peers, regionally and nationally, cardiothoracic surgery databases can also be compared with each other, to compare patient populations and assess the quality of their reports relative to one another. One example of such a study has been provided by Grover and colleagues, comparing the VA and STS cardiac surgery databases . Grover’s findings were that, in spite of the major difference in the male proportion of patients between the two databases, risk factors are otherwise very similar. Moreover, both databases have shown a significant reduction in the risk-adjusted operative mortality rate over a decade of provide risk-adjusted performance to their participants, with observed-to-expected death ratios decreasing from 1.05 to 0.9 in the VA system, and 1.5 to 0.9 for the STS participants. This reinforces the conclusion that the availability of data reports for practice-based learning and improvement is thus shown to improve care and outcomes for patients.
As previously noted, procedure volume has long been associated with quality outcomes. Database reports on outcomes have prompted studies of procedure volume in comparison with clinical quality measures—mortality, morbidity, and processes of care . For a voluntary database, critics have expressed concern that not having full participation can potentially skew the data, and thus, the outcomes. In this case the comparison of outcomes against the higher volume in a larger database (e.g., Medicare), can help to reinforce the accuracy of the quality of the data reported .
The quality measurements in adult cardiac surgery have been applied to develop a methodology for comprehensive assessment of adult cardiac surgery quality of care, including both individual measures and an overall composite quality score. A Cardiac Surgery Performance Measures Steering Committee and the associated Technical Advisory Panel convened by the National Quality Forum (NQF) selected a set of 21 structure, process, and outcomes measures to assess quality of cardiac surgery care in the US. Incorporating the trend in health care quality assessment for the use of bundled measures and “all-or-none” scoring, the STS Quality Management Taskforce (QMTF) chose eleven individual quality measures grouped within four domains that included all relevant CABG process and outcomes measures endorsed by the NQF. The four domains of STS quality measures are 1) perioperative medical care of preoperative beta blockade, discharge aspirin, discharge beta blockade, and discharge antilipid therapy; 2) operative care of the use of at least one IMA; 3) risk-adjusted operative mortality; and 4) absence of postoperative morbidity, specifically renal insufficiency, deep sternal wound infection, re-exploration for any cause, stroke, and prolonged ventilation/intubation . The STS QMTF has developed and tested a composite measure of cardiac surgery quality that encompasses multiple domains of care, uses Bayesian random-effects analyses, uses all-or-none scoring where appropriate, and avoids subjective weighting of individual measures, to provide validated quality measures useful to various types of users . With the STS composite quality score in use, Shahian and colleagues looked at the association of hospital CABG volume to the STS composite quality score, and found only 1% of composite score variation was explained by volume .
5.5. Application of performance measures for improving quality
With the data reports described above, data becomes information provided for self-examination and self-assessment, which in turn can be the starting point for improving quality and outcomes. The individual and group reports have prompted the response of, “What does this mean?...and what can I do to improve it?” This is the start of the necessary self-examination and self-reflection, followed by system examination, on which to build the quality improvement. When the data is shared with the other members of the interdisciplinary team, it is even more powerful, as it prompts self-assessment and drives buy-in for a collaborative approach to improvement. Carefully designed and disciplined teamwork and reliable implementation of evidence-based protocols applied by an empowered front line helps make improvements, especially decreasing complications and increasing cost savings .
6. Database use across the continuum of care
6.1. Linkage of databases
The STS database, as a national voluntary participation database, has seen growing participation, especially over recent years. However, it has still been important to quantify the completeness of the STS database as representative of cardiac surgery care in the US. To that end, the STS linked successfully with the Centers for Medicare and Medicaid Services (CMS) Medicare database and demonstrated high and increasing penetration and completeness of the STS database . In addition, this linkage should facilitate studying long-term outcomes of cardiothoracic surgery. The STS Congenital Heart Surgery database has also performed successful linkage to the Pediatric Health Information System database, an administrative database. This will similarly allow providers to conduct database research that capitalizes on the enhancements provided by linking both types of data to answer important clinical questions .
It has been noted repeatedly in this discussion that the surgical databases have focused on short-term outcomes, with mortality being captured as in-hospital or 30 days. But what can we learn about how the patients are doing after 30 days? What data sources may be available to the provider, beyond personal provider or staff follow-up with the patient and/or his/her primary physician? The NNE has linked that database to the National Death Index, to provide long-term survival outcomes . In the interest of more complete and accurate objective data, the STS has linked to the Social Security Death Master File, which allows for verification of “life status.” This successful linkage of the STS database to social security data has allowed examination of survival after cardiac operations—CABG, aortic valve replacement and mitral valve operations—with initial reporting on one-year survival .
In an important study looking at long-term outcomes, the STS, the American College of Cardiology Foundation, and the Duke Clinical Research Institute are collaborating on a comparative effectiveness study (American College of Cardiology Foundation—Society of Thoracic Surgeons Collaboration on the Comparative Effectiveness of Revascularization Strategies [ASCERT]) of CABG and percutaneous coronary interventions (PCI). This study has developed a long-term mortality (or survival) risk prediction model for CABG and PCI, i.e., considering outcomes across the continuum of care . In another early report on comparative effectiveness of revascularization strategies, Weintraub and colleagues have found that, among older patients with multivessel coronary artery disease not requiring emergency treatment, there was a long-term survival advantage among patients who underwent CABG as compared to patients who underwent PCI . Studies like this will continue to inform providers as they improve shared decision-making and quality, using the data-driven comparative effectiveness research.
6.2. Future opportunities for database improvement
Databases have been firmly established as sources of data for driving learning and improvement. Providers who participate in clinical registries or databases are therefore facilitating their own individual and collective practice-based learning and improvement. The variables captured in these databases will continue to be improved—by building evidence-based consensus on definitions, and by addition of variables needed to more completely represent the patients. In the risk factor discussion above, it was mentioned that there are aspects of patient condition that are not currently captured—including poor nutritional status and homelessness. Also discussed under risk factors was the challenge of capturing rehabilitation potential, relative to anticipated quality of life goal following postoperative recovery. It should be noted that Afilalo and colleagues have proposed using an integrative approach combining frailty, disability, and STS risk scores to better characterize elderly patients referred for cardiac surgery, and especially to identify those at increased risk .
While there is benefit in enhancing the databases by adding or refining variables where appropriate, it will remain imperative to exercise excellent quality control of data entered into the database, especially by applying consistent standardized definitions . As the electronic medical record becomes more pervasive in the US and elsewhere, this process may become more streamlined, but it will remain important to conduct appropriate audits for completeness and accuracy of data to decrease variability that would make the data and derived calculations suspect.
In the spirit of Dr. Codman, who in 1917 called for hospitals to release and compare outcomes data , the STS has initiated voluntary public reporting of database participant performance . The rationale is to provide transparency and promote accountability. With implementation, will follow the continued need for appropriate auditing and reporting of composite measures using credible data and methodologies, thus decreasing the likelihood of other entities developing measures from inferior methodologies which use unadjusted or inadequately adjusted administrative data [83,84].
A complete and accurate database is essential in order to provide vital information to patients, to providers, and to society. The patient needs valid and accurate prediction derived from data on procedural risk to aid in decision-making and to set realistic expectations for outcomes of care. The provider needs a valid and accurate risk prediction tool with which to appropriately counsel patients, families and colleagues. The provider needs valid and trusted reports on process and outcome measures in order to carry out appropriate and necessary assessment of self and system by which to improve. The public needs to see that the providers and the profession are measuring and monitoring, and using data for improvement of decision-making and care delivery, to ensure safe, timely, effective care. Cardiothoracic surgeons have been leaders in database development and testing, but even stronger leaders in applying the data to improve practice and outcomes for patients. This is a critical role and responsibility, but also an opportunity to make an invaluable contribution to our patients.