High-grade gliomas are the most common and aggressive group of primary central nervous system tumors, and are characterized by grim prognoses. The glioma with the worst associated survival is Glioblastoma Multiforme (GBM), otherwise known as anaplastic astrocytoma grade IV. The median survival of patients diagnosed with GBM is just 12 to 15 months. After recurrence, only 9-15% of patients are alive and progression free at 6 months (APF6), with a median survival of 9 months.
Advances in surgical techniques and neuroimaging have proven important, but nonetheless have done little to improve survival. The standard treatment for newly diagnosed GBM remains an aggressive tripartite strategy of surgery, radiation therapy (RT), and chemotherapy with adjuvant and concurrent temozolomide (TMZ), and yet GBM almost inevitably recurs. TMZ was added to the standard of care after Stupp et al. published results in 2005 showing that TMZ extended survival by 2.5 months . Other FDA approved treatments for GBM include the nitrosoureas lomustine and carmustine, as well as gliadel wafers that release carmustine and are placed during surgery. The use of these agents has been controversial and not standard, despite the fact that they have a similar effect on survival as TMZ. Upon recurrence, the treatment options for GBM are drastically diminished. Therapy is varied, and can include re-irradiation or surgery for some patients, and drug treatments. Only recently has the anti-angiogenic drug, bevacizumab, been added to the therapeutic regimen for recurrent tumors, although its impact on survival is debated . NovoTTF, an electric current therapy, is also approved for use in recurrent GBM.
Current treatment for GBM is not without toxicity. None of the GBM treatments considerably extend survival or improve quality of life, and consequently the risk-benefit ratio in treating GBM is not greatly tipped towards beneficial. Indeed, individual prognostic factors of each patient, including 06-Methylguanine-DNA methyltransferase (MGMT) methylation status, presence of mutant epidermal growth factor variant III (EGFRvIII), baseline performance status, age, and extent of surgical resection, usually have a better correlation to survival than available treatments . Younger age, MGMT methylation, IDH1 mutation, gross total surgical resection, and a higher performance status as measured by Karnofsky Performance Score (KPS) are positive predictors of survival [4-7].
The lack of effective treatments can be largely attributed to the biology of this glioma making it elusive to treatment. GBM is largely infiltrative of the surrounding brain tissue, so separation for surgical and radiation treatment is essentially impossible, thwarting efforts to prevent recurrence. Drugs may fail to impact GBM because of their inability to cross the blood-brain-barrier (BBB). Although it has been argued that the BBB becomes permeable due to a GBM, it is likely that many cancerous cells remain in areas with an intact BBB . Additionally, the ubiquitous mutations and redundant biologic pathways in GBM allow the tumor to easily escape many therapies. Upon recurrence, GBM often mutates and takes on new genetic abnormalities, conferring resistance to treatment that previously achieved some success. Combination therapy approaches are likely the best solution to GBM’s escape and resistance mechanisms. New therapies, possibly combined with older ones, are sorely needed to see remarkable advancements in the treatment of GBM. Only a handful of agents have been FDA approved for GBM in the last four decades, and these include: nitrosoureas, gliadel wafers, TMZ, bevacizumab, and Novo-TTF. As such, clinical trials remain as vital as ever.
Clinical trials of innovative treatments are the avenue for progress in GBM treatment, but serious obstacles plague trials and discourage advancements. Enrollment of patients and success of investigational agents is sub-par. Only 5% of adults with cancer enter clinical trials, and a mere 6% of the agents that go into trials are ultimately FDA approved.
A major barrier to developing new treatments has been the relative infrequency of GBM. The incidence of GBM is 11,000 diagnoses and a prevalence of 25,000 each year . Although it gained some notoriety in the media when Senator Edward Kennedy passed away as a result of the disease in 2009, GBM is still seen as an orphan disease. Its rarity contributes to a paucity of funding, and makes innovative drug development through clinical trial research slow and frustrating as a result of low patient accrual. With just 3% of patients who receive standard of care surviving past 5 years, new clinical trials will shape the future of this diagnosis, but presently, a very small handful of GBM patients participate in clinical trials. The world of clinical trials is fraught with ethical debates, frustrating lack of standardization across trial parameters, and low funding. The efficacy of new treatments is obscured by poorly designed trials with insufficient stratification by individual prognostic factors or ill-chosen end-point analyses. However, as researchers continue to elucidate the molecular mechanisms of GBM, innovative, targeted treatments have gained promise as the new possible solution. These include immunotherapies, anti-angiogenic drugs, and therapies against target molecules like EGFR. Well-designed clinical trials involving these agents are necessary for finding progress for the low survival rate of GBM.
Although it has been debated whether or not clinical trial research should be conducted on terminally ill patients who may not reap a survival benefit, two recent studies have shown that clinical trials unequivocally help patients, irrespective of the trial’s occurrence before or after standard of care advances, including the addition of TMZ. Shahar et al  discovered a significant survival benefit for GBM patients who participated in clinical trials, regardless of their placement in the control or experimental arm, and independent of baseline performance scores or age. This important finding was corroborated by Field et al whose group showed that patients in clinical trials had improved outcomes, once again regardless of assigned trial arm, age, or performance status . Clinical trials are valuable to academic and medical communities, and now it would appear, they can be to the patient. Patients in both arms of a trial could exhibit better outcomes because of the additional close surveillance by medical teams that offer psychological and physical support. The act of following a rigid protocol, placebo effect, and/or a change in the physician patient relationship as a result of both being observed may also suffice as explanations . A disconcerting question is provoked from these results, if patients on the control arm do better than those who receive the standard of care, is it really the best standard of care?
2. Overview of clinical trial development
Clinical trials have progressive phases of development, starting with pre-clinical laboratory research, followed by either a pre-clinical phase 0 or phase I trial, and then a succession of phase II, phase III, and usually a phase IV trial. Pre-clinical studies precede human trials, and involve bench research, such as human cell lines, followed by translation into animal models implanted with human xenografts or syngeneic rodent tumors in immune-competent mice for immunological studies.
A phase 0 trial, also known as a micro-dosing exploratory investigational new drug trial (IND), or early phase I trial, is the first translational step into human use, after an investigational agent has been thoroughly studied in animal models. In this phase, only a small amount of drug is used, and the data generated are preliminary without intended indications about safety or effectiveness. A small number of patients with advanced disease and without available treatment options participate, and once the experimental agent is deemed acceptable for human use, a phase I trial is initiated.
Phase I trials are meant to examine the dosing of a new drug, its safety, and pharmacokinetic properties. Classically, the goal is describe the maximum tolerated dose (MTD). Notably, efficacy is not evaluated at this phase . Patients in phase I trials are monitored carefully for side effects. There are approximately 20 enrolled patients, who have advanced disease where no standard treatment exists. Phase I can last for months to a year or more, depending on accruement.
Phase II trials examine the efficacy and pharmacodynamics of the investigational treatment in under 100 patients. Healthy patients are excluded from participation in oncology trials because of the high toxicity associated with cancer treatments. Unfortunately, the efficacy and toxicity of cytotoxic drugs simultaneously increase . In brain tumor patients, efficacy is typically measured by tumor changes, and these are characterized on a set spectrum from complete tumor response to progressive disease, with evaluation performed through radiographic imaging . If a pre-determined level of efficacy is attained, the trial may advance to phase III, otherwise, the clinical trial process is halted. Phase I and II trials may be combined if a drug is already well known for another diagnosis, as there is some existing information on toxicities and dosing.
Phase III trials involve a larger cohort of patients, anywhere from hundreds to thousands of people, and last for years. The intent is to compare the investigational treatment to the standard of care and determine if it confers some advantage over the standard, in terms of efficacy, side effects, or other factors. This phase usually employs both randomization and blinded enrollment, which in combination with larger numbers of participating patients, helps to reduce bias of earlier phase trials. After the completion of phase III, if the study agent possesses benefit over the standard treatment, FDA approval is sought.
Phase IV trials are not always required, and simply serve to answer any lingering questions after the drug has FDA approval, such as side effects that have not been sufficiently explored. Phase IV trials enroll many more patients and may last a long time. Drug companies often sponsor phase IV studies to act as post-marketing surveillance trials (NCI).
The clinical trial process through all of the phases is not easy or rapid. It takes a median of 172 days to open any phase trial, and with three phases this adds up to an extra 1.5 years spent on the administrative process. Many factors play into this slow approval, including inadequate resources for frequent institutional review board (IRB) meetings and unavailability of staff that can be allocated to protocol development. Other barriers, like contracting, or increased litigation and bureaucracy at academic centers, also slow the process. Solutions such as a centralized IRB have been proposed, but not realized. A more efficient administrative path would mean opening trials faster, and consequently patients could obtain the benefits of new treatments sooner; this does not need to come at the expense of ensuring that patient safety and research integrity stay intact .
One more streamlined option for drugs entering clinical trials is to apply for accelerated approval (AA). The FDA designed AA so that drugs could be more quickly approved for life-threatening conditions. It is intended for drugs that fill an unmet need in the treatment of a serious condition. Two notable social movements, namely the AIDS and breast cancer movements, helped bring the concept of AA to fruition. Patients who were sick and dying were frustrated by their inability to access promising experimental drugs that still had years before FDA approval . The advent of AA alleviated some of the regulatory sluggishness and hindrances. The results of these activist movements have trickled down to affect treatment of brain tumors; for example, both TMZ and bevacizumab have been granted AA for use in GBM.
AA is faster in part because it relies upon data from surrogate end-points, as opposed to regular approval, which is granted based off of data on real clinical benefit, designated to be either prolonged survival or improved quality of life. Surrogate end-points, such as progression free survival (PFS) or overall radiographic response (ORR), are supposed to reasonably predict an effect on actual clinical measures and give evidence of an acceptable risk-benefit ratio, while requiring less time to assess. Ideally, the surrogate end-point has been well validated in its relation to clinical benefit, as correlation with the clinical measure does not necessarily mean evidence of clinical improvement will follow suit. A prevalent example is the two drugs encainide and flecainide, which received AA in 1980 for their ability to suppress arrhythmias, a known risk factor in myocardial infarction. After AA, they were prescribed to almost half a million people; physicians even believed that it was unethical for control subjects in a randomized trial not to receive them. However, the completion of a randomized, post-marketing trial found that the drugs actually tripled the risk of death, and they were withdrawn from the market . As shown by this example, simply correlating an end-point with a clinical benefit can be risky. Moreover, although phase II or III trials can provide the data for AA, regular approval is contingent upon the completion of required post-marketing trials that are conducted to demonstrate actual clinical benefit, or the drug may be withdrawn from the market.
Although the importance of post-marketing trials is unquestionable, it is apparent that these are not always conducted in a timely fashion for oncology drugs. The FDA granted AA to 47 new uses for 35 oncology drugs from 1992-2010, but by 2011, 21 had not been given regular approval because the clinical benefit had yet to be proven in a post-marketing trial. Monitoring from the FDA and penalties to companies that do not finish trials in a reasonable time will be necessary to preventing ineffective, or even harmful drugs from staying on the market and being prescribed. Confirming that a post-marketing trial is designed and ready for patient accrual at time that AA is granted would help ensure the drug company’s compliance [17, 18].
The FDA and other regulatory agencies are not solely responsible for a challenging clinical trial process. It can also partly be attributed to the complex ethics involved in cancer clinical trials. These ethical issues are important to recognize, and play a part in the physician, patient, and greater regulatory agencies’ attitudes towards clinical trial research.
3. Ethics of clinical trials
Ethical issues for patients in brain tumor (as well as other oncology and terminal diagnoses) trials can be sorted into a series of fundamental questions. The first question is if terminally ill patients should be considered a vulnerable population? Although the GBM patients who participate in trials are often white, male, and educated, and therefore not classically categorized as vulnerable by society, their status as dying conveys a new vulnerability . A vulnerable group is defined by an inability to protect their own interests or by having cognitive impairments that interfere with decision-making capacity. Physicians and others become concerned when a dying, desperate patient becomes willing to do anything, regardless of proposed risk or benefit, for the chance of an improved outcome. GBM patients undoubtedly suffer from varying degrees of cognitive impairment, and struggle physically and psychologically, so their resulting decision making capacity could very well be reduced. Conversely, these patients arguably make other equally important decisions, such as end-of-life wishes and estate wills.
The debate regarding vulnerability should not produce the conclusion that GBM patients need to be excluded from clinical trials; the very act of exclusion of dying patients can be unjust and harm the progress of medical care . It is presumptive to assume that each patient prefers palliative care to the chance for involvement in research for future treatments, having an avenue to fight their diagnosis, or being valued in the medical research world. The notion that participation in clinical trials by desperate patients is similar to coercion is also incorrect, as being in a situation with inherently limited choices towards the end of life is distinct from coercion . Avoiding the categorization of terminally ill patients participating in trials as exploitation can be achieved by ensuring that the research is essential to the future of the diagnosis, that another population could not replace dying patients, and that risk is minimized .
A second ethics issue is the patients understanding of the informed consent process mandatory to entering a trial. Consent is intended to explain the reasons for the trial and what can be expected to happen to a patient as a result of participation. For a phase I trial, the consent process would inform the patient that the study is meant for finding dosing and safety information. Despite this, patients involved in phase I trials often cite their desire for therapeutic benefit as the reason for participation, despite the fact that phase I trials do not anticipate efficacy for patients [14, 21]. A dichotomy between what the researcher, who searches for toxicity and pharmacological details, versus the patient, who wants to get better, hopes to gain is apparent . The fact that patients use phase I trials as a last-ditch effort to fight their disease and to attain a response brings up the natural conclusion that the informed consent process is inadequate. If the consent process were sufficient, the patient would be aware that they very likely would not experience a response in their disease, and that the trial is simply for dosing and safety . However, this gap in understanding is not as simple as inadequate consent. A vital line must be drawn between understanding and appreciation; a patient can comprehend that the chances of receiving a therapeutic benefit are incredibly slim, and that dosing is the end-point, but what they appreciate in this scenario may remain the possibility, however small, of a benefit. Indeed, although only a third of patients report the goal of a phase I trial is for dosing information, a full 92% report it is for safety. Furthermore, it is exceedingly difficult to grasp the psychological state of a person who is dealing with serious illness and fear, and thus inadequate to assume that healthy people fully understand the perspective that gives rise to a terminal patient’s decision to participate in a trial with an apparently skewed risk-benefit ratio . Comprehension as a result of the informed consent process needs to be pursued and confirmed along with a realization that patients and researchers may continue to value different end goals. This will avoid exploitation and protect against the formation of false hope in patients.
The third ethics issue is the balance of the risk-benefit ratio, which is exceedingly important in ensuring that a clinical trial should be conducted. The risk-benefit ratio for terminally ill patients has is unique, as they may be more willing to accept risks in order to gain a benefit. For example, it has been historically believed that there has only been an overall response rate of 5%, and a complete response rate of 0.3-0.7%, in phase I anticancer trials. At first glance this seems quite low, and deciding whether or not conducting these trials is necessary also depends upon the associated risks. However, these numbers do not include newer targeted therapies, and additionally do not contain descriptions of less drastic responses. In fact, more than 60% actually showed at least one objective response (tumor shrinkage of more than 50%). The risk of death by phase I associated toxicities was 0.5% . Horstmann et al. (2005) analyzed 460 oncology trials and found a 10.6% combined complete and partial response rate, and a 34.1% rate of less than partial response or stable disease, which are higher rates than previously recognized (deaths due to toxicity was 0.49%) . It is possible that the benefits of phase I trials have been understated. Horstmann also points out that the retrospective analyses lacked separation by treatment type, which greatly affects toxicity and response. In addition to higher than believed biological response rates, an encouraging result is that psychological value has been discovered in phase I oncology trials. Although a large amount of time and energy must be dedicated by the patient, which is precious at the end of life, it is arguably balanced by the comfort of frequent physician contact and a chance to exert willpower in a powerless situation .
Another ethical dilemma in the oncology setting is how the physician-patient relationship is altered by clinical trial participation. This issue is key because physicians are the interface between patients and clinical trial research, and enrollment of patients into trials is heavily dependent upon physician referral. Phase I trials can be frustrating for physicians. A phase I trial is traditionally designed with the 3x3 model of Fibonacci escalation, meaning the first group of patients receives a low dose, the second a higher dose, and the third the highest dose, with dose escalation in each level until a reversible, toxic event occurs. In this design, the majority of patients will receive a sub-therapeutic dose, as disease responses are typically seen at 80-120% of the maximum dose . This means 60-80% of patients in phase I trials receive an ineffective dose, along with any associated toxicities, a low overall response rate, and a historical disease remission rate of just 1%. It becomes understandable why physicians would not be enthusiastic about referring patients to phase I oncology trials. Physicians have no clinical authority over the care of their patient once they are enrolled in a trial, and it is hard to communicate to a patient that even if a benefit were to occur, the chances are not optimized for each patient in a phase I trial because this is fundamentally contrary to the Hippocratic oath . New dosing methods have been suggested to reduce the amount of patients who receive sub-therapeutic doses and to quicken phase I trials. These can take the form of intra-patient escalation, a higher starting dose, fewer patients in each level, and accelerating the dose escalation process . The Fibonacci scheme is no longer widely implemented, which eases some ethical tension for patients and physicians . Changing dosing methods to increase the likelihood of patients benefitting is desirable, but it can blur the lines between research and therapeutics. It implies that phase I trials are for benefit, when they remain to be conducted for research on safety and dosing. With a scheme like intra-patient escalation, physicians may have a greater tendency to think that they are helping their patient, even though the end-point of the phase I trial remains unchanged. Standard medical care and clinical trial research are not one and the same, but physicians and patients alike often fail to see the boundaries . This aside, decreasing the number of patients who are unnecessarily exposed to a drug by employing better dosing schemes is important.
Randomized phase III trials also present issues for the doctor-patient relationship. In order for a physician to send a patient to a clinical trial, they must not believe either arm of the clinical trial is superior; the Hippocratic oath would prevent the physician from accepting randomization in this case. Physicians are inclined to operate under concern for their present patient, not the future generations that will reap the results of a successful clinical trial . However, randomization and blinded enrollment are significant. Trials that do not adequately conceal patient distribution have a greater tendency to find an advantage of the new treatment over the standard. Blinded assessment produces significantly lower and more consistent scores as compared to open assessment . Dealing with ethical issues surrounding randomization is essential. Reluctance of physicians to refer patients to clinical trials is a major obstacle as it begets less effective and less powerful trials due to low patient accrual.
That being said, patient interaction with clinical trials is not wholly dependent upon a relationship with a physician. Patients do seek them out independently or find them as a result of patient advocacy and social movements, as seen with the AIDS and breast cancer activist movements. It would appear that the activist movements gained a victory for terminally ill patients who seek experimental drugs through AA. However, it is feasible that AA is monetarily helpful to pharmacology companies, who can profit once approval is granted, rather than being as advantageous as patients believe it to be. In the AIDS movement, the drugs that were granted AA were extremely expensive, while still being toxic and mediocre . The FDA and congressional regulations are wary of drug companies taking advantage of patients through AA, which is why it is only granted for diseases with extreme need and why, in 1991, the FDA published specific regulations for AA in life-threatening conditions, including a post-marketing trial requirement . The crux of the issue in patients versus approval mechanisms is the ethical argument between the right to take experimental drugs and the need to collect drug data.
In 1980, the United States Supreme Court ruled that patients do not have the right to obtain unapproved drugs (Rutherford vs. U.S.). Gonzales vs. Raich in 2005 resulted in the Supreme Court ruling, “dispensing of new drugs, even when doctors approve their use, must await federal approval” . Recently, the Supreme Court denied the Abigail Alliance’s, a non-profit organization founded by a young girl’s father hoping to gain access to an experimental drug for his terminally ill daughter, requests to allow drugs to be approved and used after phase I trials, and to let companies financially profit before approval. The possibility of exploitation of sick and desperate patients by drug companies was too conceivable in this case. The facts remain that 66% of oncology drugs entering phase I trials fail to be approved . Thus, accepting the Abigail Alliance’s desire for drug approval after phase I trials would be dangerous.
The right to early access to drugs through AA should not compromise further data collection. Once patients have access to a drug through AA, enrolling in the post-marketing trial is not appealing, yet the purpose of the trial is to ultimately understand the risk-benefit ratio, which is critical to patients and should not be undervalued . Although AA is granted with the assumption that the agent will procure clinical benefit, this is not always the case. For example, if a drug receives AA after phase II, the reality is still that 45% of drugs fail phase III testing. Consequently, AA does need to be granted with caution and the assurance that more data will follow . It is also necessary to understand that rare adverse events may only be illuminated in large randomized trials, and very rare ones may only be discovered after the drug is being taken by large numbers of the population. A possible solution for mitigating the risk of AA would be staggered approval, in which the drug is approved for use in a small subset of a disease, and as more data are collected the drugs’ use can be expanded . A second option to reduce the risk associated with AA is expanded access for non-randomized open trials, so that more patients can enroll and receive a drug while data are generated. The National Coalition for Cancer Survivorship and the American Society for Clinical Oncology have advocated for expanded access .
The ethics of clinical trials do not have straightforward resolutions; the main point that must be adhered to through any of the ethical debates is protecting patients who have run out of medical options from exploitation or an unacceptably skewed risk-benefit ratio. The fact is that although clinical trials are complex to develop and run, their significance for GBM remains strong even in the midst of ethical and regulatory barriers.
4. Challenges in brain tumor clinical trials
Obstacles in brain tumor clinical trials will be discussed in order to illuminate why progress in the field continues to be a struggle. The issues covered will include: challenges to patient selection, trial end-point analyses, and the different trial parameters required for optimal assessment of recently developed targeted, experimental agents.
4.1. Patient selection and accrual
Patient accrual and selection in GBM clinical trials can be problematic, because without sufficient patient numbers or stratification, trials become ineffective. GBM is an orphan disease and consequently the number of patients who enter clinical trials is quite low. This problem is exacerbated by physician hesitance to refer patients to trials. As a result of low numbers, statistical power in clinical trials is diminished, which prevents detection of more subtle efficacies. For example, in a randomized clinical trial looking for a 50% increase in survival, 136 patients are needed per arm. To detect a 25% increase in survival, 411 patients must be in each arm. For this reason, in designing a clinical trial for a rare disease, the clinical outcome pre-determined as significant must be carefully considered . By using the same standard for every disease, a drug might be inaccurately deemed ineffective and not worth studying.
Another element in patient selection for brain tumor trials is defining the appropriate control for a treatment comparison. The possibilities include: an untreated baseline from the same patient, an untreated patient in another arm, or historical controls. Historical controls are used to compare efficacy results of many phase I and II trials, but this is not an ideal measure. Historical controls can be derived from possibly flawed literature, and trials may have differed in significant ways, like eligibility criteria, lack of control for corticosteroids, types of imaging analysis, or unaccounted prognostic factors. There is little consistency across trials because of the absence of agreed upon standard care, with the exception of the Stupp protocol after 2005 . With new, targeted therapies, the ideal control for a trial is to procure untreated and treated biological specimens, within the same patient over time. By the patient serving as his/her own control, the amount and function of a target can be easily and thoroughly compared. The issue in this approach lies in the feasibility and cost-effectiveness of multiple surgeries .
The GBM patient population is also heterogeneous in regards to individual prognostic factors. Principal prognostic factors described after a recursive partitioning analysis (RPA), performed by the Radiation Therapy Oncology Group, were the patient’s age, KPS, neurological function, mental status, and extent of surgery, which is related to survival after 89% of tumor volume resection [4, 32]. Another essential factor for prognosis is tumor location. If patients are not stratified and selected for, with consideration of prognostic factors, it may become impossible to assess whether or not treatment effects are truly due to treatment or individual differences. Understanding of these prognostic factors allows patient to be sorted into four main GBM risk classes, which can be accurately compared amongst one another . Risk groups determined by RPA are important in phase II trials that compare data to historical controls to eliminate patient selection bias . Age is an extremely important as a prognostic factor, followed by performance status for patients older than 50 years, and within that subset, mental status causes a significant split [4, 33].
Highlighting the vitality of prognostic factor description and separation in clinical trials was reported in a retrospective analysis by Perry et al. 1997 of the RTOG database. The group discovered that pre-treatment variables (i.e. prognostic factors) impacted survival more than actual treatment. The variables examined were the duration of symptoms, neurologic abnormalities, tumor grade, age at diagnosis, and post-operative performance status. In their examination of phase II studies, it became apparent that patient selection furnished sufficient explanation of survival differences. It is clear that patient stratification is imperative for accurate treatment analysis, even though it likely increases the required number of patients for trials as subgroups are created .
4.2. Clinical trial end-points
End point analysis has been a major point of contention and frustration in many GBM clinical trials. There is no single preferred end-point, making comparisons between trials complicated and doubts about a treatment’s efficacy hard to resolve. The importance of the end-point cannot be overstated, as an ill-chosen one can make a drug’s efficacy seem poorer or better than it actually is, which are both detrimental outcomes. The two main types of end-point analyses for efficacy in brain tumor clinical trials are tumor response (visible tumor effects along with rate and duration of response) and tumor control (PFS or OS). Tumor control is more optimal for targeted therapy effects because it takes into account cytostatic drugs that do not impact tumor burden, as compared to tumor response, which is more sensitive to cytotoxic drugs . Both PFS and tumor response rely upon radiographic imaging analysis, yet this is plagued by unreliability and lack of standardization in brain tumor patients. Radiographic imaging needs meticulous and careful assessment.
Imaging: Radiographic imaging is a chief tool for analyzing the results of an interventional agent in clinical trials in all phases. Imaging gives insight into direct and tangible changes in the GBM, but it also comes with many challenges. Treatments may lead to confusing imaging read-outs.
Pseudoprogression is treatment caused changes in edema or contrast enhancement that radiographically mimics increasing tumor burden . Up to 30% of patients have shown psuedoprogression after RT/TMZ, as well as after chemotherapy wafers, immunotoxins delivered with convection enhanced delivery, viral gene therapy, and immunotherapies . A repeat scan must be performed 4-8 weeks later helps to rule out pseudoprogression  in order to prevent unnecessary therapy or accidental inclusion of patients in clinical trials for recurrence . Alternatively, pseudoresponse is equally as problematic as it can give rise to the perception that a therapy has caused a response. The corticosteroids used by many patients considerably reduce tumor enhancement on MRI or CT imaging, giving the false appearance of a tumor response . Further, agents that target vascularity alter vessel permeability, and in turn decrease perfusion and enhancement on imaging, without the underlying tumor being affected. One example is the anti-angiogenic agent bevacizumab, which can significantly decrease enhancement within 24 hours on MRI, although the solid tumor is static .
There are commonly used methods for assessing imaging in many trials, although no standard is globally applied. Before current methods were proposed, the Levin criteria and the WHO response criteria were used to assess brain tumor imaging. The Levin criteria called for analysis of MRI images to assess the extent of enhancement, edema, and mass effect as compared to a baseline scan. Tumor responses were either categorized as complete, partial, stable disease, or progressive disease [35, 39]. The WHO criteria evaluated contrast-enhanced computed topography (CT) by multiplying the greatest cross-sectional enhancing tumor diameters . The Levin and WHO criteria were subsequently replaced by the Macdonald criteria in 1990 because they did not have adequately defined guidelines, and did not consider artificial enhancement created by factors besides the tumor.
The Macdonald criteria use MRI, contrast-enhancing imaging, which has become the imaging gold-standard method for measuring changes in GBM. There are two different types of MRI assessment. The first is to measure the largest diameter on any single axial section, which can be done using two distinct established criteria, and the second is volumetric measure of tumor size. The first diameter method is the Macdonald Criteria (two dimensional), outlined more than 20 years ago but still widely used, which analyzes the largest enhancing tumor diameter on a single axial gadolinium-enhanced T1-weighted section. This process also considers the greatest perpendicular diameter on that section. The product of the diameters is calculated, and each subsequent imaging study is summed with the previous diameters . These results are then applied to classify the results as complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD). The Macdonald criteria defines complete response as the disappearance of all enhancing measureable and non-measurable disease for at least 4 weeks, with no new lesions, no corticosteroids, and stable or improved clinical status. A partial response requires greater than 50% decrease in all enhancing perpendicular diameters compared with the baseline, with no new lesions, stable or decreasing corticosteroid use, and stable or improved clinical status. Significant progressive disease is a 25% increase in the sum of the perpendicular diameter products in contrast-enhancing lesion, a new lesion, or clinical deterioration. Stable disease is defined as anything besides complete, partial, or stable response. Clinical status is measured with KPS or Eastern Cooperative Oncology Group (ECOG) performance status (explained below) [34,40]. The criteria take into account corticosteroid use and neurological status changes. Complete and partial responses are combined to comprise what is known as overall radiographic response (ORR), which is the indication of positive treatment efficacy. Radiographic response is used in clinical trials because changes in the size of the tumor are presumably a predictor of symptomatic and survival benefit. ORR can be evaluated in a single arm trial, and it is not convoluted by the history of the disease so it can more accurately predict therapeutic benefit. ORR also has advantage as an end-point because the Macdonald criteria are widely used and relatively objective, so they enable comparison between trials . However, although radiographic response and survival are believed to be related, ORR has not been directly connected to therapeutic benefit, and the significance of a partial versus a complete response is not well characterized . Furthermore, radiographic response is biased against cytostatic agents that produce stable disease as opposed to partial or complete radiographic responses .
There are also limitations associated with the Macdonald assessment method. Macdonald criteria do not specify directions for necrotic portions, are restricted in the ability measure irregularly shaped tumors, are subjective to the radiographic reviewer, offer no guidance for multifocal tumors, face challenges in measuring around cysts or surgical cavities, and fail to assess non-enhancing tumor [35, 40, 41]. Ignoring non-enhancing tumor while simply measuring contrast enhancement is insufficient because perfusion of enhancing agents across the BBB can be influenced by many factors, including corticosteroids, inflammation, surgery, ischemia, anti-angiogenic drugs, radiation, radiological methods, seizure, and post-surgical effects [36, 41].
In an effort to rectify these limitations, updates to the Macdonald criteria have been proposed by the Response-Assessment Neurooncology (RANO) working group. The updated criteria consider, without measuring, non-enhancing regions on T2-weighted and FLAIR MRI imaging to visualize response. Any enlargements in these areas are considered as progression, and to be defined as CR, PR, or SD, the non-enhancing regions must be stable or decreased. The definition of stable disease is expanded to be a greater than 25%, but less than 50%, decrease in enhancing disease, along with no new lesions, stable or decreasing corticosteroid use, and stable or increased clinical status. Pseudoprogression is clarified by stating that within the first 12 weeks after radiotherapy, progression can only be determined if all of the enhancement occurs outside the radiation field or is pathologically confirmed [35, 41].
The other diameter measuring method is the one-dimensional RECIST criteria, published in 2000, but not as widely utilized as the Macdonald criteria because it has not been validated in brain tumors [40, 41]. RECIST finds the longest linear enhancing diameter of a lesion in one axial plane, which is repeated on later imaging studies, regardless of changing section or orientation. Non-measurable lesions are less than 10mm, less than 2 times the imaging section thickness, and cannot be cystic foci, necrotic foci, or leptomeningeal lesions. If multiple lesions must be measured, they are done so separately and summed in final analysis . RECIST does not consider steroid use or neurological status, and faces similar challenges as the Macdonald criteria in regards to the difficulty of radiographically measuring a heterogeneous tumor . Macdonald and RECIST have been found to be concordant in other cancers, and recent evidence found that they gave similar responses for tumor response and progression in recurrent GBM treated with irinotecan and bevacizumab .
The final MRI method is computer-aided volumetric assessment, in which a computer delineates the border between enhancing and non-enhancing areas over every axial section with enhancement. Border review is conducted by a neuroradiologist to calculate the total enhancing volume. Volumetric measurements can better succeed in controlling measurement variation, and may be more sensitive to early progression and response rates . Nevertheless, defining the borders for measurement is problematic in this heterogeneous tumor . Although there are theoretical advantages, volumetric measurements are not recommended because they are not standardized and the technology is not yet widespread .
Newer imaging modalities are a promising solution for imaging confusion. Spectroscopy uses metabolites to better show events in the non-enhancing tumor regions and diffusion imaging uses water diffusion into the tumor to visualize a treatment’s efficacy and allow better imaging of the non-enhancing tumor. This helps distinguish between ischemia effects and recurrence. Perfusion imaging is sensitive to blood flow and permeability and therefore an ideal measure for anti-angiogenic therapy. PET scans can help distinguish recurrence versus necrosis post-radiation therapy [30, 36, 41]. T2-weighted or fluid attenuated inversion recovery (FLAIR) MRI imaging can give insight into non-enhancing tumor and the infiltrative tumor burden. These technologies have been used in some studies, although they remain to be standardized [3, 35].
In summary, tumor response on MRI is not infallibly reliable or objective, but PFS is still a common end-point. OS is another often used end-point, and does not rely on radiographic imaging. OS and PFS are explained below.
Overall survival: Time from diagnosis or treatment initiation to death, from tumor-related or unrelated causes, is qualified as OS. Because it does not rely upon imaging, OS is a more objective measure. Another advantage over imaging is the ability to compare to historical controls if a comparator trial arm is not present. However, caution is warranted in comparisons to historical controls due to variable start times, eligibility criteria, or less than ideal radiological review . The FDA prefers OS as a gold standard end-point; OS was used in approval for gliadel wafers in recurrent GBM in 1996, gliadel wafers for first line therapy in 2003, TMZ for first line therapy in 2005, and bevacizumab for recurrent GBM in 2009. This is likely a result of the FDA’s wariness of imaging subjectivity . However, OS exhibits disadvantages in its inability to account for post-trial life-prolonging therapy, which is far from standardized. End of life care can range widely from antibiotics to aggressive steroid usage [34, 36]. OS also takes significantly longer in trials due to extended follow-up. However, including built in interim analysis such as OS at 12 months can lessen the time needed before analysis. Reduction in time required before measurement is the primary reason PFS is often utilized as an alternative end-point.
Progression Free Survival: PFS is the time between tumor stabilization, after surgery or RT/TMZ, and subsequent radiographic tumor progression. Survival and PFS are well correlated, despite inter-patient discrepancies in KPS, age, or prior chemotherapy treatment [29, 30]. PFS is often used as a surrogate end-point for clinical benefit, especially in trials for brain tumors, but it does require special consideration. PFS is optimally assessed through a blinded independent radiology review board, and a control arm has to be included for efficacy comparisons. The FDA is wary of PFS because of its reliance on subjective radiographic imaging. It is unlikely that PFS would provide adequate evidence for AA if a drug were on the market that had already shown significant survival benefit . PFS is free of corruption by life-prolonging treatments , and the more frequent occurrence of imaging measurements in comparison to OS. PFS also yields increases in statistical power, facilitating trials involving fewer patients and less time . The standardization of PFS has been improved by setting specified time points for imaging in order to negate time dependent assessment bias . Using 6 month PFS (6moPFS) as the determined end-point is a common choice, as it allows imaging to take place over an adequate amount of time, and this longer time point may be better linked to OS. If 6moPFS were to be substituted for OS as an endpoint in FDA AA decisions, a randomized trial would go from requiring 134 patients and 3.5 years to needing 134 patients and 1.5 years. However, the FDA remains reluctant to accept PFS as a reliable measure.
1.3.3. Secondary end-points: There also exist other relevant end-point analyses besides tumor response and tumor control in brain tumor clinical trials. Secondary end-points like quality of life (QOL) or neurocognitive function are significant because radiographic imaging changes do not necessarily give clear insight into the effect of tumor burden or the drug on the patient’s everyday life. Survival benefit goes beyond response and survival to encompass improvement of disease-related symptoms and/or QOL. Traditionally, KPS and QOL questions are used to satisfy questions about a treatment’s impact on the patient. KPS is frequently used in clinical trials, which measures, on a scale from 0-100, the patient’s activity level and medical care requirements. KPS is more related to prognosis and overall physical status than QOL. The ECOG score, also known as the WHO performance score, can also measure performance status. The WHO score is a five-point scale derived from KPS, but it is slightly more simple [44, 45].
The FDA and the North American Brain Tumor Coalition (NABTC) have declared QOL a secondary end-point of interest, but it is notoriously resistant to standardized measurement and therefore it is challenging to validate QOL data to support drug approval [29, 30, 43]. A main problem in measuring QOL is that patients often do not fill out all questionnaires over time, due to symptomatic development, loss of motivation, or doctors and nurses failing to administer or explain questionnaires, so data remain incomplete. There are different questionnaires currently available to measure QOL, but none have become the standard. The MMSE is a questionnaire intended to give insight into QOL, but it is insensitive to mild cognitive impairments and is more appropriate for detecting dementia. QOL questionnaires have been developed by the Functional Assessment of Cancer Therapy (FACT) and the European Organization for the Research and Treatment of Cancer (EORTC) that incorporate information given by the patient and their proxy, although proxies often cannot accurately recapitulate the necessary data and may exaggerate HR-QOL issues while underestimating psychological problems [34, 46, 47]. The EORTC’s QLQ-C30 is a 30 item functional and symptomatic assessment for HR-QOL. A measurement specific for brain cancer developed by this group, the EORTC QLQ-BN20, has 20 items and assesses visual disorder, motor dysfunction, disease symptoms, treatment toxicity, and future uncertainty. The FACT-BR questionnaire is specific to brain cancer and examines physical, social, emotional, and functional measures and is more psychologically, than symptomatically focused . Even if QOL cannot yet be used to support drug approval, it is an important measure in clinical trials of how a patient is coping with treatment, and a worsening QOL needs to be monitored and corrected.
4.3. Future of brain tumor clinical trials
The end-points in clinical trials for targeted therapies need to be re-evaluated. Targeted therapies frequently work in a cytostatic fashion, and judging their efficacy on imaging is often inappropriate. Because the biology of GBM is unimaginably complex, it is not likely that a single molecular agent will produce dramatic results on imaging. Instead, taking stock of biological markers before and after treatment will allow a better glimpse at if and how the drug is working. Other cancers have made use of biological markers, but this trend has not yet taken root in brain tumor clinical trials. The relationship between biological molecular changes and tumor response and clinical benefit requires comprehension before markers can be useful. Grasping how molecular markers respond to a treatment will help explain the ideal context for a drug. For example, identifying where it affects a biological pathway can shed light on combination therapy options. Preclinical evaluations in animal models will help define when molecular marker changes are significant in relation to survival or tumor burden. One biological marker in gliomas is phosphorylation of a target receptor in association with tyrosine kinase inhibitor drugs . This measure is direct evaluation of the target, rather than upstream or downstream molecules, which provides a clear picture of the drug’s effect on the intended target. However, this does not take into consideration the target as part of a larger biological environment.
An ideal picture of biological events can be gleaned through analysis of surgical tumor samples. The most powerful scenario for efficacy appraisal using biomarkers is to administer the investigational drug before acquiring a tumor sample through biopsy or craniotomy, then give more investigational drug, and finally take another surgical tissue specimen after a specified amount of time has passed. In this model, the changes in biological pathways in direct relation to a drug can be carefully and thoroughly studied, but there are many disadvantages as well. The major obstacle is that not all surgeries are therapeutically necessary. Craniotomies are expensive and risky, while less invasive biopsies are a poor substitute as they may not be representative of the tumor [29, 31].
Moreover, some investigational drugs, namely anti-angiogenic agents, preclude surgery because they interfere with blood clotting, wound healing, and can complicate anesthesia. Immunotherapeutic drugs might also block necessary immune responses in unexpected ways after surgery [31, 34]. When the drug-surgery cycle is feasible, training must be in place for surgeons. In surgery, circumferential dissection yields more viable tissue than inside-out methods, thus standardization of how tissue is acquired is a key control step for clinical trial quality. In anticipation of this issue, surgeons who work with the NABTC have been taught the expected tissue acquisition methods.
Because the drug-surgery model is not always an option for patients, non-invasive marker development will be essential for these cases. Indeed, elucidating appropriate biomarkers and avoiding unnecessary surgery would be preferred in every possible situation. Non-invasive biomarkers were explored in clinical trials on the GBM drug cediranib. The trial utilized non-invasive biomarkers, including circulating progenitor cells (CPCs), circulating endothelial cells (CECs), and amounts of pro-angiogenic proteins in the plasma. These biomarkers were coupled with MRI for perfusion, permeability, and vessel size. Briefly, the study, which will be discussed more below, discovered that the biomarkers were useful to predicting certain tumor behaviors. Furthermore, it was revealed that the drug had a short-lived response, something that may have gone undetected without molecular markers . Fully comprehending the drug’s effect is crucial, because subtle windows of efficacy such as this should not be missed or the drug will be mistakenly shelved as ineffective.
Incorporating molecular markers would merit re-vamping of phase 0, I, and II trials, hopefully into a more streamlined process. Trials that use molecular markers would preferably switch from calculating MTD to appropriate biological dose (ABD), instead. As stated previously, MTD works well for cytotoxic agents where a higher the dose equals a higher efficacy. However, ABD can occur well below MTD, and is a function of affecting the target as desired .
Phase 0 trials traditionally assess pharmacokinetic and pharmacodynamic properties in a small group of patients, followed by phase I revealing the MTD. An accelerated combined phase 0/I trial of a targeted molecular agent could simultaneously define ABD and gain a preliminary understanding of the drug’s action on its target. Actually binding the target with the drug would be a requirement for entering a phase II trial. The phase 0/I design is not meant to replace the dose escalation assessment in phase I or the efficacy analysis in phase II, but it would facilitate an efficient use of time and financial resources by combining dosing with initial target data . Using molecular markers in a phase II trial to show that a molecular agent affects, not just binds the target as intended, would be crucial. Imaging and clinical data would remain helpful as an indirect end-point, especially in combination treatment trials; yet it is vital to keep in mind that a monotherapy clinical trial is unlikely to produce an obvious clinical benefit . Currently, more phase II trials are arising that have surgical arms to facilitate analysis of molecular markers . A phase II trial could go beyond its role for efficacy by identifying a relevant target population. This way, enrolling patients in clinical trials, who are not expected to benefit from it, could be avoided .
Comprehending if and how the drug works on the target will produce a knowledgeable foundation for future drug combinations, which is probably the most promising method for GBM treatment . Although a drug may not produce obvious alterations to a clinical or radiographic end-point as monotherapy, its effectiveness could be augmented by adjuvant administration of other carefully selected drugs that simultaneously target relevant connected pathways. Tailoring combinations of targeted molecular agents based on the individual’s molecular profile is a future possibility, and trials may come to be designed for patients with different markers .
Ahead of many proposed clinical trial design changes, targeted molecular agents have begun to infiltrate clinical trials for GBM, some of which will be explored below. Certain clinical trial changes have taken place, but they are not uniform across the board. In addition to targeted therapies, immunotherapies and other therapies will be discussed, as they are prevalent in the clinical trial landscape and encompass similar trial designs and possible changes, such as use of biomarkers, as targeted molecular agents. The clinical trials discussed are intended to serve as practical examples of the changes, as well as established methods, that are actually taking place in trials, rather than theoretical propositions. Two important points of focus will be end-point analysis and choice of control in the clinical trials. The clinical trials examined below have either been recently completed, or are ongoing. They are organized into categories by the type of interventional agent, namely: anti-angiogenic/receptor tyrosine kinase (RTK) inhibitors, immunotherapies, and other therapy.
5. Discussion— Current clinical trials
5.1. Receptor tyrosine kinase inhibitor agents
RTK’s are an enticing target in GBM because the RTK/ phosphoinositide 3-kinase (PI3K) / protein kinase b (AKT) pathway has mutations in over 80% of tumors . RTKs, such as EGFR, bind and activate PI3Ks, which in turn activate AKT that has many downstream targets and causes apoptosis inhibition while stimulating growth and proliferation . One important downstream target of AKT is hypoxia-inducible factor-1α (HIF-1α), a key player in angiogenesis . EGFR is especially ubiquitous with a 40% alteration rate through mutation, rearrangement, or amplification [49, 52]. PI3K is not the only mediator in functional RTK signaling; in fact in vivo activation of various RTKs prohibits GBM from being dependent on any single RTK for important downstream signaling . GBMs are so variable that they can be comprised of cells with different driver genes of RTKs, such as EGFR and PDGF, exhibiting various activation statuses from cell to cell. Therapeutically, a silver lining is that these coexisting subpopulations have shared early genetic mutations, suggesting the existence of a possibly targetable precursor cell . RTK therapy has given sub-par evidence of efficacy in trials thus far. Clinical trial results may improve if targeted molecular therapy addresses the presence of the target and the circuitry in which it occurs, the appropriate type of trial evaluation, that GBMs are complex networks with cross-talk and feedback loops, and the tendency of cancer cells to rapidly mutate due to genomic instability. Each of these elements is likely to impact the effectiveness of any targeted therapy. A lesson should be learned from the failure of HIV monotherapy and success of combination therapy as an example of the approach needed in quickly mutating diseases . This begs the question of why RTKs are evaluated as monotherapy in clinical trials and expected to produce clinical benefit when the efficacy of a single agent is highly unlikely due to ubiquitous presence of different activated RTKs in GBM.
Erraticism in RTK profiles from patient to patient and even within one tumor is precisely why combination therapy for RTK inhibitors would be necessary to demonstrate therapeutic success. Combinations might even be optimized if they are based on the individual’s specific RTK activation profile . This will not be an easy undertaking, as the experience with some drugs’ efficacy, namely cediranib, not correlating with the target expression indicates. Targets are not biologically isolated, and their expression does not guarantee a targeted treatment will have results. Combination therapy comes with its own issues in teasing out toxicities and efficacy of different agents.
Ensuring that the RTK inhibitors, especially anti-angiogenic therapies, actually impact the tumor is paramount to understanding their influence on efficacy. Imaging issues have already been discussed; anti-angiogenic drugs can cause convoluted imaging read-outs with transient improvement in edema and mass effect. As such, biomarkers will be vital in clinical trials, as well as other imaging options, such as T2 and FLAIR MRI or PET with amino acid tracers [36, 37]. Finally, the side effects of RTKs are not negligible, and may affect future treatments, such as surgery due to inhibited wound healing . Selecting the patient population that is most likely to benefit is essential for reducing unnecessary exposures to these drugs.
Bevacizumab is a humanized, monoclonal antibody that inhibits vascular endothelial growth factor (VEGFA), the ligand of the VEGF receptor, a receptor tyrosine kinase associated with angiogenesis. Angiogenesis is vital to embryogenesis in order to develop the vascular system from endothelial cells. When a tumor becomes too large (greater than 1-2mm) to rely on the host vasculature, it may transition from avascular to vascular prompted by hypoxia-induced activation of pro-angiogenic factors such as VEGF and transforming growth factor beta (TGF-β). VEGF is an important inducer of angiogenesis, and is over-expressed in GBM, as well as many other cancers . Bevacizumab was granted FDA accelerated approved for the treatment of recurrent GBM on May 5th, 2009, based on the evidence for objective tumor response provided by two phase II studies. The process of bevacizumab through clinical trials is discussed.
This phase II trial was designed to study GBM in its first or second recurrence and give evidence for AA. The trial had two arms: bevacizumab as monotherapy and bevacizumab with Irinotecan, and was historically controlled. To be eligible, patients may have had prior surgery or biopsy, prior RT/TMZ, demonstrable radiographic evidence of progression after therapy, stable or decreasing corticosteroid use 5 days before the baseline MRI, and a KPS ≥ 70. The median age of the bevacizumab arm was 54 years, and KPS was stratified in groups of 70-80 and 90-100. There were 85 patients in the bevacizumab monotherapy arm; the irinotecan and bevacizumab arm were not analyzed for approval because the FDA stated that the efficacy of bevacizumab and irinotecan were inseparable. 6moPFS was the designated primary trial end-point. However, the primary end-point per the FDA was objective response rate, as 6moPFS needs a comparative control to show results. An independent review committee determined the objective response rate according to Macdonald response criteria, and the MRI readings were confirmed 4 weeks later. Secondary end-points were the duration of response and safety. Historical control data was from an analysis by Wong et al  of eight phase II trials at the M.D. Anderson Cancer Center, which included 225 GBM patients. In these studies, both CT and MRI imaging were used, scans were performed every 2 months, and criteria similar to Macdonald were used. In the AVF3708 trial, MRI imaging was conducted every six weeks with Macdonald criteria. The historical control objective response rates were assumed to be 5% for the bevacizumab monotherapy arm, and 15% for the 6moPFS. A significant 28.2% objective response rate with 1 CR and 23 PR, and a duration of 5.6 months was reported. The FDA reported a 25.9% response rate with 4.5 months duration. The discordance rate for objective response rate assessments between two assigned, independent review radiologists was 47.1% and a third radiologist failed to agree with the first two 14.3% of the time. The high frequency of disagreement highlights exactly why the FDA is hesitant about MRI usage in clinical trials, especially with anti-angiogenic therapies that are known to change MRI readings, as stated previously . The 6moPFS reported by the trial sponsor (Genentech) was significant at 43.6%, the 6moPFS reported by the FDA was significant at 36%, and as reported by the investigator was significant at 42.6%. The discrepancy was purportedly due to the fact that Genentech used a 5.52 month cut-off, while the FDA adhered to 6 months. Although historical controls are not ideal due to possible population differences, tumor assessment frequency and criteria, along with type of follow-up, Genentech proposed that such a large change in 6moPFS is a reasonable predictor of clinical benefit.
AVF3708 suffered from lack of clarity over the end-point of choice, since the FDA disagreed with Genentech. The study also relied upon historical controls, which are never optimal, and due to poor planning was unable to use data from the bevacizumab and irinotecan arm. Finally, although the study used Macdonald criteria, the objective response rate was obviously a point of contention with disagreement between reviewers. Despite these issues, this data was used to support AA.
5.1.3. NCI 06-C-0064E
A second phase II trial examined the efficacy of bevacizumab in recurrent GBM, as compared to historical controls. It used the same tumor assessment criteria (Macdonald) with an independent review process, but collected MRI every 4 instead of 6 weeks. The median age of the patients (n=56) was 54 years, and KPS was stratified with a group scoring 70-80 and a group scoring 90-100. Patients had previously undergone surgery, radiation, and systemic chemotherapy. Once again, PFS was the determined end-point, but it was agreed that the data on objective response rate and the duration of the response would be used to support bevacizumab approval. A 19.6% response rate with a median duration of 3.9 months was reported. These data supported AA along with AVF3708.
5.1.4. Post-marketing trials
After AA was granted based on these phase II trials, two post-marketing phase III clinical trials, as required by the FDA after AA, were initiated. They were intended to confirm the efficacy of bevacizumab, but this time in the primary tumor setting, and the trials have recently have reported results. The first trial, AVF4396g, was sponsored by the drug company Hoffman-La Roche and was a randomized, double-blind, placebo-controlled study of bevacizumab in combination with RT/TMZ for 921 newly diagnosed GBM patients. The median age in the treatment group was 57 years and in the placebo group it was 56 years. KPS was stratified with 32.6% and 30.3% having a score of 50-80 in the treatment and placebo groups, respectively, and 67.4% and 69.7% having a score of 90-100 in the treatment and placebo groups, respectively. Patients were also organized into RPA classes, with the majority being in class IV. The primary end-points were designated as OS and PFS, although the FDA declared OS as the principal regulatory end-point. Updated Macdonald criteria were used for imaging assessment. Patients were randomized with recursive partitioning analysis to receive RT/TMZ with or without bevacizumab (FDA 2009). The median PFS was significantly higher at 10.6 months in the bevacizumab arm versus 6.2 months in the placebo arm. OS was not significantly different in the experimental versus placebo groups, 72.4% and 66.3% at 1 year, and 33.9% and 30.1% at two years, respectively. No change was observed in QOL or neurocognitive function .
The second phase III trial was sponsored by the National Cancer Institute, and enrolled 978 patients, 637 of whom were eligible to undergo randomization to receive RT/TMZ with bevacizumab or a placebo. The median age was 58 years in the randomized group. There was a KPS of 60-80 in 40% and 39% of the treatment and placebo groups, respectively, and a KPS of 90-100 in 60% and 61% of the treatment and control groups, respectively. The majority of patients had total resection, although roughly a third had partial resection. RPA class stratification was used, with most patients being in class IV. There was no significant effect on OS, 15.7 months in the bevacizumab group and 16.1 months in the placebo arm. OS was similar to the AVF4396g study, but PFS differed, possibly due to a statistical difference in the pre-specified alpha level for progression (p<0.01). PFS was improved at 10.7 versus 7.3 months. This study reported that over time patients experienced worsened quality of life, a greater symptom burden, and a decline in neurocognitive function .
The phase III trials failed to show a direct clinical benefit for survival, which jeopardizes the continued usage of bevacizumab in GBM, at least in the upfront setting. Disturbingly, these two trials presented contrasting data on the impact on the quality of life for GBM patients. This was puzzling as both studies used the EORTC’s QLQ-30 and BN20 questionnaires for quality of life assessment. AVF4396g used updated Macdonald imaging criteria that took into account non-enhancing tumor, so the possibility of missing a progression in disease was reduced, which could have affected QOL . The lessening of edema and subsequent reduced need for corticosteroids is the purported benefit to QOL seen in AVF4396g. Differences in statistics and imaging criteria between the post-marketing trials are not negligible, and point to the challenge of which results are to be most convincing.
Bevacizumab is a germane example of a drug that received AA, only to disappoint in confirmatory trials. If bevacizumab is indeed harmful to the quality of life, as suggested by the NCI’s phase III trial, its presence on the market is concerning, although the differences in primary vs recurrent setting may clarify that. It could be speculated that early studies and AA were flawed in their reliance on imaging dependent end-points as their initial primary considerations. Objective response and 6moPFS may not accurately predict clinical benefit. The issues with imaging in the trial are substantiated by the almost 50% frequency of disagreement in AVF3708 between radiological reviewers according to the FDA. AA is also based on much smaller trials, whereas the phase III trials combined looked at almost 2,000 patients, so statistical power is greatly improved. It remains to be seen how bevacizumab will be prescribed in the future, now that it is has a debatable effect on survival. It has not been removed from the market, and continues to be evaluated in clinical trials.
A receptor tyrosine kinase inhibitor that recently went through clinical trials is Cediranib (AZD2171), a pan-VEGFR inhibitor with some activity against related structures c-Kit andwhich acts against platelet-derived growth factor receptors (PDGFRA and B). A phase II trial involving 31 recurrent GBM patients receiving cediranib monotherapy, with the primary end-point being APF6, was conducted. The median age was 53 years and median KPS was 90 and patients were not stratified. Secondary end-points were radiographic response, median OS, and toxicity. Various imaging modalities were employed (assessment was changed to include 2D and volumetric measurements) and analysis of plasma and urinary biomarkers was undertaken. In summary, the median OS was 227 days and the APF6 was 25.8%. A radiographic partial response was detected in 17/30 (56.7%) patients with volumetric measurements, and with 8/30 (27%) using Macdonald criteria instead of volumetric analysis. This is a significant example of the differences that can arise between different imaging assessment methods. As compared to historical controls (Wong et al. analysis 1999), the results were deemed encouraging . The baseline levels of different biomarkers were unable to be correlated with PFS or OS, but were postulated to be helpful for pharmacodynamics. However, some biomarkers did exhibit significant change, independent of PFS or OS, after treatment. A significant correlation was stated between some dynamic biomarkers and radiographic response and survival. This study demonstrated that the majority of patients reduced or stopped their corticosteroid dosages, but this was contingent upon continued treatment with cediranib. The study results were overall positive, and the vitality of combined therapy was re-iterated due to the belief that cediranib normalizes vessels to make other treatments more effective .
A phase III randomized, placebo controlled, partially blinded study was initiated after the promising phase II results. Phase III was a 3 arm study of 325 recurrent GBM patients investigating cediranib monotherapy, cediranib with lomustine (nitrosourea approved for use in GBM), and lomustine monotherapy. This phase took the important step of moving to combination therapy. The median age in each arm was 54 years. The patients were stratified according to KPS with 50%, 48%, and 36.2% having a score of 70-80 in each arm, respectively and 50, 51.2, and 62.5% having a KPS of 90-100 in the respective arms. Patients had previously been treated with RT/TMZ. The primary end-point was PFS, assessed by centralized, independent, blinded radiographic review, which used updated Macdonald criteria for T1, T2 and FLAIR MRI imaging. It should be noted that this evolved from the phase II trials that used volumetrics for imaging analysis and APF6. The soluble biomarkers VEGF, VEGFR2, and bFGF from plasma were measured in relation to baseline levels, but they were determined to be unrelated to predicting outcome in the trial. Biomarkers did not prove helpful in this trial, which was disappointing. The trial did not meet its PFS goal of significance, and did not impact OS. Since 136 patients received salvage therapy of bevacizumab monotherapy or combination therapy, survival could have been biased, yet salvage therapy was established to be similar in the different arms. Perhaps the disappointing results can be partially attributed to changes in imaging analysis methods over the course of the trial phases. There were secondary benefits of lengthened time to neurologic deterioration and reduced steroid dependence. The combination of cediranib and lomustine was not ideal, as cediranib heightened lomustine associated toxicities, although these were manageable. The idea that cediranib might be better in combination with RT, according to pre-clinical results, was proposed. Two phase II studies on cediranib and radiation are ongoing .
Immunotherapy has been gaining momentum as the next attractive treatment option for GBM. It promises fewer side effects, and greater efficacy because it uses the host’s own immune defense mechanisms to fight cancer while sparing normal tissue. Important immune cells to be activated are naïve and memory T cells, natural killer cells (NK), and natural killer T cells (NKT). Immunotherapies seem to show good evidence of impacting survival in clinical trials with relatively minimal toxicities .
Although immunotherapy is promising, there are inherent challenges and limitations. The BBB is often discussed in immunotherapy, but its role is a subject of debate. However, the BBB may not be as impenetrable as formerly proposed. Antigens from the CNS do end up in cervical and nasal lymph nodes. Also, lymphocytes are capable of crossing the BBB. In fact, activated T cells express certain molecules that enable them to penetrate the BBB . Inflammatory chemokines stimulate lymphocyte arrest and binding of cell adhesion molecules to integrins, which is fundamental to immune cells entering the brain. Furthermore, lower numbers of immune molecules and cells, like CD4 and human leukocyte antigen (HLA) class I, have been correlated to worse prognosis and survival, which would not be expected if the brain was exclusive to the immune system .
In addition to the BBB, antigen identification is also challenging. While other cancers express tumor-specific antigens for obvious targeting, GBM has very few cancer specific antigens . Even epidermal growth factor variant III (EGFRvIII), which is found solely on GBM cells, is expressed in just 20-30% of tumors. Also, because an antigen is tumor specific does not mean that it is immunogenic . GBM cells do not adequately present antigens, limiting the activation of T cells . A third formidable obstacle to immunotherapy is the profound immunosuppression of GBM patients, both in the tumor microenvironment and systemically. Immune suppression is typically profound and cannot be completely overcome, which is a disadvantage for immunotherapy efficacy but an advantage in avoiding a different risk of immunotherapy, that of accidentally inducing autoimmunity. The occurrence of an autoimmune reaction is a risk because tumor cells are technically host cells, so they have inherent tolerance systems put in place to avoid autoimmunity, and expression patterns similar to host cells .
Therapies that are effective despite immunosuppression issues can run into a major obstacle, which is tumoral immune editing. If a therapy effectively targets certain antigens, the tumor may subsequently change its reliance on these antigens . This was illustrated by the loss of EGFRvIII expression in many recurrent tumors after administration of an EGFRvIII targeted vaccine, rindopepimut. Vaccines using whole tumor lysates or combination therapy are a possible solution to immune editing.
In addition to immune system obstacles, immunotherapy faces the same problems as other therapies with clinical trial design, imaging limitations, and end-point selection. Reliance on pre-clinical animal models is not optimal because the models have limits as surrogates, and translating the therapy’s effects into humans is complex. Moreover, obtaining an immune response does not necessarily bestow improved survival or longer time to progression . Finally, immunotherapy trials are very expensive. The simplest phase III immunotherapy cancer trial with approximately 300 patients, easy immunotherapy manufacturing, and efficient clinical readouts, surgery, and imaging might cost about 20 million dollars. A larger trial that has personalized therapy and is more technology reliant could cost in the hundreds of millions .
Another popular and prevalent vaccine for GBM is Rindopepimut (CDX-110), an EGFRvIII specific 14-mer peptide. EGFRvIII is unique in that it is solely expressed on GBM cells. An in-frame deletion causes fusion of two parts of the molecule, ultimately leading to constitutive activation that is implicated in tumorigenicity, tumor cell migration, and resistance to chemotherapy and radiation .
A phase II clinical trial (ACT III) with intradermally administered rindopepimut plus GM-CSF and maintenance TMZ in 65 newly diagnosed EGFRvIII positive GBM patients reported a PFS of 12.3 months (compared to 6.3 historically) and an OS of 24.6 months (compared to 15.0 historically). Patients had to have undergone gross total resection and traditional TMZ/RT without progression. The median age was 56 years and the median KPS was 90 . The historical control cohort consisted of 17 patients treated at the center at the same time and the populations were matched for positive EGFRvIII status, gross total resection, treatment with RT/TMZ, and no progression for 3 months post treatment . The primary end-point was considered PFS. Methylation of MGMT resulted in a significantly increased PFS. After these positive results, new clinical trials were initiated.
The current investigations of rindopepimut are a phase III trial (ACT IV) for newly diagnosed GBM patients and a phase II trial (ReACT) of recurrent GBM patients. ACT IV aims to include 700 EGFRvIII positive patients who have had gross total and incomplete resection, in an effort to reduce bias towards grossly resected patients, and who have completed standard chemoradiation. Completion is expected in 2016 . The trial is a 2 arm, randomized study, comparing rindopepimut to a TMZ control, with OS as the primary measure, and PFS and safety and tolerability as secondary measures. Results have not yet been released (Clinicaltrials.gov, NCT01480479). ReACT is studying the efficacy of rindopepimut in combination with bevacizumab for patients with relapsed EGFRvIII positive GBM. Patients are sorted into two groups: those who have received and are refractory to bevacizumab, and those who have not previously received bevacizumab. Within those two groups, patients will be randomly assigned to receive rindopepimut/GM-CSF or a placebo (KLH) while continuing, re-starting, or starting bevacizumab therapy. Group 1 is bevacizumb naïve and receives bevacizumab plus rindopepimut/GM-CSF. Group 2 is bevacizumab naïve and receives bevacizumab plus a KLH placebo. Group 2C is bevacizumab refractory and receives bevacizumab plus rindopepimut/GM-CSF. The primary end-point for Groups 1 and 2 is 6moPFS, and the secondary end-points are safety and tolerability, antitumor activity, and EGFRvIII specific immune responses. OS is notably absent as an end-point. The primary end-point in group 2C will be ORR. Imaging analyses are performed using RANO criteria. Patients must have had prior surgery and chemoradiation, be EGFRvIII positive, and be in their first or second relapse of primary GBM or first relapse of secondary GBM. Completion is anticipated for 2014 with 168 patients. Preliminary results were released in November 2013. The OS for Group 1 was 12.0 months versus 7.9 months in Group 2. The PFS was 3.7 months in group 1 as compared to 2.0 months in group 2. Interestingly, the occurrence of tumor response (>50% shrinkage) for group 1 was 37% per investigator review and was 32% per expert panel review. For group 2 it was 19% per investigator review and 25% per expert panel review. Once again, this demonstrates the objective nature of imaging analyses. The results also indicate that patients who are EGFRvIII positive do have a poorer prognosis. In Group 2C the OS was 5.6 months and the PFS was 1.9 months. In preliminary ReACT data it was suggested that the anti-EGFRvIII titer is predictive of patient outcome. The results for ReACT are continuing to be assessed, but show a trend towards efficacy (Clinicaltrials.gov, NCT01498328).
There are limitations in the analysis and use of rindopepimut. An obvious problem with rindopepimut is that only 20-30% of GBMs express EGFRvIII. Furthermore, upon recurrence it has been documented that between 82-91% of tumors lose EGFRvIII expression after recurrence [62, 67]. Importantly, EGFRvIII expression has not been significantly related to survival. Furthermore, trial design was not ideal in rindopepimut testing. The ACT III trial did not take into account certain prognostic factors, like mutation of isocitrate dehydrogenase one/two (IDH1/2), which is known to positively affect survival; patients in the trial also had various salvage therapies that may have influenced survival. The historical control population was also much smaller than the experimental group (n=17 vs. n=65). Despite matched eligibility criteria, discrepancies between the rindopepimut study population and the historical population used for comparison in ACT III are possible and would be problematic. Bias in the clinical trials of rindopepimut could also have arisen from the fact that KPS scores were quite high; in ACT III the median KPS was 90. KPS is a key prognostic factor and may indicate that patients in these trials had a better prognosis overall. Moreover, patients in these trials underwent gross total resection, which is not realistic to the greater GBM patient population, but ACTIV and ReAct do attempt to address this by including patients with less than gross total resection .
5.3. Other therapies
An intriguing non-drug option for GBM patients is the Novo-TTF system, approved by the FDA in 2011. NovoTTF is a portable device that is worn almost continuously, with the exception of short daily breaks for personal needs. It emits low intensity, intermediate frequency, alternating electronic fields that are non-invasive. Electronic fields disturb cell division by triggering microtubule misalignment necessary for the metaphase to anaphase transition. The fields also cause movement of intracellular macromolecules and organelles in telophase. Ultimately, the cytokinetic division of the cell is impossible after failed segregation and distribution of chromosomes, organelles, and microtubules, and cell death occurs . Dividing cells have unique shapes and electric features, making rapidly multiplying cancer cells sensitive to electric field treatment.
NovoTTF was tested in 237 recurrent GBM patients in a phase III trial comparing NovoTTF monotherapy to any active chemotherapy, as determined by the physician’s judgment. The median age of patients was 54 years and the median KPS was 80. The primary end-point was OS and the secondary end-points were PFS, PFS6, radiological response, 1 year survival, QOL, and safety. A central review board did imaging analysis using the MacDonald criteria. Selected patients must have had radiation, with or without TMZ, and 80% had failed 2 or more chemotherapies and had more than one recurrence. The primary end-point of OS was not significant at 14% versus 9.6% in the active control arm. PFS6 was improved at 21% compared to 15%. The failure to produce improved survival might have been unavoidable in a patient population with more than one recurrence and failure of other treatments; other trials involving recurrent patients often enroll patients on their first recurrence. Despite not reaching survival significance, NovoTTF was approved. An important result contributing to approval was NovoTTF’s effect on QOL, assessed via QLQ-30 . Enhanced QOL is attributable to the absence of serious chemotherapy related toxicities, such as nausea, anemia, fatigue, and serious infections, although, some patients treated with NovoTTF did have worsened neurologic events, including headaches and convulsion.
The FDA approval of NovoTTF was surprising in light of the FDA’s hesitance on relying unstandardized QOL measures. However, the decision to approve NovoTTF is understood through its contingency upon the dearth of treatments for recurrent GBM. Re-resection and re-irradiation are feasible in only a handful of recurrent patients, and chemotherapy has an extremely low response rate of less than 10%. Approval for an intervention that does not improve survival would is far more feasible in a disease like GBM that has such a poor prognosis. The vote for approval was very close (7 to 6) and can be classified as a “non-inferiority” approval, meaning NovoTTF is not better nor worse than other therapies, and it may improve QOL . Here the risk-benefit ratio was tantamount to the drug succeeding in clinical trials. It is an example of how skewed the ratio is in a severe diagnosis such as GBM. The benefit did not have to be great because other approved agents have very low benefits in comparison to significant side effects.
The ideal design of clinical trials for GBM patients remains elusive. The conflict between OS, PFS, and ORR as primary end-points continues, with the FDA staying with OS. Radiographic imaging assessment remains an obstacle for PFS and ORR, which is evidenced by disagreement between radiographic reviewers (as seen in studies with bevacizumab). Although imaging criteria are being updated, the most likely progress will be through different imaging methods, like T2 MRI and FLAIR. Alternatively, biomarkers are becoming more important. They are already being used in trials, but there is still a gap in comprehension of how and when they are most useful. In RTK inhibitor clinical trials biomarkers were assessed, yet they did not always have predictive value or an expected relationship to an end-point. However, this does not mean that biomarkers are useless. Connecting tumoral and survival end-points to biomarkers will require further study, and they are indeed continuing to be analyzed in clinical trials.
Many investigational agents discussed above undergo changes to their end-points as the drug advances from phase to phase in the clinical trial. Rindopepimut is an example of the myriad end-points that may be used for assessing any one agent. Furthermore, as each investigational agent seems to use a different control cohort, there is especially variability in the historical cohorts chosen for comparative analyses, which are drawn from several literature sources. Lack of consistency across trials or even phases is detrimental because it creates confusion and complexity in analyzing investigational agents. Also, many drugs in clinical trials end up being disappointing despite promising early phase results. This phenomenon could be attributed to limitations in early phase clinical trials, such as small sample size, absence of a comparator arm, or bias in patient selection. Protecting drugs in early phases who have not demonstrated clear efficacy and safety from entering the market will continue to be an ethical and sometimes legal battle; collecting data is absolutely vital to the welfare of current and future patients and it cannot be compromised.
Another major issue is the tendency to focus on monotherapy (or monotherapy plus RT/TMZ). Monotherapy is helpful in determining how a drug works, but beyond this the therapeutic effects would be greatly augmented by creating relevant combination therapies. GBM evolves too quickly and frequently for any monotherapy to become a “cure”. Moreover, combination therapy in the form of novel drugs, in comparison to a new drug added to the RT/TMZ regimen, is not common and needs more focus. Combining novel drugs is admittedly tricky, yet it holds promise.
Clinical trials are the avenue for progress in GBM. The obstacles that inhibit the success of clinical trials are formidable, but not unsolvable. Greater patient participation, better standardization across and within clinical trials, and innovative end-points and combination therapy are the way forward.
The authors would like to thank the University of Colorado Department of Neurosurgery, The University of Colorado Cancer Center, the Cancer League of Colorado, and the National Institutes of Health RO1 # 11066485 for support.