Electroretinography (ERG) is an important clinical tool that provides an objective quantitative measure of retinal function. Decreased a and b wave amplitudes and prolonged latencies correlate to reductions in retinal function that may be the result of toxicity, ischemic damage, or retinal dystrophy (Fishman et al. 2001, Ophthalmology monographs). Furthermore, since the different components of the ERG waveform correspond to the different layers of the retina, one is able to attribute changes in the ERG to damage to specific retinal layers. This data can be a useful surrogate for retinal health, for example establishing safety profiles for drugs under clinical development.
Since 1989 the International Society for Clinical Electrophysiology of Vision (ISCEV) has provided standards for the recording of ERGs. These documents provide a framework for the clinical electrophyisologist to obtain “standard” ERG recordings (Marmor 1989). The variety of permissible ERG instruments and their individual calibration requirements contributes to significant inter-laboratory variability. This variability is recognized in the ISCEV standards and partly addressed by stating “it is incumbent on the manufactures and users to verify that full-field stimulation meets the requirements of this standard.” Placing the onus for compliance on the manufactures but leaving the clinical electrophyisologist to determine if the recording standards are indeed met.
ERG standards have extended beyond the a- and b-wave of the full field flash ERG. The pattern ERG (PERG) is the electroretinal response to a pattern reversing stimulus such as bar gratings or checkerboard pattern. The PERG primarily reflects ganglion cell function and since it is viewed on display monitors it largely represents ganglion cell function within the macula. The peak and trough components of the PERG have been formally defined as the N35, P50 and N95 which represent the polarity (Negativity or Positivity) and the mean latency of occurrence. The ISCEV has produced standards for the recording and reporting of the PERG (Holder et al. 2007).
While the PERG provides a single waveform which represents the electroretinal response of the entire macular region, the clinical multifocal electroretinogram (mERG) provides information of local retinal function. The mERG is recorded typically displaying the local retinal response of 61 or 103 local regions within the central 45º of the posterior pole. The responses represent localized cone-driven ERGs obtained in the light adapted state. While the waveform morphology of the mERG is similar to the fullfield ERG the electroretinal generator sites of the trough and peak are not equivalent so it is incorrect to use the ‘a-wave’ and ‘b-wave’ labels for the mERG. Rather, the major features of the mERG waveform are labelled based on polarity and order of appearance, namely, the N1, P1 and N2 components. Unlike fullfield flash ERGs which can be seen in response to single flashes, the mERGs are highly processed responses obtained to over 16,000 presentations and subject to more technical artifacts including eye movement, head tilt, poor refractive capability, and poor fixation. There are ISCEV standards for the recording and reporting of mERG (Hood et al. 2008) that also deal with the identification of artifacts and their solution.
The ISCEV Standards Committee are also reviewing and examining other ERG techniques, parameters and waveforms such as On-Off response and Photopic Negative Response (PhNR) and will be developing guidelines and standards for recording and reporting these electroretinal responses.
The expert consensus panel that authored the 2008 ISCEV standards also recognised that older ERG recording equipment may not comply with some of the current stimulus parameters, including background illumination level and flash stimulus levels, but they expressed their hope that manufacturers would strive to update their protocols and comply with the most recent standard. The instrument manufacturers are indeed making modifications as they bring out new equipment models. The panel tacitly stated that since updating non-compliant equipment takes time, publication of data from laboratories that do not fully comply with the current standards is permissible provided they clearly indicate all variances from the ISCEV protocol. This poses a challenge for any organisation attempting to interpret ERG recordings between centers or over time as recordings techniques and standards are continually changing
Two significant sources of variability are the background luminance and the intensity of the stimulus flash. The most recent ISCEV standards chose to define the stimulus intensity and background luminance as single values rather than acceptable ranges (Marmor et al. 2009, 118:69-77). They also recognize that differences between equipment and calibration fluctuations would cause minor variability in the strength of the stimulus and thus provide for a tolerance of ± 10% as an acceptable amount of fluctuation in the standard.
The ISCEV standards provide a good framework for the establishment of an ERG recording system but they are by no means all encompassing. Compliance with ISCEV standards at two different sites does not mean recordings of the same patient at two different time points are directly comparable. This difficulty in obtaining reproducible ERG tests at different centers has hindered the utility of ERG testing in drug development and clinical trials.(Chambers 2011)
There are several challenges encountered in comparing and compiling data between different test sites that are important considerations and affect the clinical utility of ERG both for individual patients and in clinical trials. The following sections highlight some of these challenges and offer suggestions to overcome or decrease inter-site and inter-test variability in ERG recordings.
2. Equipment issues
It is the responsibility of the clinical electrophysiologist to ensure that their laboratory complies with the published ISCEV standards. The standard states “it is necessary that all electrophysiologists master the technical requirements of their chosen electrode, to ensure proper impedance, to ensure that waveforms are comparable to standard ERGs, and to define both normal values and variability for their own laboratory.” Essential to the adherence to the ISCEV standards is a thorough understanding of the implications of variability at all points in the stimulus – recording loop along with implications of fluctuations in the environment. In this section we review some of the challenges faced in recording ERGs and minimizing inter-site and inter-test variability.
2.1. The light source problem
Differences in standardized luminance pose a major challenge to comparing ERGs between clinical testing centers. The main problem is an inherent property of the most common light source, the xenon flash tube. In xenon flash tubes there is instability produced by arc-wander which can give rise to random fluctuations of a few percent in light output (Robson 1998). All Xenon tubes produce less light as they age necessitating frequent radiometric calibration. Unfortunately, due to manufacturing tolerances the working standard of radiometry equipment provides an overall uncertainty of about 10% (Ryer 1997). This measurement error would be multiplied between sites potentially providing a significantly greater amount of inter-site variability. One possible solution to this issue is the recent change to light emitting diodes (LEDs) for the ERG stimulus source.
While LEDs do not have the luminous efficacy of xenon flash tubes they are still a promising light source for electroretinography since they can be precisely controlled with regards to intensity and flash duration. LEDs show remarkable flash-to-flash stability with less than 1% variation in light intensity(Robson 2005). In addition, long term output change is around 1% after 2000 hours of continuous output at an ambient operating temperature of +55ºC(Coupland 2004a). In the late 1990’s several investigators began using LED stimulator systems for clinical electroretinography in the laboratory and for intra-operative monitoring. In 2000, a light-emitting diode flash stimulator became commercially available, the Espion ColorBurst™ and Espion ColorDome™ (Diagnosys LLC, Littleton, MA). These systems have now been employed as the stimulus for ERG recording systems and have significantly reduced inter-test and inter-site variability.
2.2. Electrode issues
There are many designs of ERG recording electrodes available, including contact lens, gold foil, gold wire, corneal wick, wire loops, microfibers, as well as skin electrodes. The clinical ERG is obtained with an electrode placed at some distance from the neural elements producing the signals of interest. The electrical current originates in the retinal circuitry around the eyeball and orbit, with both spatial and temporal variations. The ERG signals are conducted from their retinal generator sites through various tissues to the surface electrode. Each electrode type has its own characteristic impedance, recording characteristics, and inherent artefacts(Coupland 2004a).
In 2006 an international survey of ERG electrode use amongst ISCEV members was conducted via email(Coupland 2006a). Members were asked which electrode system they used most often and their second choice of electrode system. Over 80% of respondents used two or more different ERG electrodes in their clinical practice. The majority of respondents (52%) use the contact lens electrode as their first choice for clinical electroretinography. The second and third most popular choices were the DTL fibre electrode (36%) and the lid hook electrode (12%) respectively. There were no respondents who use skin electrodes as their electrode of first choice in all their patients.
Of the 80% of respondents who used a second ERG electrode, lid hook electrodes were chosen by 44%; whereas, 21% of these respondents used DTL fibers or contact lens electrodes as their second choice. A small number of respondents (12%) indicated they used a skin electrode as their second choice; all of these respondents were involved in testing paediatric patients.
Respondents were also asked why they preferred the ERG electrode they were using. Those using contact electrodes preferred better signal-to-noise, quality and consistency of recorded ERGs, the durability and convenience of the lid speculum were also considered important. Those respondents choosing the lid hook electrode preferred better patient acceptance, good ERG recording results, the unaltered optical quality provided by lid hook electrodes and their ease of use. Those respondents expressing preference for DTL–fiber electrodes were impressed with patient acceptance, the electrode’s ease of use, the unaltered optical quality provided, the fact that it cannot be blinked out, the electrode is disposable, and that no sterilization is needed. Skin electrodes were preferred because of ease of use, patient comfort, and the fact that no sterilization was needed.
As illustrated by this study the choice of recording electrode is varied amongst the members of ISCEV. Furthermore, when asked whether clinicians were using the best ERG electrode recording system, 72% of respondents expressed agreement; whereas, 20% of respondents weren’t sure and 8% of respondents felt they were not using the best electrode available. The chief impediments to changing to a better ERG electrode recording system were listed as the cost and time needed to collect new normative data, the time for training staff to use the new electrode effectively, and the inability to find consistent supplies of disposable microfiber electrodes.
The signal to noise ratio, impedence and sensitivities for each of these electrode systems differ considerably (Coupland 2006a). As highlighted by this study it was felt that changing the electrode significantly changed the recordings, to the point where the clinical electrophyisologist felt it was necessary to recreate the normal database. If comparisons are made between recordings from different sites or at different times it is important that same electrode system is used to allow these comparisons to be valid. This difference is amplified between sites where not only the electrodes are different but the entire recording systems are different.
2.3. Calibration issues
The Calibration Standard Committee of the ISCEV has provided guidelines for the calibration of stimulus and recording parameters used in clinical electrophysiology of vision (Brigell et al. 2003, 107:185-193). The document is concerned with both the calibration of the visual stimulus including protocols for the measurement of luminous flash intensity, mean luminance, contrast and visual angle of pattern stimuli. The photometric measurement of luminance levels used for stimulation of the rods is most accurately specified in scotopic units. Unfortunately, few photometers have scotopic correction filters available and the suggested ISCEV standard is a photopic photometric calibration of the stimulus. This allows only an approximation of the rod flash luminance. In our clinical experience many clinical laboratories do not have appropriate technology or familiarity with both photopic and scotopic photometric calibration. The use of automated calibration routines can be usefully applied to optoelectronic (i.e. light emitting diode) stimulators and at least one manufacturer has incorporated internal calibration of their LED stimulator. While the ISCEV standard requires the calibration of Ganzfeld strobe flash and background luminance is performed at least every 6 months, in reality the frequency of calibration varies widely across different laboratories. Some equipment manufacturers have provided internal reminders to prompt users to perform calibration, this is particularly useful if internal automated calibration systems are provided.
The ISCEV standards also provide guidelines for the calibration of electrophysiological recording systems and include protocols for the measurement of electrode impedance and amplifier calibration(Brigell, Bach, Barber, Moskowitz, and Robson 2003, 107:185-193). The calibration of amplifiers is particularly challenging since most manufacturers do not provide a standard calibrator to pass a known signal through the system to measure the system output. Suitable signal generators must be capable of producing low amplitude output using both sine wave and square wave pulses. The use of both sine wave and square wave calibration signals allows detection of unwanted harmonic distortion and assesses the filtering characteristics of the amplifiers themselves(Brigell, Bach, Barber, Moskowitz, and Robson 2003, 107:185-193). The ISCEV standards committee is presently communicating with equipment manufacturers to develop appropriate standardized calibrators.
With increasing use and popularity of optoelectronic stimulation, internal calibration is essential because the luminous intensity of an LED is temperature dependent. This means that as temperature linearly increases or decreases, the light intensity of an LED exponentially decreases or increases respectively. Red and amber LEDs are more sensitive to temperature effects than blue and green LEDs. Some ERG systems (e.g. Diagnosys LLC, Lowell, MA) utilizing an LED stimulator provide the user with the ability to perform internal calibration of the LED system once the system is turned on or it can be evoked every time a new test is selected. Presently, the ISCEV standards do not address calibration of LED systems, but this should be rectified in the next revision of the standards.
The ISCEV standards provide a good basis for creation of an ERG recording system. However it is incumbent on the electrophyisologist managing the recording laboratory to ensure that the equipment is properly and regularly calibrated and that the recording technique is standardised and conforms to ISCEV standards.
3. Normative data
Dorland’s Medical Dictionary defines Normal as “agreeing with the regular and established type.”(Droland 2003) Typically in medicine this is taken as the mean ± 95% confidence intervals for a given population. Thus an ERG response for a given type of stimulation would be normal if it fell with-in the 95 % confidence intervals for that type of recording. The challenge however is defining the 95% confidence intervals for ERG.
The need to establish a new normative database as indicated in Coupland’s 2006 survey of ISCEV members is a major hindrance to changing or updating any electrophysiological system(Coupland 2006a). To date few companies supply a normative database which can negate the need to carry out on site testing to establish site specific guides for normative values. This is contrary to most diagnostic equipment utilized in ophthalmology and is in part due to the inherent variability in recording systems identified above. Population specific factors such as age, race, pupil size, axial length, and diurnal variation, impact the ability to establish a normative database; these issues are addressed in this section.
3.1. Age factors
The effect of age on ERG parameters has been well demonstrated(Birch and Anderson 1992, 110:1571-1576;Peterson 1968:Suppl-77;Weleber 1981, 20:392-399). Birch and Anderson(Birch and Anderson 1992, 110:1571-1576) for instance described ERGs in 270 normal subjects recorded under the ISCEV standard and ranging in age from newborn to 79 years. The mean log amplitude of both rod and cone responses increased from birth reaching its maximum at around 25 years and then showed a progressive decline with age. While the exact factors causing the decrease in ERG amplitude in the elderly are not well understood, it is likely that preretinal media changes, reduction in photopigment optical density with age, or bipolar and/or Müller cell degeneration with aging could account for such age-related amplitude reduction (1992).
Similar changes are observed at the other end of the age spectrum. Both scotopic and photopic ERGs are recordable at birth. There is a rapid increase in b-wave amplitude during the first 4 months in the neonatal period. By 6 months of age, b-wave scotopic sensitivity has become equivalent to that of adults(Fulton 1988, 69:101-109). Development of normal, rod-mediated vision in human infants is primarily based on studies of electroretinal function, and psychophysical adaptation to bleaching lights as well as adaptation to steady background lights. Presently, it is thought that early postnatal development of human scotopic function reflects the reorganization of the inner retinal circuitry(Fulton 1988, 69:101-109).
As with most diagnostic tests in medicine, it is important to compare any ERG recording with age matched normal subjects – but this significantly increases the number of ERG recordings required to establish a normative database. This task becomes even more monumental when one takes into consideration ethnic diversity.
3.2. Ethnic diversity
Coupland et al., (2006b) described an international study of the ERG in normals using the same ERG equipment and protocol at two centers in the Peoples Republic of China and in Canada. It is assumed that any observed differences in ERG parameters would likely reflect true population differences rather than variations resulting from instrumentation or methodology. Two similar aged populations were recruited and identical recording methodology was employed. Interestingly, there were no significant differences observed in a- and b-wave peak latency (i.e. implicit time) between the two groups; however, there were statistically significant differences in b-wave amplitude observed for the scotopic rod, scotopic mixed rod-cone, photopic cone and 30 Hz flicker ERGs, with amplitudes being significantly larger in the Canadian subjects. They proposed that the observed b-wave amplitude differences observed were due to increased axial length in Chinese eyes as evidenced by increased mean (myopic) refractive error. This study demonstrated a correlation between axial length and b-wave amplitudes and implicit times suggesting individuals with pathological myopia may result in falsely abnormal values if these factors are not reflected in the normative database.
Gender differences in ERG b-wave amplitude have been noted by several investigators (Birch and Anderson 1992, 110:1571-1576;Peterson 1968:Suppl-77). Peterson (Peterson 1968:Suppl-77) first noted that females had statistically significant larger b-wave amplitude in all ages from 10 to 50 years. A similar small but statistically significant affect of gender was also demonstrated by Birch and Anderson(Birch and Anderson 1992, 110:1571-1576) showing slightly smaller amplitude which they and others thought likely due to greater axial length in males. Thus comparing a female subject to a normative database of males would not be valid.
3.4. Pupil size
Retinal illumination is proportional to pupil area. Coupland (2004) reported dilated pupil size in 400 consecutive patients between ages 10 and 90 years and found the dilated pupil size varied from 4 to 9.5 mm and was significantly correlated with age (p<0.001) (Figure 1). Since there was significant variation in dilated pupil size with age in the clinical population, the affect of this was studied in a small group of patients in which dilated pupil size could be strictly controlled through a set of custom made soft contact lenses with artificial pupil sizes of 4 and 6 mm (Figure 2). ERGs obtained to a photopic ISCEV standard flash were measured in subjects against their normal fully dilated pupil and then through a 4 mm and 6 mm artificial pupil. Variability of dilated pupil size can significantly affect amplitude and implicit time obtained using standard flash ERG.
3.5. Diurnal variation in ERG
Birch et al.(Birch, Berson, and Sandberg 1984, 25:236-238) described the range of diurnal variation for the rod b-wave amplitude to beapproximately a 13% reduction occurring 1.5 hours after the onset of daylight. Interestingly, this time period corresponds to the point of maximum rod outer segment disc shedding. ERG b-waves become largest by midday. In a larger study of circadian rhythm on the dark-adapted ERG, Nozaki et al.,(Nozaki, Wakakura, and Ishikawa 1983, 27:346-352) reported that a-wave amplitude showed no circadian rhythm. B-wave amplitudes were also noted to be lowest around 6 a.m. and highest at midday. B-wave amplitude showed no direct correlation with corticosteroid levels but did show correlation with dopamine β-hydroxylase. Melatonin level has also been found to affect the ERG (Fufiange et al. 2002). Significant inverse correlation was found between salivary melatonin level and ERG b-wave amplitude; when melatonin levels were highest, the ERG b-wave amplitude was lowest(Rufiange, Dumont, and Lachapelle 2002, 43:2491-2499). These findings suggest that the ERG recordings in patients during clinical trials, to monitor patient improvement or progression, are best performed at approximately the same time of day.
3.6. Recording conditions
The effects of environmental conditions on the recording equipment have been outlined above. These conditions also affect the subject and can adversely impact the quality of the recordings obtained. In a study looking at optimization of visual evoked potentials (VEP), Karanjia et al.,(Karanjia, Brunet, and ten Hove 2009, 36:89-92) were able to demonstrate that use of a recumbent position improved the signal to noise ratio for the recordings. Failure to attend to patient positioning can artificially alter the quality and amplitude of all types of electrophysiological recordings and thus must be consistently addressed during the recording of ERGs.
This information demonstrates the need for appropriately selecting subjects for a normative database. Databases which are homogenous for one ethnic group, of a single age group or refractive state may not accurately reflect the normative data range for individual subjects who do not match those demographics. Furthermore, consistency in recording time and technique is essential as variation in pupil size from inadequate dilatation may artificially alter the amplitude of the recording. Thus, it is important that the subject be comfortable and tested under consistent conditions including the time of day.
As illustrated in Coupland’s survey (2006a) the need for site specific normative databases is a serious impediments to the individual laboratory adapting new recording equipment or techniques. Establishing a new normative database is labour intensive, time-consuming and expensive. The solution, it might seem, would be a cooperative multicenter collaboration for normative ERG data collection. The advantages would include a reduction in cost, resources and testing time to those participant sites. The challenges posed by recording at different sites is addressed in the following section.
4. Multicenter recordings
Conflicting findings between testing sites have been published since the early days of ERG. For example Sabates et al.,(Sabates, Hirose, and McMeel 1983, 101:232-235) looked at the b/a wave ratio as a surrogate of inner retinal function in patients with central retinal vein occlusion (CRVO). This study was based on the understanding that the a-wave represents outer retinal function and thus the neurons responsible for this component of the ERG would be supported by choroidal circulation, whereas the b-wave is representative of the neuronal activity of the inner retina, supplied by the central retinal artery and thus more likely to be damaged in a CRVO. Sabates found that a b/a ratio of less than one was correlated with a higher likely hood of a patient having neovascularisation of the iris (NVI) as a result of the CRVO.
Sabates’s finding was refuted by Johnson et al.,(Johnson et al. 1988, 106:348-352) found that b/a ratios were greater than unity in 8 out of 9 patients who subsequently developed NVI when a maximal stimulus amplitude luminance was utilized. While Johnson and Sabates results appear to be at odds with each other it is important to note that several key difference existed between Sabates and Johnsons recording systems.
Differences in the luminance for both sets of experiments coupled with differences in the type of recording electrode, Ganzfeld stimulator, and recording equipment only compounded any difference in the signals recorded at the two sites. These issues make it virtually impossible to directly compare data collected in the two studies and similar limitations still exist today when trying to compare or compile data from multiple sites.
Over two decades has passed since Sabates’s and Johnson’s studies and a basic ERG protocol has been standardized for certain responses since 1989(Marmor 1989, 73:299-302). This ERG standard has since been updated three times, most recently in 2008(Marmor et al. 2009, 118:69-77) and in theory would allow for comparison of recordings throughout the world. The guidelines include recommendations for commercial recording instrumentation to allow for the recording of the standard five ERG responses and in conjunction with numerous commercially available electro-diagnostic systems. Given the number of manufactures of stimulus and recording equipment and number of permissible recording techniques, the number of possible combinations and subsequent variability, remains very large.
The greatest impediment to collaborative multicenter data collection is this inter-site variability in ERG recording parameters. In order to utilize multi-site data collection the two main sources of inter-site variance; differences in recording methodology and differences in standardized stimulus luminance; would need to be addressed. Inter-site variability resulting from differences in recording methods (e.g. different recording electrodes, filter settings, adaptation time, inter-flash intervals etc.) can be largely controlled through standardization. When a single recording system and protocol is used a consistent ERG recording can be obtained at different test centres (Figure 3). All fifteen sets of recordings were done on a single subject and show consistent a- and b-wave amplitude and latency. In order for ERGs to have a use in clinical trials recordings between test centres need to be consistent. Figure 3 clearly illustrates that this is possible provided the appropriate care and protocols, as discussed below, are in place.
4.1. ERG and clinical trials
The strength of ERG recordings as a primary or secondary endpoint lies in the quantitative objective nature of the recordings. Yet the clinical significance of decreased ERG amplitudes or delay is not always clear in clinical trials setting.
The US FDA Center for Drug Evaluation and Research is responsible for monitoring the drug development process as well as approving new drug products and monitoring adverse events after approval has been granted. ERG is first used in preclinical studies on drugs that are intended to affect electrophysiology, bind with melanin, or cause retinal lesions (Chambers 2011). In clinical studies ERGs are used as outcome measures to assess therapeutic efficacy as well as monitor potential retinal toxicity when demonstrated ERG abnormalities have been shown in animal studies. In human clinical trials ERG is often used as a secondary endpoint although it has been used as a primary endpoint in a recent safety study(Cordell et al. 2009, 127:367-373). FDA does not usually set ERG testing standards and generally accepts the ISCEV standards(Chambers 2011). To date changes greater than 40% in b-wave amplitude from baseline have been accepted as clinically significant(Cordell, Maturi, Costigan, Marmor, Weleber, Coupland, Danis, McGettigan, Antoszyk, Klise, and Sides 2009, 127:367-373).
4.2. ERG and multicenter clinical trials
ERG has the potential to provide objective information on retinal function and thus is highly desirable in clinical trials. The difference in recording technique and equipment, however, make interpretation between centers a major challenge to the clinical utility of ERG.
New clinical trials may soon be obligated by the FDA to include ERG as one of the outcome measures if a medication is intended to directly affect the electrophysiology of cells or if there has been demonstrated ERG abnormality in animal studies(Chambers 2011). Given this FDA mandate for ERG testing as part of new drug development the ability to record from subjects at different centers during a multicenter clinical trial is becoming a necessity. To date there is one multicenter clinical trial which utilized ERG as a primary endpoint and it provides an excellent case study on how to address the challenges of multicenter ERG recordings(Cordell, Maturi, Costigan, Marmor, Weleber, Coupland, Danis, McGettigan, Antoszyk, Klise, and Sides 2009, 127:367-373).
4.3. Standardized multicenter clinical trial
Cordell et al.,(Cordell, Maturi, Costigan, Marmor, Weleber, Coupland, Danis, McGettigan, Antoszyk, Klise, and Sides 2009, 127:367-373) were mandated to look at potential for retinal toxicity for tadalafil or sildenafil by the FDA, in a post market Phase IV clinical trial conducted at 15 US clinics using standardized ERG equipment and protocols. ERG was selected as the primary measure of retinal toxicity in this clinical trial as it provided objective quantities measure of retinal function. Subjects were recruited at 15 centers within the United States mandating that ERG testing be conducted at all 15 centers. In order to overcome the technical limitations to multicenter recordings all centers used the same standardized equipment, the same ERG protocol, and a single normative data set with a single website based ERG reading center. In the process they have established guidelines which address the challenges of multicenter ERG recordings with a variety of technical and logistical solutions for the use of ERG in future clinical trials.
4.3.1. Personnel and training
Consistency of outcome across multiple testing sites can only be ensured through appropriate training and consistent monitoring of ERG outcomes to maintain quality assurance standards and this is one of the roles of the ERG reader. It is critical that all sites submit standardized ERG tests on normal subjects and empirically demonstrate consistency before receiving certification for multicenter trials.
Appropriately trained technologists are critical for multicenter ERG clinical trials. Preferably, all centers should be trained by the same trainer to ensure consistency of technique. On-site visit during patient testing confirms performance consistency at the individual site. In the Cordell et al., (2009) study, all 15 sites received centralized training on the east and west coasts (Washington DC and Scottsdale Arizona). In addition, all 15 sites received individualized on-site refresher training all within a 4-week period following centralized training. Reproducibility and consistency of ERGs obtained on individual non-clinical trial patients at multiple sites were assessed through a centralized web-based ERG reading center. Through the use of standardised training and protocols this study was able to demonstrate that ERG is a viable method of compiling objective multicenter data with low levels of inter-site variability.
4.3.2. Role of ERG reader
The ERG reviewer or ERG reader performs two vital roles in clinical trials using ERGs as clinical endpoints. The primary role is to monitor quality and consistency in reproducibility of ERGs submitted from clinical sites. It is essential that ERGs have quality assurance review in order that they are interpreted correctly. Technical artefact such as power mains interference, eye blink and eye movement, high frequency noise, electrical spiking, and other myogenic interference can produce spurious changes in waveform morphology which could be incorrectly interpreted. The secondary role of the ERG reader is to determine if there have been objective quantitative or subjective qualitative changes in ERG findings over time. It is critical that the ERG reader be blind to both patient identification and experimental clinical trial condition. This reduces the degree of bias as does the reliance on statistical criteria for determining ERG change.
When multiple ERG readers are used in large clinical trials a process for conflict resolution is necessary. In order to maintain consistency of quality of ERG interpretation it is essential that all readers review all waveforms or all those waveforms that are deemed to fall outside of normal limits and agree that significant changes in ERG amplitude and timing have occurred. It is essential that criteria for determining ERG change be determined before the clinical trial begins.
Multicenter recording protocols such as the one employed in Cordell et al., (2009) benefited from having a centralised electronic database of all the recordings. This allowed multiple readers to assess the complete set of data and make independent evaluations of the ERGs. Conflicting interpretations were then reassessed as a group using the predetermined criteria established prior to the commencement of recruitment. The acceptance of this clinical trial by the FDA provides tacit approval of the methodology employed in Cordell et al., (2009).
As an objective measure of retinal function ERG is poised to play a major role in clinical trails. Cordell et al., (2009) provides a framework by which one can run a successful multicenter clinical trial that utilizes ERG as a primary endpoint. Their success was dependent on standardization of not just the recording equipment but the training and personnel, both technicians and ERG readers, involved in the trial.