FISH Probe Counting in Circulating Tumor Cells

Presence of tumor cells in blood of patients with metastatic carcinomas has been associated with poor progression free and overall survival (Cohen et al., 2008; Cristofanilli et al., 2004; de Bono et al., 2008). Assessment of treatment targets on circulating tumor cells (CTC) before initiation of therapy may provide a means to guide therapy (Attard et al., 2009; de Bono et al., 2007; Hayes et al., 2002; Meng et al., 2006; Meng et al., 2004; Smirnov et al., 2005; Swennenhuis et al., 2009). Characterization of CTC can be performed by Fluorescence In Situ Hybridization (FISH) which has been used to prove that CTC are indeed malignant (Fehm et al., 2002; Swennenhuis, et al., 2009), and that gene amplifications, deletions and translocations related to certain therapies can be detected (Attard, et al., 2009; Meng, et al., 2006; Meng, et al., 2004).

healthy individuals (de Solorzano et al., 1998), cells from amniotic fluid (Lerner et al., 2001), and on tissue (Raimondo et al., 2005). An excellent overview of methods is available (Restif, 2006). However, to our knowledge, automated dot counting has never been applied in samples containing containing CTC. The nuclei and dots of these cells are extremely heterogeneous in shape and intensity, and therefore difficult to score, even by reviewers. Therefore, we investigated the error in counting FISH dots, and evaluated different methods to count the FISH dots by a computer algorithm.

Patient samples
A prospective multicenter clinical trial that evaluated the utility of counting CTC for predicting response to therapy, progression-free survival, and overall survival in metastatic castration-resistant prostate cancer patients was conducted (de Bono, et al., 2008). A total of 65 clinical centers throughout the United States and Europe participated in this study after formal institutional review board approval. All patients were required to provide written informed consent. Blood was collected before starting a new treatment and at monthly intervals prior to the next cycle of therapy.

Data acquisition for CTC enumeration
The CellTracks Magnest containing the cartridge is placed on the CellTracks Analyzer II a semi-automated fluorescence-based microscopy system that acquires images using a 10X/0.45NA objective with filters for DAPI, FITC, PE, and APC to cover the complete surface area of the cartridge. The CellSearch software identifies objects staining with DAPI and PE in the same location and generates images for the DAPI, FITC, PE, and APC filters. A reviewer selects the CTC defined as nucleated DAPI+ cells larger than 4 m, lacking CD45-APC and expressing CK-PE from the gallery of objects, which are tabulated by the computer. Figure 1 shows an overview of the image acquisition and identification of CTC. After a scan, the cartridges were stored at room temperature until the reviewer was finished reviewing the images. Accuracy, precision, linearity, and reproducibility of the CellSearch system have been described elsewhere . 3) The cartridge is inserted into a Magnest to distribute the labeled cells over the analysis surface. 4) The cartridge is scanned at 10X in the CellTracks Analyzer. 5) CK+ DAPI+ objects are presented to the reviewer for CTC selection. 6) Coordinates of selected CTC are collected. 7) The coordinates and images from the scan are saved on a CD or DVD.

Samples used for FISH counting algorithm development
For algorithm development, images from cells of a patient in which no CTC were detected were used. Leukocytes which were non-specifically carried over during the CTC enrichment procedure were used as targets for this purpose, as in these cells almost no chromosomal aberrations are present and each cell should therefore contain two copies of each chromosome and gene region. The sample was labeled with a centromere specific FISH probe for chromosome 17 and a probe identifying the HER2 gene region. The centromere probe is larger than the HER2 probe, thus a difference in automated count could be expected. For the validation of the algorithm, CTC from 47 patients with hormone refractory metastatic prostate cancer labeled with probes identifying the centromeres of chromosome 1, 7, 8 and 17 were used (Swennenhuis, et al., 2009).

Sample preparation for FISH probes on CTC
Cartridges containing CTC were used for the FISH procedure. To preserve the location of the CTC for future interrogation the buffer inside the cartridge was carefully aspirated aspirated to avoid cell cell movement and replaced with methanol acetic acid. After fixation the cartridges are dried using a forced forced air flow flow and processed for FISH or stored at -20 °C for later use. FISH probes specific for the centromeric regions of chromosome 1, 7, 8, and 17 labeled with PlatinumBright-647, -550, -505, and -415, respectively, were used in this study (Kreatech, Amsterdam, The Netherlands). The probe mixture consisted of 50 µL of hybridization buffer (50% Formamide / 1 x SSC / 10% Dextran Sulfate) containing FISH probes against 1, 7, 8, and 17 at 2 ng/µL each. The cartridges were placed on a 80°C hotplate for 2 min, with the glass facing towards the hotplate, and hybridized at 42°C for 16 h. After hybridization the cartridge was washed with PBS containing DAPI (use abbreviation instead of whole word) as a nuclear counter stain.

Data acquisition for FISH probe detection in CTC
After hybridizing the FISH probes, the samples were scanned on a modified CellTracks Analyzer II. This analyzer is equipped with a 40X/0.63NA objective, to improve the resolution and light collection of the fluorescent FISH dots, and filter cubes to detect DAPI, PlatinumBright-647, -550, -505, and 415. The locations and images of the CTC identified in the initial 10X scan -described in 2.3-were loaded from a CD. A software program was written to move to the locations of interest and record Z-stacks to capture signals at a range of depths of the objects of interest (Swennenhuis, et al., 2009). The DAPI signals are used to correlate the 40X with the 10X scan, thereby verifying if the CTC location is correct. This was necessary, as the cells could shift up to ~200 µm due to the FISH protocol. The image acquisition procedure for the FISH probe detection is shown in figure 2. 1) The cartridge is re-opened to carefully aspirate buffer and fixate the cells. 2) FISH reagents are added and the sample is hybridized for 16h. 3) Coordinates and images from previously assigned CTC are loaded from a CD. 4) The modified CellTracks records DAPI images at 40X at the designated coordinates and surroundings. 5) Cross correlation is performed between the DAPI images from the 40X and 10X scan to verify the correct location of the CTC. 6) After the right location is found, the FISH z-stacks are recorded.

Algorithm for counting FISH signals in leukocytes and CTC
Maximum intensity profiles were created of all the Z-stacks to speed up the counting process for human reviewing. The algorithm to identify the nucleus and count the FISH probes within this nucleus consists of five general steps: 1. enhancement and segmentation of the outline of the nucleus; 2. enhancement and segmentation of potential FISH objects; 3. exclusion of objects that are too noisy; 4. measurement of intensity and morphological features of the potential objects; 5. exclusion of objects that do not meet inclusion criteria.
Nuclei were located in the DAPI image, which was enhanced using a zero-crossing filter in combination with a gradient magnitude filter (Verbeek & Vanvliet, 1994). Edges were enhanced using a morphological gradient magnitude filter. The image of this filter was multiplied by an image filtered by a laplace-plus-dgg filter. This filter created a combination of second order derivatives in x and y, and combined it with an image of the second derivative in the direction of the maximum gradient. The combined image of the two filters is thresholded using a fixed threshold. On the outline of objects larger than 2000 pixels a distance transform was applied: every outline pixel value is replaced by its closest distance to the edge of the outline. This procedure was then followed by a watershed transform, to verify if the outline consisted of multiple maxima and were thus two or more closely spaced objects. Figure 3 shows how a distance transform improves the watershed transform in case of saturated DAPI signals. The nucleus that was at least 250 pixels in size and located closest to the middle of the image was selected as the final outline: only objects inside this outline are considered for FISH dot counting. In a next step, dot-like structures were enhanced. Usually this is performed by employing a tophat filter (Netten, et al., 1997). We used a method termed multiscale product (Vermolen et al., 2008) because it appeared better suited to deal with the heterogeneity of the shapes of the FISH probes. This filter increases the intensity of objects in a range of radii, using a multiplication of Gaussian kernels with different σ (ranging from 1.5 to 3.0 pixels): with the 2D Gaussian kernel defined as Using a range of Gaussian kernel sizes improves robustness to variations in size of the objects of interest, compared to using a single kernel size as is done in the tophat transform method. After applying the multiscale product, objects were thresholded using the triangle threshold (Zack et al., 1977) which was multiplied by a factor of 0.1 to include all relevant dot-like structures, bright or dim. This thresholding method uses the intensity histogram of the image and is especially suited for images with few object pixels. After thresholding, a final verification was performed to exclude objects that are too noisy. The dome finding method of Restif was applied to the coordinates of the maximum of each object (Restif, 2006). This method checks nearest downhill neighbors in three level sets -up to three pixels around the maximum-and excludes objects that have more than one extra local maximum in this region. Figure 4 shows the method by which nearest downhill neighbors are checked. We allow one extra local maximum to include very closely spaced FISH dots. In this way, noisy objects are excluded, whether or not this noise is originating from a high or low intensity background. Finally, measurements were performed on these objects: size, maximum intensity, mean intensity, total intensity, relative intensity, roundness, and perimeter were saved for every object. Relative intensity was defined as the total intensity of the object related to the total intensity of the brightest www.intechopen.com object within the same nucleus. Using these measurements, different exclusion criteria were tested. Combinations of measurements were tested on the leukocyte training set to exclude debris and keep the true FISH dots. Figure 5 gives a schematic overview of the procedure.

Expert reviewing of samples
Next to the algorithm, five expert reviewers counted FISH probe signals in the leukocytes, using a macro written in the program ImageJ (Collins, 2007). They reviewed the set two times: the first time they were asked to review all images, the second time they could skip images that were unclear in their view. In this way it could be measured how sure reviewers were. All CTC samples were also reviewed by the five expert reviewers.

Results
492 leukocytes and 500 CTC were imaged by the modified CellTracks Analyzer II. Figure 6 shows an example of a FISH Z-stack from a leukocyte in top and slice view, from which a maximum profile was created. The profiles were processed by the algorithm, requiring ~2 minutes for each sample and counted by human reviewers, requiring ~2 hours for each sample. Figure 7 shows the different steps of the algorithm: segmentation of the nucleus, enhancement of dot-like structures and the final outline of nucleus and the dots. www.intechopen.com

Counting of the leukocyte training sample
After comparison of the manual and automated counts in the training sample, it became apparent that only the measurements "size" and "relative intensity" had a positive impact on the counting efficiency of the algorithm. After objects were measured and counted in the HER2 channel the objects, with a relative intensity lower than 30% of the brightest dot, within that nucleus were excluded. For the centromere 17 channel this threshold was optimal at 25%. Objects smaller than 5 pixels were also excluded. Automatic counting of chromosomes in leukocytes resulted in an accuracy of 97.8% of the HER2 dots and 97.5% of the centromere 17 dots. Accurate here means "equal to the manual count of the subset of images were all reviewers agreed upon" (n=409 for HER2 and n=347 for centromere 17). The mean inter-reviewer agreement was 92.6%±2.3% and 91.7%±1.7% and the mean intrareviewer agreement was 96.5%±2.7% and 97.0%±1.8% for the HER2 and centromere 17 probes, respectively. Table 1 gives an overview of the counting efficiencies after review of HER2 and centromere 17 of the whole data set and the data set containing only the images with objects that could be easily identified by the reviewer, compared with the count generated by the algorithm. In figure 8 the distribution of the count of the PC and five reviewers are shown. The count of the reviewers is represented by the mean and the standard deviation for each chromosome count.  Table 1. Agreement between expert reviewers and the PC algorithm when reviewing leukocytes. In the upper right part of the table (white), the agreement of the first review is shown in which the reviewer had to review the full dataset (n=492). Second, in the lower left part of the table (dark grey) the agreement of the second review is shown, in which the reviewers only reviewed the cells they were certain of (the "obvious" dataset). Last, the intra-reviewer variation is given on the diagonal of the table (light grey).

Counting of a sample containing CTC
After processing a sample containing CTC it became clear that the threshold for relative intensity was strongly related to the quality of the FISH probe used, and thus probe dependent. Slight adjustment of the relative intensity criteria to a range from 14%-20% was necessary to ensure reasonable counting by the algorithm. This value was correlated with the average of the maximum intensity of all the objects in a channel: if this average was high, then the relative intensity should be set lower. Figure 9 shows  Agreement of the PC with the subset of cells on which all reviewers agreed was 76.1% (n=238), 83.9% (n=280), 86.6% (n=209), and 85.3% (n=251) for probes from centromere 1, 7, 8, and 17 respectively. Mean inter-reviewer agreement was 70.9%, 75.3%, 66.8%, and 72.3% for these four channels. Figure 10 show the agreement between all reviewers in detail and the histogram of the count.

Automated counting is necessary and feasible
We have shown that reliable automated counting of FISH probes on EpCAM+DAPI+CK+CD45-cells is both necessary and feasible. Comparing expert reviews revealed that intra-reviewer variation -the same expert reviewing a data set twice-could be as high as 3.5% of the cells. Inter-reviewer variation was higher: 7.5%; these numbers were both acquired for the "easy" leukocyte samples with low copy numbers. Variation between reviewers while reviewing CTC samples could be as high as 33.2% (centromere chromosome 8), showing that the number of signals in a nucleus is of great influence on counting accuracy, as is the knowledge of the reviewer that he or she is dealing with CTC or leukocytes. Furthermore, reviewing 500 FISH nuclei in four channels takes several hours, while the computer only needs a few minutes.
From the results it becomes clear that review of chromosome 1 and 8 was the most difficult, for both PC and reviewer. These probes had on average a factor two lower intensities than the probes from chromosomes 7 and 17. Thus, the inter-reviewer agreement was lower as well as the agreement with the PC. The dome finding part of the algorithm revealed the same: it removed objects that were too noisy in 17% and 13% of the nuclei in the channel from chromosome 1 and 8 respectively, and only in 8% of the nuclei from channels of chromosome 7 and 17. Signal to noise ratios were clearly lower in channels were the agreement was lower. www.intechopen.com

Sources of error for human and PC
Agreement between PC and reviewer was good when control samples were reviewed, and reasonable when CTC were reviewed. The difference between the two data sets could be attributed to a few main sources of error: 1. Nuclei were well separated in the leukocyte sample, the CTC samples contained more clusters. While these clusters are usually easily resolved by eye, the algorithm had more difficulty in this task. Figure 9, panel B shows an example in row 1: two closely spaced nuclei with almost saturated intensity. In this case the signals of the nuclei are close to saturation and although a distance transform and watershed transform was applied, they were still segmented as one. The PC thus over-counted in this example. 2. Because the DAPI signal from the nuclei can vary greatly between samples, some signals fall just outside the segmented outline of the nucleus as determined by the algorithm. This is the case when the signal from the nucleus is relatively dim, as is shown in figure 9, panel B, row 4, where the reviewers counted two probes and the PC counted only one. This challenge could be resolved by dilating the outline nuclei more than is done now. However, closely spaced nuclei will be resolved worse in this case. The heterogeneity of the shape and size of the nucleus is largely due to presence of ferrofluid in combination with the fixation step in the FISH procedure. The ferrofluid particles were added to keep the cells tightly located to the imaging surface. However, due to the influence of these magnetic particles and the tendency of some cells to adhere to surfaces, the DNA spreads over the surface. Ferrofluid particles that line up under influence of the magnetic field force these cells to spread even further. Thus in the DAPI images even small islands of DNA were visible, that clearly were part of a bigger nucleus, making it more difficult for the algorithm to measure a perfect outline of the nucleus and include all the DNA in the dot counting. Figure 11 shows an example of this effect. Fig. 11. Example of a nucleus spread due to fixation of the cell. Note the vertical lines that were created by ferrofluid aggregates that follow the local magnetic field. This type of nucleus is especially difficult to segment correctly.
3. The CTC sample had a larger variety in signal quality. Although the segmentation algorithm is dynamic on the histogram, it is still difficult for the PC to distinguish between what a reviewer calls a "true signal" and debris. For example, when a reviewer sees two signals -a bright and a relatively dim one-, he or she will usually count two. However, when five bright signals and one dim signal are seen, the dim object is more often neglected. Figure 9 panel B rows 2 and 3 show examples of difference in counting because of relative intensity. In row 2, the reviewers counted five and the pc four, while in row three, the reviewers counted two and the PC three probes. The PC counts 100% reproducible, but does not take into account these human considerations. For this analysis, it is thus very difficult to get an absolute "golden truth". 4. It still is hard for the PC to distinguish between a split probe (one chromosome that had two signals) and two closely spaced chromosomes. It is however not known how often a reviewer misclassifies such an object. A reviewer can structurally ignore or assign the split spots. The PC cannot and counts these items according to the algorithm. The number of cells that have these split spots may vary between samples and also within samples (e.g. between lymphocyte and CTC). The PC might not be able to distinguish between these, but if this factor appears to be of influence to the result, the PC could use the measurements of the probes -i.e. relative intensity coupled to size of closely spaced probes-to estimate the probability of these splits in the cells. Leukocytes could be used as an internal control for measuring the frequency of these splits and for estimating a relevant "size/relative intensity" threshold.
The above error sources may seem a big challenge, but are not of importance for the clinical relevant observations, which is the presence or absence of aneuploidy to ascertain the cancerous origin of the cells and the presence of amplification or deletions of specific genes that may be used to guide certain therapies. CTC are very heterogeneous: within one patient a wide variety of chromosomal aberrations could be spotted. So whether or not a certain cell has five or six copies is of lesser importance than the fact that this number is greater than two. When comparing counts that are greater than two or not, the reviewer and PC concur in 87%, 93%, 94%, and 94% of the cells for centromere 1, 7, 8, and 17 respectively for the data set in which all reviewers agree. This demonstrates that in about 90% of the cases, the PC and reviewer will draw the same conclusion about the ploidy status of the cells identified as tumor cells.  1 and 2) and two examples of difficulty of locating the true outline of the nucleus (row 3 en 4) are given. It could be argued that the example of row 1 isn't suitable for reviewing at all because the background staining is too high. For reviewers, there is no real quantitative criterion whether or not to reject a certain object based on its intensity distribution. However, the PC has such a criterion: it can easily check if a maximum of an object is surrounded by more than two other local maxima. If this is the case, then an object should be excluded. We perform this verification by means of the dome finding function. In this way, the PC performs more reliable than the human reviewers.

Future research
In the future, the algorithm may be optimized further by using clinical data. When coupling for instance response to a therapy of a patient to the aberration of the genes in the CTC the www.intechopen.com treatment is targeting, a better golden truth may be found. Furthermore, quality of FISH could still be improved. Split probes are still a big challenge for the PC, but also for establishing a good count by reviewers. Consequently a quality score could be set by the algorithm by measuring intensity variations, for instance in carried-over leukocytes. This score could be used as an internal control in each patient sample to adjust exclusion criteria and to reject cells that are not suitable for interpretation. Finally, removal of ferrofluid could greatly improve the segmentation of the nucleus. Aggregation of ferrofluid particles disturbs the natural shape of the nucleus and blocks a fraction of the fluorescence light. Implementation of physical filters to enrich CTC by size would not require any ferrofluid and could be an improvement in the next generation tumor cell capturing devices.

Acknowledgments
We would like to acknowledge Ronald Sipkema for his contribution to the software for the improved CellTracks Analyzer.