Genome Profiling and Potential Biomarkers in Neurodegenerative Disorders

Neurodegenerative Diseases - Processes, Prevention, Protection and Monitoring focuses on biological mechanisms, prevention, neuroprotection and even monitoring of disease progression. This book emphasizes the general biological processes of neurodegeneration in different neurodegenerative diseases. Although the primary etiology for different neurodegenerative diseases is different, there is a high level of similarity in the disease processes. The first three sections introduce how toxic proteins, intracellular calcium and oxidative stress affect different biological signaling pathways or molecular machineries to inform neurons to undergo degeneration. A section discusses how neighboring glial cells modulate or promote neurodegeneration. In the next section an evaluation is given of how hormonal and metabolic control modulate disease progression, which is followed by a section exploring some preventive methods using natural products and new pharmacological targets. We also explore how medical devices facilitate patient monitoring. This book is suitable for different readers: college students can use it as a textbook; researchers in academic institutions and pharmaceutical companies can take it as updated research information; health care professionals can take it as a reference book, even patients' families, relatives and friends can take it as a good basis to understand neurodegenerative diseases.


Dementia and Down syndrome
Dementia, common symptom of all three already mentioned neurodegenerative diseases is also a common symptom in individuals with Down syndrome (DS). Most of individuals with DS after about age of 30 have the characteristic plaques and neurofibrillary tangles, associated with AD. As in general population, the prevalence of AD in people with DS increases significantly with age. On the other hand, age-related cognitive decline and dementia in people with DS occurs 30-40 years earlier than in the general population, reaching almost 40% in the 50s [16]. Life expectancy of people with DS continues to increase and therefore, dementia is becoming an important issue.

Biomarkers
Research in the field of biomarkers is a rapidly growing and developing area in medicine. Everyday advances in genomic, proteomic, metabolomic and epigenomic knowledge and technologies have made their way also in the neuroscientific research area. Biomarkers are very important indicators of normal and abnormal biological processes. By definition, biological marker or biomarker is a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention [17]. Despite the fact that enormous effort and extensive research have been concentrated on this area, there is still a major lack of biomarkers for diagnosis, progression monitoring, response to treatment evaluation, etc. in neurodegenerative disorders such as Alzheimer's disease (AD), Parkinson's disease (PD) and Huntington's disease (HD).
Biomarkers have many valuable applications, such as identification of major neuropathological processes in specific disease, disease detection and monitoring of health status, early efficacy and safety evaluations in in vitro studies in tissue samples, in vivo studies in animal models, and early-phase clinical trials. They are invaluable as a diagnostic tool for identification of patients with a disease or abnormal condition, as a tool in staging the disease or classification of the extent of disease, as an indicator of disease prognosis and in predicting and monitoring of a clinical response to treatment. Biomarkers are of extreme relevance in chronic NDG diseases -there are no cures for these diseases, as neurons of the central nervous system cannot regenerate on their own after cell death or damage. Tremendous efforts have been made in recent years to identify the neuropathological, biochemical, and genetic biomarkers of these diseases aiming to establish the diagnosis in earlier stages, to survey the rate of progression, or response to treatment. Currently, the neuropathologic diagnosis is a gold standard, but it can only be made in the form of an autopsy after the patient's death. On the other hand, biomarkers may improve the early diagnosis at a stage when disease-modifying therapies are likely to be most effective, the monitoring of disease progression and the efficacy of any therapeutic intervention [18].

Brain transcriptome in neurodegenerative disorders
Many different research groups have tried to solve the neuropathophysiological puzzle in PD, AD, HD and DS. Human brain has been extensivelly studied using many approaches, in the last decade also variety of »omic« technologies. Whole-genome gene expression studies in brain of each of four diseases individually have shown changes in transcription of number of genes when compared to normal human brain. We investigated, reviewed and collected data from all reported studies to date on brain transcriptome in Parkinson's disease, Alzheimer disease, Huntington disease and Down syndrome and performed integrated meta-analysis.

Methods
In an attempt to present the alterations consistently reported by studies of brain transcriptome in neurodegenerative diseases, we initially searched for such reports in literature databases, then obtained raw and processed experimental data from microarray data repositories, after which we performed probe level meta-analyses of datasets originating from various studies. In addition, to reveal possible commonalities and shared pathways across various neurodegenerative diseases, we inspected the similarities and differences in gene expression dysregularities occurring in these conditions. As we were primarily interested in the studies with microarray experimental results accessible from biological repositories, we then searched Gene Expression Omnibus (GEO) repository (http://www.ncbi.nlm.nih.gov/geo/), ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/) and Stanford Microarray database (http://smd.stanford.edu) for studies with data available in the raw or processed form. As most of the gene expression profiling experiments were performed on Affymetrix platform and to avoid difficulties due to different probe annotations utilized by different microarray manufacturers, only results from experiments performed on the Affymetrix U133 platform were included to facilitate further steps in probe level meta-analysis of microarray data. The detailed information on datasets included in the analyses may be observed in Table 1.

Microarray data pre-processing and preparation for meta-analysis
All the integration and statistical steps described were performed in R statistical environment version 2.13.1 (http://cran.r-project.org), using Bioconductor version 2.8 packages (available at http://bioconductor.org) [19]. Raw data from all microarray experiments listed in Table 1 was obtained directly from Gene Expression Omnibus (GEO) repository (http://www.ncbi.nlm.nih.gov/geo/) utilizing the GEOquery package for R [20,21]. Before the meta-analysis of data from selected studies was performed, all the datasets obtained in such manner were inspected for significant inter-array differences in distribution of probe intensities. For this reason, raw datasets were initially examined using arrayQualityMetrics package and where necessary the straightforward quantile normalization functions in the affyPLM package was utilized [30,31]. Non-specific intensity and interquartile variation filters were applied using methods in genefilter package [19]. Log 2 transformations were applied where discrepancies in data reporting format were observed. Data collections for each individual neurodegenerative disease were then merged using probeset annotations as the common denominator. Using this approach we avoided potential statistical issues originating from averaging probe intensity values to obtain a single mean intensity value for each gene, possibly disregarding distinct expression of different transcripts from the same gene. These steps resulted in generation of 4 separate data matrices, each carrying data for a single disease, originating from multiple studies -Alzheimer disease (AD), Down syndrome (DS), Huntington disease (HD) and Parkinson disease (PD) datasets.

Meta-analysis
Summarized differential expression of genes in each merged dataset was calculated using meta-analysis algorithms incorporated in the RankProd package for R [32]. RankProd uses a non-parametric statistical algorithm that facilitates detection of genes that are consistently highly ranked across microarray datasets originating from various microarray experiments in various studies perfomed on the same condition (ie. disease). As this approach is based on rank statistics in contrast to approaches requiring analyzing absolute intensity values, it allows for inclusion of data originating from different laboratories, differing platforms and potentially studies performed under differing conditions [32]. For analyses of such multi-study data, RPadvance function was utilized in our analyses, with origin parameter set to account for data originating from number of different sources corresponding to the number of different originating study [32]. Here it is important to stress that we have faced the issue of multiple studies simultaneously reporting differential e x p r e s s i o n i n s e v e r a l d i f f e r e n t a n a t o m i c a l b r a i n p a r t s . A s w e w a n t e d t o f a c i l i t a t e t h e discovery of differentially expressed genes in diseased tissue in comparison to control samples, we set the origin parameter to take into account these considerations and regard such data as originating from different sources, thereby avoiding comparisons of gene expression between different brain regions rather than between affected and unaffected samples. Afterwards, P-values and q-values were obtained by performing 100 permutation cycles of complete originating datasets. An arbitrary P-value cut-off for significance of differential gene expression was then set at P<0.05.

Investigating intersections between datasets and gene set enrichment analyses
Resulting ordered lists of differentially expressed probesets were subsequently investigated for overlap between AD, DS, HD and PD datasets. Top 1000 genes from each dataset were used and intersections between combinations of two, three and four datasets were obtained. Venn diagrams in the results section were produced using Venny utility available at http://bioinfogp.cnb.csic.es/tools/venny/index.html. Furthermore, to gain insight in functional properties of genes in the intersections, gene set enrichment analyses (GSEA) were performed, utilizing GOstats package for R and investigating significant (uncorrected p<0.05) over-or underrepresentation of GeneOntology (GO) and KEGG terms annotating genes occurring in the intersections [33][34][35][36]. Additionally, DAVID tool (http://david.abcc.ncifcrf.gov/) was used to reveal the functional annotation clusters related to intersecting genes [37]. Required annotation conversions were performed using the hgu133plus.db package from Bioconductor annotation package collection and using biomaRt package for R in combination with Ensembl Biomart service (http://www.biomart.org/) [38,39].

Results
Alltogether, our data collection comprised of data from 9 whole-genome expression studies, performed on samples from 4 neurodegenerative conditions (AD, DS, HD and PD). Collectively, 200, 33, 201, and 186 microarray analysed samples were included in the investigations of AD, DS, HD and PD, respectively, which accounted for 620 separate experiments included overall. A slight predominance of experiments performed on case tissues was noted in most of the experiements with summary case:control ratio amounting to 1,2:1 (339 affected tissues and 281 unaffected tissues included). Separate analyses of datasets for each NDG disorder have revealed significant perturbances in expression profiles of several genes. When arbitrary permutation p-value cut-off was set at 0.05 for upregulated genes, 5701 probesets attained significance in the AD dataset, 3291 in DS dataset, 4174 in the HD dataset and 3043 in the PD dataset. In the downregulated gene group the p<0.05 significance was reached for 5496 probesets in the AD dataset, 2983 probesets in the DS dataset, 4079 in the HD dataset and 3410 in the PD dataset. A detailed view of the distribution of significance values of the top 10,000 ordered differentially expressed genes may be observed in Figure 1 for each of the NDG disorders. The resulting numbers of significant results are inflated by the effect of multiple testing and therefore the q-values were also estimated as described in the article by Breitling et al [40]. The numbers of upregulated probesets with estimated q-values below 0.05 were 3775 for AD, 1496 for DS, 3182 for HD and 1894 for PD datasets. The numbers of downregulated probesets meeting this criterion were 3624 in AD, 652 in DS, 3065 in HD and 2541 probesets in the PD dataset.

Common patterns of differential expression in neurodegenerative disorders
Comparisons of comformity between profiles of transcriptome perturbations in four neurodegenerative diseases was initially performed by inspecting lists of top 1000 DE (differentially expressed) probesets for each condition and subsequently obtaining probesets (and genes) found to be differentially expressed simultaneously in several conditions. The numbers of overlapping probesets may be observed in Figure 2. The largest overlap was observed between between the PD and HD lists, with altogether 338 (33.8%) upregulated and 267 (26.7%) downregulated genes differentially expressed in both conditions. Detailed overview of the extent of overlap between pairs of top DE gene list may be observed in Figure 3. A notable number of probesets was DE in all four conditions: 44 upregulated and 16 downregulated as presented in Figure 2a and 2b.

Comparative functional analyses of differential expression profile in neurodegenerative diseases
Calculations of gene set enrichment profile of upregulated and downregulated sets of genes presented here, were performed using hypergeometric test in the GOstats package. The profiles of DE genes were first calculated for each disorder separately, and afterwards every intersection between combinations of four sets of DE genes was evaluated. Results of interests from separate GSEA analyses are presented in Table2(a-d) for top 1000 downregulated DE gene sets (the data for upregulated GSEA are not shown). Several GO biological process annotations appeared in all of the four analyses, most notably terms related to synaptic transmission and to cognitive processes.
We have also investigated the extent of similarity of GSEA profiles across four diseases. Top 200 enriched GO terms were inspected in each neurodegenerative disorder and compared for matching terms in pair with other three disorders. Greatest similarity was observed between GSEA terms annotating downregulated genes in all four disorders, which may be observed in more detail in Figure 4. As previously observed for overlapping genes, greatest overlap was observed between PD and HD GO profiles in the upregulated (40.0% overlap) and downregulated sets (59.5% overlap).    Table 2c Parkinson's disease (downregulated genes). GOBPID stands for GeneOntology biological process ID Fig. 4. Pairwise comparison of GO terms between pairs of datasets representing four neurodegenerative diseases. Percentages were calculated by dividing the number of GO terms overlapping by the number of all GO terms included in the overlapping analysis (N=200). GO terms annotating upregulated genes are presented in shades of red color and those annotating downregulated genes in blue

Conclusion
We have shown that whole-genome transcription analysis might be useful for identification and clarification of pathophysiological mechanisms in neurodegenerative diseases. We have used innovative approach of comparing and integrating experiment results from different NDG diseases and provided new important insights into the common NDG processes. Elucidation of these mechanisms holds important potential for future prediction and development of new useful treatments as well as for identification of biomarkers of neurodegeneration. When comparisons of intersections between groups of top DE genes were performed, the greatest overlap was found between DE genes in brain samples of patients with HD and PD, which is possibly in accordance with their primary manifestation in movement disturbances related to function of basal ganglia. On the other hand, this similarity is surprising, as the known etiological agents in HD and PD differ significantly, one disorder being a consequence of monogenic disruption and other being a complex disorder with heterogeneous combination of genetic and environmental factors [41]. Surprisingly high is also the profile overlap between AD and PD, which present as clinically somewhat distinct entities. Recently however, it has been becoming progressively more obvious that the two disorders share not only a significant proportion of clinical elements (movement disorder, cognitive decline, mood and psychiatric disorders) but also share common pathophysiological pathways [42]. These results potentially suggest that clinical distinction between disease entities may not be perfect projection of actual processes at cellular and molecular level. Additionally, in contrast to expectation, however, the lowest overlap was observed between samples from patients with DS and AD, especially as these conditions have been known to share NDG pathways related to amyloid beta deposition in neurons.
Reasons for lower extent of overlap may be found in significant differences in the age of patients from whom the brain samples were obtained for studies of DS in comparison with AD. Additionally, it is important that in most instances, a complete triplication of genes located on chromosome 21 may dominate genes commonly dysregulated in DS and AD [29]. Also, the number of brain tissue samples samples profiled in microarray experiments was by far the lowest among other types of NDG diseases investigated in our survey. Therefore, before final answer regarding this finding is obtained, more studies investigating transcriptional alterations in DS brain samples must be performed. Several GO categories appeared to be consistently singled out in GSEA analyses of separate and overlapping genes DE in NDG disorders. Interestingly several terms were related to processes previously associated with neuron degeneration [42], most prominently GO terms: synaptic transmission (GO:0007268), neurogenesis (GO:0022008) and terms related to higher cognitive processes (GO:0007611). Dysfunctional synaptic transmission (as in glutamate exitotoxicity) and defects in neurogenesis have been previously repeatedly shown to be related to various NDG diseases [42][43][44]. It is interesting that although disturbances in neuroinflammatory mechanisms have been proposed as a possible causative factor in a number of NDG diseases, our analysis of intersecting genes dysregulated in brain samples of these conditions did not single out a particular common inflammatory pathogenetic pathway. This notion may be interpreted in the light of previously recognized differences in complement-activating immunogenic activity of plaques in different NDG diseases, resulting in absence of commonly overlapping inflammatory genes and GO terms [42]. When we investigated the compatibility of functional profiles between four NDG diseases, we have found greatest overlaps between sets of GO terms annotating genes characterized by downregulation in NDG diseases, where an overlap greater than 40% was observed in all of the pairwise comparisons of the sets of top 200 enriched GO terms. Again, the greatest functional conformance was noted between top downregulated genes in HD and PD as well as AD and PD dataset pairs. Notable overlap was also observed in the functional profiles of upregulated genes, where we noted good functional conformity between DS and HD datasets in addition to HD-PD and AD-PD functional overlaps. It is important to stress that genome-wide expression studies included in this survey are inherently burdened by important statistical issues that predominantly originate from the issue of testing a large number of variables on a relatively small population of biological replicates (ie. study subjects) [45]. For this reason we attempted to gain a more complete account of biological alterations in neurodegenerative diseases by merging data from several different studies investigating transcriptional changes in brain samples of distinct neurological conditions (AD, DS, HD and PD) [46]. This increased the number of biological replicates considerably, allowing for potentially more reliable calling of DE genes in these conditions. There are, however, important downsides to this approach: the studies included were performed under differing conditions in different institutions and by different research staff. Even more important is the great heterogeneity between brain tissue samples investigated. We have attempted to circumvent these issues by using appropriate RankProd meta-analysis methods, nevertheless these results must be interpreted in light of these considerations.
Nevertheless it is still difficult to differentiate between the causal changes in transcriptome in contrast to changes resulting from previous damage to neural tissue. It is possible, however, that the similarities in transcriptome profile between clinically and pathologically distinct entities suggest a common response to an unknown initial damaging stimulus. We propose that in future, integration of various data such as genomic in combination with transcriptomic data should provide a way to delineate possible mechanisms, where genetic predisposition results in manifestation of transcriptional imbalances, consequently resulting in observed phenotype. Genome-wide expression profiling may however direct further research attempts into a particular direction. Also, there are other "omics" approaches besides transcriptomics and integrating all of them is future challenge. Neurodegenerative Diseases -Processes, Prevention, Protection and Monitoring focuses on biological mechanisms, prevention, neuroprotection and even monitoring of disease progression. This book emphasizes the general biological processes of neurodegeneration in different neurodegenerative diseases. Although the primary etiology for different neurodegenerative diseases is different, there is a high level of similarity in the disease processes. The first three sections introduce how toxic proteins, intracellular calcium and oxidative stress affect different biological signaling pathways or molecular machineries to inform neurons to undergo degeneration. A section discusses how neighboring glial cells modulate or promote neurodegeneration. In the next section an evaluation is given of how hormonal and metabolic control modulate disease progression, which is followed by a section exploring some preventive methods using natural products and new pharmacological targets. We also explore how medical devices facilitate patient monitoring. This book is suitable for different readers: college students can use it as a textbook; researchers in academic institutions and pharmaceutical companies can take it as updated research information; health care professionals can take it as a reference book, even patients' families, relatives and friends can take it as a good basis to understand neurodegenerative diseases.