Parkinson’s disease (PD), the second most common progressive neurodegenerative disorder, was long believed to be a non-genetic sporadic origin syndrome. The identification of distinct genetic loci responsible for rare Mendelian forms of PD has represented a revolutionary breakthrough, allowing to discover novel mechanisms underlying this debilitating still incurable condition. Along with single-nucleotide polymorphisms (SNPs), other kinds of DNA molecular defects have emerged as significant disease-causing mutations, including large chromosomic structural rearrangements and copy number variations (CNVs). Due to their size variability and to the different sensitivity and resolution of detection methodologies, CNVs constitute a particular challenge in genetic studies and the pathogenetic or susceptibility impact of specific CNVs on PD is currently under debate. In this chapter, we will review the current literature and bioinformatic data describing the involvement of CNVs on PD pathobiology. We will discuss the recently highlighted role of PARK2 heterozygous CNVs, the possible common founder effects of PD gene rearrangements and the importance to map genetic breakpoints. We will also add a summary about the current available molecular methods and bioinformatics web resources to detect and interpret CNVs. Assessing the global genome-wide burden of large CNVs and elucidating the role of de novo rare structural variants on PD may reveal new candidate genes and consequently ameliorate diagnosis and counselling of mutations carriers.
- Parkinson’s disease
- copy number variations
- DNA rearrangements
Parkinson’s disease (PD) is a progressive debilitating movement disorder and constitutes the second most common neurologic disease after Alzheimer’s disease, affecting approximately 1% of the population older than 65 years of age .
Clinically, most patients present resting tremor, bradykinesia, stiffness of movement and postural instability. These major symptoms derive from the profound and selective loss of dopaminergic neurons of substantia nigra pars compacta (SNc), although SNc seems to become involved later in the middle stage of the disease . The neuropathological hallmarks of PD are round eosinophilic intracytoplasmic inclusions termed Lewy bodies (LBs) and dystrophic neurites (Lewy neurites) present in surviving neurons , both composed of alpha-synuclein aggregates. In more advanced stages, patients can also develop a range of non-motor symptoms, including rapid eye movement, sleep behaviour disorder, constipation, depression and cognitive decline. Although drugs such a levodopa (L-DOPA) or surgical intervention (deep brain stimulation) can alleviate the motor symptoms, they do not halt disease progression and are not effective against the non-motor aspects of the disease .
PD is primarily a sporadic multifactorial disorder, resulting from an elaborate interplay of numerous elements: genes, susceptibility alleles, environmental exposures, gene-environment interactions and their overall impact on developing and aging brain. Important insights have been provided during the last years through studying the genetics, epidemiology and neuropathology of PD, together with the development of experimental in vivo and in vitro models. A prominent role of common underlying pathways, such as mitochondrial dysfunctions , oxidative stress  and impairment of the ubiquitin/proteasome system (UPS) system , is now supported by accumulating genetic studies, which demonstrate that PD-associated genes directly or indirectly impinge on mitochondrial integrity, reactive oxygen species production and protein clearance.
In this review chapter, we will introduce the genetics of PD and then we will focus on copy number variations (CNVs), a form of DNA structural rearrangements, which are currently attracting researchers’ interest for their role in PD pathobiology.
2. PD: a genetic overview
PD was long believed to be a prototypical non-genetic disorder. In the last 15 years, however, the identification of distinct genetic loci responsible for rare monogenic Mendelian forms and the discovery of numerous risk factors have revolutionized this view and have provided novel clues to understand the molecular pathogenesis of this still incurable condition .
The Mendelian monogenic forms of PD are well-established and collectively account for about 30% of the familial and 3–5% of the sporadic cases . These rare forms are caused by mutations in specific genes, which are inherited from parents in either a dominant or recessive way.
In the autosomal-dominant inheritance, one mutated allele of the gene is sufficient to cause the disease . The SNCA (alpha-synuclein) gene, localized in the PARK1 locus, was the first to be associated to this pattern of transmission and was identified in an Italian American family with more than 60 affected individuals distributed on five generations . Among family members, the disease was caused by the Ala53Thr (A53T) missense mutation, which induces a conformational change in the protein chain and facilitates alpha-synuclein aggregation. Additional disease-causing mutations in SNCA are currently known, including genetic variations in the coding regions (Ala30Pro, Glu46Lys), single-nucleotide substitution in 3′ UTR and dose-dependent genomic multiplications (duplications or triplications) [8–10] that will be discussed in the next sections. A second gene clearly associated to the dominant PD inheritance is LRRK2 (locus PARK8), which encodes a large multi-domain protein called leucine-rich repeat kinase 2. LRRK2 harbours several genetic variants recognized as pathogenic, some of which represent the most frequent causes of Mendelian and sporadic PD identified so far (i.e. Gly2019Ser and mutations altering codon Arg1441). The precise physiological function of LRRK2 is unknown; however, it is probably implicated in different cellular functions such as neurite outgrowth, cytoskeletal maintenance, vesicle trafficking and autophagic protein degradation .
Some monogenic forms of PD are inherited with the autosomal recessive pattern, and frequently, symptoms have an early onset. In these disorders, mutations in both alleles (either homozygous or compound heterozygous) cause the pathological phenotype . Mutations in three genes cause the recessive typical form:
PARKIN (locus PARK2), the most commonly mutated gene in the juvenile Parkinsonism, responsible of 50% of cases;
PINK1 (locus PARK6), which is present with a frequency variable from 1 to 9% depending on the ethnic background ;
These three genes will be discussed in the next sections.
The recessive cases of Parkinsonism can clinically manifest also in an atypical form, meaning that patients have clinical features of PD, neuronal loss in the SNc and additional cell degeneration in other regions of the nervous system, such as in the striatum. The genes currently identified as responsible of this forms of Parkinsonism are the following:
ATP13A2 (locus PARK9), whose mutations are associated with the Kufor–Rakeb syndrome, a form of recessively levodopa-responsive inherited atypical Parkinsonism . This gene belongs to the P-type superfamily of ATPases that transport inorganic cations and other substrates across cell membranes ;
PLA2G6 (phospholipase A2 group VI), localized in the locus PARK14, which has been recently associated to a particular Parkinsonian phenotype consisting of levodopa-responsive dystonia, pyramidal signs and cognitive/psychiatric features with onset in early adulthood ;
FBXO7 (F-box only protein 7) is harboured in the PARK15 locus and responsible of the Parkinsonian-Pyramidal Disease (PPD), an autosomal recessive neurodegenerative disease with juvenile onset and additional pyramidal signs . FBXO7 is a components of modular E3 ubiquitin protein ligases called SCFs (SKP1, cullin, F-box proteins), which function in phosphorylation-dependent ubiquitination.
Despite the existence of rare monogenic forms, it is now clear that PD is a genetically heterogeneous and most likely complex disorder. The complexity of PD is underlined by the notion that we are currently aware of dozens of more or less convincing loci, genes and risk factors linked to the different inherited or sporadic forms. For the sake of completeness, we mention here just some of them: PARK3, GBA, UCHL1, PARK10, GIGYF2, PARK12, HTRA2, PARK16, DNAJ, HLA-DR, GAK-DGKQ, SYNJ1, GBAP1 [18–20]. This list is progressively extending thanks to the evidences revealed by linkage mapping analysis, genome-wide association studies (GWAS) and high-throughput genomic biotechnologies, such as next generation sequencing and array technologies [18, 21].
However, while the detection and interpretation of some kind of DNA molecular alterations, as single-nucleotide polymorphisms (SNPs), are relatively simple, CNVs genotyping is technically more challenging, partially due to the quantitative rather than the qualitative nature of the assay. These kinds of DNA molecular defects, therefore, are in general under-represented in genetic studies. In the next paragraphs, we will introduce CNVs and then we will focus on what is currently known about their pathogenic or susceptibility impact on PD pathobiology.
3. CNVs: origin, classification and clinical relevance
The genomic sequence along human chromosomes is constantly changing, and this process enables humans to evolve and adapt. The scientific community have long been aware of genetic variation at either size extreme (i.e. cytogenetically recognizable elements and SNPs). However, about 10 years ago, scientists began to recognize abundant variations of an intermediate size class known as structural variations. Within this class, CNVs, which involves unbalanced rearrangements that increase or decrease the DNA content, represent the largest component by far. Currently, the size of CNVs is defined as larger than 50 bp  and can be limited to a single gene or include a contiguous set of genes. These structural variants encompass more polymorphic base pairs than SNPs and finally result in an altered DNA diploid status (i.e. gain or loss of genomic region).
DNA structural changes and CNVs originate when specific architectural genomic elements are present that render DNA regions very susceptible to rearrangements. These latter can be classified as recurrent or non-recurrent events, depending on whether the same rearrangement is identified in unrelated individuals .
Three major mechanisms for the generation of recurrent and non-recurrent CNVs have been proposed. The most common cause of recurrent genomic rearrangements is the non-allelic homologous recombination (NAHR), which occurs during both mitosis and meiosis between two DNA blocks of high homology, like the region-specific low-copy repeats sequences (LCRs). Specifically, CNVs generate when tracts of directly oriented LCRs recombine by unequal crossing-over leading to both deletion and duplication of the genomic fragment [23, 24] (Figure 1). LCRs with different orientations can recombine to produce NAHR-mediated inversions or more complex genomic rearrangements.
Non-recurrent CNVs can mainly result from non-homologous end joining (NHEJ) mechanism. NHEJ is the major cellular mechanism for double-strand break (DSB) repair. Upon DSB, NHEJ reconnects chromosome ends, very often editing them before ligation and leaving an information scar (i.e. random nucleotides added at the site of the breakage to facilitate the strands’ alignment and ligation)  (Figure 2). NHEJ is more often associated with deletions and chromosomal translocations. However, complicated DNA intermediates have been proposed as origin mechanisms for duplications as well .
The third mechanism proposed to trigger non-recurrent genomic rearrangements is fork stalling and template switching (FoSTeS) . FoSTeS occurs when the DNA replication machinery pauses, the lagging strand dissociates from the polymerase holoenzyme and the template is switched with another region of the genome that is usually in physical proximity to the original replication fork. Replication continues based on a wrong template until the original fork is restored. Such template switching may occur several times before the replication process gets back to its original template, resulting in complex rearrangements (Figure 3) . The pausing and stalling of the DNA replication machinery are common at certain nucleotide motifs and repetitive DNA sequences; however, such events can also occur due to chemical changes in DNA structure as DNA lesions or DNA alkylization . CNVs generated by FoSTeS mainly originate during the S phase of the cell cycle as a consequence of DNA repair mechanisms. It should be noted that the CNVs created through FoSTeS are difficult to be distinguished from those generated by micro-homology-mediated breakpoint-induced repair (MMBIR), a mechanism of end-joining that relies on small-scale homology of DNA sequence at the ends of DSBs .
CNVs are very common and constitute a prevalent source of genomic variations. These alterations may account for adaptive or behavioural traits, may have no phenotypic effects or can underlie diseases . For this reason, determining the clinical significance of CNVs is very challenging and relies heavily on both frequency information from healthy control cohorts and databases with previously reported clinically relevant CNVs.
In a clinical setting, CNVs are categorized into five groups (according The American College of Medical Genetics and Genomics practice guidelines):
Abnormal or pathogenic (e.g. well-established association with a disease);
VOUS (Variants of uncertain clinical significance – rare or private CNV);
Benign (a polymorphic variant detected in a normal individual without clinical significance).
Specific large CNVs and single-gene dosage alterations have emerged as critical elements for the development and maintenance of the nervous system  and have appeared to contribute to hereditable or sporadic neurological diseases, such as neuropathies, epilepsy forms, autistic syndromes, psychiatric illnesses and also neurodegenerative diseases [23, 29–33]. In the next paragraphs, we will focus on the current evidences about the recurrence of CNVs in PD pathobiology.
4. CNVs in PD
4.1. Single-gene CNVs in familiar PD-genes
Accumulating evidences show that alpha-synuclein gene (SNCA) copy number gains play a major role in the disease severity of PARK1. Singleton et al.  were the first to describe a genomic triplication of chromosome 4q21 22 containing the SNCA locus within a large family with PD autosomal-dominant inheritance pattern and dementia called the Spellman–Muenter family or Iowa Kindred. The size of the triplication region, confirmed with quantitative polymerase chain reaction (PCR) and fluorescence in situ hybridization (FISH) methodology, was over 2.0 Mb.
After the initial description, numerous families with SNCA genomic alterations have been reported. PD families with members carrying four copies of SNCA gene have been detected in South Africa, Iran, Japan, Pakistan and Italy [35–40]: in general, triplication generates very high expression of mRNA and protein and influences the clinical manifestations of PD, causing severe forms of Parkinsonism similar to dementia with Lewy body. Duplications have been reported in more numerous families than triplications [33, 38, 41–51]. In some of the patients with the same ethnic background (Japan), haplotype analysis revealed their derivation from common founders [43, 44]. In contrast to triplication carriers, the clinical phenotype of patients with duplicated SNCA resembles idiopathic PD, mainly with late age at onset, good efficacy for levodopa therapy, slower disease progression and without early development of dementia. An interesting familiar case, the Swedish-American family (named the “Lister family”), presents both duplicated and triplicated SNCA carriers within different branches of the pedigree (branches J and I), suggesting a primary duplication event followed later by another one and resulting in the triplication [52, 53]. A similar pedigree, the Ikeuchi family, has both heterozygote and homozygote duplication from a consanguineous marriage (producing a pseudo-triplication) . The clinical features of the individual with the SNCA homozygote duplication showed severe Parkinsonism similar to that of a triplication carrier.
In 2007, Ahn et al.  firstly reported two sporadic patients with SNCA duplication from a large screen of PD patients. The age at onset was 65 and 50 years old for the two patients. Their clinical course was similar to typical sporadic PD without severe progression or cognitive decline. Other cases of sporadic PD carrying de novo SNCA duplication were later revealed by different detection assays [55–59].
The breakpoint of SNCA multiplication is different among families and sporadic patients. The largest multiplication detected so far is about 41.2 Mb, containing 150 genes and defined a partial trisomy 4q . The smallest one sizes about 0.2 Mb and was found in a Japanese family . The size and gene make-up of each multiplicated region does not seem to severely influence the clinical presentation of the carrier. The single common determining factor that appears between all patients with SNCA multiplications is the presence of the more than two copies of the entire gene.
Significant data regard the mosaicism of SNCA rearrangements, indicating the situation in which not all the body cells present the same genetic composition but specific groups of cells have a different genomic architecture. In this regard, Perandones et al.  have reported two interesting cases. In these patients, the exon dosage test conducted on peripheral blood revealed no alterations, while FISH analysis conducted on interphase cells from the buccal cavity (oral mucosa cells) revealed a good percentage of cells with SNCA triplication or duplication. It should be noted that both patients displayed a Parkinsonian clinical phenotype already described for the SNCA duplication or triplication carriers. Since usually only DNA from peripheral lymphocytic tissues are examined for SNCA rearrangements, it should be taken into consideration that the possibility to examine cells from other tissues in order to detect low-grade mosaicism.
Although the SNCA story suggests a gain of function, several early-onset forms of PD have demonstrated the role of loss of function in the aetiology of the disease. The most common loss-of-function causes of early-onset PD are mutations in Parkin (or PARK2) gene, one of the largest known genes of our genome harboured in the long arm of chromosome 6 (6q25.2-q27) and encoding an E3 ubiquitin ligase. Mutations of PARK2 are particularly frequent in individuals with evidence of familiar recessive inheritance and account for 50% of the cases with juvenile PD. Parkin mutations also explain ~15% of the sporadic cases with onset before 45 [61, 62] and act as susceptibility alleles for late-onset forms of PD (2% of cases) .
The PARK2 gene has a high mutation rate since it is located in the core of FRA6E site, one of the most mutation-susceptible common fragile site of human genome . For this reason, more than 200 putative pathogenic mutations have been reported worldwide, affecting numerous ethnic populations [33, 35, 59, 64–77]. The PARK2 mutation spectrum includes homozygous or compound heterozygous missense and nonsense point mutations, as well as several exon rearrangements (both duplications and deletions) involving all 12 exons and the promoter region. The list of studies focusing on PARK2 CNVs is such high that describing them one by one is very arduous. Overall, Parkin CNVs mutations published so far are summarized in Figure 4 and are collected in the Parkinson Disease Mutation database (http://www.molgen.vib-ua.be/PDMutDB), whose the reader is referred for a more complete overview. Recently, our research group has outlined a complex alternative splicing mechanism regulating the expression of PARK2 [78–80]. These data suggest the existence of five additional exons that, however, have never been considered for dosage screening.
CNV rearrangements involving PARK2 exons accounts for 50–60% of all pathogenic anomalies, rendering gene-dosage assays essential in Parkin mutational screening . However, the hot-spot nature of this gene makes its quantitative analysis a particular challenge, and several issues need to be pointed out in this regard.
First of all, the determination of mutational phase of the rearrangements, meaning the assessment that amplified or deleted exons are really contiguous. Kim et al.  have showed that phase determination is a prerequisite for PARK2 molecular diagnosis: by phase determination, several patients with apparent contiguous multi-exon deletions were re-diagnosed as compound heterozygotes. Simple gene-dosage assays seem to be not sufficient to determine the phase of rearrangements, and therefore, the true incidence of molecularly confirmed Parkin-type early-onset PD may be underestimated.
A second important point refers to breakpoint mapping which can be useful to compare exon rearrangements between patients and families and to study the possible causing event mechanism . Just few papers have addressed this issue so far, but mostly report rearrangements into the region between PARK2 exons 2 and 8 [25, 82]. In the majority of mapped cases, micro-homologies at breakpoint junctions were present, thus supporting NHEJ and FoSTeS/MMBIR as the major mechanisms responsible for PARK2 genomic rearrangements .
Despite these hypotheses, it cannot be excluded that PARK2 CNVs may arise in some minor ethnic groups from an ancient founder . For example, haplotype analysis in four families from The Netherlands has showed a common haplotype of 1.2 Mb responsible for exon 7 duplication and a common haplotype of 6.3 Mb responsible for exon 4 deletion .
A relevant matter of ongoing debates is the pathogenic role of single-heterozygous PARK2 CNVs. Several studies have sought to address this issue, but the findings published so far are inconsistent and conflicting. Some reports indicate that CNVs heterozygous mutations in PARK2 associate with increased PD risk [49, 68], albeit others found no differences for an association [33, 69]. Not only association studies but also examinations of families have yielded contradictory results. Heterozygous family members of homozygous carriers have been described with mild PD signs of late onset [84, 85], whereas another group found no typical clinical signs of the disease . Very recently, Huttenlocher et al.  have genotyped a large sample of Icelanders, observing that PD patients were more often heterozygous carriers of CNVs than controls, and confirmed these results by a meta-analysis study. These findings seem again to suggest that heterozygous carriers of PARK2 CNVs have greater risk of developing PD than non-carriers.
Pathogenic mutations in the PTEN-induced kinase (PINK1) gene are much less common than PARK2 mutations with a frequency variable from 1 to 9% depending on the ethnic background . The encoded protein is a putative serine/threonine kinase of 581 amino acids involved in mitochondrial response to cellular and oxidative stress .
Homozygous deletions involving different combination of exons 4–8 have been described in both familial and sporadic early-onset cases coming from Japan, Brazil, Sudan and Iran [38, 88–91]. Just in one of them, breakpoint analysis has been performed, revealing a complex rearrangement combining a large deletion and the insertion of a duplicated sequence from the neighbouring DDOST gene intron 2 . As suggested by breakpoint analysis, this rearrangement may result from FoSTeS mechanism. Only one case of a sporadic patient carrying two different CNVs on PINK1 (compound heterozygous mutations) has been reported until now, consisting of exon 2 deletion in one allele and exons 2–4 deletion in the other one .
The spectrum mutation of PINK1 CNVs is enlarged by the heterozygous cases that, however, do not explain the recessive inheritance. The largest heterozygous deletion published so far includes the entire PINK1 gene and spans for 56 kb . This deletion also partly involves two neighbouring genes and two highly similar AluJo repeat sequences. It is likely that the deletion results from an unequal crossing-over between these two sequences. Further heterozygous deletions involving exons 1, 3–8 and exon 7 have been described in familial or sporadic cases of early-onset PD [72, 93, 94].
The PARK7 locus on chromosome 1p36 was localized by homozygosity mapping in two consanguineous families from genetically isolated communities in the Netherlands and Italy [95, 96]. In one of the families, a 14 kb deletion involving the first five of seven exons in the DJ-1 gene has been identified . Furthermore, three siblings of Iranian origins born from consanguineous parents and carriers of a homozygous deletion of exon 5 have been described .
The product of DJ-1 is a highly conserved multifunctional protein belonging to the peptidase C56 family . It acts as positive regulator of transcription, redox-sensitive chaperone and sensor for oxidative stress and apparently protects neurons from ROS-induced apoptosis [98, 99]. Alu repeat elements flank the deleted sequence on both sides, suggesting that unequal crossing-over is likely at the origin of this rare genomic rearrangement .
Further heterozygous CNVs (both deletions and duplication) involving the exons of DJ-1 gene have been published so far [93, 100–102], although they do not completely explain the recessive pattern of the PD phenotype.
ATP13A2 mutations are associated with Kufor–Rakeb syndrome (KRS), a form of recessively levodopa-responsive inherited atypical Parkinsonism . It encodes a large protein belonging to the ATPase transmembrane transporters, and recently, it has been identified as a potent modifier of the toxicity induced by alpha-synuclein .
To our knowledge, just one family from Iran with deletion of ATP13A2 has been reported . Specifically, three affected siblings born from consanguineous parents have been described as carriers of a homozygous deletion of exon 2. All three affected individuals had moderate mental retardation, aggressive behaviours, visual hallucinations, supranuclear vertical gaze paresis, slow vertical saccades and dystonia. Cognitive function deteriorated rapidly and all three affected individuals had dementia by age 10. Further clinical and genetic follow-up of KRS patients will increase the knowledge on the natural history and clinical features of this syndrome.
4.2. Rare single-gene CNVs in other PD-genes
The tyrosine hydroxylase (TH) gene encodes a monooxygenase that catalyses the conversion of L-tyrosine to L-dihydroxyphenylalanine (L-DOPA), which constitutes the rate-limiting step in dopamine biosynthesis. Consistent with the essential role of TH in dopamine homeostasis, missense mutations in TH have been associated with severe Parkinsonism-related phenotypes, such as Segawa’s syndrome, L-DOPA-responsive infantile Parkinsonism or L-DOPA-responsive dystonia (DRD) in recessive form .
In 2010, for the first time, Bademci et al.  reported a 34 kb deletion of the entire TH gene in a 54 years-age-at-onset PD patient, presenting no evidence for dystonia, but responsive to L-DOPA treatment. The deletion was first identified by CNV analysis in a GWAS using SNP array, then confirmed by multiple quantitative PCR assays and was not found in any of 642 controls.
More recently, Ormazabal et al.  reported a DRD patient with a deletion in TH encompassing several exons. A heterozygous exon 12 deletion was first identified by MLPA. Additional analysis with long range PCR over breakpoints confirmed the deletion encompassing a segment of 716 bp (c.1197 + 25_1391del) including exon 12 and part of exon 13. The patient also had a heterozygous mutation (c.1 – 70 G > A) at promoter region in another allele. The promoter mutation was believed to be the underlying cause for DRD phenotype, but authors recommend the inclusion of structural variant analysis for those patients with clinical and biochemical features of TH deficiency lacking molecular confirmation by the usual sequencing techniques.
In 2011, two independent groups reported the identification of the same missense mutation (p.Asp620Asn) in the vacuolar protein sorting 35 (VPS35) gene as a novel disease-causing mutation in large autosomal-dominant PD pedigrees of Austrian and Swiss origins [107, 108]. After this discovery, Verstraeten et al.  performed in-depth sequence and dosage analyses of VPS35 in an extended LBD patient group comprised of PD, PD with dementia, and dementia with LBs patients living throughout Flanders. The dosage analyses were performed using multiplex amplicon quantification, but no CNVs were detected in this patient’s cohort. Despite these data, it cannot be excluded that CNVs in VPS35 maybe occur in other PD patients groups and contribute to PD onset. It has very recently demonstrated that specific deletion of VPS35 in dopaminergic neurons of deficient mice resulted in PD-like deficits including loss of dopaminergic neurons and accumulation of alpha-synuclein .
The progranulin (PGRN) gene is expressed in a wide variety of tissues including neuronal and microglial populations of the central nervous system and encodes for an autocrine growth factor . Its loss-of-function mutations are responsible for ubiquitin-positive frontotemporal lobar degeneration linked to chromosome 17 (FTLDU-17). A deletion of exons 1–11 of the PGRN has been reported in a patient with typical PGRN neuropathology, and equally, in his sister presenting PD . The deletion resulted from a non-homologous recombination event and was measured by using quantitative multiplex PCR of short fluorescent fragments. Although PGRN mutations are certainly not a major cause of PD, these data suggest that PGRN CNVs may attend to PD mechanisms.
A relevant feature of PD development is the abnormal iron deposition in the SN of PD patients . The HMOX (Heme oxygenase) protein degrades heme ring to biliverdin, free ferrous iron and carbon monoxide being the rate-limiting activity in heme catabolism. The isoform HMOX1 is highly inducible in response to oxidative stress, which is considered a significant pathway altered in PD conditions. Based on these major findings, Ayuso et al.  analysed exon 3 dosage alterations in the HMOX1 gene in 691 patients suffering from PD and 766 healthy control individuals. CNVs of the HMOX1 gene were analyzed using a TaqMan assay, designed to hybridize just within the HMOX1 exon 3. CNV analyses in the whole study group revealed the occurrence of three patients with PD and six control individuals with a single copy of the HMOX1 gene. Thus, authors conclude that HMOX1 CNVs exist but they do not seem to have a major association with PD risk. However, it cannot be excluded the occurrence of structural alterations in other HMOX1 exons.
4.3. The 22q11.2 deletion
The 22q11.2 deletion syndrome (22qDS), also known as Di George or velocardiofacial syndrome, is a multi-system disorder caused by a chromosomal microdeletion most commonly involving a 3 Mb segment on the long arm of chromosome 22. Several case reports of individuals with the hemizygous deletion of chromosome 22q11.2 and some clinical features of Parkinsonism have suggested that this genetic anomaly may also confer an increased risk of early-onset PD [113–115]. Some of these cases were reported to be treated with L-DOPA, while in other case, presynaptic dopamine imaging indicated degeneration of the nigrostriatal dopamine system [113–115]. To investigate the association between 22q11.2 deletions and PD, Butcher and colleagues  assessed the occurrence of a clinical diagnosis of PD in a well-characterized cohort of adults with 22qDS. They also examined available brain tissue from 3 individuals with 22qDS and an ante-mortem diagnosis of PD. They reported four patients with the 22q11.2 hemizygous deletion and diagnosed of early-onset PD. Three of them also displayed typical neuropathological features with prominent LBs and Lewy neurite formations on autopsy examination. The authors concluded chromosome 22q11.2 deletions may represent a novel risk factor for early-onset PD and, with their neuropathological data, excluded that Parkinsonian clinical features can be due to adverse effects of antipsychotics.
Ogaki and Ross  proposed it is not the microdeletion per se that is responsible for the phenotype, but rather the complete loss of function of a gene at the locus due to the combination of the deletion and a mutation on the other allele. Indeed, the chromosome 22q11.2 region contains some excellent candidate genes, such as COMT, encoding catechol-O-methyltransferase that is involved in catecholamine catabolism including dopamine and thus plays a role in regulating dopamine levels (COMT-inhibitor has been used as a treatment for PD). The deletion region also contains SEPT5, encoding SEPTIN5 that functionally interacts with PARK2, and DGCR8 that encodes a subunit of a complex which mediates the biogenesis of microRNAs, including miR-185 (also encoded within the chromosome 22q11.2 deletion) which is predicted to target LRRK2 .
Very recently, a 37-year-old early-onset PD patient was found carrying the 22q11.2 deletion but lacking the more severe features of 22qDS, such as cardiac defect, palatal defect and hypocalcemia . The deletion was revealed using chromosomal microarray analysis, suggesting that this genetic test should be considered as part of the evaluation for patients with early-onset PD and other features associated with 22qDS. Interestingly, very recently, Perandones et al.  reported a case of mosaicism of a patient from an Ashkenazi Jewish ethnic group with a history of midline defects and PD onset at 46 years. In this patient, FISH test detected a mosaicism of a 22q deletion in 24% of the analyzed blood cells, highlighting the relevance of performing individual cell-by-cell analysis, at least until single-cell sequencing becomes optimized and generally available. The pathogenesis of early-onset PD in patients with 22qDS remains unknown but, if elucidated, it may contribute to understanding the aetiology of PD and ultimately to prevention and treatment strategies.
4.4. CNVs involving the mtDNA of PD patients
Mitochondria dysfunction was implicated in the pathogenesis of idiopathic PD. Accumulated evidence of mitochondrial DNA (mtDNA) anomalies has been observed in PD patients, including increased mtDNA deletions/rearrangements in both cerebral area (SN and striatum) and peripheral tissues (skeletal muscle) [120–122]. The number and variety of mtDNA deletions/rearrangements seem to be selectively increased in the SN of PD patients compared to other disorders (multiple system atrophy and dementia with LBs) as well as patients with Alzheimer disease and age-matched controls.
More recently, Gui et al.  analyzed the copy number of mtDNA using quantitative real-time PCR in 414 cases with PD and 231 healthy subjects from mainland of China. The level of mtDNA was significantly decreased in PD patients’ peripheral blood as compared to that of healthy controls. Furthermore, lower mtDNA copy number was more frequently detected in the older onset age group than that in the younger group, suggesting mtDNA content might be an important genetic event in PD progression. In addition, using direct sequencing, they examined the mutations in the D-loop (the region that controls replication and transcription of mtDNA) and in the genomic POLG1 gene. The results revealed that 17% of the PD patients carried mutations in the D-loop of mtDNA and that patients carrying mutated POLG1 had a significantly lower copy number of mtDNA than those of PD patients without POLG1 alterations, suggesting a role of POLG1 variations for reducing mtDNA copy number in PD.
5. Genome-wide studies to map CNVs in PD
The typical attempts to identify genetic lesions that underlie monogenic forms of disease have involved the use of linkage mapping analysis in large pedigrees of several affected individuals with known relationships. The application of genome-wide technologies assays now allows the fast production of high quality, ultra-dense genotypes, producing rapid mapping and localization of genomic deletion and duplication. However, there is still no consensus on the best approach for the detection or analysis of genome-wide CNVs, and very few studies have been conducted so far.
The first pilot analyses of structural genetic variations in a large cohort PD patients and neurologically normal controls have been carried out using the genome-wide SNP association study approach . In this study, no new regions associated with PD were identified, but several deletions and duplications in PARK2 were observed, confirmed by independent gene dosage experiments. Some months later, Kim et al. , tried to investigate CNVs in PD patients by array-based comparative genomic hybridization (CGH). They observed several candidate CNVs in many chromosomes, but finally conclude these did not involve any regions harbouring genes implicated in PD pathogenesis or progression.
In 2011, Pankratz et al.  presented the results of the first systematic genome-wide analysis of CNVs for PD using CNV calling algorithms. The final sample included 816 cases and 856 controls. They replicated the association of PD susceptibility with PARK2 CNVs and also detected CNVs in two novel genes, DOCK5 and USP32, associated with an increase in risk for PD at genome-wide significance. However, neither of these novel loci could be validated with independent molecular tests.
To identify novel CNVs and to evaluate their contribution to the risk of PD, more recently Liu et al.  have conducted a genome-wide scan for CNVs in a case–control dataset (268 PD cases and 178 controls), focusing on a genetic isolate, the Ashkenazi Jewish population. Using high-confidence CNVs, this research group examined the global genome wide burden of large and rare CNVs. Overall, global burden analyses did not reveal significant differences between cases and controls, but the deletions were found 1.4 times more often in cases. Interestingly, a total of 81 PD cases carried a rare genic CNV that was absent in controls. Of note, duplication of the OVOS2 (ovostatin 2) gene on Chr. 12p11.21 was identified as significant risk factors for PD. In one PD case, sample was observed a deletion spanning NSF and WNT3 genes, a region previously identified as “top-hit” in GWAS studies and physically near to MAPT gene. Other interesting genes include ATXN3, FBXW7, CHCHD3, HSF1, KLC1 and MBD3, which participates in the disease pathways with known PD genes.
Whole-genome microarray detection was used to identify somatic CNVs in PD patients by Pamphlett et al. , who investigated the existence of candidate brain-situated genetic variations missing in blood DNA. In this study, a total of 45 CNVs in PD brain samples but in any control, brains were founded, including genes related to mitochondrial function (BCL2, IMMP2L), cellular vesicle formation (NRSN1) and apoptosis (BCL2), pathways implicated in the pathogenesis of PD. Furthermore, additional private rare CNVs observed in PD brains have been reported. This study shows that specific-brain CNVs can be detected, and raises the possibility that brain-situated mutations could underlie some cases of PD.
6. How to detect and analyse CNVs: molecular methods and bioinformatics web resources
CNVs can be detected and analysed by laboratory methods restricted to certain locations on chromosomes (locus-specific levels), or targeting the whole genome (genome-wide level). In the next sections, we will briefly describe the traditional methodologies, the currently available high-throughput biotechnologies, and the supporting bioinformatics tools to help the detection, analysis and interpretation of CNVs.
6.1. Locus-specific methods
6.1.1. RFLP—Southern blotting
The most conventional method for detecting structural rearrangements is RFLP—Southern blot, which relies on DNA digestion with rare-cut enzyme, electrophoresis separation of digested DNA fragments by pulsed field gel, membranes transfers and hybridization with appropriate probes . The RFLP—Southern blot method may be a good choice to resolve large size CNVs and structural variations. However, there are some drawbacks to this method: 1) the very low resolution; 2) the higher cost per analysis compared to other methods; 3) time-consuming and laborious procedures that require more than a week; 4) the need for purified high molecular weight DNA. For these reason, the RFLP—Southern blot is not so much used in both clinical and research fields.
6.1.2. Fluorescence in situ hybridization
FISH is an extremely useful method for the detection of chromosomal abnormalities. This methods relies on the hybridization of fluorescently labelled DNA probes to metaphase chromosome spreads (resolution 1–3 Mb) or interphase nuclei (50 kb–2 Mb) . The highest resolution is obtained by fibre-FISH (5–500 kb), where probes are visualized on mechanically stretched chromosome fibres, and is currently the preferred method to precisely determine the genomic structure of complex CNVs . Nowadays, FISH combined with multiple probes labelled in different colours (multicolour FISH), is widely used in clinical diagnostics as a screening tool to confirm the presence of CNVs and other structural variants . However, FISH has several limitations. Locus-specific probes are expensive, and the procedure is time-consuming and labour intensive. Furthermore, only a limited number of chromosomal loci can be screened in a single experiment and the identification of an abnormality is highly dependent on the DNA probe used, specifically on its size and hybridization localization.
6.1.3. PCR-based approaches
The first PCR-based technique used for targeted CNV analysis was real-time quantitative PCR (qPCR) that combines “traditional” endpoint PCR with fluorescent detection technologies to record the accumulation of amplicons during PCR cycling . Fluorescence monitoring systems is based on fluorescence probes (e.g. TaqMan, Scorpions, FRET probes) or DNA-intercalating agents (SYBR green). With the accumulation of target sequences during PCR, the fluorescence signal increases. Amplicon quantification relies on the observation of the threshold cycle number (Ct), at which the amount of an amplified target amplicon is directly related to the amount of starting target. So a higher or lower starting copy number of a genomic DNA target will respectively result in a significant earlier or later increase in fluorescence, and thus, in a decreased or increased Ct. The qPCR technique offers great flexibility and adaptability, and can be carried out in a closed system, thus eliminating the risk of PCR and sample contamination and does not require post-processing of PCR products. Therefore, it has become one of the most popular methods for CNV analysis, especially for validation of results obtained by microarray tests.
Another widely used PCR-based approach is MLPA (or multiplex ligand probe amplification). MLPA consists of two oligonucleotide hemiprobes, one synthetic and one derived from the single-stranded M13 bacteriophage . These oligonucleotides hybridize to adjacent sites of the target sequence. Each hemiprobe is flanked by universal PCR primer sites and one of the hemiprobes also has a “stuffer” sequence allowing each probe set to have different fragment lengths. After the hybridization to genomic DNA, the two hybridized hemiprobes are ligated resulting in a proportional relation between the number of joined primers and the target copy number. Then, PCR amplification is carried out using a single universal dye-labelled primerset. The resulting PCR products are separated by capillary gel electrophoresis followed by data analysis to identify CNVs. Since the relative quantity of each of the PCR products is proportional to the number of copies of the target sequence, results are given as allele copy numbers as compared to normal controls. MLPA is specifically developed to screen up to 50 (on average 20–40) independent loci simultaneously, with results typically available after 1–3 days.
6.2. Genome-wide high-throughput biotechnologies
As anticipated, the above-described traditional methodological approaches bear objective limits: they are time-consuming and labour-intensive, they require multiple phase steps and severe equipment costs and above all do not provide a complete genomic overview of structural imbalances at sufficiently high resolution. The recent development of the aCGH and next generation sequencing (NGS) biotechnology has dramatically improved and catalysed the detection and characterization of multiple CNVs, offering high reproducibility, high resolution and scalability for complete genome-wide mapping of imbalances [129–133].
6.2.1. CGH array
The application of the whole-genome high-resolution aCGH (comparative genomic hybridization array) platforms for detecting deletions or duplications has extensively grown. This biotechnology is now recognized as the first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies . In addition, several customized high-density aCGH, suitably designed to focus on specific clinically relevant chromosomal locations have been developed [133, 135, 136] and aim to obtain an improved resolution. The designing of customized aCGH platforms has been already applied to different human diseases including neuromuscular diseases, cancer, autism, epilepsy, multiple sclerosis, mitochondrial and metabolic disorders [132, 137–143].
The CGH method uses two different fluorescent dyes for the test (unknown or experimental) and reference DNA samples, which contemporary hybridize on a microarray glass spotted with millions of probes. By measuring the ratio between the fluorescence signals of the two dyes and assuming that the reference sample has a normal diploid, the CNVs can be detected as either the gain or loss of signal in the test sample.
6.2.2. Next generation sequencing
In the last few years, the NGS technology is becoming more frequently used in various fields of life science. The NGS is a high-throughput technology that can output million or billion short reads from the shotgun sequencing, and thus provides high resolution mapping of genomic regions. The huge amount of data can be utilized for de novo assembly, SNPs calling, structural variations and for the detection of CNVs with high resolution. Recently, a variety of algorithm suitably projected to analyse CNVs has been proposed [144–147], and scientists are still studying to find out the best one. The application of next-generation sequencing platforms to PD genetic studies promises to improve resolution and reveal new clues to better understand its molecular causes.
6.3. Online web resources
The collection of CNVs into web databases has provided an essential tool in interpreting results for diagnostic laboratories and in helping researchers. Public available repositories are The Database of Genomic Variants (DGV—http://dgv.tcag.ca/dgv/app/home), The International Standard for Cytogenomic Arrays (ISCA—http://www.clinicalgenome.org), The database of Chromosomal Imbalance and Phenotype in Human using Ensembl Resources (DECIPHER—http://decipher.sanger.ac.uk), and the NCBI Database of Genomic Structural Variations (dbVAR—http://www.ncbi.nlm.nih.gov/dbvar).
CNVs deposited in these databases are currently loaded into the Human Genome Browser (http://genome-euro.ucsc.edu/index.html) as searchable tracks (Figure 5). All together, these databases are helpful in reporting genomic reports updated information on CNVs frequencies in unaffected controls and are classified in VOUS, benign, pathogenic, likely benign or likely pathogenic.
7. Conclusive remarks
Several evidences suggest an extensive and complex genetic action of CNVs on PD etiopathogenesis. Thus far, unfortunately, only a small portion of the genetic variance has been identified; the remaining substantial components remain unknown. Assessing the global genome-wide burden of large CNVs and elucidating the role of de novo rare structural variants on PD may reveal new candidate genes, explain a portion of the “missing heritability” (for example, new susceptibility or causative factors that overall converge on PD syndrome) and consequently ameliorate diagnosis and counselling of mutations carriers. The forthcoming new era of genomics data promises to increase resolution and uncover new interesting perspectives.
This work was supported by the Italian Ministry of Education, Universities and Research through Grant CTN01_00177_817708 and the international Ph.D. program in Neuroscience of the University of Catania. Authors gratefully acknowledge Cristina Calì, Alfia Corsino, Maria Patrizia D’Angelo and Francesco Marino for their administrative and technical support.