Candidate gene list. OMIM, online Mendelian inheritance in man; AR, autosomal recessive; AD, autosomal dominant; XL, X-linked; nd, not defined, *In progress (OMIM)
Inherited macrothrombocytopenias comprise a heterogeneous group of inherited platelet disorders that are characterized by large platelets, thrombocytopenia and bleeding tendencies in affected individuals. Diagnostic platforms have traditionally involved a battery of complex phenotypic tests that often fail to reach a diagnosis. Next-generation sequencing lacks the pre-analytical and analytical shortcoming of these tests and provides an attractive alternate diagnostic approach. Our group has developed a candidate gene array targeting genes known to affect platelet function and tested it in a large cohort of Australasian patients with presumed platelet function disorders, particularly macrothrombocytopenia. This array identified causative variants in a significant portion of patients with uncharacterized platelet disorders, including transcription factor mutations that cannot easily be diagnosed with standard platelet phenotyping procedures. We propose that targeted genotypic screening can identify the genetic basis of platelet function defects and has the potential to be developed into a powerful clinical platform to help clinicians diagnose these rare disorders.
- Inherited macrothrombocytopenia
- next-generation sequencing
- candidate gene array
Platelets are essential for clot formation after tissue trauma. Initiation of the platelet plug occurs by adhesion of platelets to the damaged vascular endothelium mediated by interactions of glycoprotein Ib/IX/V complexes with von Willebrand factor (vWF), and GPVI and integrin α2β1 with collagen . Extension of the platelet plug requires activation of αIIbβ3 through an “inside-out” signaling cascade which enables receptor cross-linking with fibrinogen and vWF and activation of “outside-in” signaling events [1, 2].
Primary hemostasis relies on both adequate function and number of platelets. Abnormalities in platelet function and/ or number may be acquired (liver disease, chronic kidney disease) or inherited (inherited platelet function disorders, IPFDs or inherited platelet number disorders, IPNDs). The group of inherited macrothrombocytopenias is included in the heterogeneous IPNDs and are characterized by large platelets, thrombocytopenia and bleeding tendencies in affected individuals (Figure 1A, Figure 1B, Figure 1C and Figure 1D) .
Unfortunately, inherited macrothrombocytopenia is under-recognized with the presence of large platelets on blood film examination often leading to a misdiagnosis of immune thrombocytopenic purpura (ITP), resulting in subsequent inappropriate treatment with steroids or in some cases removal of the spleen . Diagnostic algorithms have traditionally been based around biological laboratory tests examining functional properties and activation pathways in isolated platelets [3, 5–7]. This phenotypic approach is poorly standardized, technically difficult and not easily reproducible [6–11]. In addition, numerous pre-analytical variables may affect phenotypic test results. These variables include the effect of food (garlic), alcohol, drugs (herbal remedies, non-steroidal anti-inflammatory drugs, anti-platelet medications) and stimulants (smoking and caffeine) on platelet function, activation of platelet samples during venipuncture and transport necessitating careful sample handling as well as the relatively large volume of blood needed (which becomes a major problem when assessing pediatric samples) [12–14]. Despite these complex phenotypic tests, many cases remain without a definitive diagnosis.
Genetic technology may overcome many of the problems surrounding phenotypic testing for thrombocytopenia as DNA is stable, can easily be transported long distances and is not affected by diet or drugs. Moreover, genetic-based tests have provided opportunities to reduce redundancy and heterogeneity of diagnostic algorithms and have shifted our ability to describe inherited platelet disorders from a level of the defective platelet pathway involved, to a molecular level.
The Sanger sequencing method  has long been considered the “gold standard” technology to rapidly analyze small regions across a limited number of samples, but it is not suited to screening large numbers of genes in multiple patients . The emergence of next-generation sequencing (NGS) technologies as a diagnostic approach has been able to generate more test sequence increasing the number of gene targets and decreasing the costs [17, 18]. Human whole genome sequencing (WGS) or whole exome sequencing (WES) [19, 20] have proven to be clinically appropriate and practical modalities in describing new genetic mutations in families and identifying known pathogenic mutations in individuals formerly without a diagnosis .
Testing approaches may vary depending on whether a novel genetic mutation is likely. WGS and WES are powerful platforms in discovering novel causal variants in individuals with rare penetrant monogenic disorders , whilst a candidate gene approach allows assessment of known mutations in genes causing clinical phenotypes.
Whole genome approaches incorporating NGS have recently reported novel mutations in an essential platelet transcription factor GFI1B [22, 23], and a WES approach followed by targeted Sanger sequencing was used successfully to describe mutations in
2. Materials and methods
Diagnostic assessment of patients with uncharacterized thrombocytopenia was performed as part of a human research ethics committee approved study conducted in accordance with the Declaration of Helsinki.
Following informed written consent, 20 ml of blood was taken from an antecubital vein and collected into EDTA tubes. This blood was easily transported, in some cases, over 1,000 km between diagnostic sites in Australia.
A total of 95 patient DNA samples were analyzed. This included two internal controls for which DNA-based diagnosis had previously been established by Sanger sequencing.
32 male patients (mean age 37.4 years, range 18–92 years) and 44 female patients (mean age 38.7 years, range 18–79 years) were included in the NGS assay. The mean age of the cohort was 38.1 years (range 18–92 years). Sixteen de-identified DNA samples were received from referring institutions for which no additional laboratory data were available.
Phenotypic testing data were available for 59 (62.1%) individuals. This included platelet functional analysis (PFA) (
2.2. DNA preparation
Genomic DNA (gDNA) was isolated from peripheral blood leukocytes using the Wizard® Genomic DNA purification kit (Promega, Alexandria, NSW, Australia). DNA quality and concentration were assessed using the Nanodrop™ 1000 spectrophotometer (Thermo Scientific, Scoresby, Vic, Australia) that measures the purity of DNA by the ratio of absorbance of molecules at 260 and 280 nm. Samples with ratios between 1.8 and 2.0 were accepted for analysis whilst ratios lower than this may represent the presence of contaminants and these samples were not processed further . At least, 250 ng of input gDNA was prepared per sample.
2.3. Candidate gene identification and gene panel design
An extensive literature search using public databases was performed to assemble an initial candidate gene list of all genes reasonably hypothesized to have an impact on platelet number and size (
A TruSeq custom amplicon (TruSeq® Custom Amplicon Kit, Illumina Inc., Scoresby, Vic, Australia) specific for the target regions of the selected 19 genes (Table 1,
|Alpha-Actinin-1||AD||α actinin-related thrombocytopenia (α actinin-RT, 615193)|
|Thrombospondin receptor (Glycoprotein IV)||AD||Familial thrombocytopenia with GPIV deficiency (nd, 608404)|
|V-Ets avian erythroblastosis virus E26 oncogene homolog 1||nd||nd|
|Coagulation factor II (thrombin) receptor||nd||nd|
|Friend leukaemia virus integration 1||AD||Paris-Trousseau syndrome / Jacobsen syndrome (TCPT/JBS, 188025, 600588)|
|GATA-binding protein 1||XL||GATA1-related disorders (GATA1-RD, 300367, 314050)|
|Growth factor-independent 1B||AD||GFI1B-related thrombocytopenia (GFI1B-RT, 187900)|
|Glycoprotein 1b-alpha polypeptide||AR|
|Bernard Soulier syndrome (BSS, 231200)|
Platelet type-von Willebrand disease (PT-VWD, 177820)
Velocardiofacial syndrome (VCFS, 192430)
Mediterranean thrombocytopenia (nd, 153670)
|Glycoprotein 1b-beta polypeptide||AR||Bernard Soulier syndrome (BSS, 231200)|
|Glycoprotein VI||AR*||Bleeding disorder, platelet type 11|
|Glycoprotein IX||AR||Bernard Soulier syndrome (BSS, 231200)|
|Integrin, alpha-2||AR||GPIa/IIa deficiency (giant platelets and mitral valve insufficiency) (nd,nd)|
|Integrin, alpha-2B||AD||Monoallelic ITGA2B/ITGB3-related thrombocytopenia (ITGA2B/ITGB3-RT, 187800)|
|Integrin, beta-1||AR||GPIa/IIa deficiency (giant platelets and mitral valve insufficiency) (nd,nd)|
|Integrin, beta-3||AD||Monoallelic ITGA2B/ITGB3-related thrombocytopenia (ITGA2B/ITGB3-RT, 187800)|
|Myosin heavy-chain 9||AD||MYH9-related disease (MYH9-RD,155100)|
|Neurobeachin-like 2||AR||Gray platelet syndrome (GPS, 139090)|
|Purinergic receptor P2Y, G protein-coupled 12||AR*||Bleeding disorder, platelet type 8|
|Runt-related transcription factor 1||AD||Platelet disorder, familial, with associated myeloid malignancy (FDP/AML, 601399)|
|Tubulin, beta-1||AD||β1 Tubulin-related thrombocytopenia ( β1 tubulin-RT, 613112)|
2.4. Next-generation sequencing
The Truseq custom amplicon library preparation kit and the MiSeq Illumina sequencer platform (Illumina Inc.) were used to create the sequencing library and perform resequencing respectively. All steps were performed in-house according to the manufacturer’s instructions [27, 28].
Library preparation was performed by enrichment of the target regions using an amplicon-based multiplex polymerase chain reaction (PCR) method. Here, a custom amplicon tube (CAT) containing upstream and downstream oligonucleotides specific for the target regions was hybridized to the unfragmented gDNA samples in a 96-well plate. Unbound oligonucleotides were then removed by a series of wash steps using manufacturer supplied reagents. A proprietary extension–ligation mix containing DNA polymerase and ligase (Illumina Inc.) extended and ligated the upstream bound oligonucleotide through the targeted region to the 5′ end of the downstream oligonucleotide. The resulting extension–ligation products containing the targeted genomic region flanked by common sequences required for amplification were then amplified by standard PCR on a thermal cycler. The amplicon size (250 bps), the number of amplicons in the CAT (632 amplicons) and the type of input DNA (high quality) determined the number of PCR cycles (
The MiSeq Illumina instrument was used to resequence the pooled library by paired-end sequencing. The DNA library was immobilized to the single-use glass-based MiSeq flow cell through the adapter sequences. Bridge PCR amplification then generated clusters of clonal copies of each DNA molecule. These templates were then sequenced using platform-specific reversible dye terminator sequencing-by-synthesis chemistry. Sequence alignment to the reference genome (GRCh37/hg19) was performed using on-instrument software (MiSeq reporter software, Illumina Inc.) that aligned the reads in BAM format and outputted variant calls in.vcf files. Variant calls were generated using ANNOVAR software (http://www.openbioinformatics.org/annovar)  with an acceptance threshold Q-score of 30, corresponding to a 1:1000 error rate and genomic datasets were viewed using the Integrative Genomics viewer (IGV) (www.broadinstitute.org/igv/) . Sanger sequencing was performed to provide data for bases with insufficient coverage and validate variants of clinical significance.
2.5. Data analysis
The University of California, Santa Cruz (UCSC), genome browser (http://genome.ucsc.edu) was used for variant analysis and variants were cross-checked against databases including the NHLBI-Extended Sequencing Project (ESP), 1000 Genomes Project Database  and the Database of Single-Nucleotide Polymorphisms (dbSNP, http://www.ncbi.nlm.nih.gov/SNP/). Bioinformatic tools, Sorting Intolerant From Tolerant (SIFT, http://sift.jcvi.org/) , Polymorphism Phenotyping-2 (PolyPhen-2, http://genetics.bwh.harvard.edu/pph2/)  and Mutation taster (http://www.mutationtaster.org/)  were used to predict variant effects on protein structure and function in the cases of variants lacking published literature.
2.6. Nomenclature and descriptions for variant reporting
All variants identified were annotated according to Human Genome Variation Society (HGVS) nomenclature for clinical reporting (http://www.hgvs.org). The variant elements included gene name, zygosity, cDNA nomenclature, protein nomenclature, exon number and clinical assertion.
Descriptions of sequence variations were adapted from the American College of Medical Genetics and Genomics (ACMG) recommendations for standards for interpretation and reporting of sequence variations and are listed below :
3.1. Next-generation sequencing platform performance
Next-generation sequencing on the Illumina platform produced 13 690 589 (96.74%) reads that passed initial filtering. This process removes any clusters demonstrating excessive intensity corresponding to bases other than the called base. Only reads that passed the quality filter were assigned a quality score. A quality score of Q30 was accepted in the run predictive of an error probability of ≤0.1%. One sample was excluded from analysis due to poor DNA quality that generated poor-quality scores across all genomic regions.
Overall coverage across all genomic targets was 92.3%. This was consistent with the initial software prediction.
3.2. Candidate gene panel results
A total of 703 non-synonymous variants were detected; 75 of these variants were novel and had not been reported in the dbSNP database. An average of eight non-synonymous variants was detected per patient.
Two individuals with known mutations in
Pathogenic mutations were detected in 16 individuals (17.4% of the cohort) whilst 36 individuals (39.1%) had VUS and 40 individuals (43.0%) were without identifiable pathogenic mutations (Table 2, Table 3).
|16||60 mutations in 36 individuals|
|Number of individuals without pathogenic mutations identified: 40|
The candidate array was successful in detecting mutations in genes commonly associated with macrothrombocytopenia and included a total of nine
A homozygous mutation of
Sanger sequencing was also performed in selected samples across regions of low coverage (Q < 30) from those genes in which the clinical significance is widely accepted and included,
The diagnosis of IPFD and IPNDs using classic phenotypic methods poses a challenge to clinicians and laboratory scientists due to lack of consensus over classification and diagnostic criteria, poor standardization of tests and heterogeneity of traditional diagnostic approaches . This diagnostic conundrum is evident in our cohort where only 11 patients received a suspected diagnosis to a pathway level following multiple previous phenotypic tests. In addition, only 62% of patients received any form of phenotypic test, reflecting the difficulty of accessing these specialized techniques in many centers.
Sanger sequencing is widely regarded as a reliable platform for routine diagnostic genetic testing and small-scale projects. However, effective analysis of numerous disease-associated genes by Sanger sequencing in a diagnostic setting is time-consuming, expensive and not always feasible . A candidate gene array was selected as it has the potential to simultaneously analyze all of the selected coding regions of disease-targeted genes. Moreover, relative to WES and WGS, it provides good gene coverage and representation of exons, is relatively fast and cheap and minimizes the problems with unexpected findings and development of complex downstream bioinformatic pipelines for analysis .
We have demonstrated that high-quality sequence data can be generated from a candidate group of platelet genes using the Illumina MiSeq platform. Our candidate gene panel comprised 19 genes associated with IPNDs, predominantly inherited macrothrombocytopenia. Pathogenic mutations were detected in 17.4% of the cohort. The most number of mutations was detected in the
Transcription factors are the key regulators for the development of the hemostatic platelet from blood stem cells. Stem cells differentiate into a bipotent megakaryocyte-erythroid progenitor, then a committed megakaryocyte that undergoes endoreplication prior to extending proplatelet extensions from the cytoplasm into the bone marrow sinusoid forming platelets . This complex differentiation pathway is orchestrated by the activation and repression of groups of genes important for blood cell development via transcription factors [46, 47]. The candidate gene panel contained four genes that encode hemopoietic transcription factors, FLI1, GATA1, GFI1B and RUNX1. Definitive diagnosis of platelet disorders caused by mutations in these genes solely by phenotypic testing is not possible. We detected a pathogenic mutation in one of these genes,
The yield of pathogenic variants reported above may have been improved by more stringent patient selection criteria. In this study, all patients suspected of an inherited thrombocytopenia by treating hematologists were included regardless of the platelet phenotype. That is, not all patients demonstrated macrothrombocytopenia. In addition, in 16 cases only DNA was received and the platelet phenotype was not known. Noting that 15 of the 19 genes on the candidate panel are known to cause macrothrombocytopenia and that only 5 genes on the panel (
Variants of uncertain significance (VUS) were detected in over a third of the cohort (39.1%). Thirteen samples contained more than one VUS. One sample contained five VUS in five different genes (
Coverage is a crucial metric for establishing accuracy as well as analytical sensitivity and specificity of a NGS testing platform . Coverage requirements depend on the application of the NGS test. In general, sequencing more reads will increase the power of the assay. We determined the necessary coverage level based on recommendations forwarded by the Royal College of Pathologists of Australasia  whose guidance is in compliance with National Pathology Accreditation Advisory Council (NPAAC) standards for testing of human nucleic acids  and combined this advice with recommendations from published literature and other international bodies such as the ACMG . Our accepted Q score (Q30) was met in 92.3% of all genomic targets and in 97% of exonic targets. The read coverage distribution curve displayed a classic Poisson-like distribution indicating uniformity of coverage, this data accompanied by the high quality of base calls suggested that the NGS platform is able to deliver reliable sequence data. However, there were also areas of lower coverage where the platform did not perform as well, and lacked sensitivity. These regions were identified at genomic targets in
An important aspect of the post-analytical process is the timely provision of a genomic test report. In the setting of inherited platelet disorders, a false negative interpretation may lead to a falsely conservative bleeding prophylactic strategy at the time of surgery, in turn, placing the individual at a potentially increased risk of bleeding. A false positive result, on the other hand, may cause undue stress to the individual and their family. A genomic test report was therefore carefully and consistently structured taking into consideration recommendations from professional bodies such as the RCPA  and ACMG . The report (Appendix 1) contained a summary of the genes analyzed and reflected the scope and limitation of the assay and indicated the context in which the test was performed. A clear, succinct, interpretative comment was made regarding the detected variant. This indicated whether or not the detected variant was associated with the clinical phenotype and highlighted variants of uncertain significance. The body of the report detailed, in a structured format (see materials and methods), any detected pathogenic or clinically relevant variants and whether these had been previously described. An interpretation on the significance of the detected variant was supported by relevant references where possible, and recommendations regarding additional validation tests and /or genetic counseling and clinical screening were provided. Following the main body of the report, DNA variants that were considered to be non-pathogenic were listed. The report was concluded by a description of the test method and limitations thereof.
In conclusion, our study has demonstrated the potential to successfully diagnose inherited macrothrombocytopenia in cases that remained uncharacterized by traditional phenotypic approaches. Optimization of this format will provide patients an opportunity for a “one stop, one step” testing platform that is cost-effective and not affected by the pre-analytical variables that hinder current testing methods based on functional analysis of platelets. However, the translation of NGS from a powerful research tool into the clinical laboratory will require co-operation from international groups to establish best practice, quality and reporting standards for these conditions, as well as to generate reliable databases that link platelet phenotypes to genotypes to provide best hemostasis clinician advice.
Validation by Sanger sequencing has not been performed on clinically significant or novel detected variants and should be considered by the referring clinician.
Variant 2: No.
The pathogenicity of variant 2 is uncertain as information regarding this mutation is not available in the reported literature. Note that the classification of variants of uncertain/ unknown significance may change over time if additional information on these conditions becomes available in the reported literature.
Utsch B, et al. Bladder exstrophy and Epstein type congenital macrothrombocytopenia: evidence for a common cause? (Letter) Am J Med Genet 2006;140A:2251–3.
Kunishima S, et al. Immunofluorescence analysis of neutrophil non-muscle myosin heavy chain-A in MYH9 disorders: association of subcellular localization with MYH9 mutations. Lab Invest 2003;83:115–22.
MYH9-related disorders have an autosomal dominant inheritance. Genetic counselling is recommended for this individual and their family. Family screening may be appropriate after appropriate genetic counselling.
A TruSeq custom amplicon specific for the target regions of 19 genes,
It is also possible that a particular genetic mutation was not recognised as the underlying cause of the genetic disorder due to incomplete scientific knowledge of the impact of all variants at this point in the literature.
An example of a NGS report.