Stem Cells for Modeling Human Disease

Human pluripotent stem cells (PSCs) in the form of human embryonic stem cells (hESCs) or induced pluripotent stem cells (iPSCs) are capable of growing indefinitely in vitro , maintaining their capacity to differentiate into the three primary germ layers: mesoderm, endoderm and ectoderm. Different protocols have been developed to differentiate PSCs into almost any cellular type with different degree of success. This technology has allowed scientists to use patient‐derived iPSCs to study the physiopa‐ thology of the disease by analyzing the phenotype of the cells derived from these iPSCs. However, control iPSCs obtained from healthy individuals will always have different genomic environment than patient's iPSCs, making it difficult the interpretation of the cells phenotype. The recent appearance of specific nucleases [zinc‐finger nucleases (ZFNs), the transcription activator‐like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)] has made it possible to edit the genome of PSCs. We can now generate syngeneic hESCs or iPSCs harboring the desired mutation and comparing the emerging cells with those derived from genetical‐ ly identical PSCs that will differ only in the mutated gene. In this chapter, we summa‐ rize the progress made in this field and discuss the different approaches that have been used recently for the generation of syngeneic human pluripotent cellular models for different pathologies.


Introduction
Disease models are an essential tool for elucidating the molecular basis of several pathologies, allowing the development of novel therapies. Historically, and taking into account the scarcity of the patient cells, the use of model organism made possible the clarification of cellular mechanisms underlying human disease. Drosophila melanogaster, Caenorhabditis elegans and zebrafish have been very helpful in dissecting basic disease mechanisms [1][2][3]. However, the simplicity of these organisms from the physiological point of view and their phylogenic distance from human limit their use as models for human disease. The most popular animal models for studying human diseases are based on genetically engineered mice. For an updated list of available models, see http://www.informatics.jax.org/humanDisease.shtml. A quantum leap was made by the introduction of "humanized" mouse models in which various types of human cells are engrafted and they function as they would in human's organs [4][5][6]. This humanized mice models can harbor human hematopoietic stem cells (HSCs) facilitating the analysis of human hematology and immunology disease in vivo. In spite of the potential of humanized animal models, several issues remain to be overcome, such as the insufficient intercellular relationships or the physiological differences between humans and the animal models. These differences could explain in part why some drugs tested in animal models fail in the corresponding clinical trials. Recent progress in the field of regenerative medicine allows the generation of patient-specific stem cells that are suitable for generation of human disease models. This can be done by two different approaches: the first approach is to drive stem cells from patient with the target disease, and the second is to genetically alter human stem cells with gene editing tools.

Disease modeling with human stem cells derived from patients
The advantage of using human stem cells derived from patients is that they can be isolated at different stages of disease severity. For example, by isolating stem cells from patients with endstage disease, we could study those phenotypes resulting from a combination of insults in the patient that take the patient to that particular stage. These particular phenotypes are practically impossible to mimic by other means such as genetic or epigenetic manipulations.

Human adult stem cells
Adult stem cells are multipotent cells found in all adult tissues, and they participate in the physiological regeneration of the tissues where they belong. The adult stem cell better characterized and with better perspectives for use as human models of disease are the neural progenitor cells (NPCs), mesenchymal stromal cells (MSCs) and the HSCs:

Neural progenitor cells
NPCs comprise relatively undifferentiated cell population in the central nervous systems (CNSs) that give rise to a broad array of specialized cells, including neurons and glial cells. These cells can be isolated and cultured in vitro. NPCs driven from patients with known mutations associated with a specific disease allow an excellent "in dish disease modeling" for neurogenetic disease, allowing a direct study of the cellular pathogenic mechanisms. Amyo-trophic lateral sclerosis (ALS), commonly referred as Lou Gehrig's disease, is a fatal neurodegenerative disease characterized by loss of motor neurons (MNs) in the motor cortex, brain stem and spinal cord, resulting in muscle paralysis and ultimately death due to respiratory failure. A human cellular model for ALS has been developed from NPCs derived from postmortem spinal cord NPCs from ALS patients. The derived astrocytes provide the first in vitro model system to investigate common disease mechanisms and evaluate potential therapies for ALS [7]. A second model of a neurodegenerative disease was established by the use of immature neural cells derived from adult tissues from post-mortem brain tissue of a 25year-old man with fragile X syndrome (FXS) [8]. The cultured fragile X cells displayed many of the characteristics of NPCs, as well as the biochemical hallmarks of FXS, including CGG repeat expansion. These two models allowed the study of the efficacy of new therapeutic agents.

Mesenchymal stromal cells
MSCs represent another interesting alternative for disease modeling. The International Society for Cellular Therapy (ISCT) established that MSCs must be purified from stromal populations based on plastic adherence and must be positive for CD105, CD90 and CD73, negative for MHC-II, CD11b, CD14, CD34, CD45 and CD31 and express low levels of MHC-I. In addition, MSCs must differentiate in vitro into osteocytes, chondrocytes and adipocytes. MSCs can be obtained from bone marrow, adipose tissue, synovial membranes, dental pulp, Wharton's jelly, umbilical cord blood, liver tissue, etc.
MSCs are a very attractive source for disease modeling because they are able to give rise to different tissue types. Therefore, MSCs derived from patients could be used to model diseases affecting tissues to which MSCs can be derived to. Contrary to induced pluripotent stem cells (iPSCs), MSCs are not induced by the expression of gene involved in oncogenic cells transformation (discussed later). Due to their relative abundance within the body, the MSCs could be used as novel human in vitro models for several diseases. Dossena et al [9] isolated MSCs from the adipose tissue of spinal and bulbar muscular atrophy (SBMA), a late-onset progressive neurodegenerative disorder caused by a trinucleotide (CAG) repeat expansion within the coding region. CAG expansions encode for longer polyQ chains in the produced protein [10]. Dossena et al showed that MSCs isolated from the adipose tissue of SBMA patients form nuclear polyQ inclusions producing a robust pathogenic polyQ phenotype in vitro [9]. Unfortunately, there is still some controversy regarding the differentiation of neuron from MSCs, limiting the relevance of this model to study neurodegeneration. The same limitation stand for other diseases affecting other tissues/organs to which MSCs cannot differentiate to, such as liver, blood or muscle.

Hematopoietic stem cells
HSCs have the potential to give rise to all hematopoietic cells in vitro and are therefore a potential source to mimic diseases affecting the hematopoietic system including primary immunodeficiencies (PIDs) and autoimmune diseases. Most applications of human HSCs in disease modeling have focused on the development of humanized mice models. In these systems, human HSCs are inoculated into immunodeficient mice engrafting and regenerating most of the hematopoietic system of the mice with human cells. The engrafted cells can be used to study behaviors of the different populations or to study therapeutic interventions. These models have been used for modeling infectious diseases such as HIV-1 [11], Ebola virus [12], PID [13] and other disorders of the hematopoietic systems [14].

Human embryonic stem cells from embryos with genetic diseases
Human embryonic stem cells (hESCs), derived from pre-implantation embryos, were the first human pluripotent stem cells (PSCs) to be isolated. Thanks to the pre-implantation genetic diagnosis (PGD), it is now possible to generate hESCs from monogenetic diseases [15][16][17]. Using these approaches, hESCs derived from embryos with FXS, Huntington's disease (HD) and familial dysautonomia (FD) have been generated [18][19][20][21]. The authors observed that in vitro differentiation of FXS-hESCs into neurons resulted in abnormal neurogenesis and poor neuronal maturation mimicking the developmental events taking place during neurogenesis in FXS patients. Similarly, Feyeux et al [22] demonstrated a down-regulation of the Huntingtin (HTT) gene in HD hESCs-derived neurons and identified early-stage mitochondrial dysfunction during development. In the case of the FD or Riley-Day syndrome, by using FD-diagnosed embryos, Lefler and colleagues [21] have found that IKAP is likely a vesicular-like protein involved in neuronal transport and synaptic integrity in FD hESCs, probably reflecting some peripheral nervous system (PNS) neuronal dysfunction observed in FD. However, the scarcity of PGD, legal concerns in relation to the parental consent for embryo donation and some ethical consideration have made this approach very difficult.

Induced pluripotent stem cells
The development of the iPSCs technologies [23] bypassed these limitations since we can now generate hESC-like cells from patients. The ability of iPSCs to differentiate into virtually any tissue or cell type makes these cells an excellent tool for modeling human disease. These cells allow the generation of patient-specific disease-relevant cells in virtually a continuous manner. This is imperative in the case where the disease is a result of an interplay genetics risk, instead of a punctual mutation. This is the case of the vast majority of the neurodegenerative disorders where only the 5-10% are Mendelian disorders caused by a punctual mutation, the rest are a result of multifactorial genetic association. Several iPSCs models have been developed for neurodegenerative diseases including HD [24][25][26][27][28][29], Alzheimer's disease [30] and spinal muscular atrophy [31][32][33][34][35][36]. Several inherited bone marrow failure (BMF) syndromes have also been modeled in vitro by the derivation of the corresponding iPSCs, and the corresponding differentiation protocols. This is the case of defective telomere elongation [37], Fanconi anemia [38][39][40] or congenital megakaryocytic thrombocytopenia [41]. Following the same procedure, Wang and colleagues were able to found new insights in the Barth syndrome (BTHS), an Xlinked cardiac and skeletal mitochondrial myopathy caused by mutation of the TAZ gene, obtaining BTHS iPSC-derived cardiomyocytes [42].

Disease modeling with gene editing tools and stem cells
iPSCs and hESCs from embryos with genetic diseases (hESCs-GD) provide powerful tools for modeling human diseases, although the different genetic background of the control cells versus the patient-specific cells makes the interpretation of the results difficult [43]. A solution to this problem is the use of genome editing (GE) technologies to generate the desired mutations into iPSCs or hESCs [44]. This approach allows the direct comparison of PSCs harboring the desired mutation with isogenic control cells lines. However, until recently, this approach presents low efficiency and specificity. This situation changed considerably with the appearance of specific nucleases (SNs). The introduction of double-strand breaks (DSBs) in the targeted DNA sequence has dramatically improved the homologous recombination up to 10,000. Several SNs have been described such as the meganuclease I-SceI, the zinc-finger nucleases (ZFNs), the transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeats (CRISPR)-associated system 9 (CRISPR/Cas9). Meganucleases, also called homing endonucleases, are a class of highly sequence-specific enzymes that recognize a relative large long DNA sequence ranging from 12 to 30 bp. The recognition site of MNs can be engineered in order to target specific sites within the genome. One of the major advantages of the meganucleases is their small size making them amenable to be packaged into single viral vector and allowing efficient delivery. However, this technology requires good knowledge of protein engineering and has been limited to some DNA targets [45]. A major breakthrough arises with the discovery of ZFNs [46][47][48], the first SNs capable to target almost any DNA sequence in the human genome. ZFNs are chimeric proteins that combine a nuclease domain (FokI) and a zinc-finger domain (ZFD) that recognizes the targeted DNA sequence. The specificity is therefore determined by the ZFD that harbors four-six zinc fingers of 30 amino acids. ZFNs are easier to construct in comparison with MNs, but still required intensive labor to obtain site-specific ZFNs. Soon after the appearance of ZFNs, a new protein-based SNs, TALENs, were described [49][50][51]. TALENs are a combination of the catalytic domain of an endonuclease (FokI) fused with a DNA-binding domain derived from transcription activator-like effectors from plant pathogen Xanthomonas species. TALENs recognize specific DNA sequences via DNA-binding domains composed of nearly identical 34-amino acid repeated unit. As ZFNs, site-specific TALENs can be derived in almost any laboratory with a good molecular biology expertise. However, the wide distribution of ZFNs and TALENs was hindered by the complexity of their designs. The latest described SN system is CRISPR/Cas9. This system, contrary to the previously described nucleases, uses a short molecule of RNA, called guide RNA (gRNA) to direct the Cas9 (nuclease) to the target sequence. This simplicity has opened the use of SNs to almost any laboratory in cell and molecular biology. All the endonucleases can cause DSBs and, subsequently, insertion or deletion at the site of the genomic DSB can be induced by imprecise nonhomologous end-joining (NHEJ)-mediated repair or by precise editing using HDR.
All these GE technologies have opened up the possibility to obtain almost any mutation in any cell type. Therefore, different groups have already applied these technologies to model human disease by generating disease-causing mutations in primary stem cells:

Editing hHSCs for disease modeling
Gene therapy targeting HSCs is a real option now for several genetic diseases, including severe combined immunodeficiency (SCID) and other severe non-SCID PID. The PID patient has been treated classically by allogeneic HSCs transplantation or by the correction of the patient's own HSCs by the insertion of a functional copy of the affected gene by a viral vector [52,53]. However, to further improve HSCs-based gene therapy, HSCs cellular models capable to recapitulate the disease phenotype are needed. The low frequency of these disorders and the difficulties to obtain HSCs from the patients preclude the use of patient HSCs as standard tools for disease modeling and preclinical testing. As an alternative, GE tools can be used to generate the disease-causing mutation in HSCs obtained from umbilical cord. For example, a X-linked severe combined immunodeficiency (SCID-X1) induced by a mutation in the IL2RG gene using specific ZFNs has been recently modeled in vitro [54]. Sickle-cell disease (SCD), another rare in the β-globin gene, has also been modeled using this strategy. SCD can be cured by allogenic HSC transplant. However, this is only possible when a matched donor is available, making the development of gene therapy using autologous HSCs a highly desirable alternative [55]. In order to improve the gene therapy treatment for this rare disease, a model was required. In this sense, the electroporation of healthy HSCs cells with mRNA-ZFN that specifically targeted the α-globin gene-generated SCD-like HSCs. The author shows that the SCD-like HSCs generated similar numbers and patterns of erythroid and myeloid clones in comparison to HSCs from SCD patients. The SCD-like HSCs cells maintained the repopulation capacity into immune-deficient NSG mice, representing an excellent disease model for SCD [56]. Finally, it is noteworthy to mention that the gene editing approach using stem cells was used not only for disease modeling but also for disease treatment. This has been done by knocking down a clinically relevant gene CCR5 principal co-receptor involved in human immunodeficiency virus (HIV) infection [57]. A recent clinical trial shows for the first time that CCR5 ablation in immune cells of 12 patients with HIV enhanced their resistance to the HIV infection [58].

Editing hESCs for disease modeling
As we mentioned before, hESCs-GD can be generated and could in theory be an excellent tool for any disease. We also mentioned the problems this strategy face if we want to extend its use to model any disease [43]. GE technologies have allowed the development of a very potent alternative: the generation of gene-edited hESCs harboring the disease-causing mutation. This strategy has the additional advantage of count with isogenic hESCs that only differed in the mutation and are otherwise genetically identical.
Then, we will describe some examples of how different groups have used GE to model different diseases: Cancer. Chromosomal translocations are signatures of numerous cancers and lead to expression of fusion genes that act as oncogenes. This aberrant translocation can be induced using gene editing tools in vitro in hESCs. For instance, the translocation related with Ewing sarcoma and anaplastic large cell lymphoma (ALCL) was recently reproduced in vitro by the way of ZFN and TALEN [59]. Breakpoint junctions recovered MSCs derived from GE hESCs fully recapitulating the genomic landscape found in tumor cells from Ewing sarcoma patients.
Monogenetic diseases. Disrupting a single gene by GE is easier than promoting translocations or multiple gene mutations. Human cellular models of monogenetic diseases were the first to be generated by GE as an alternative to iPSCs from patients. An elegant demonstration of the importance of using isogenic cell lines to model disease was provided by Reinhardt et al [60]. These authors generated iPSCs lines from patients with the LRRK2 mutation (Parkinson's disease-related mutation) and healthy individuals without the mutation. In addition, they used GE to correct the mutation generating isogenic cell lines that only differed in the LRRK2 mutation. Gene expression profile clustering of hESCS-derived neurons showed that the healthy and mutant iPSC lines did not cluster in separate groups (as would be expected). Rather, one of the healthy lines clustered closely to a mutant line and one healthy line was very different to all of the other lines. Therefore, no conclusion could be taken with these diseasespecific iPSCs lines [60]. Interestingly, the only cell lines that clustered close together were pairs of mutant lines with and without correction of the mutation by GE. Taking these data into consideration, the authors managed to demonstrate that mutation in LRRK2 induced dysregulation of CPNE8, MAP7, UHRF2, ANXA1 and CADPS2 genes. Other good examples of the relevance of using isogenic cell lines were showed by Li et al [61] and by our group [62]. Li et al generated a model for the Rett Syndrome by mutating the MECP2 gene. MECP2 mutant neurons mimic the defects observed in this disorder and unbiased global gene expression analyses (thanks to the use of isogenic cell lines) showed that MECP2 protein functions as a global gene activator in neurons but not in neural precursors. Similarly, our group has generated human cellular models for Wiskott-Aldrich syndrome (WAS) by mutating the WAS gene in hESCs using ZFNs [62]. Using these models, we uncovered that the absence of WAS protein also affected early hematopoiesis and megakaryocyte development, a phenotype that could not be observed when using iPSCs from WAS patients [63].
Several other monogenetic diseases have been modeled by GE of hESCs (see Table 1). Martinez et al used TALENs to knock out the KCTD13 in hESCs to model the Timothy syndrome, a variant of an autism disorder, allowing a rapid drug screening [80]. Vanuytsel et al [79] generated FANCA-deficient hESCs, a human model for FANCA, a recessive disorder characterized by progressive BMF, congenital abnormalities and development of malignancies. The authors used ZFNs to introduce a selection cassette into exon 2 of FANCA, disrupting its expression in hESCs. Interestingly, they couldn't obtain homozygous FANCA mutants due to growth arrest of these cells. Only heterozygous population could be obtained. Zou et al have generated hESCs models of Paroxysmal nocturnal hemoglobinuria (PNH) [77] by mutating the PIG-A gene. Recently, a hepatocyte-like cell line has been derived from hESCs treated with TALEN targeting the SORT1 gene, allowing to model coronary artery disease in vitro [81]. The same group has also generated derived cell lines with specific mutations in AKT2 gene, which performs an important role in the development of metabolic syndrome, and also in PLIN1, which encodes the perilipin protein and when altered, is responsible for the autosomaldominant subtype of partial lipodystrophy [81].

Editing iPSCs for disease modeling
As mentioned before, iPSCs can be obtained from any patient representing very potent tools for disease modeling. However, several considerations should be taken into account when interpreting the phenotypes of iPSCs from patients and controls since they have different genetic backgrounds. Indeed, a major concern to interpret the data coming out from a diseaseassociated iPSCs is the choice of appropriate controls. In addition, several studies have shown that the generation and expansion of different iPSCs from some patients' lines shows some genetic aberration ranging from single nucleotide variant, chromosomal deletion and differences in the methylation pattern of their genomes. An interesting solution to this problem is the GE of iPSCs from healthy individuals, as we have described previously for hESCs, in order to generate "home-made" disease-specific iPSCs Figure 1. This approach allows the direct comparison of "gene-edited" iPSCs cell line with its isogenic control. Then, we will describe some examples of modeling human disease using this approach: Parkinson's disease (PD) iPSCs were generated by mutating the LRRK2 gene in iPSCs lines from a healthy control using ZFNs. Neurons derived from gene-edited PD iPSCs exhibit  reduced neurite outgrowth and increased apoptosis in response to oxidative stress when compared with neuron derived from isogenic control iPSCs [60,82]. Park et al generated hemophilia A iPSCs using a TALEN pair to invert a 140Kbp DNA segment of the F8 gene (a mutation shared by half of the hemophilia A patients) in human control hiPSCs. The authors demonstrated that TALENs could be used to generate disease models associated with chromosomal rearrangements [66]. The chemokine (C-C motif) receptor 5 (CCR5) serves as an HIV-1 co-receptor and is essential for cell infection with HIV-1 virus. Loss of this receptor confers protection against HIV-1 infection. In this sense, several groups have generated CCR5negative iPSCs using CRISPR/Cas9 [68] or ZFNs [69]. HSCs-like cells derived from the CCR5-iPSCs could be used in the future to model HIV-1 resistance to infection [69,70,83]. Antibodies against the human platelet alloantigens (HPAs) cause severe alloimmune bleeding disorders. HPA-1b/b platelets (variant Pro33 within the Integrin β3) are relatively rare in the population and difficult to obtain for transfusion or diagnosis. Zhang et al [73] used CRISPR/Cas9 GE to generate Integrin β3-Pro33 (HPA-1b) iPSCs to model this variant. Megakaryocyte progenitors derived from these iPSCs expressed the HPA-1b alloantigenic epitope. In a similar approach, Liu et al [76] used CRISPR/Cas9 and TALENs to generate iPSC deficient for the SCN1A protein to model epilepsies. Using SCN1A-deficients iPSCs, they were able to explore the mechanism of epilepsy caused by loss of SCN1A protein, showing reduction of the amplitudes and enhancement of the action potential thresholds. A novel model of Barth syndrome derived from the control human line PGP1 iPSC was obtained by CRISPR/Cas9 and mutated in exon 6 of TAZ gene, resulting valid to study this monogenic mitochondrial cardiomyopathy [42,84]. Several other diseases have been modeled in vitro using one of the previously described GE tools; for more details, see Table 1.