Ebola Virus’s Glycoproteins and Entry Mechanism

Ebola virus glycoprotein (GP) is the only protein that is expressed on the surface of the virus. The GP proteins play critical roles in the entry of virus into cell and in the evasion of the immune system. The GP gene transcript to membrane GP is constituted of two subunits GP1 and GP2,and the secretory GP (sGP). The main function of GP1/2 is to attach virus to target cell’s membrane, whereas sGP has multiple functions on Ebola pathogen‐ esis, such as inactivate neutrophils through CD16b causing lymphocyte apoptosis and vascular dysregulation. There are many studies that focused on better understanding the GP mechanism and aim at developing new antibodies and drugs such as VSV-EBOV, cAd3-EBO Z, rVSVN4CT1 VesiculoVax, ‘C-peptide’ based on the GP2 C-heptad repeat region (CHR) targeted to endosomes (Tat-Ebo) and MBX2270. In this chapter, we discuss the Ebola viral glycoproteins, genomic organization, synthesis, and their roles and functions. On the other hand, we treat the mechanisms of pathogenicity associated with Ebola GPs.


Introduction
Since the beginning of the year 2012, cases of Ebola virus have been reported in four African countries: Guinea, Liberia, Sierra Leone and Nigeria. WHO announced the end of Ebola outbreak in January 2016 [1]; despite this, according to the WHO, new cases are declared later in Sierra Leone, Liberia and Guinea [2,3]. What this highlights is that the risk of the Ebola Figure 1. Phylogram of Ebola virus obtained with BEAST, the phylogenic software based on Bayesian evolutionary analysis. The first cluster (GP1) contains sequences from Guinea alone. In second cluster (GP2), we may find sequences from Sierra Leone and Guinea. In third cluster (GP3), the sequences from Fore'cariah, Dalaba are collected. related to viruses taken early in March 2014. The cluster found in the large Conakry region is relative to two other lines. The second cluster contains sequences from Sierra Leone and Guinea; it could be a reintroduction of Sierra Leone or Guinea streaming in related strains to those initially introduced in Sierra Leone. Finally, a third group of viruses is located in Conakry, Fore'cariah, Dalaba, and to a limited extent in Coyah. Several sequences from Sierra Leone are grouped within the third cluster. Such phylogenetic structure suggests that there have been multiple migrations of EBOV into Guinea from Sierra Leone [10,11]. However, following the phylodynamic studies, the virus had accumulated a significant number of mutations, 7 × 10 -4 substitutions per site per year [12]. Figure 1 represents a phylogram of the Ebola virus.

Structural roles and genomic information on gylcoprotein
The Ebola genome mainly consists of seven genes, which encode eight proteins. Genes are delimited by conserved transcriptional signals; each gene starts with an initiation site at 3′ and ends with a (polyadenylation) stopover site. Zaire Ebola virus (EBOV) is a member of Filoviridae order of Mononegavirales. According to the ICTV (International Committee on Taxonomy of Viruses) classification, Mononegavirales order contains four families in addition to Filoviridae, which are Bornaviridae, Nyamiviridae, Paramyxoviridae and Rhabdoviridae.
Families of the Mononegavirales can be distinguished by the size of the genome and coding capacity, virion's morphology (filamentous, pleomorphic, or ball-shaped and bacilliform), viral pathogenesis and by its hosting organism. The nomenclature Filoviridae was changed several times for various reasons and also due to the nature of information discovered through the study of these viruses. The aim was to prepare a classification that reflects the knowledge and correct the international code of nomenclature [13]. Unfortunately, in the Filoviridae case, approaching ICTV has not been useful because both species and virus names had already been implemented in non-Latinized binomial form [14,15]. The new nomenclature writes "Zaire ebolavirus" means the species "Ebola virus," which is a member of Zaire Ebola virus-and is the most common member. However, the abbreviation is "EBOV." The new classification also contains a new genre called "Cuevavirus," which is newly discovered and contains the species "Lloviu cuevavirus" [14].
The common characteristics of this family are as follows: a negative genome RNA, which is linear, single-stranded, mono-segmented and filamentous; the genes in genome are ordered in specific ways, always starting with gene for envelope protein (3′UTR) and finishing with gene for the RNA polymerase (5'UTR); the viral replication occurs after synthesizing antigenome; RNA is stored within the helical nucleocapsid; and RNA polymerase gene is the largest gene of Filoviridae [16].
Besides the common characteristics, there are others that allow the distinctions between the genomes of Filoviruses from Mononegavirales order, such as the location of overlapping genes in Filoviridae. Overlaps are of the length 18-20 bases and are limited to the conserved sequences determined for transcription signals [17]. The bioinformatics analysis of the amino acids' sequences of the protein from Filoviridae has shown different degrees of identities. The analysis has shown that the nucleoprotein (NP) is the most conserved protein with a significant identity with the exception of the C-terminal portion. The rates of conservation for other proteins are as follows: 33% for VP35, 27% for VP40, 34% for GP, 33% for VP30 and 37% for VP24 [17][18][19].
In Figure 2, the genome of EBOV is shown schematically. From this scheme it is evident that the fourth gene from the 3′ of the viral genome encodes for glycoprotein (GP). This gene contains two reading frames: GP is one of the genes that encode for two mRNA-ORF I of the GP gene encodes for an sGP of about 50-70 kDa (it is non-structural glycoprotein that is efficiently secreted by infected cells) and ORF II encodes for a transmembrane glycoprotein of 120-150 kDa. The length of this gene is about 2406 nucleotides and is located between nucleotide 5900 and 8308. It is located before the VP40 gene and after the VP30 gene.  19,000 base, where L is the length of the gene and is more conserved then the VP40, which is the more polymorphisized gene; however, the GP is within an average polymorphism. The sequence 6.039-6.923 coding for the GP1 is followed by polyadenine before the GP2 sequence in position 6.924-8.068.
The GP gene is localized between the nucleotides 5883 and 8288, where the mRNA of small non-structural secreted glycoprotein is encoded between the regions 6022 and 7116; however, the membrane GP is encoded by the regions 6022 and 8051 of nucleotides, where the region 6022-6906 is responsible for the subunit GP1 and the nucleotide sequence form 6906 to 8051 in the EBOV's genome coding is responsible for GP2; here we observed that for sGP and GP1/2 the coding starts separately.
The GP gene advances by conserved transcription start and stop signals: "CUACUUC UAAUU" as the start transcription signal for nucleotides 5883 and 5894, and "UAAUUCU" as the stop transcription signal. The end of the gene of GP containing a region encoding (UUUUUU) or including that of the GP gene (UAAUUCUUUUU) is typically sited in the region 8278-8288. It is plausible to think that the purpose of these poly U's into the genome, and these sites, is to form a poly-adenine during the transcription for retaining the mRNA and forming a premature protein [19].
A major difference between the Ebola and Marburg species has been shown with regards to the GP gene. In contrast to Ebola in which the genome contains information about two different forms of GP (sGP and GP), the GP gene of Marburg only codes a single GP.
The work of Sullivan et al. [20] has dismissed that a single mutation in the position 77 or 121 in the sequence of GP2 is able to influence the cytotoxicity and the immunogenicity of the virus. A post-transcriptional modification is also able to have the same effect [20]. The post-transcriptional change may be due to the effects of prophylactic drugs or antibodies that are specifically reactive with the GP2 [20]. The structure of EBOV GP and its interaction with the human antibody KZ52 is shown in Figure 3. Therefore, the GP1 subunit of GP binds with the GP2 subunit by a non-covalent bond; therefore, the residues of GP at the positions 266 and 476 do not significantly affect viral entry [21]. According to the crystal structure of the protein, it is shown that three subunits of GP1 (blue) are bound to three GP2's subunits (green). In yellow it is shown that the human antibody KZ52 interacted with the GP at the base of the chalice [7,14].
Multiple studies have shown that the GP gene products have the following multiple roles in the process of pathogenesis: -The binding to the receptor is performed by the GP1 subunit whereas the GP2 subunit is responsible for the fusion and viral entry [22].
-The EBOV secretory GP is incriminated in the provocation of B and T lymphoid apoptosis, but an investigation suggests that the soluble GP has no role in apoptosis [23]. It is also involved in antigenic subversion as well as restoring the barrier function of endothelial cells [6].

The transmembrane glycoprotein GP1/2
EBOV GP has 676 amino acids in length with an apparent molecular weight of 150 kDa. The glycoprotein is synthesized as a precursor of pre-GP (or GP0) considering the length of the gene. A series of post-translational events leads to the maturation of the viral glycoprotein, N-glycosylation of the protein in the endoplasmic reticulum and the O-glycosylation in the Golgi. The pre-GP precursor is finally cleaved in the trans-Golgi compartment by a protease, furin, into two subunits: extracellular GP1(501 amino acids) and the GP2 transmembrane subunit (175 amino acids), which are interconnected together by disulphide bridges [24]. GP plays a role in the pathogenesis through regulation of the adaptive response. It is at the origin of [24] -reduction with the availability of adhesion molecules; -disruption of the presentation of the viral antigens to lymphocytes; and -glycoproteins GP1 and GP2 have immunosuppressive motifs.

The secretory glycoprotein (sGP)
We hypothesize that the alteration of the homeostasis system and vascular system observed during Ebola virus disease could be, at least in part, caused by these soluble glycoproteins [25].
Recent works show that the source of the expression of GP1/2 or sGP is the ribosomal slippage process, where eight adenines nucleotides play an important role in the inversion of the expression of sGP and GP1/2,.If the transcription complex reads only seven adenines, it encodes for the secretory GP, while the reading of the eight adenines leads to the transcription of the pre-GP1/2 by dint of two disulphide-linked subunits GP1 and GP2. Figure 4 illustrates this translational frameshifting or ribosomal frameshifting shown in the Ebola virus GP gene [26].

Viral cycle and pathogenicity
The pathogenesis mechanism of the Ebola virus begins with the infection of the immune system's cells such as macrophages, dendritic cells (DCs) and monocytes, which are the first that come to contact with the viral particles. An alteration of interferon's production is observed in the infected cells, and the level of alteration is different from one cell to another. DCs and monocytes are the cells that show important level of alterations and, consequently, the INF production level decreases significantly. Furthermore, lymphocyte apoptosis and a global dysfunction of specific immune system's cells are observed with the increase of viral burden. In addition, the virus is responsible for other dysfunctions such as neutrophils inactivation, induction of apoptosis, inhibitor of immune response, and involvement in the process of viral entry in epithelial cells, vascular dysregulation and evasion [24,27,28]. The spread of virus throughout the body including the vital organs with immune system dysfunction leads to death.
The Ebola virus attacks the whole body causing increasing disseminated intravascular coagulation that degrades quickly the haemostasis and the functioning of vital organs. Infection destroys the endothelial cells, mononuclear phagocytes (monocytes, macrophages, dendritic cells, mast cells) and hepatocyte [29]. The mechanism of pathogenicity and the viral cycle can be divided into three phases: -State extracellular and penetration, which cited the paramount role of the GP and GSP.
-State intracellular and the role of replication complex.
-Roles of VP24, VP40 and VP35 in the evasion of the immune system.

Target cells and receptors
The tropism of the Ebola virus depends on the expression of the receptor at the entry of the virus by the target cell. Several receptors of Filoviridae were determined.
The Ebola GP is bound to the C-type lectins as DC-SIGN, L-SIGN and hMGL expressed by monocytic, dendritic and macrophage cells [30][31][32][33]. The virus uses other ubiquitous molecules expressed by non-monocytic cells to internalize the target cells too [34]. The Ebola virus also uses a process called antibody-dependent enhancement (ADE) to attach the host cells and facilitate the entry [35]. It is shown that the GP binds with the IgG Fc receptor IIIb and forms a cross-linking virus-antibody-complement complex to Fcγ III, which explains the rapid spread of virus throughout the body (liver, brain, heath and endothelial cell) [35]; other study demonstrated that T-cell immunoglobulin and mucin domain 1 are receptors for the Ebola virus [35].
The Ebola virus infects most types of cells. However, macrophages and dendritic cells allow a strong replication and spread of the virus through the lymph and blood circulatory system. Thus, the virus reaches lymph nodes, liver and spleen, and spreads to other tissues.

Extracellular role of GP
The infective dose of EBOV is about 1-10 virion by aerosol in non-human primates. Despite this small amount of the virus, the formation and composition of the virion allows it to cause problems for the infected bodies.
However, after infection, it tries to prevent and interfere with the immune response via glycoproteins EBOV (GP), which is one of the reasons why the Ebola virus is fatal. The EBOV glycoprotein is the only viral protein expressed on the surface of the virion and is essential for binding to host cells and the catalysis of membrane fusion in addition to other roles of pathogenicity. The GP is combined with carbohydrates that help in the prevention of the immune system; also, the coating of the protein in a sweet layer makes it more difficult for the immune system to identify that a virus is present. On the other hand, the GP released into the intracellular medium inhibited host antibodies. It is also accompanied by the rapid neutralization of certain populations of T lymphocytes by a super-antigen effect. The GP is the essential protein in the mechanism of penetration. The GP-secreted/transmembrane GP inhibit the effect of neutralization of the natural antibodies (Ab), thanks to the carbohydrates combined with GP. The GP related to Ab easily gets attached on the cell membrane by C1q (thanks to the complement immune). This attachment facilitates and promotes deposition on host cell and this is followed by the penetration of the virus via the macropinosome pathway (this is the same capture solute of intercellular lipid lane) to the cell. Moreover, citing the possibility of using the protein G and calthrin, it indicates the role of the actin in the penetration of virus and suggests that the virus promotes, locates and retakes a large part of it action by interactions ( Figure 5). The GP binds with neutrophils and endothelial cells by the DC-SIGN (dendritic-cell-specific ICAM3-grabbing non-integrin) and L-SIGN (liver and lymph node SIGN), which provide links to cell-GP via carbohydrate determinants. These bonds formed with the neutrophil receptor CD16 cause a significant reduction in the signal CR3 and Fcγ receptor II B, avoiding virus clearance. It has been shown that the strong pro-inflammatory responses were induced by the commitment of the EBOV GP with the TLR-4 and by the activation of the NF-κB transcription factor. The GP is responsible for cytotoxicity on endothelial cells by secretion of enzymes, proteolytic endosomes (such as cathepsin), that cause the destruction of the vascular endothelium and increase in vascular permeability and haemorrhagic signs.
In addition, the GP also binds with multiple nearby IgG, which allows the binding of C1 to the Fc region of antibodies that is thermo labile, and interacts with the cell surface molecules. This complex consists of C1q and two pro-enzymes of serine protease, C1r and C1s [26], which allow the virus to bind to cell membranes. At that time the virus enters and internalizes the cell via the macropinocytosis [36] (Figure 6).
The amount of BST2 into cell does not change but the surface of BST2 decreases in the presence of the GP. This explains that the GP hides the BST2 receptors in its absence, and VP40 that commune-precipitates and co-localizes binds with BST2. This reveals that GP plays a role in the inhibition of this interaction [37].

Intracellular action of GP
Macrophages and dendritic cells are the first to be infected, but the viruses can infect most cell types with the notable exception of lymphocytes and other non-adherent cells [38]. Several researches have shown that the EBOV's binding with the receptors is relatively non-specific. For example, EBOV may also attach to C-type lectins, which interact with glycans on EBOV GP as well as on phosphatidylserine (PtdSer) receptors which interacts with the viral envelope, which leads to a better EBOV entry [39,40]. PtdSer receptors include Gas6 or protein S and TAM family receptors (TYRO3, AXL and MER).
It was found that their internalization was independent of clathrin-or caveolae-mediated endocytosis, but they co-localized with sorting nexin (SNX) 5 [42]. Once it is internalized, the virus must be carried in an intracellular compartment containing the factors essential for the activation of GP. The virus is initially inside the cell and in macropinosomes. The proteases cathepsin B and cathepsin L cysteine cleave GP and remove over 60% of the peptide mass, while interacting with NPC1 GP1 they intended to promote fusion of the viral membrane with the membrane of the bladder, accompanied with a pH drop in macropinosomes announcing the end of this step, which causes the fusion of the membrane of the host to the virus. Then, the complex transcription is released first, followed by the release of the viral genome [37,43] (Figure 6). Figure 6. The diagram summarizes the virus entry mechanism and the various stages of internalization and replication. After attachment of the virus to the cell membrane, it activates the formation of macropinosomes via intracellular signals including the role of HAVCR1 (TM1), which recently has demonstrated the roles of the TIM-1 as a receptor or a cofactor for entry of Ebola virus. Moreover, the expression of endogenous TIM-1 reduced in very permissive cell lines leads to a reduction of the infectivity of Ebola virus [44]. Cleavage of the GP via Cathepsin B and L allows the fusion of endosomal membrane with virus causing the release of virus into the cytoplasm. By order, the VP30, VP35 and L are the first that are released into the cytoplasm and the viral genome negative sb RNA (Image from ViralZone2014 [45]).
Transcription and replication complex, VP35, N and L ensure the transfer of RNA-to RNA+ for transcription and translation of viral genome. The first step is the activation of the transcription complex by the fixation of zinc in the active site (70-90) of co-activator VP30. The VP30 binds with VP35, L and N to start the transcription and translation. The mRNAs are translated using host ribosomes. At this point, we can say that filoviridae are independent in their replication machinery and they need only a transcriptional signal (zinc and ribosomes) from the host cell.
The maturation of the GP track in the Golgi apparel, where the GP sequence is cleaved into GP 1 and GP2, is expressed on the surface of the cytoplasmic membrane and sGP.
The VP35, VP24 and VP40 play roles in the inhibition of immune responses by inhibition of the translation and signalization of antiviral genes by the succession of kinase-phosphorylation reactions. When the complex polymerase binds along the RNA template, the polymerase complex stops and is re-introduced at each junction of genes and transcription, thus individual genes appear sequentially in their 3′-5′ order. The region 3′ in the genome and anti-genome viral contains promoter's sites of replication for positive and negative sense RNA synthesis; they are approximately 176b [46]. The virus acts on microtubules and immunosuppressive genes to inhibit cell division. As the number of virions increases, it causes a burst of the host cell and then death or apoptosis due to the speed of the replication of virions, which are approximately 109 plaque-forming units (PFUs) in tissue during 7-10 days [26].

PDB-ID
The spread of the virus in the body and vital organs causes haemorrhage and fever due to unskilled hyper activation of cytokines via transmembrane GP, and the activity of the NK causes diarrhoea because of the infection of digestive cells of the system. In addition, the pneumocystis, hepatocytes and cardiovascular cell infection accelerates the death of the patient.

Strategies for the inhibition of the Ebola virus
Several methods can be used to inhibit filoviridae and more particularly the glycoprotein. The inhibition of the GP induces entry cell inhibition and then limited viral infection. Table 1 shows different crystalline forms of GP with cellular proteins, which develop the pathology and suggest a site to inhibit. Inhibitors must beagle to inhibit the alone active site of GP1 or both active sites of GP2. The inhibitors may be antibodies or small molecules that interact with EBOV proteins in the way to limit its action. The EBOV has the ability to use multiple ways as immune pathway, therefore it is important to design inhibitor or cocktail of inhibitors that inhibit multiple targets at the same time. In our opinion, the best way to reduce devastating action of EBOV is the collective inhibition of GP and VP30. The inhibition of GP is to reduce the side effects caused by the GP. Many antibodies and inhibitors were developed, and some of them during clinical trials [47]. However, the inhibition of VP30 is also the best way to inhibit the replication process and then remove the virus via mRNA degradation by RNAase.

Biosecurity, biosafety and Ebola virus
A good understanding of the mechanisms of the virus' action allows us to manipulate the viruses in the level of biosafety, which is lower than BSL-4. The virus is composed of two complementary and essential units for the infectious act, genome and VP30, VP35, VP40 and L proteins; merging of genome with these four proteins is capable of inducing infection. As Ebola is a negative sense single-stranded RNA virus, the isolation of their genome from its microenvironment composed of four proteins cannot trigger a viral reaction. As such Ebola cannot synthesize DNA genome, therefore, its cDNA copy is synthesized only in a laboratory, in which all manipulation conditions are easily manageable.
The best way to master and control the transcription and translation processes in the laboratory is by not allowing a transcription/translation of the total genome. Moreover, by isolating the proteins from their genome and other proteins, the pathogenic effects can be stopped. The pathogenic effect is caused by the cooperation and integration of different proteins that make up the genome of the viruses; however, the loss of a single protein causes the inhibition of virus by losing their genetic information via the degradation of cellular RNA as in the case of VP30.
It is possible to clone cDNA in E. coli in the BSL-2 laboratory [52,53], which is consistent with the approach outlined in the BMBL and is responsible for developing and implementing an appropriate biosecurity measure. By cloning the cDNA of a gene, it is possible that it will accelerate the scientific research process and help in discovering new drugs. The study of individually cloned altered proteins is also possible in animals' models.

Conclusions
Based on the data presented in this chapter, Ebola has developed multiple pathways and modes for the evasion of the immune system and internalization in target cells. Further studies are necessary for a good understanding of the entry mechanism. However, the specific proteins of virus or even the cDNA genome disassociated of proteins can be studied in BSL-2 because the effects of virus depends on the presence of genome associated with structural and functional proteins, which allows to study the virus in laboratories at biosafety level 3 or 2. They may even accelerate the process of finding new inhibitors by pharmaceutical and vaccination companies.