Group A Streptococcus pyogenes (GAS) is a human pathogen that commonly causes superficial infections such as pharyngitis, but can also lead to systemic and fatal diseases. GAS infection remains to be a major threat in regions with insufficient medical infrastructures, leading to half a million deaths annually worldwide. The pathogenesis of GAS is mediated by a number of virulence factors, which function to facilitate bacterial colonization, immune evasion, and deep tissue invasion. In this review, we will discuss the mechanism of molecular interaction between the host protein and virulence factors that target the fibrinolytic system, including streptokinase (SK), plasminogen-binding group A streptococcal M-like protein (PAM), and streptococcal inhibitor of complement (SIC). We will discuss our current understanding, through structural studies, on how these proteins manipulate the fibrinolytic system during infection.
- hemolytic Streptococcus
- plasminogen-binding streptococcal M protein
- streptococcal inhibitor of complement
- host-pathogen coevolution
Group A Streptococcus (GAS) is a strict human pathogen that leads to diverse clinical manifestations, ranging from superficial infections, such as pharyngitis, to severe cases of streptococcal toxic shock syndrome and necrotizing fasciitis mainly in children and young adults . GAS infection can also lead to a range of post-streptococcal autoimmune sequelae such as acute rheumatic fever, rheumatic heart disease, and acute glomerulonephritis [2, 3]. Life-threatening systemic GAS infection is more prevalent in, but not limited to, regions with insufficient medical infrastructures and is estimated to cause more than half a million deaths annually worldwide [4, 5]. Through coevolution, GAS has perfected its ability to manipulate the host fibrinolytic system for invasion. In human, the plasminogen/plasmin (Plg/Plm) system plays a key role in fibrinolysis, tissue remodeling, and wound healing [6, 7, 8, 9]. This review aims to focus on the current understanding on molecular mechanisms adopted by GAS to hijack the host Plg/Plm system during infection.
2. The plasminogen/plasmin system
The early observation that streptococci stimulate fibrinolysis by Dr. William S. Tillett in 1933  had triggered the subsequent discoveries on how streptococci manipulated the fibrinolytic system to facilitate blood clot dissolution . The actual protein responsible for the clot lysis is in fact a constituent of the human plasma, instead of the bacteria, and is not fibrinolytic until activated by the streptococcal protein named streptokinase (SK) [12, 13]. This human lytic factor is plasmin (Plm), an activated form of plasminogen (Plg).
2.1 Structure of Plg
Plasmin (Plm) is a plasma serine protease responsible for many physiological functions such as cell migration , wound healing , inflammation , and prohormone processing . Plm circulates in an inactive zymogen form called plasminogen (Plg).
Primarily synthesized and secreted by the liver , native Plg is a 89–92 kDa glycoprotein comprising of seven domains: an N-terminal PAN-apple domain (PAp), followed by five homologous kringle domains (KR-1 to KR-5) and a serine protease domain (SP) (Figure 1a) [19, 20]. The PAp domain is important for maintaining a compact conformation (closed) in the circulation . Each KR domain has a lysine-binding site (LBS) that consists of the Asp-X-Asp/Glu motif (except KR-3 which has the Asp-X-Lys mutation) that recognizes and binds to surface lysine or arginine residues, such that the KR domains facilitate the binding of Plg and Plm to substrates and targets (such as fibrin and cell surface receptors) which leads to the conformational change from close to open. SP is the catalytic domain. In the zymogen form, residues His603, Asp646, and Ser741 (also called the catalytic triad) adopt an inactive configuration.
2.2 Physiological activation of Plg
In mammals, the two key physiological Plg activators are tissue-type (tPA) and urokinase-type (uPA) Plg activators (Figure 1b). Activation of Plg requires its co-localization with the activators; the expression of these activators is regulated both spatially and temporally in vivo [22, 23, 24]. Thus, Plm plays a key role in fibrinolysis intravascularly on the surface of fibrin clots in the presence of tPA and cellular migration and tissue remodeling extravascularly in the presence of uPA bound on cell surfaces.
Upon binding to the targets, Plg adopts an open conformation. The activation loop, which is obstructed by the linker between KR-3 and KR-4 in the closed conformation, becomes exposed. The activation bond (Arg561-Val562) is then proteolytically cleaved by uPA and tPA [25, 26]. The nascent N-terminal Val562 moves by 11.6 Å forming a salt bridge with Asp740; this triggers a series of conformational changes and thus allows the formation of a functional substrate binding site and catalytic pocket (Figure 1c). In Plm, the heavy chain (N-terminal domains, 63 kDa) and the light chain (SP domain, 25 kDa) are linked together by two disulfide bonds, between Cys558-Cys566 and Cys548-Cys666.
Serine protease inhibitors (termed serpins ) play a key regulatory role to ensure that there is no aberrant activation of Plg nor free Plm in the circulation. Under physiological conditions, the activity of plasminogen activators is modulated by their specific plasminogen activator inhibitors (PAI-1 and PAI-2) (Figure 1b) . Active Plm which is not physically immobilized is removed immediately from the circulation by Plm-specific inhibitor α2-antiplasmin (α2-AP) [29, 30].
3.1 Structure and function of SK
SK is secreted by GAS as a 47 kDa protein and consists of three homologous domains, termed α, β, and γ, held together by flexible linker loops. Each domain adopts a β-grasp fold consisting of 4–5-stranded β sheets and a central α-helix or a coiled coil . The interaction between Plg/Plm and SK is evolutionarily conserved and strictly species specific [32, 33]. SK variants secreted by GAS isolated from different species (e.g., from human, pig, and horse) are incapable of any cross-species reactivity and therefore are predicted to share not only low sequence identity but also low structural homology .
The X-ray crystallography studies on the binary complex of Plm SP domain (μPlm) and SK reveal that SK wraps around the SP domain forming a horseshoe-shaped structure [31, 34] (Figure 2) and further superposition of the full-length closed Plg with the μPlm-SK structures suggests that the interaction between SK and Plg can occur with Plg, which remains in its closed conformation without any steric clashes (Figure 2) . This observation provides fundamental insights to the mode of Plg activation by SK, as discussed in the next section.
SK is not a protease, nor it activates Plg by proteolytically cleaving the activation loop as uPA or tPA mentioned above. It forms a 1:1 stoichiometric complex with Plg through a rapid binding reaction, with an association rate of 5.4 × 107 M−1 s−1 . Binding of SK to free Plg results in the formation of catalytically active Plg (Plg*) (Figure 3). The SK-Plg* binary complex [37, 38] cleaves Plg, either in closed or open conformation, to form Plm.
The Plm generated has a much higher (~57,000-fold) affinity for SK than Plg (KD 11 pM and 624 nM, respectively), such that the Plg in the SK-Plg* complex would be replaced by Plm to form the final and irreversible SK-Plm complex (Figure 3) [39, 40]. The inhibitory capacity of α2-AP reduces significantly with a ~10,000-fold lower affinity for the SK-Plm complex than Plm (rate constant of 1.4 × 102 and 5.4 × 107 M−1 s−1, respectively) [36, 41]; accordingly, GAS infection could potentially generate an unregulated pericellular proteolytic (i.e., Plm) activity within the host.
3.2 Plg activation by SK
How does SK activate Plg without cleaving the activation loop? The current model suggests that the N-terminal Ile1 residue of SK inserts into the binding cleft of Val562 in the SP domain and forms a salt bridge with Asp740. Accordingly, it induces a conformational change and formation of a functional catalytic site [42, 43, 44]. This “molecular sexuality” mechanism of cofactor-induced zymogen activation is also reported in the activation of prothrombin-2 by staphylocoagulase from Staphylococcus aureus .
The activation loop of Plg has evolved, via negative selection, to be a poor substrate of Plm , to minimize the risks of autoactivation. Binding to SK, however, changes the shape of the substrate binding pocket. In doing so, SK-Plg* and SK-Plm becomes highly specific in the binding and cleavage of the Plg activation loop , and this leads to a total deregulation of the fibrinolytic system.
Lastly, how does SK-Plg* or SK-Plm access the activation loop of Plg which is shielded in the closed conformation as previously discussed? Published data suggested that SK mediates a conformational change in the substrate Plg. Specifically, the substrate binding site of SK-Plm is situated at the tip of the protruding 250-loop region (residues Ala251-Ile264) of SK β domain (Figure 2) . Mutation studies reveal that residues Arg253, Lys256, and Lys257 of the same 250-loop can also bind simultaneously to the substrate Plg via its LBS of KR-5 domain , forming a ternary complex [31, 49, 50]. Thus, it is foreseeable that SK β domain peels KR-5 away from the PAp domain which leads to the formation of an open Plg with its activation loop exposed.
Further, SK has a 20-fold higher affinity for Plg in the open conformation, presumably due to additional interactions with other KR domains [40, 51]. Specifically, the C-terminal Lys414 of SK γ domain has been shown to interact with KR-4 LBS [52, 53]. Apart from Lys414, other Lys residues located at the β and γ domains might also be involved in binding to other KRs; together they promote a remarkably high-affinity interaction between SK and Plg/Plm in their open conformation . However, without any structural data on the co-complexes of the relevant domains, the exact mechanism of the LBS-dependent interactions remains to be speculative.
3.3 Classification of streptokinase
All invasive GAS strains express SK to enhance dissemination [55, 56] and colonization within the host . Interestingly, the SK alleles are polymorphic and can be subdivided into two phylogenetic lineages based on the highly variable β domain , namely, cluster 1 (SK1) and cluster 2 (SK2) (Figure 4a) [38, 58, 59]. The sequence identities of α, β, and γ domains between GAS strains are 77, 55, and 84%, respectively . GAS from different clusters show different properties in Plg activation, receptor expression, and receptor binding.
SK1-Plg complex is enzymatically active (Figure 4b) but has been shown to be susceptible to α2-AP inhibition . Furthermore, SK1-Plg can bind to fibrinogen (Fg) and form the Fg-SK1-Plg ternary complex without any changes to the enzymatic activity . SK1-Plg binds directly to Plg receptors such as glyceraldehyde-3-phosphate dehydrogenase and enolase, whereas Fg-SK1-Plg binds to M protein receptor such as M1.
SK2 is further subdivided into two clusters—SK2a and SK2b. Like SK1, SK2a expresses M protein and other Plg receptors, and the SK2a-Plg* complex is enzymatically active. One striking difference is that both SK2a-Plg* and SK2a-Plm are resistant to α2-AP inhibition. SK2b on the other hand is co-expressed with a specific Plg receptor called plasminogen-binding group A streptococcal M-like protein (PAM, see next section) [38, 59]. SK2b has a lower affinity for Plg (30-fold lower than SK1 and SK2a) , and the SK2b-Plg complex is enzymatically inactive. Thus, Plg activation by SK2b is strictly limited to the bacterial cell surface . Upon formation of the quaternary complex of PAM-SK2b-Plg-Fg, this complex is resistant to α2-AP (Figure 4b).
The polymorphism and functional differences between the SK variants result in different physiopathology of streptococcal infection . For example, the PAM-expressing SK2b strains where Plm activity is restricted to the cell surface are able to sustain much longer-lasting skin infections [37, 62].
4. Plg-binding group A streptococcal M-like protein (PAM)
M protein is the major virulence determinant of GAS . It belongs to a family of dimeric coiled-coil surface-associated proteins. Under the electron microscope, it appears as a fibrillar coat on the bacteria surface . The protein sequence of M proteins is highly variable especially in the first 50 residues at the N-terminus, known as the hypervariable region (HVR). Strain typing based on HVR sequence has identified more than 250 M subtype to date . The variable region confers affinity to different host molecules, such as Fg , immunoglobulin , complement factor H , etc. There has been a number of reviews published on the sequence pattern and function of the M protein family [64, 69] and therefore will not be covered in the current paper. Here, we will focus on the structure and function of PAM, which is a specific Plg receptor that mediates Plg activation by SK2b.
4.1 Structure of PAM
PAM is encoded by the emm gene situated in the multiple gene activator (mga) regulon. The mga regulon contains varying number of emm or emm-like genes and forms the basis of the five different emm patterns (type A-E). PAM-positive GAS strains are exclusively emm pattern D [70, 71].
PAM has the overall domain architecture of an M protein, which includes a hypervariable region (HVR) at the N-terminus followed by variable A and B repeat domains and the conserved C and D domains and an anchor region (Figure 5a). In the precursor protein, there is a signal sequence that precedes the HVR and is removed upon secretion. The anchor region consists of an LPTXG motif that is responsible for sortase-mediated crosslinking of the C-terminus to the cell wall peptidoglycan .
To date, no binding target or function has been assigned to the HVR region. However, as this region extends the farthest from the cell surface, it might serve as a hypermutatable decoy which promotes GAS evasion from the host immunity as observed for the HVR of M1 and M5 proteins .
The A repeat domain consists of up to two tandem repeats termed a1 and a2. The a1a2 repeats each harbor a conserved Plg-binding motif consisting of an arginine-histidine dipeptide (termed the RH motif). PAM variants differ mainly in the HVR and A repeat region  and can be divided into three classes based on the A domain arrangements, namely, I, II, and III (Figure 5a). All classes have the a2 repeat, class I has both a1a2 repeats, class II only has a2, and class III has both a1a2 repeats as in class I, but the repeats are separated by a three-residue insertion. In bacterial strains such as PAMNS265 and PAMNS32, the second RH motif is mutated to Arg-Tyr and Gly-His, respectively (Figure 5a). Despite these variations, all PAM bind to human Plg with high affinities [71, 73, 74].
Based on NMR studies , the structure of the HVR and A domain is predominantly disordered, and the binding to Plg results in a major conformational change and formation of α-helical structures (Figure 5b). This observation was further supported by experimental data published in a recent study , where it is revealed that the conformation switch can be detected even without binding to Plg, and the alternation between disordered and a dimeric α-helical structure occurs in a temperature-dependent manner, similar to the M1 protein reported previously . This observation could be explained by a conformation sampling of the flexible domains.
Other than the aforementioned dynamic and dimeric interaction at the N-terminal HVR and A domains, the current structure model of PAM is a coiled-coil dimer, which is stabilized via the C and D domains’ interaction . It is proposed that at least two C domains are required for a stable dimer formation. However, PAMNS455, one of the smallest PAM variants identified to date, contains only one C domain (Figure 5a). While it has been shown that PAMNS455 has high affinity for Plg , the question remains if and how PAMNS455 maintains the dimeric assembly.
4.2 Binding mechanism of PAM to Plg
PAM binds to both Plg and Plm directly with high affinity (KD of ~1 nM) [77, 78] through the RH motifs in the A repeat region to Plg KR2 domain (Figure 6) . Based on the crystal structures of the a1 repeat-KR2 binary complex, the side chains of the RH motif residues Arg101 and His102 form a pseudo-lysine moiety (called the lysine isostere) and bind to the LBS of KR2 [79, 80]. Peripheral residues of the RH motif such as Asp91, Glu93, Leu97, Lys98, and Glu104 mediate further intermolecular interaction via binding to residues of KR2 outside the LBS, namely, Tyr200, Arg220, Arg234, and Trp235 . These additional interactions play important roles in stabilization of the complex. Of these residues, Tyr200 and Arg220 are unique to KR2, accordingly; these residues may drive the specificity of the A repeats toward the KR2 domain. In doing so, PAM is expected to bind not only tightly to Plg but also without competing with SK binding . Further structural studies would be required to validate this hypothesis.
Outside the A repeats-KR2 binding interface, there are many questions remained to be addressed regarding the interaction between Plg and PAM. For instance, both a1 and a2 were shown to bind KR2 , but would a single PAM monomer bind to two Plg? Further, KR2 in closed Plg is inaccessible. How does PAM bind to KR2? Does it induce a conformational change of Plg prior to the complex formation ? Furthermore, the N-linked glycosylation of Plg at KR3 in Plg glycoform I reduces its affinity for PAM ; does KR3, which does not have a functional LBS, mediate exosite(s) interaction with PAM?
5. Streptococcal inhibitor of complement (SIC)
Streptococcal inhibitor of complement (SIC) is a 31-kDa secreted virulence factor found in M1 and M57 GAS serotypes. SIC is named after its inhibitory function of complement-mediated cell lysis. SIC binds to complement system regulators such as histidine-rich glycoprotein, clusterin, and membrane attack complex C5b-C9 (Figure 7a) . Subsequent research revealed that SIC also binds to antimicrobial peptides [84, 85], extracellular histones , fibrin , thrombin , and plasminogen . Accordingly, the physiological role of SIC is to manipulate the host defense system for infection and invasion. Of particular interest to the current review is that it inhibits the fibrinolytic system through binding to Plg .
5.1 Structure and function of SIC
SIC consists of an N-terminal signal peptide that is cleaved upon secretion; the mature form has a short repeat region followed by three tandem repeats of about 30 residues each (Figure 7b). The three-dimensional structure of SIC is currently unknown, and there is no apparent sequence identity with proteins in the database such as Pfam.
Additional to its well-known roles in suppressing the host defense system, SIC has been shown to modulate the fibrinolytic system . It was proposed that SIC inhibits SK-mediated Plg activation. Specifically, SIC-positive GAS entrapped in the fibrin clot allows its survival for much longer than the SIC-negative strain. The entrapped bacteria colonize before its dissemination from the primary infection sites.
SIC is expressed in the early growth phase of M1 GAS; its role, which is to temporally regulate the activity of SK, is only reported in a recent study. It was shown that the Plg-binding motif(s) in SIC is located at the C-terminal 200 residues which presumably binds the Plg KR domains . Significantly, although SIC does not bind to Plm, it binds specifically to Plg via competing with SK for Plg. It remains to be determined experimentally whether the C-terminal domain of SIC also binds to the Plg KR5 and/or the SP domains like SK, as discussed previously.
The fibrin network plays a pivotal role in innate immune defense via entrapping pathogens within the primary infection sites. GAS infection studies in animals have provided strong evidence that GAS has the ability to manipulate the host fibrinolytic system at many levels [88, 89]. On one hand, hijacking the host Plg/Plm on the bacterial surface has provided an energy-efficient strategy to break down the fibrin network during dissemination [55, 57, 90], and this is achieved with the aid of PAM. Using GAS strains which express both SK2b and PAM genes, it was shown that inactivation of either genes significantly reduces virulence . SIC, on the other hand, allows the bacteria to make use of the fibrin network as a shelter during the initial colonization phase, and it simultaneously inhibits the complement system in order to ensure the survival of bacteria in the early infection phase. The combined effects of these virulent factors perhaps allow the SIC-expressing M1 strain to be one of most invasive GAS .
GAS has evolved into a formidable pathogen through its millennial of coexistence with human host and natural selection; it is invasive and also evasive through manipulating the host immunity with a plethora of virulent factors. The three extracellular virulent factors discussed in this review modulate specifically the fibrinolytic system via an assembly of Plg modulators. Ironically, these virulence factors are capable of outranking the human counterparts in terms of efficiencies and affinities. SK, for instance, is the most efficacious Plg activator ever discovered, and therefore it was the first therapeutic approved for the treatment of thrombotic disorders including myocardial infarction  and pulmonary embolism . With the increasing prevalence of antibiotic-resistant superbugs, GAS infection is expected to post a risk to public health worldwide. Better understanding on the molecular mechanisms of how these virulent factors manipulate the host immunity will provide insight on future development of treatments for GAS infection.
This work was supported in part by the Australian National Health Medical Research Council. J. C. W. is an Australian Laureate Research Fellow.
Conflict of interest
There is no conflict of interest.