A rare norovirus (NoV) genotype GII.17 has recently emerged and rapidly became predominant in most East Asian countries in the winters of 2014–2015. In this study, we report the diversity of NoV GII.17 in detail; a total of 646 GII.17 sequences obtained during 1978–2015 were analyzed and subjected to meta-analysis. At least five major recombinant GII.17 clusters were identified. Each recombinant variant group appeared to have emerged following the time order: GII.P4-GII.17 (1978–1990), GII.P16-GII.17 (2001–2004), GII.P13-GII.17 (2004–2010), GII.Pe-GII.17 (2012–2015) and GII.P3-GII.17 (2011–2015). The newly emerged GII.P3-GII.17 variant, which exhibited significant sequence and structure variations, is evolving toward a unique lineage. Our results indicate that circulation of GII.17 appears to change every 3–5 years due to replacement by a newly emerged variant and that the evolution of GII.17 is sequentially promoted by inter-genotype recombination, which contributes to the exchange between non-GII.17 and GII.17 RdRp genes and drives the evolution of GII.17 capsid genes.
- global prevalence
Norovirus (NoV) is the predominant etiological viral agent of acute gastroenteritis across all ages; usually old people and young kids are more susceptible to these viruses . Currently, the genus of norovirus can be divided into at least seven genogroups (GI–GVII). Of these, GI, GII and GIV can be detected in the samples from human gastroenteritis [2, 3]. Within each genogroup, NoV strains can be further subdivided into diverse (over 40) genotypes based on sequence similarity of the RNA-dependent RNA polymerase (RdRp) and major capsid genes [2, 3]. Globally, GII strains have already contributed to over 75% of human NoV cases [4, 5], whereas a special genotype GII.4 has been found to be responsible for the majority of outbreaks since 1990s [6, 7], and novel GII.4 variants emerged every 2–4 years [6, 7, 8].
In the winter of 2014–2015, a rare NoVs genotype, GII.17, emerged in most of the East Asian regions including China (Guangdong [9, 10, 11, 12, 13], Jiangsu , Zhejiang , Hebei , Hong Kong , Taiwan , Beijing , and Shanghai , Japan  and South Korea ). Soon after, the novel GII.17 strain became the most predominant NoV strain, replacing the pandemic strain GII.4 Sydney 2012, responsible for the majority of gastroenteritis outbreaks in this region . A limited number of cases were also reported in other countries (Italy , Romania  and the USA ).
Recombination allows a substantial exchange of genetic materials and is a major driving force of viral evolution [2, 27]. However, although multiple studies have reported on the evolutionary dynamics of NoVs, the role of recombination in shaping NoV evolutionary history is still very significant. Given the importance of GII.17 NoV as a cause of epidemic gastroenteritis in recent years (2014–2015), it is crucial to better understand how this genotype has evolved over time.
This study aimed to determine the mechanisms of evolution in norovirus GII.17 strain from 1978 to 2015, with particular focus on the effects of recombination events on the acquisition of non-GII.17 RdRps.
2. Materials and methods
2.1. NoV GII.17 sequence datasets
Sequence datasets were constructed following the strategy described by Yu et al. . Briefly, NoV GII.17 sequences were all collected from two divided pathways (Figure 1): firstly, search results from the GenBank nucleotide database using combination keywords such as “norovirus and GII.17” and “norovirus and 17” in December 2015 and secondly, publications from the PubMed and Google scholar literature databases (papers published between 2003 and 2015) that contain a combination of “norovirus” and “GII.17” in the titles, keywords and abstracts . Non-English literatures were excluded .
Subsequently, all the NoV GII.17 sequences were downloaded in FASTA format. Then the corresponding information of each sequence was edited by using Geneious  as in a uniform format which included the sequence name (or accession number) and length, sample source, sampling time and sites . Duplicated sequences were removed . To delete the non-GII.17 sequences, all the candidate sequences were analyzed with the online-based genotyping tool , which is designed to identify norovirus genotypes based on phylogenetic analysis.
2.2. Phylogenetic analysis
Nucleotide sequences were aligned using the ClustalW program. Phylogenetic analysis was performed with MEGA 5.1 package  based on partial sequences of ORF1 (241 nt) and ORF2 (181 nt). The reference strains were retrieved from NCBI nucleotide database . Phylogenetic trees were reconstructed using the Tamura-Nei model and maximum likelihood methods. Bootstrap was calculated with 1000 pseudo-replicate datasets. The distance scale represents the number of nucleotide substitution per position.
2.3. Recombination analysis
To detect recombination events, sequences were aligned with ClustalW program in the MEGA 5.1 package  and then checked manually. The reference strains were retrieved from NCBI nucleotide database. NoV strains were defined as recombinants if they were grouped into different genotypic clusters on the phylogenetic trees, which were reconstructed using full-length sequences of ORF1 (5107 nt) and ORF2 (1623 nt), respectively.
Simplot method  was employed to identify the recombination breakpoint site and to further verify the NoV recombinants. The bootstrap values were plotted for a window of 400 nt, moving in increments of 10 nt along the alignment.
2.4. Homology modeling
The tertiary structures of the capsid P domains of NoV GII.17 were modeled using SWISS-MODEL online server . The recently published GII.17 domain dimer X-ray crystal structure (PDB accession: 5F4O)  was used as the template for generating homology models. The constructed models were examined and edited using PyMoL .
3.1. NoV GII.17 sequence dataset
A total of 472 and 1698 sequences were obtained from the GenBank nucleotide database using “norovirus and GII.17” and “norovirus and 17” as search keywords, respectively. After manually checking sequence genotyping results, the non-GII.17 sequences (n = 1614) of the above two sequence datasets were excluded from subsequent analysis (Figure 1). A literature search yielded 67 citations, among which 57 of the research articles (85%) reported norovirus GII.17 sequences (n = 362). All screened sequences from these two independent sources were combined, and duplicated sequences were removed. Finally, a total of 646 sequences belonging to NoV GII.17 were obtained (Figure 1).
These 646 sequences ranged from 205 to 7570 nt in length. Over 60% of them (n = 427) were shorter than 400 nt, while 57 sequences covered the nearly complete or the complete viral genome of more than 7000 nt. All the NoV GII.17 sequences were located within ORF1, ORF2 or the region overlapping ORF1 and ORF2. Specifically, 3.41% of sequences (n = 22) belonged to ORF1, and 79.26% (n = 512) belonged to ORF2. The remaining 112 sequences contained regions from both ORF1 and ORF2.
3.2. Genetic diversity of NoV GII.17
Genotyping analysis revealed that many of the GII.17 sequences were genetic recombinants since two distinct genotypic regions, ORF1 (RdRp) and ORF2 (VP 1), were identified in the same sequence. All the recombinants contained the same ORF2 genotype of GII.17 while exhibiting varying ORF1 genotypes of GII.P3, GII.P13, GII.P16, GII.Pe and GII.P4. Phylogenetically, all the GII.17 ORF2 sequences were categorized into at least five major clusters. Interestingly, each cluster composed of only one type of GII.17 recombinant, for example, GII.P4-GII.17, GII.P16-GII.17, GII.P13-GII.17, GII.Pe-GII.17 and GII.P3-GII.17 (Figure 2A). The collection dates of the corresponding strains within each cluster indicated a temporally sequential clustering and were distributed as follows: GII.P4-GII.17 cluster during 1978–1990, GII.P16-GII.17 cluster during 2001–2004, GII.P13-GII.17 cluster during 2004–2010, GII.Pe-GII.17 cluster during 2012–2015 and GII.P3-GII.17 cluster during 2011–2015 (Figure 2A).
Notably, 27 sequences did not belong to any of the above-described lineages (Figure 2A); however, they were more closely related to the GII.P13-GII.17 cluster. Coincidently, all of these sequences were isolated in a similar time period (1998–2012) as the GII.P13-GII.17 cluster (2004–2010) (Figure 2A).
The GII.P3-GII.17 cluster, containing 486 sequences, was the largest recombinant lineage of the five recombinant clusters, with most of the sequences discovered in the past 2 years (2014 and 2015) (Figure 2A). In addition, the VP1 sequences (viral protein 1, the major capsid protein) of the recently emerged epidemic strains during 2014 and 2015 were also classified into this GII.P3-GII.17 cluster (Figure 2A).
3.3. Recombination confirms the emergence of GII.17
Given that only a small portion of the ORF2 sequences (181 nt) was included in the phylogenetic tree (Figure 2A) and that the bootstrap value of the GII.P3-GII.17 cluster was less than 80%, two more phylogenetic trees were constructed using the full-length sequences of ORF1 and ORF2 (n = 54) in order to further verify the recombination in sequences from the GII.P3-GII.17 cluster. On the ORF1 tree (Figure 2B), all the sequences were grouped into GII.3 genotype clusters with a high bootstrap value (99%). On the ORF2 tree (Figure 2C), sequences were grouped into GII.17 genotype clusters with a high bootstrap value (99%). These results demonstrated that the new epidemic strains were in fact GII.P3-GII.17 intergenic recombinants.
Simplot analysis was also performed to confirm the recombination events. The candidate recombinant sequences shared high similarities with GII.3 in ORF1 and with GII.17 in ORF2, respectively. In addition, recombination breakpoints were identified within the overlapping regions of ORF1 and ORF2.
3.4. Sequence variations of the GII.17 P domain
To explore the evolution of NoV GII.17 capsid genes and the importance of intergenic recombination, five representatives of the P domain sequences (GII.P3, GII.P13, GII.P16, GII.Pe and GII.P4) from distinct clusters discovered between 1978 and 2015 were subjected to analysis. Sequence similarity among all complete genomic sequences was analyzed using Simplot. GII.P16-GII.17 and GII.Pe-GII.17 variants shared high similarities with the GII.17 variant that emerged the earliest (GII.P4-GII.17 in 1978) at the identification of 92.73 and 90.45%, respectively (Figure 3B). However, the variant GII.P13-GII.17 that emerged after revealed a decline in similarity with GII.P4-GII.17 (82.96% of identity). The recently emerged GII.P3-GII.17 variant showed the largest divergence (76.12% of identity) from the GII.P4-GII.17 variant (Figure 3B). Interestingly, a typical “V” shape was observed in our Simplot analysis (Figure 3A), suggesting that the P2 subdomain is the most hypervariable region in the GII.17 capsid and continues to evolve (Figure 3A).
A structure-based sequence alignment of the P domain (amino acid) was performed to examine any potential differences among these five GII.17 variants. Overall, P1 subdomain sequences were relatively conserved (Figure 4); however, significant sequence differences were observed in the outer loop regions in the P2 subdomain that includes the B-loop (aa 291–299), P-loop (aa 339–352), A-loop (aa 375–383) and T-loop (aa 395–400) (Figure 4). In addition, slight changes were also found in the S-loop (aa 438–445) located in the P1 subdomain and in the U-loop (aa 408–417) located within the junctional region of the P1 and P2 subdomains (Figure 4).
Interestingly, various mutations were also identified in the P2 subdomain, especially in the loops on the outer surface of the protein. Comparing to the other four earlier variants, three deletions (aa 349, 350 and 381) and two insertions (aa 379 and 397) were observed in the recently emerged GII.P3-GII.17 variant (Figure 4). These mutations, which are located in the B-loop (aa 349 and 350), A-loop (aa 379 and 381) and T-loop (aa 397), might have altered the binding capacity of NoV to host human histo-blood group antigen (HBGA) (Figure 4). These results also suggested that the GII.17 P domain might have evolved from a non-prevalent strain into an epidemic GII.17 variant strain (GII.P3-GII.17).
The position of amino acids in VP1 corresponds to AGI17592. Arrows indicate deletions (aa 349, 350 and 381) and insertions (aa 379 and 397) in the recently emerged GII.P3-GII.17 variant, as compared to the other GII.17 variants.
3.5. Structure shift of the GII.17 P domain
To better understand how this novel NoV GII.17 spread so rapidly and widely, the P domain structures of all five GII.17 variants were constructed based on homology modeling. Comparison of the overall structures of the five GII.17 variants revealed that most amino acid substitutions occurred in the P2 subdomain, while the P1 subdomain remained relatively conserved.
Figure 5 shows the surface loops that formed the conventional GII NoV binding interface (B-, T-, N-, P-, U-, S- and A-loops). The front-side view indicates that the outer structures of the N- and U-loops of all five GII.17 variants remained relatively unchanged (Figure 5). However, compared to the other four GII.17 variants, distinct structural changes were observed in the A-, B- and T-loops of the GII.P3-GII.17 variant (Figure 5). The top-side view revealed that the P-loops of the GII.P13-GII.17 and the GII.P3-GII.17 variants are protruding from the surface of the proteins (Figure 5), a feature clearly different from those observed in the other three GII.17 variants.
To clarify the specific changes observed in the loops, superpositions of the P domain structures of the five representative GII.17 variants were constructed (Figure 6). The most prominent difference was observed in the P-loop (Figure 6). Three distinct structures of the P-loop were observed among the five GII.17 variants, with the GII.P4-GII.17, GII.P16-GII.17 and GII.Pe-GII.17 variants sharing the same structure. In contrast, the GII.P13-GII.17 and GII.P3-GII.17 variants each presented a unique structure (Figure 6). In addition, the T-, B-, U- and A-loops also exhibited considerable differences. Notably, the GII.P4-GII.17, GII.P16-GII.17, GII.Pe-GII.17 and GII.P13-GII.17 variants shared similar T-, B-, U- and A-loops; while the GII.P3-GII.17 variant displayed a unique shift in structure in these four loops (Figure 6).
3.6. Global distribution of NoV GII.17 variants
Sequences of the five distinct variants of GII.17 (n = 646) were obtained from at least 24 countries and regions from five continents: Asia (China, Hong Kong, Taiwan, Japan, Bangladesh, Singapore, South Korea, Pakistan and Thailand), Europe (Italy, France and Russia), America (Brazil, Mexico, Nicaragua, Uruguay, French Guiana and the USA), Africa (Cameroon, Ethiopia, Kenya, Morocco and South Africa) and Oceania (Australia). The number of sequences obtained in these regions was unevenly distributed, with certain countries being over-represented compared to others. Most of the sequences were obtained in Asia (89.30%) and Europe (7.31%). The number of sequences collected in China was the highest (71.12%), followed by South Korea (9.80%), Kenya (4.99%) and Japan (4.81%).
Since most of these NoV GII.17 sequences were categorized into five recombination variant clusters; it is important to understand how the different GII.17 variants are distributed globally. Only three sequences belonging to the GII.P4-GII.17 variant were collected from French Guiana and Pakistan (Figure 7A). Most of the sequences of the GII.P16-GII.17 variant were isolated in Japan (n = 17) (Figure 7A). In addition, sequences of the GII.Pe-GII.17 variant were well dispersed among different countries including Cameroon (n = 1), Ethiopia (n = 6), China (n = 7), South Korea (n = 1), Brazil (n = 2) and the USA (n = 1) (Figure 7A). Sequences from the GII.P13-GII.17 variant were even more widely distributed: China (n = 3), Japan (n = 1), Bangladesh (n = 10), Singapore (n = 3), South Korea (n = 3), Thailand (n = 2), France (n = 2), Russia (n = 2), Brazil (n = 2), Mexico (n = 1), Nicaragua (n = 1), Uruguay (n = 1), Cameroon (n = 3) and South Africa (n = 1) (Figure 7A). As for the GII.P3-GII.17 variant, its sequence distribution was also quite scattered, with most of the sequences found in China (n = 389), South Korea (n = 50), Kenya (n = 28), Japan (n = 9) and Thailand (n = 4) (Figure 7A).
Interestingly, multiple GII.17 variants were often observed in one country or region (Figure 7A). For example, over 70% of the GII.17 sequences (399 in 561) were detected in China, including sequences from the GII.P3-GII.17, GII.P13-GII.17 and GII.Pe-GII.17 variants (Figure 7A). Except for GII.P4-GII.17, the other four GII.17 variants were found in South Korea, with GII.P3-GII.17 being the most dominant. Different GII.17 variants were also observed in Japan (GII.P3-GII.17, GII.P13-GII.17 and GII.P16-GII.17), Cameroon (GII.P3-GII.17, GII.P13-GII.17 and GII.Pe-GII.17), the USA (GII.P3-GII.17, GII.P16-GII.17 and GII.Pe-GII.17), Thailand (GII.P3-GII.17 and GII.P13-GII.17) and Brazil (GII.P13-GII.17 and GII.Pe-GII.17) (Figure 7A).
3.7. Yearly prevalence of GII.17 variants
Figure 7B showed the yearly distribution of the GII.17 variants detected worldwide. The GII.17 sequences isolated from 1978 to 2015 were unevenly distributed. The number of isolated sequences peaked in 2001, 2006 and 2009, then continued to increase significantly from 2011 to 2015 and reached the highest peak in 2015 (Figure 7B). This result suggested that circulation of GII.17 might change every 3–5 years due to replacement by a newly emerged variant. The GII.P4-GII.17 variant, which emerged the earliest, only appeared in 1978 and 1990 (Figure 7B). About a dozen years later, the GII.P16-GII.17 variant was discovered in 2002 and persisted till 2004, when the GII.P13-GII.17 variant emerged and replaced it. A few of the GII.P3-GII.17 variants emerged between 2009 and 2013, and the number of sequences that belonged to this variant sharply increased during 2014 and 2015. Over 70% of the GII.17 sequences belonged to this GII.P3-GII.17 variant and were isolated during this period (Figure 7B). It is worth noting that the GII.Pe-GII.17 variant was also observed between 2012 and 2015, although the number of sequences identified during this period was far less than that of the GII.P3-GII.17 variant (Figure 7B).
NoV, one of the leading causes of human acute gastroenteritis, is wildly distributed around the world. In the past decades, NoV GII.4 was identified as the most predominant genotype involved in numerous epidemic outbreaks, for example, in 2002, 2004, 2006, 2009 and 2012 . Recently, a novel NoV GII.17 variant has emerged and is responsible for multiple disease outbreaks mainly in China and Japan [9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. GII.17 appeared as a dominant strain replacing the GII.4 strain in these regions during 2014–2015 .
NoV GII.17 strains have been circulating among various human populations for over 37 years, with previously emerged GII.17 sequences being sporadically detected in multiple regions in various continents including most of Africa, Asia, Europe, North America and South America (Figure 7A). However, it is still unclear as to why this recently emerged GII.17 variants spread so rapidly and widely within such a short time period.
4.1. Genetic diversity and recombination
Previous studies have suggested that GII.17 sequences could be classified into distinct clusters based on the divergence of VP1. However, only a few GII.17 sequences were analyzed in these studies [14, 15, 18, 20, 24, 26]. In order to investigate the diversity of NoV GII.17 in detail, large sequence dataset was constructed and subjected to meta-analysis and genotyping in this study (Figure 1). Surprisingly, at least five major recombinant GII.17 clusters, including GII.P4-GII.17, GII.P16-GII.17, GII.P13-GII.17, GII.Pe-GII.17 and GII.P3-GII.17, were identified. In addition, each recombinant group appeared to have emerged following a particular time order. For example, the earliest group of GII.P4-GII.17 was isolated in 1978, followed by GII.P16-GII.17 (2001–2004), GII.P13-GII.17 (2004–2010), GII.Pe-GII.17 (2012–2015) and GII.P3-GII.17 (2011–2015). This indicates that the evolution of GII.17 is sequentially promoted by inter-genotype genetic recombination, which contributes to the exchange between non-GII.17 and GII.17 RdRp genes and promotes the evolution of GII.17 capsid genes. Consequently, these genetic recombinations could have potentially affected the antigenic properties of NoVs and accelerated the emergence of novel epidemic variants or strains [17, 23], for example, GII.P3-GII.17. The highly diverse genome of the rare genotype of GII.17 demonstrated in this study is far beyond what has been reported previously.
Moreover, based on the phylogenetic trees of RdRp and VP1 (Figure 2B and C), all GII.P3-GII.17 sequences were further subdivided into cluster I and II. Cluster I contains the variants identified from 2013 to 2014, while the variants isolated from 2014 to 2015 comprise Cluster II. Our results suggest that after RdRp recombination with the pandemic genotype of GII.P3, the subsequently emerged GII.17 strains underwent various modifications at the capsid [23, 35], which, consequently, promoted the emergence of new variants (Figure 3).
Notably, the co-circulation of the GII.P3-GII.17 variant with the non-epidemic strains of GII.Pe-GII.17 indicates that the evolution of GII.17 strains was driven by multiple mechanisms in different directions within the human population in recent years (Figure 2A).
4.2. Shift and variation in capsid structure
Chan et al.  and Lu et al.  have previously reported that the GII.17 VP1 evolved faster than GII.4 VP1. The rapid evolution of this GII.P3-GII.17 VP1 at a rate of 5.68 × 10−3 nucleotide substitutions per site per year, which is comparable to that of GII.4 VP1 (5.3–6.3 × 10−3 substitutions per site per year), is faster than that in other NoV genotypes such as the epidemic genotypes GII.3 and GII.7 (1.961 × 10−3 and 2.36 × 10−3 nucleotide substitutions per site per year, respectively) . In this study, compared with previous circulating GII.17 variants, the recently emerged GII.17 variant was found to exhibit significant sequence and structure variations.
First of all, the new GII.P3-GII.17 variant showed the most noticeable sequence divergence (76.12% of nucleotide identity) compared to the other GII.17 variants, for example, GII.P16-GII.17, GII.P13-GII.17 and GII.Pe-GII.17 (Figure 3), which could be a driving force of antigenic drift of the new GII.17 variant.
Secondly, at the amino acid level, the P domain of the GII.P3-GII.17 variant appeared to have evolved more rapidly than the other variants (Figure 4). Specifically, three deletions and two insertions were observed in the new GII.P3-GII.17 variant (Figure 4). The homology structure of the P domain also showed that many mutations were located in the human histo-blood group antigen (HBGA) binding loops of the newly emerged GII.P3-GII.17 variant (Figure 5). Notably, most amino acid substitutions were found in the P2 subdomain. These results suggest that these mutations most likely affect the HBGA binding property of the new GII.17 variant, which in turn, might expand its host range and prevalence [17, 21, 35]. It is important to note that changes at the antigenic epitopes of the viral capsid might also lead to adaptive advantages that contribute to the rapid spread of the virus.
The observed structural variations led us to hypothesize that, in the beginning of GII.17 evolution, the binding interface of these viruses remained relatively stable from 1978 to 2004, as reflected in the variants GII.P4-GII.17 (1978–1990) and GII.P16-GII.17 (2001–2004). Soon after that, a significant change occurred at the surface loops of the capsid (GII.P13-GII.17, 2004–2010), followed by more dramatic changes in 2011–2015 (GII.P3-GII.17, 2011–2015) (Figure 2A). Interestingly, the loop structures of the GII.Pe-GII.17 variant remained relatively constant, even though it was in circulation during the 2012–2015 period, indicating that this variant retained the conventional GII.17 HBGA binding interfaces . Our results also confirmed that the GII.P3-GII.17 variant might have evolved as a unique lineage separated from the other GII.17 variants.
Thirdly, it is also possible that variations in the rate of evolution of the capsid were promoted by recombination, since this recombination resulted in the same capsid lineage that was associated with different RdRp . Recombination breakpoints between RdRp and capsid genes might have allowed GII.17 to acquire RdRp with lower fidelity and/or increased replication efficiency, resulting in a higher rate of evolution for the associated capsid gene . Moreover, within a short time period, the human host might have yet to adapt to the new GII.17 capsid, which might also support the rapid spread of the new GII.17 variant.
4.3. Waterborne GII.17 strains
NoV particles are usually detected in environmental water or in effluents from the wastewater treatment plant throughout the year [40, 41, 42]. In fact, the attack rate of waterborne NoV (over 11%) was found to be significantly higher than that associated person to person or environmental transmission .
Interestingly, the dataset in this study revealed that over 20% (129 in 646) of GII.17 sequences were collected from water samples or water-related outbreaks. Such waterborne GII.17 outbreaks have been reported in many countries, for example, the USA, in 2005 ; South Korea, in 2008–2012 ; Guatemala, in 2009 ; and China, in 2014–2015 [16, 46]. Moreover, many GII.17 sequences were also detected in environmental water samples from around the world, such as Shandong (China) , Singapore [48, 49], Japan [50, 51], South Africa [52, 53], Kenya  and New Orleans (USA)  .
Since the capsid protects genomic RNA from the environment, it is assumed that capsid degradation could lead to viral inactivation, probably followed by the degradation of the unprotected viral RNA . Our data suggest that the capsid of GII.17 virus is more stable and therefore, can persist longer in water samples. Interestingly, Arthur et al. demonstrated that Tulane virus (a novel human NoV surrogate) also remained stable in surface water (<1 log10 reduction) after 28 days; however, viral load reduced from ≥3.5 to 4 log10 in groundwater by day 21 . Consequently, improving surveillance and systematic monitoring of environmental water samples could provide valuable information on viral circulation and enable further assessment of the emergence of novel NoVs at an earlier stage.
4.4. Nomenclature of the new GII.17 variant
To coordinate the assignment of new genotypes and variants, a dual-typing system based on the complete capsid (VP1) and the partial polymerase (1300 nt) was proposed . In 2014, the RdRp genotype of Kawasaki 323 strain emerged, but it was not assigned to any of the known genotypes in the database. Instead, it was assigned as the GII.P17 RdRp genotype, and the variants of NoVs were named Hu/GII/JP/2014/GII.P17-GII.17 . Meanwhile, the Hu/GII.17/Gaithersburg/2014/US strain was categorized into the same group as the GII.3 strains based on its RdRp sequence. However, based on the bootstrap values (<70%) and genetic distances (<0.143), insufficient confidence was encountered when an attempt was made to classify these variants with any of the known RdRp genotypes . In addition, Fu and coauthors reported that most NoV GII.17 strains detected in the 2000s were affiliated with a GII.3-like RdRp genotype and that the new GII.17 variant might be a recombinant strain that expresses a GII.3-like RdRp gene and a GII.17 capsid gene .
Viruses that exhibit a GII.17 VP1 genotype have previously been reported to harbor a GII.P13 ORF1 genotype, although recombinants expressing an ORF1 GII.P16, GII.P3 and GII.P4 genotype have also been identified. Sequence comparison showed that the ORF1 region of the novel GII.17 viruses, which are clustered between the GII.P3 and GII.P13 viruses, has never been detected before. Since this is the first orphan ORF1 sequence associated with GII.17, we decided to designate GII.P17 according to the criteria listed in the proposal for a unified NoV nomenclature and genotyping. The novel GII.17 virus was named Kawasaki 2014 after the first near-complete genome sequence (AB983218) was submitted to GenBank . This typing tool was updated to ensure correct classification of both ORF1 and ORF2 sequences of the new GII.P17-GII.17 viruses .
According to the phylogenetic tree of ORF1 (Figure 2B), all the recently emerged GII.17 sequences are closely related to GII.P3, with a bootstrap value above 95% (=99%) (Figure 2B). Moreover, Simplot analysis also supports that the new GII.17 sequences had undergone recombination between a GII.P3 RdRp and a GII.17 capsid gene, and the recombination breakpoint is located within the ORF1/ORF2 overlapping region. Taken together, our results led us to propose that the new GII.17 sequences should be named as a recombinant genotype, GII.P3-GII.17, or at least be classified as a new GII.17 variant.
In conclusion, the genetic diversity and global prevalence of NoV GII.17 from 1978 to 2015 were analyzed in this study. A highly diverse genome was discovered within this rare genotype of GII.17. For example, at least five major recombinant GII.17 clusters were found to have emerged following a particular time order. The circulation of GII.17 changes every 3–5 years due to replacement by a newly emerged variant, and the evolution of GII.17 is sequentially promoted by inter-genotype genetic recombination. Most of the GII.17 sequences were detected in Asian countries including China, South Korea and Japan, and multiple GII.17 variants were found in one country or region. Moreover, the recently emerged GII.P3-GII.17 variant exhibited significant sequence and structure variations and had evolved as a unique lineage. The results presented in this study contribute to the understanding of the evolution and persistence of NoV GII.17 in human population through analyzing recombination in the viral genome. Our findings also provide important insights into the future monitoring of the global circulation of this novel GII.17 variant.
We thank the anonymous contributors for collecting and sharing the NoV sequences. This work was supported partially by the National Key Research and Development Program of China (SQ2017YFC160114), the National Natural Science Foundation of China (41376135 and 31570112), Doctoral Fund of Ministry of Education of China (20133104110006) and Innovation Program of Shanghai Municipal Education Commission (14ZZ144), China.
Conflict of interest
The authors have declared that no competing interests exist.