The Age of Enlightenment is the period in time when the method of reasoning known as the Scientific Method was developed. This revolution in science began with the description of the sun as the center of our solar system rather than the earth. Natural phenomena previously explained by spiritualists were now described by science. Given our still evolving understanding of influenza, it is perhaps no coincidence that we describe the combined effects of the influenza virus gene segments with the word ‘constellation’, which has astrological roots describing the position of the stars. Interestingly, the name influenza also has astrological roots: it was borrowed from the Italian word influenza in the mid-17th century which, in turn, was derived from the Medieval Latin word influentia, a 14th century term that refers to the influence of the stars. The scientific rational to describe the influence of the influenza gene constellation on virus phenotype is currently being resolved. Here we try to shine some light on the subject by providing the reader with background information, recent experimental results and provide a framework for questions that remain unanswered.
Influenza is a common infectious respiratory disease caused by influenza viruses. The host range of these viruses can include birds, humans and other mammals. Influenza viruses cause seasonal epidemics and are almost globally ubiquitous. They cause significant morbidity and mortality each year yet some infected persons remain asymptomatic. Influenza is typically transmitted by aerosols produced by coughing or sneezing. Although virus particles on contaminated surfaces can be easily inactivated, the virus is still able to spread easily and rapidly. Vaccination is the recommended approach to prevent disease because of the possible emergence of drug resistance.
Vaccines are produced each year to counter the currently circulating seasonal strains. The influenza vaccine seed viruses used to produce the immunogenic proteins are reassortant viruses. That is, they contain a mix of gene segments from different viruses. The genomes of the influenza A and B viruses are made up of eight negative-strand RNA segments. The haemagglutinin (HA) and neuraminidase (NA) proteins found on the surface of the virus are found on two different segments. Usually these two segments from a seasonal virus are combined with another six gene segments from a high yield strain to make a vaccine seed virus. HA is the major immunogenic protein recognized by the host immune system. Because of influenza’s high rate of mutation, and the capacity of the genome to tolerate many mutations, there is a need to update the influenza vaccine seed virus strains each year. The influenza virus is able to avoid control by the host immune system via two major types of mutation. Antigenic drift is the process of gradual genetic mutation, especially in the HA gene, that results in newer viruses not being well recognized by antibodies that recognized the progenitor virus. Antigenic shift is the replacement of one or more segments from one influenza virus with those of another. This unpredictable event can lead to a change in host range, transmission or pathogenicity. Likewise, genetic reassortment, the mixing of genomic segments from different strains, can generate undesirable characteristics in the influenza vaccine seed viruses. Here we explore possible reasons for this and describe approaches that might be beneficial to the development of influenza vaccine seed viruses.
2. The segmented influenza genome
The Orthomyxoviridae family is comprised of negative-sense, segmented RNA viruses. There are three influenza genera, influenza A, B and C viruses, that belong to the family. Eight negative-sense RNA segments make up the viral genome for the A and B viruses, one more than the influenza C virus genome which has seven segments (Figure 1). The terminal ends of each gene segment are conserved and this allows control over several aspects of the influenza lifecycle. The terminal sequences are partially complementary and can form structures that serve as regulatory signals for transcription and replication (Figure 1). The structures are dynamic and it is thought that switching between the structures allows different steps of replication to occur .
Each negative-sense viral RNA is encapsidated with nucleoprotein to form a ribonucleoprotein (RNP). Attached to the 5’ end of each segment is the influenza polymerase complex . This arrangement allows a message RNA (mRNA) from each segment to be transcribed independently of other segments in the nucleus. Most of the segments encode a single protein but some of the RNA segments are spliced or have alternative translation mechanisms so that usually more than 10 proteins are made during an infection [3;4]. The mRNAs transcribed from all the segments are exported from the nucleus to the cytoplasm where they are translated into the viral proteins. The nucleoprotein and polymerase proteins each contain nuclear localization sequences and are imported into the nucleus where they participate in the production of new viral RNA. Some of the remaining proteins are processed in the secretory pathway and transported to the cell surface for incorporation into the virions while others remain in the cytoplasm or nucleus and modulate the host cell immune response.
The negative-sense segmented genome bestows some advantages and disadvantages to the virus. Having the genome divided up into segments creates some challenges such as ensuring that one of each segment is packaged into one virion. However, it also helps alleviate a problem faced by many RNA viruses, the high error rate inherent in RNA synthesis. The error rate for the influenza viruses has be calculated to be 2.0 x 10-6 and 0.6 x 10-6 mutations per site per infectious cycle for influenza A and B respectively . Rates ranging from 3.72 to 6.77 x 10-4 substitutions per site per year have been calculated for the influenza C segments . Influenza viruses can exist as a quasispecies, that is, a group of diverse viruses that collectively contribute to the characteristics of the population (reviewed in ). This enables mutations to exist that, by themselves, may not increase the fitness of the virus and could even be detrimental. A combination of these mutations, that together increase the fitness of the virus, may result in a virus with some selective advantage. Such a combination could occur by gene reassortment. The separation of different mutations on different virus segments facilitates this process and allows reassortant viruses to be made if a cell is infected with two influenza viruses at the same time.
Reassortment occurs in all three influenza virus genera [6;8;9]. But there is no evidence of reassortment between the genera. This is likely due, in part, to the level divergence between the viruses, both in the non-coding regulatory elements and in the proteins which interact with each other. Also, there is evidence that one virus may suppress another through pathways that are not well understood . Reassortment has long been known to occur naturally in humans, swine and birds [11-13].
Often evidence of reassortment is based on incongruence in the phylogenetic trees of each of the segments. Although naturally occurring reassortants can be compared to previously sequenced strains, the actual strains that a particular segment is derived from, and the steps in reassortment, are only deduced. The different segments from the recent 2009 pandemic H1N1 virus were phylogenetically similar to human, avian, and classical, Eurasian and triple reassortant swine virus segments. Using all the available whole virus genome sequences Bokhari et al.  used a bioinformatics approach to determine which viruses were most likely homologous to the ancestors to the 2009 pandemic strain, and what reassortments needed to occur. Interestingly, they found that among 92% of the possible paths there were certain bottleneck viruses . That is, these viruses contained the mutations and segment reassortments that made them most like the next virus in the reassortment path that eventually gave rise to the 2009 pandemic strain. This suggests that there are certain sequence requirements that need to be present before reassortment occurs.
Sequence analysis indicates that a disproportionate number of the naturally occurring reassortants are the result of novel haemagglutinin and/or neuraminidase genes being introduced into a previously circulating strain . The introduction of the HA and NA segments into another strain is also the goal of influenza vaccine seed virus strain construction.
2.1. Making vaccine seed virus
There are two predominant ways that an influenza virus can be engineered; one, by simultaneous infection of a cell with two viruses each bearing some desired trait, or two, by reverse genetics. Both methods are commonly used to produce influenza vaccine seed viruses. A disadvantage of co-infection is that it may result in the production of viruses with unwanted combinations of the gene segments. A disadvantage to the reverse genetic approach is that it may be difficult to generate a virus if the introduced HA and NA genes cause a detrimental gene constellation effect.
In the late 1950s and throughout the 1960s Edwin Kilbourne pioneered the development of genetic recombination with influenza . He recognised that an influenza vaccine seed virus should have certain desirable characteristics; good growth, low virulence, thermal stability and the proper antigenicity. At that time, development of high yield vaccine seed virus strains was via empirical methods such as mouse-lung passage. Kilbourne developed and promoted the use of “the deliberate mating of 2 or more viruses, each bearing a desired trait” so that “an appropriate progeny virus can be selected without the need for tedious“ adaptation ”until appropriate mutants, if any, become manifest” . This was achieved by infecting an egg with a combination of a non-infective strain (with a high yield characteristic) and infective influenza strain (with the desired antigenic trait) in the presence of antiserum to suppress the antigenic proteins of the non-infective strain. This allowed for the selection of virus with the same antigenicity as the infectious strain and the same growth characteristics of the non-infectious strain . This method for making vaccine seed virus strains has been widely used since. One drawback of this approach is that, because the antiserum only suppresses virus expressing the non-infective strain surface proteins, segments from the infective strain, in addition to the HA and NA segments, are often present in the resulting strains [18;19]. This may result in an undesirable trait being present in the vaccine seed virus strain and, as reassortments involving whole gene segments cannot revert like single point mutations, the traits are not as easily reverted as point mutations.
More recently, with the development of reverse genetics, it has been possible to make reassortant viruses from cloned viral segments. This allows genetically defined vaccine seed virus strains to be produced and the methodology has been employed extensively for the production of live attenuated vaccine seed virus strains . The cold adaption and attenuation mutations are spread out on multiple segments. A large number of human influenza vaccine seed virus strains have been made with the cold adapted strains A/Ann Arbor/6/60-H2N2 and B/Ann Arbor/1/66 strains . High yield attenuated backbone strains for vaccination of livestock such as birds or pigs have also been developed and are typically made using reverse genetics [21-23]. In addition, viruses expressing the haemagglutinin from highly pathogenic avian strains (H5N1) need to have the polybasic cleavage site removed by mutagenesis and reverse genetic viruses made to reduce pathogenicity. Viruses made by reverse genetics use the same plasmid derived internal gene segments. In constrast, vaccine seed strains made using in ovo reassortment are sometimes made with a recent high growth reassortant as the non-infective strain. This could result in the carrying forth of internal segments that are not from the high yield strain Puerto Rico/8/1934 (PR/8) or mutations that have appeared during the passage of the earlier reassortant strain. Although the reverse genetically engineered viruses are genetically defined, there is no avenue for reassortment to occur if there is some incompatibility between the glycoproteins from the season strain and the remaining proteins from the backbone virus.
Influenza viruses are frequently isolated and propagated in tissue culture. Madin-Darby canine kidney (MDCK) cells are widely used because they are quite susceptible to influenza virus infection. This is because the antiviral activity of MDCK cells is lacking due to inadequate interferon-induced myxovirus resistance protein 1 (Mx1) activity . As noted above, the introduction of HA or NA segments into circulating strains is dominant among naturally occurring reassortants . In contrast, recombinants generated in MDCK cells with no selection show a positive correlation between other segments: most of the segment pairs that segregate with each other in MDCK cells were polymerase combinations [25-27]. This suggests that for naturally occurring reassortants there is some selective pressure that is not present in laboratory-based experiments. One explanation for this bias may be due to the limited MDCK cell antiviral response; without this response the replication efficiency of the virus might be the main limiting factor. Thus, viruses with combinations of polymerase factors that are most efficient will become dominant. One report of recombinants generated in eggs without selection described a preference for cosegregation of the HA and M segments . Although the number of reports regarding reassortment without selection is limited, the current data suggests that egg-based experiments may more closely reflect naturally occurring events.
The replicase proteins may play a role in reassortment via their independent interaction with each RNA segment. Each genomic segment has it’s own replicase proteins associated with it when it enters the nucleus. A doubly infected cell is capable of producing each of the segments from both viruses independently of each other. The timing of the overall replication will depend on many factors such as which segments are imported into the nucleus first, translation promoter sequences and replication signals on each segment, and the induced host response. If a cell is simultaneously infected with two viruses, it is possible that early in the infection the polymerase that transcribes RNA more efficiently will control the dynamics of the infection. The resulting dynamics between the host cell and the viruses may favor the production of either virus or a reassortant. It is unknown if the dynamics of transcription and replication play a role in reassortment but, given that the transciption and replication signals on each segment can differ, it is easy to imagine the dynamics of an infection being altered when a cell is infected with two viruses. One could also imagine polymerase proteins from one strain transcribing or replicating another strain’s RNA, or a polymerase protein from one virus interacting with, and altering the activity of, a polymerase protein from the other virus.
There are two major surface glycoproteins for the influenza A and B viruses. The haemagglutinin (HA) protein is a sugar-binding protein that facilitates virus entry into epithelial cells that have sialic acid sugars on the cell surface. After the HA is cleaved by a protease, the virion is imported into the cell by endocytosis. Virus replication culminates with the accumulation of new virions at the cell surface. The neuraminidase (NA) cleaves the glycosidic linkages of the sialic acids to mediate virion release. The HA protein is encoded by segment 4 and the NA protein by segment 6 in influenza A and B viruses. Influenza C has only seven segments. The haemagglutinin-esterase-fusion (HEF) glycoprotein is encoded on segment 4 and this protein performs the functions analogous to HA and NA of influenza A and B viruses.
In contrast to the influenza B and C viruses which are only sorted by type and strain, the influenza A viruses are sorted into subtypes. The HA and NA proteins are used for virus classification. There are at least 16 different HA subtypes and 9 different NA subtypes among the influenza A viruses. The HA subtypes are divided into two groups. Certain subtypes from both groups are able to infect and transmit among humans. Most human influenza A infections are from H1N1, H2N2, and H3N2 subtypes. Occasionally a strain will jump the species barrier. A limited number of avian subtypes (H5, H7 and H9) have infected humans. Sometimes the disease is much more severe than that from a human influenza strain but these strains seem to lack the ability to transmit from human to human efficiently. There is a fear that these highly virulent viruses may reassort with human viruses creating virulent viruses that spread easily among humans. New pandemic strains arising by reassortment is clearly a concern. An additional subtype was recently identified in bats . The HA from the bat virus is more similar to the Group 1 HAs (subtypes H1, 2, 6, 8, 9, 11, 12, 13 and 16) than the Group 2 subtypes but the NA shows no similarity to any previously identified NA subtypes . It remains to be seen if these viruses can reassort and cause disease in humans.
Infection of humans with viruses containing swine origin HA and NA is known to occur. Like other zoonoses, most of these swine viruses do not spread efficiently in humans. However, the swine origin 2009 pandemic H1N1 virus spread around the world supplanting the prior seasonal human H1N1 strains. This virus was a reassortant derived from a North American triple reassortant H1N2 swine strain and a Euroasiatic H1N1 swine strain.
2.3. Replicase proteins
Four proteins are required for influenza virus replication; the nucleocapsid protein (NP) and the three polymerase proteins PB1, PB2 and PA. The polymerase proteins are the larger influenza proteins and are encoded in the largest segments 1-3. NP is encoded in segment 5. The RNA from each viral segment form ribbon-like closed superhelical structures (reviewed in ). The 5’ and 3’ ends of the RNA are in close proximity to one another and are associated with the three polymerase proteins. Nucleoprotein is associated with the remaining genomic RNA and there is one NP monomer present for each 24 nucleotides. Nuclear localization sequences on NP facilitate import of each ribonucleoprotein complex (RNP) into the nucleus (reviewed in ). Inside the nucleus mRNA transcription and viral replication take place.
Two of the polymerase proteins, PB1 and PB2, have biochemical interactions with NP proteins. In addition, the three polymerase proteins interact with each other. The carboxyl-terminus of PA interacts with the amino-terminal end of PB1 and the carboxyl-terminal end of PB1 interacts with the amino-terminal end of PB2. The same arrangement is described for both the negative strand viral RNA (vRNP) and the positive strand copy RNA (cRNP) with the polymerase being associated with the 5’ end of the RNAs. New negative-strand viral genomes are derived from the cRNP and also have a newly synthesized polymerase complex associated with the 5’ end. The NP and three polymerase proteins all have nuclear localization signals which enable them to be imported into the nucleus after they are synthesized in the cytoplasm. There is evidence that PA and PB1 associate with each other before localizing to the nucleus (reviewed in ).
Both types of positive-strand RNA, cRNP and mRNA, are generated from the vRNP. In constrast to the vRNP and cRNP, the mRNA is not associated with NP. The viral mRNAs also have a capped 5’ leader sequence snatched from a cellular mRNA and a polyadenylated 3’ end. It is not yet known exactly what regulates the polymerase complex so that it makes two distinct products from one template. The cap-binding domain is in PB2 and the endonuclease domain is in PA. Together these parts of the polymerase complex capture and remove the 5’ capped region of cellular mRNA and this is used as the priming sequence for viral mRNA production. The RNA-dependent RNA polymerase domain required for all RNA production is in the PB1 protein.
Combining the polymerase proteins from different strains to produce chimeric polymerase complexes has been studied with regard to polymerase activity and pathogenicity. It is sometimes found, but not always, that increased polymerase activity leads to more virus and increased pathogenicity . Most recent studies have focused on the replicase genes from the 2009 pandemic H1N1 strain and prior seasonal strains. It was found that the pandemic PB2 gene combined with seasonal PB1, PA and NP genes resulted in significantly less polymerase activity [33;34]. Conversely, inclusion of a seasonal PB2 gene in a pandemic background significantly increased polymerase activity. When the corresponding reassortant viruses, with a PR/8 backbone, were generated the growth kinetics for both types were reduced. This suggests that the level of polymerase activity needs to be optimized for the best virus production in vitro. In addition these viruses had higher mouse LD50 values suggesting polymerase activity and replication are important for virulence . Interestingly, introduction of the pandemic NP gene into a seasonal virus also dramatically reduced the virus replication and pathogenicity demonstrating that both altered polymerase and RNP could give rise to detrimental gene constellation effects.
In an analysis of ressortant viruses with a 2009 pandemic strain background it was found that introduction of a PA, PB1 or PB2 segment from another virus typically reduced the virus titer [27;35]. This included instances when all three segments from another virus increased polymerase activity (A/swine/Korea/JNS06/04 or A/mallard/Korea/6L/07) or reduced polymerase activity (A/duck/Korea/LPM91/06 or A/aquatic bird/Korea/ma81/07). Again, this suggests that the level of polymerase activity needs to be optimized for efficient virus production in vitro. Each of these viruses were less pathogenic in mice but several other viruses were generated that were more pathogenic in mice. One, containing the just the PA segment from A/aquatic bird/Korea/ma81/07 in the 2009 pandemic backbone had a similar level of polymerase activity to the reassortant virus containing all three A/aquatic bird/Korea/ma81/07 polymerase genes indicating that polymerase activity per se is not the cause of pathogenicity . It is possible that specific virulence determinants are associated with the PA segment but, in the absence of a gene constellation effect, this would not account for the lower pathogenicity of the virus containing all three polymerase segments.
One PB2 virulence marker, the amino acid at position 627, is a determinant of host range and contributes to pathogenicity in mice . It has been shown that introduction of a PB2 gene from a low pathogenic H1N1 virus into the highly pathogenic 1918 strain attenuated the virus in mice but pathogenicity was restored with a E627K mutation . In contrast, studies of swine influenza in pigs have shown that there is no correlation between pathogenicity and viruses with either a swine- or avian-origin PB2 gene containing the 627K or 627E mutation . While it has been suggested that the 627 residue mediates an interaction with NP , and the strength of this interaction correlates with polymerase activity , recent evidence suggests that restricted activity is due to a lack of compatibility with a host cell factor . Thus, although amino acid signatures of virulence may be important in the context of genetic drift, these results demonstrate that gene constellation effects can attenuate virulence in some hosts.
Clearly, although specific functions of replicase complex reside in each protein, the interaction of the replicase proteins plays a large role in several virus attributes including replication and virulence. At present there is a lot of genetic information for the replicase genes available. Unfortunately our current understanding of influenza replication does not enable the prediction of replication efficiency or associated pathogenicity based on replicase gene sequence alone. While new functional information is being generated regularly, a more complete understanding of influenza replication and its contribution toward pathogenicity will require more comprehensive structure-function information.
2.4. Other proteins
There are at least four additional proteins produced during influenza infection of a cell. The two segments not mentioned so far, segments 7 and 8, are the smallest genome segments. In both influenza A and B viruses each of these segments encodes at least two proteins. The analogous segments in influenza C viruses are segments 6 and 7 (Figure 1).
Influenza A segment 7 encodes two matrix proteins; M1 and M2. The M2 proteins from influenza B and C viruses are called BM2 and CM2 respectively. M1 binding to RNPs in the nucleus inhibits viral transcription [42;43]. M1 proteins form a continuous shell on the inner side of the lipid bilayer. M2, and the analogous BM2 and CM2 proteins, are ion-channel proteins that form as a homotetramers in the virus envelope. These small hydrophobic integral membrane proteins allow hydrogen ions to enter the viral particle from the endosome. The lower pH causes M1 to dissociate from the RNPs leading to the uncoating of the virus. Different coding strategies are used by the different influenza species (Figure 2 and ). M2 protein is translated from a spliced transcript while BM2 protein is translated by a coupled termination/reinitiation event [45;46]. In contrast CM1 is translated from a spliced transcript and CM2 is the produced by peptidase cleavage of a precursor protein [47;48]. Influenza B viruses encode an additional small hydrophobic integral membrane protein on segment 6. The open reading frame starts 4 nucleotides upstream from the NA ORF (Figure 2). Although NB is conserved in influenza B genomes it is apparently not essential and it’s function remains unknown at present . Interestingly, influenza A viruses also encode an alternative M2 protein on a splicing variant . The same conserved redundancy in two different influenza families highlights the importance of the ion channels for the viruses. This diversity in coding and expression of similar functions may also be a reason why reassortant viruses containing segments from different influenza types are not readily obtained.
The smallest influenza segment encodes the non-structural protein NS1 and the nuclear export protein xlink. xlink is translated from a spliced transcript and is incorporated into the virions in small numbers (Figure 2). The major role of NS1 is to modulate the host immune response. It is a multifunctional protein that interacts with several host proteins and has an RNA binding domain (reviewed in ). Protein sequence features from influenza NS1 proteins indicate that there are variant types that seem to correlate with certain host species . The exact nature of this relationship has not been teased out yet. The NS1 protein from the 2009 pandemic strain was less effective at blocking the innate immune response in cultured cells than other seasonal strains but attempts to make the 2009 NS1 protein more like the seasonal strain did not result in the same effect, rather the virus had reduced the virulence and was more easily cleared . A better understanding the relationship between NS1 from specific virus strains and the host cell type could lead to the development of vaccine seed virus backbone strains that are more suitable for vaccine production.
The presence of segment 7 or 8 from differing viruses can alter the phenotype of another virus. For example, addition of different NS segments from an H3N2 virus or different H5N1 viruses into a PR/8 backbone could result in no attenuation or complete attenuation [53;54]. Similarly, the same gene can have different effects on different viruses. For example, replacement of the A/Korea/82 (H3/N2) M segment with the A/Ann Arbor/6/60 M segment attenuated the virus. However, introducing the same A/Ann Arbor/6/60 M segment into A/Udorn/72 (H3/N2) did not attenuate the virus . This clearly demonstrates the greater impact the gene constellation has toward the virus phenotype than an individual segment in this instance.
The characterization of the laboratory generated reassortants provides useful hypothesis-driven information. Natural reassortment of the smallest influenza segments may give additional information about the role these segments play in the virus lifecycle. Some viruses isolated recently from North American pigs contain the 2009 H1N1 M segment in the context of a previously endemic H1N2 strain . This suggests that this particular gene constellation may increase viral fitness. Supporting this Chou et al.,  were able to show that inclusion of the M segment in a PR/8 backbone was essential for transmission in guinea pigs. Another group reported that the neuraminidase segment from the 2009 H1N1 strain, in addition to the M segment, was required for efficient replication and transmission in pigs . Finally, Hause et al.  reported transmission but lower viral titers for the reassortant viruses containing the M segment from the 2009 pandemic strain in pig lung homogenate when compared to infection with the parental strains. The major difference between the viruses analyzed by these groups is that the backbone strains differed; one group used comtemporary H1N2 swine viruses to generate reassortants while the other groups used laboratory adapted strains. The different outcomes observed are based on gene constellation effects.
3. The gene constellation effect
It stands to reason that if a certain gene constellation confers some desirable attribute then viruses containing those segments should occur more often in the population over time than other reassortments. That is, the combination of gene segments should occur independently many times in a large enough population if the same parental viruses continue to co-circulate in the population. However, if the reassortant has a relative fitness that is much greater than other circulating strains then the virus may quickly replace other virus strains in the population. An example of this is the emergence and spread of the 2009 pandemic H1N1 strain. Very quickly, and during a season not typically associated with high influenza rates, the 2009 pandemic strain became the prevalent H1N1 strain and prior seasonal H1N1 strains become less common .
Possibly more common, but less well documented, are reassortants that have a small increase in fitness compared to the parental strains. A recent analysis of reassortant H3N2 viruses in swine demonstrated that multiple reassortants generating the same gene constellation occurred . Several H3N2 swine lineage viruses were isolated from humans in 2011 and found to contain the segment 7 from the 2009 pandemic H1N1 virus. This prompted Nelson and colleagues to analyze the reassortants present in swine populations. What they discovered was that, in addition to the reassortants that transmitted to humans, reassortants in swine with a range of genetic backbones contained the 2009 pandemic segment 7 . It is not known if the presence of the 2009 segment 7 in swine viruses plays a role in viral fitness in swine or if it has a role in zoonotic infection of humans, but clearly the presence of this segment gives rise to viruses from different reassortment events that are stably represented in the population.
In addition to the appearance of this particular gene constellation in North America pigs, gene constellations involving all segments from the 2009 pandemic strain are becoming dominant in other parts of the world. It has been reported that the 2009 pandemic strain is present in pigs and reassorting with H1N2 and H3N2 strains [62-65]. It is not clear how the combination of gene segments present in strains like the 2009 pandemic strain results in a greater viral fitness but study of different viral characteristics has given us some insight. Here we highlight research describing the effect different gene constellations have on viral fitness.
3.1. Altered pathogenicity
There have been many efforts to understand which gene segments contribute to pathogenicity. If a particular segment were known to contribute to pathogenicity in a vaccine strain then safeguards could be put in place to prevent the generation of the gene constellations containing the offending segment(s). The current recommendation for Institutional Biosafety Committees in the United States is that gene constellation be included in the evaluation for determining the biocontainment level for influenza work . One difficulty in applying this recommendation is that the pathogenicity of a virus in one host species may differ greatly from the pathogenicity in another host species. Another difficulty is that new strains emerge and evolve faster than the pathogenicity of the gene combinations can be assessed. Here we reflect on what is known about gene constellation and pathogenicity.
In the 1970s it became clear that both the glycoproteins and the internal proteins play a role in pathogenicity. In many experiments segments from a human or animal origin virus were introduced into a pathogenic avian influenza virus and pathogenicity tested in chickens. Often the pathogenicity was reduced [67;68]. However, increased monitoring of avian influenza viruses in Hong Kong indicated that most naturally occurring reassortant H5N1 viruses were lethal to birds . Serial passage of pathogenic avian influenza in MDCK cells and selection of large plaques resulted in reduced pathogenicity in mice suggesting that differences in the host cell type that the virus is propagated in play a role in pathogenicity . Interestingly, the attenuated variants all had common mutations in the polymerase genes and grew to higher titers on MDCK cells than virus purified from small plaques. This suggests that an equilibrium between replication efficiency and pathogenicity is being altered when a virus is adapting to a new host. This is in contrast to the increased pathogenicity seen when faster growing viruses are compared to slower growing viruses in the same host; for example, avian viruses grown in eggs [71;72].
There are other examples of changes in the replication machinery of the influenza virus affecting pathogenicity. In one study it was found that reassortant avian H5N1 viruses containing the PB2 gene from a human H3N2 virus had increased pathogenicity in mice . It was also shown that the introduction of a human PB1 segment alone did not enhance pathogenicity but had a cooperative effect when the human PB2 was present. As noted above, the segments containing the polymerase often cosegregate, which could be troubling if this enhances pathogenicity.
Pathogenicity is not only dependent on the virus, it also depends on the host species and even where in the host the virus replicates . The pathogenicity caused by a virus is often due to the host response to the virus and is perhaps best exemplified by the cytokine storm. In such instances an excessive amount of proinflammatory cytokines are released and inflammation spreads from the site of infection. Acute lung injury, or the more severe acute respiratory distress syndrome, is associated with influenza infections (reviewed in ). As the NS-1 protein encoded on segment 8 has a role in modulating the host immune response, one would expect that segregation of segment 8 during reassortment might alter virus pathogenicity. However, while addition of an H3N2 NS segment to a 2009 H1N1 virus increased replication efficiency, the virus was not as pathogenic in mice as the parental H1N1 strain . Also, the addition of an H5N1-derived NS segment to a PR/8 backbone attenuated the virus when tested in mice . Because PR/8 is a high growth strain these results may not be truly representative of the effect the NS gene can have. When the NS segment from a highly pathogenic H5N1 strain was added to a highly pathogenic H7N1 strain the virus was more pathogenic in mice . Here the observed increase in virulence was also associated with enhanced cell tropism. This demonstrates the potential for reassortants created in a host specific manner to gain the ability to jump the species barrier.
The 2009 H1N1 pandemic virus arose by reassortment of swine influenza viruses of different lineages. The NA and M gene segments were from the Eurasian avian-like swine virus lineage and the remaining segments were from the North American triple reassortant lineage. The triple reassortant lineage emerged in the 1990s with the PB2 and PA segments derived from an avian virus, the PB1 from a human virus and the remaining segments from the classical swine lineage. With all three swine lineages circulating, and reassorting, there is a concerted effort to characterize current and possible reassortants for their potential to infect humans [62;65]. The ferret is a widely used model for assessing pathogenicity and transmission of viruses. Unlike mice, but like humans, ferrets infected with a seasonal influenza strain present with an increase in temperature, nasal secretions, sneezing and sometimes with a cough, making them suitable for study of human viruses. Triple reassortant swine viruses isolated before 2009 did not produce clinical signs of respiratory symptoms like sneezing and nasal secretions . The pathogenicity of these viruses was similar to the 2009 pandemic virus with more lung pathology than seasonal viruses . All the triple reassortant viruses transmitted between ferrets by direct contact but only those with human-like HA and NA were transmitted efficiently by respiratory droplets [65;78]. Addition of a seasonal H3N2 NA to a 2009 pandemic virus gave rise to more severe pulmonary lesions in ferrets demonstrating the importance of gene constellation . Further, it has been shown that the tropism of such a virus is linked to the balance between HA and the NA even when the replication competence is lower . For influenza vaccine seed viruses it has been hypothesized that a lower NA content in the virion can increase HA content .
3.2. Altered growth rates
A high growth phenotype in a virus seed strain is beneficial for vaccine production. There are many proteins that might have an effect on the virus growth rate. The envelope proteins determine the infectivity through effects on attachment, entry and budding. The replicase proteins affect the speed of transcription and replication. And finally, NS1 protein modulates the host response. Thus, differing combinations can result in changes in growth rates. Many vaccine seed strains have been made using PR/8 as the backbone, that is, replacement of the HA and NA segments whilst retaining the six remaining PR/8 segments. Usually the introduced HA and NA segments come from the same virus so one would assume that the encoded proteins are compatible with each other. Thus, any reduction in growth rate is due to the change in interactions between the HA and NA combination and other proteins. Increased growth rates are often achieved by passaging the virus but the enhanced growth sometimes results in antigenic changes to HA rather than adaptation of the other proteins.
Many insights about virus growth have come from analysis of changes on one segment. It has been observed that culture of many influenza viruses in eggs results in amino acid changes in HA as the virus adapts to the new host. Similar observations have been made when viruses are grown in different cell types. For example, most viruses have an asparagine residue in the haemagglutinin at position 117 (H1 subtype) or 116 (H3 subtype). Substitution of this residue with aspartic acid does not alter the growth in MDCK cells but enhances growth in Vero cells . This mutation was shown to alter the pH range for virus membrane fusion indicating that this is an important factor for optimal growth . However, HA does not act in isolation and the best growth of vaccine seed strains will depend on how differences in HA affect interactions with other proteins, in particular, the other major envelope proteins NA and M.
The introduction of HA from a seasonal H1N1 virus into the 2009 pandemic strain backbone resulted in larger plaque size and higher viral titers in cell culture . It was further shown that, in contrast to the predominantly filamentous parental strains, the reassortant virus was predominantly spherical and enhanced yields could be obtained by introducing the same seasonal HA into other swine-origin backbones . By using the opposite approach, introducing swine-origin segments into the seasonal virus backbone Octaviani et al. were able to show that the high yield was primarily due to the presence of swine-origin HA and M segments .
The HA and NA segments from A/Vietnam/1194/2004 and the PR/8 backbone were combined to make a reassortant prepandemic vaccine seed virus but it gave low antigen yield and did not grow well. Incorporation of the M gene segment from the A/Panama/2007/1999 H3N2 strain or from the A/Vietnam/1203/2004 H5N1 strain enhanced growth . The M1 proteins differed at positions T167A, R174K, I219V, A227T and A239K, while the M2 proteins differed at positions N31S, R54L, Y57H, S82N and G89S from the PR/8 segment 7 proteins. It is unknown how the synergy between these segments works but, with structures available for the major envelope proteins and advances in electron tomography, these interactions may be revealed in the near future .
Several vaccine seed strains have included the seasonal PB1 gene [18;19]. Growth rates were also shown to improve when the indigenous PB1 was included in the 2009 H1N1 reassortant  and with a H5 reassortant . However, this result did not extend to a different H5 reassortant . We postulated that certain residues within the PB1 protein might be important for growth and yield. Rather than target those amino acids known to be involved enzyme activity or protein interactions with the other polymerase proteins, we made changes based on sequence similarity between diverse PB1 proteins that reassorted in ovo during the production of influenza vaccine seed viruses. Using this impartial approach, we made changes to the PR/8 PB1 gene that, when combined with the HA and NA from the low yield H3N2 Wyoming/03 strain, resulted in faster growth in both egg and cell culture .
3.3. Altered protein production
The 5’ untranslated regions of the influenza A genomic segments contain signals that stimulate translation. These signals regulate the amount of protein produced from each segment and differ between segments. The non-coding regions also differ in length among the different segments and virus types [88-90]. The sequence motifs AGGGU and GGUAGAUA that are recognized respectively by the host protein G-rich RNA sequence binding factor 1 (GRSF-1) and the viral NS1 protein may also be present in the non-coding regions (reviewed in ). Both of these proteins have been shown to stimulate translation [91;92]. In addition, there are changes in translation that most likely due to changes in the RNA structure. Single nucleotide changes in the 5’ and 3’ non-coding regions of PA were shown to have no effect on translation individually, but together these changes almost completely abolished protein expression . It is not known if the changes altered mRNA production or affected translation itself, but one would expect that the resultant loss of viral protein would affect virus replication and possibly virulence. Indeed, the mutations that resulted in low protein production were based on the sequence of a low pathogenic avian influenza virus.
Very similar viruses may produce quite different amounts of viral protein and several groups have tried to find the underlying reasons. It is common for several reassortants to be made using different seasonal isolates in an effort to create a suitable high yield vaccine seed virus. As noted previously the PB1 segment from the seasonal strain was found in many high yield reassortants made for vaccine manufacture . Analysis of protein production from reassortants with or without the seasonal PB1 segment and the 2009 H1N1 HA and NA as been performed. One group found that the presence of the PB1 from the 2009 strain increased protein yield while another found that it decreased protein yield [85;93]. Both groups used PR/8 as the high yield donor strain but they each used different 2009 pandemic isolates. This highlights one difficulty in predicting what gene constellations may be beneficial for protein production; minor strain variations may have major translational effects. Comparison of both the PR/8 strains and both 2009 pandemic strains used in these studies may provide interesting information.
3.4. Incomplete genomes and RNA structure
The in vitro production of ressortant viruses using the classical reassortant method involves high multiplicities of infection to increase the chance of both viruses infecting cell. Early studies demonstrated that passage of influenza at high multiplicities of infection could result in the production of non-infectious particles [94;95]. These particles are now known as defective interfering, or DI, particles. Subsequently it was discovered that the polymerase genes frequently contained deletions [96;97]. Thus, although eight genomic segments were packaged, the viruses could not make proteins essential for replication. In addition, non-infectious particles lacking the glycoproteins have also been described . This explains the loss of infectivity even though many particles are detected by haemagglutination . Kaverin et al.,  were able to show that one fraction of a DI population could complement another fraction of the DI population. More recently, Odagiri and Tashiro  were able to show that non-coding sequences were responsible for the preferential packaging of DI RNA rather than the full length RNA segment. It is not clear when or how the portions of the RNA are deleted but it is likely that the structure and the sequence of the RNA play a significant role. Using a bioinformatics approach, Priore et al.  analyzed the extensive base pairing that exists throughout the genomic segments of avian, swine and human influenza A viruses. The results indicated that there were significant differences between species in the PB2, NP, M and NS-containing segments. These differences were only on the positive strand which could indicate a role in either protein production or negative strand synthesis. Given that these segments do not reassort as frequently as the other polymerase segments or glycoprotein segments, it would seem that genome wide RNA structural organization does not contribute to reassortment. Although, it is postulated that global organizational RNA structure could be a mechanism by which the virus adapts to the host environment  leaving open the possibility that the RNA structure of a particular segment affects the chances of it being involved in reassortment.
DI particles represent an evolutionary dead end with regard to a natural infection. However, particles that lack a complete genome could be either detrimental or beneficial in vaccine production. Particles that are defective in the polymerase will alter growth characteristics and would not function in a live attenuated vaccine. In contrast, particles with incomplete genomes represent an abundance of antigen with no pathogenicity, which could be viewed as desirable in an inactivated vaccine.
Understanding the gene constellation effect in influenza is important, especially for vaccine production. The mixing and matching of influenza genomic segments in nature and in the laboratory gives rise to new viruses with phenotypes that differ from the ancestral viruses. In nature, this may be a more pathogenic virus or one that has an expanded host range. In the laboratory, attenuated viruses with good growth characteristics and high protein yield are desirable for study and vaccine production. A greater understanding of what contributes to the gene constellation effect may enable researchers to produce influenza vaccine seed viruses that facilitate production with reduced risk of infection.
As we have described in this chapter, current research has provided some insight into the genomic features that contribute to the gene constellation effect, but more work needs to be done. Some segments, such as those encoding the glycoproteins and the polymerase proteins, appear to be more frequently involved in reassortments. The reassortment of the polymerase proteins is more common in laboratory manipulation whereas the reassortment of glycoproteins is more common in nature. The beneficial effects of certain protein:protein interactions may be the underlying impetus behind some of these reassortments. For example, certain combinations of HA, NA and M can lead to changes in transmission and growth. Likewise, certain combinations of PB1, PB2 and PA can affect polymerase activity and growth. Also, the two smallest segments have effects on cell tropism and viral fitness. The amount of polymerase activity is not directly associated with virus titer suggesting other factors affecting replication must be balanced with replication efficiency. Changes in the polymerase segments can also affect pathogenicity, especially when the virus is adapting to a new host cell. Changes in the glycoproteins have also been shoen to affect pathogenicity. While much work has focused on either the glycoproteins or the replicase proteins independently, some of the work described here demonstrates that these two groups of proteins have an effect on each other. The interaction between these two groups of proteins at a functional level needs to be elucidated.
While several groups have analyzed the genomes of reassortant viruses there is still a great need for better understanding what features contribute to genomic reassortment. With more whole virus genome sequences available for analysis there is a better chance that the features important for reassortment can be determined. Retrospective analysis of reassortant viruses can illuminate which genomic features are compatible. In vitro construction of reassortant viruses can highlight which segments, or parts of segments, are not compatible. After a reassortment event is detected there needs to be more analyses of the mutations that occurred in each segment as they may have facilitated the reassortment event. Mutations necessary for reassortment would occur prior to reassortment and perhaps be present in bottleneck viruses. Mutations that occur with passage are those that increase the fitness of the reassortant. Description of both types of mutations would enhance our understanding of the network of interactions between viral proteins. In addition to the changes in coding sequences, analysis of the untranslated regions of the genomes is also important. There is no available information about the compatibility of segments with the replication and translation machinery, or how this contributes to the gene constellation.
Finally, an understanding of the gene constellation effect will allow for the selection of better reassortant viruses for vaccine production. Currently both the in ovo and reverse genetic methods use an impirical approach to get the best viruses that express the desired HA and NA proteins. Knowing how the different segments contribute to the network of interactions that result in high yield will enable researchers to produce strains that will provide the best backbone for an influenza vaccine seed virus. The optimal backbones may be universal, or differ for the different virus subtypes, or differ according to the host species that the virus providing the HA and NA infects. But without optimal virus backbones, the production of high yield reassortant influenza vaccine seed viruses will remain inefficient.