Nudivirus Genomics and Phylogeny

Viruses are small infectious agents that can replicate only inside the living cells of susceptible organisms. The understanding of the molecular events underlying the infectious process has been of central interest to improve strategies aimed at combating viral diseases of medical, veterinary and agricultural importance. Some of the viruses cause dreadful diseases, while others are also of interest as tools for gene transduction and expression and in non-poluting insect pest management strategies. The contributions in this book provide the reader with a perspective on the wide spectrum of virus-host systems. They are organized in sections based on the major topics covered: viral genomes organization, regulation of replication and gene expression, genome diversity and evolution, virus-host interactions, including clinically relevant features. The chapters also cover a wide range of technical approaches, including high throughput methods to assess genome variation or stability. This book should appeal to all those interested in fundamental and applied aspects of virology.


Introduction
The nudiviruses (NVs) are a diverse group of arthropod-specific large DNA viruses. They form rod-shaped, enveloped virions, and replicate in the nucleus of infected cells. Nudivirus genomes are covalently closed circles of double stranded DNA molecules. Some nudiviruses have been used as potential bio-control agent for management of economically important arthropod pests (Burand 1998, Huger 1966. A variety of non-occluded rod-shaped dsDNA viruses replicating in the host nucleus have been observed in various host species, belonging to Lepidoptera, Trichoptera, Diptera, Siphonaptera, Hymenoptera, Neuroptera, Coleoptera, Homoptera, Thysanura, Orthoptera, Acarina, Araneina, and Crustacea. They had been considered as "non-occluded baculoviruses" (Huger and Krieg 1991) or more recently as nudiviruses (Burand, 1998). Most of these viruses were identified solely based on morphological features and very limited biological data. Accordingly, it remains unclear whether they are evolutionarily monophyletic or polyphyletic lineages, and whether they are genetically related to each other, to the well-investigated baculoviruses, or to other large dsDNA viruses.
Thus far, only a few nudiviruses have somehow been studied in detail. The Oryctes rhinoceros nudivirus (OrNV), formerly known as the rhinoceros beetle virus or Oryctes baculovirus, was discovered in the 1960s and has been widely used to control rhinoceros beetle (O. rhinoceros) in coconut and oil palm in Southeast Asia and the Pacific (Huger 1966, Jackson et al. 2005. It has an enveloped rod-shaped virion and replicates in the nucleus of infected midgut and fat body cells (Huger 1966, Payne 1974, Payne et al. 1977. Heliothis zea nudivirus 1 (HzNV-1), formerly known as Hz-1 virus or the non-occluded baculovirus Hz-1, was originally described as a persistent viral infection in the IMC-Hz-1 cell line isolated from the adult ovarian tissues of the corn earworm Heliothis zea (Granados et al. 1978). It can also persistently infect several other lepidopterous cell lines, e.g. IPLB-1075 (H. zea), IPLB-SF-21 (Spodoptera frugiperda), IPLB-65Z (Lymantria dispar) and TN-368 (Trichoplusia ni) (Granados et al. 1978, Kelly et al. 1981, Lu and Burand 2001. In contrast, clear infections have not been observed when the virus was inoculated into larvae of H. zea, H. armigera, Estigmene acrea, S. frugiperda, and S. littoralis (Granados et al. 1978, Kelly et al. 1981. The potential molecular mechanisms associated with this defective host infection of HzNV-1 need to be explored, which will shed light on the viral evolution. Gryllus bimaculatus nudivirus (GbNV) infects nymphs and adults of several field crickets G. bimaculatus, G. campestris, Teleogryllus oceanicus and T. commodus, and replicates in the nuclei of the infected fat body cells (Huger 1985). Heliothis zea nudivirus 2 (HzNV-2), previously known as gonadspecific virus, H. zea reproductive virus or Hz-2V, was first observed in the gonads of adult corn earworm H. zea. Its infection brings about deformities of the reproductive organs of insect hosts, which in turn lead to sterility in both female and male moths (Burand andRallis 2004, Raina et al. 2000). HzNV-2 is also able to infect other Noctuid species and to replicate in two lepidopteran insect cell lines of TN-368 and Ld652Y, derived from ovarian tissues (Burand and Lu 1997, Lu and Burand 2001, Raina and Lupiani 2006.

Infection cycles and gene expression
Only very limited data on the infection cycle of nudiviruses are available. Their life cycle in either cell culture or natural hosts is still poorly understood. HzNV-1 has a bi-phasic infection process of latency and productivity in its life cycle. In the latent phase of infection, viruses either exist as episomes or insert their DNA into the host genome (Lin et al. 1999), and keep latency for many passages in the infected insect cells (Chao et al. 1992, Lin et al. 1999, Wood and Burand 1986; virus particles are undetectable in most of these latently infected cells. Sometimes virions are released from as few as 0.2% of latently infected cells, resulting in the presence of low viral titers (around 10 3 PFU/ml) in the culture medium (Chao et al. 1998, Lin et al. 1999. During the productive infection cycle, in contrast, high titers of virus progeny are produced, resulting in the death of most cells. Often, however, a small proportion of the cells, usually less than 5%, are latently infected, and viruses stay in these cells for a prolonged period of time (Chao et al. 1992, Wood andBurand 1986). Upon in vitro infection, OrNV appears to attach to and subsequently internalizes into cultured cells by pinocytosis (Crawford and Sheehan 1985), a mechanism involving the formation of invaginations by the cell membrane, which close and break off to generate virus-containing vacuoles in the cytoplasm. While it remains unknown how the viral DNA is released into the cytoplasm and eventually enters the nucleus. During the later stage of replication, along with the cytopathic changes to the nucleus, the virogenic stroma is developed, where the viral envelopes and nucleocapsid shells are produced and subsequently packaged with viral DNA. At last, the matured virions enter the cytoplasm followed by budding through the cell membrane (Crawford and Sheehan 1985) .
In vitro sequential expression of viral genes encoding structural and intracellular proteins has been divided into early, intermediate and late stage in the replication cycle of OrNV (Crawford and Sheehan 1985). The temporal gene expression profiles of HzNV-1 during productive infection are divided into three stages: (i) the early stage, 0 to 2 h p.i.; (ii) the intermediate stage, 2 to 6 h p.i.; and (iii) the late stage, which includes all virus-specific www.intechopen.com events appearing after 6 h p.i. (Chao et al. 1992). Persistency-associated transcript 1 (PAT1), expressed by persistency-associated gene 1 (pag1), is the only detectable transcript during latent infection of HzNV-1 (Chao et al. 1998).

Taxonomy and nomenclature
Given that they share similar structural and replication aspects with baculoviruses of insects, nudiviruses were previously classified as the so-called "non-occluded baculoviruses" (NOBs) (Huger and Krieg 1991). NOBs were later removed from the family Baculoviridae because no genetic data were available which would have supported their relationship (Mayo 1995). Nudiviruses have been also referred to as intranuclear bacilliform viruses (IBVs). Notably, unlike baculoviruses, nudiviruses generally lack occlusion bodies (OBs). The genus name Nudivirus has been proposed to accommodate this group. Based on the currently available morphological and molecular data, the following demarcation criteria were proposed for classification of a candidate virus into the genus Nudivirus: (i) Viral genome is consist of large circular dsDNA molecule; (ii) A set of conserved core genes are shared among members and viruses propagate in the nuclei of infected host cells; (iii) Morphology of virion is rod-shaped and enveloped; (iv) Viruses are transmitted per oral and/or per parenteral route, and infect larvae and/or adults with diverse tissue and cell tropisms (Wang et al. 2007a). Obviously, these demarcation criteria need to be complemented with more biological properties, such as virion properties, infection and replication strategies, as well as host range and virus ecology, becoming available. To name a nudivirus species, it was suggested to follow the nomenclature for other large eukaryotic dsDNA viruses, host name with the suffix name of nudivirus (Wang et al. 2007c).
Presently, nudiviruses comprise five tentative species, OrNV, GbNV, HzNV-1, HzNV-2, and Penaeus monodon nudivirus (PmNV) (Wang and Jehle 2009). Considering their similarities to baculoviruses and, on the other hand, taking their distinct biological, ecological features and virion properties into account, the establishment of an independent family ''Nudiviridae" within a new order ''Baculovirales" along with the Baculoviridae seems most appropriate. The establishment of an order "Baculovirales" would allow subsequent flexible integration of other ''baculovirus-related" but highly diverged viruses, such as the proposed ''Hytrosaviridae" ) or the Nimaviridae, without taxonomic re-definition of the family Baculoviridae.

Genome size
HzNV-1 was the first completely sequenced nudivirus (Cheng et al. 2002). Its genome is 228,089 bp in size, has a G+C content of 42%, and encodes 154 ORFs (Table 1). HzNV-1 ORFs are randomly distributed on both DNA strands with 45% clockwise orientation and 55% counterclockwise orientation. HzNV-2, the close relative to HzNV-1, has a genome of 231,621 bp, only slightly longer than that of HzNV-1, with a G+C content of 42% identical to HzNV-1 (Wang et al. 2007a). Later on, the genome of OrNV, the first discovered nudivirus, was partially sequenced (Wang et al. 2007c). Recently, the complete genome of OrNV was successfully achieved using DNA generated with multiple displacement amplification (MDA) . The OrNV genome is 127,615 bp in size with a G+C content of www.intechopen.com 42% and contains 139 ORFs (Table 1, Fig. 1) , Wang et al. 2011. Thus far, the smallest nudivirus genome sequenced is GbNV, which is 96,944 bp in length with a G+C content of 28% and contains 98 ORFs. Among them, 58% are in clockwise distribution and 42% are in reverse direction (   homologues; black color, GbNV specific ORFs; blue color, HzNV-1 homologues; green color, baculovirus, HzNV-1 and OrNV homologues; pink color, OrNV homologues; yellow color, HzNV-1 and OrNV homologues; blue color, HzNV-1 homologues; light blue, cellular homologues; grey color, baculovirus homologues. Taken from Wang et al. (2007b) with permission from the American Society for Microbiology.

Gene order
Similar to what is observed in other viral families (e.g., the Baculoviridae), gene order is poorly conserved in nudivirus genomes as well. OrNV and GbNV share a number of gene clusters, comprising 2-7 collinearly arranged genes, distributed throughout their genomes (Wang and Jehle 2009). In contrast, only two gene clusters were detected between OrNV and HzNV-1 Jehle 2009, Wang et al. 2011). However, a gene cluster of helicase, pif-4/19 kda, and/or lef-5 is present in all three nudivirus genomes (Fig. 3), which is similar to the conserved core gene cluster of four genes of helicase, pif-4/19 kda, 38K and lef-5 in all sequenced baculoviruses (Herniou et al. 2003, Jehle andBackhaus 1994). Hence, core gene clustering strongly supports the hypothesis of a common ancestor of nudiviruses and baculoviruses.

Repetitive regions
Repetitive sequence regions (Rsr) were detected in all three sequenced nudivirus genomes. They are variable in length and numbers and are distributed throughout the genome. They are homologous neither to each other within and between genomes, nor to those of other large dsDNA viruses, such as baculoviruses, hytrosaviruses and white spot syndrome virus (WSSV). Rsr appear to be a universal feature of all large dsDNA viruses.

Promoter motifs
A promoter motif of TTATAGTAT was identified at the upstream regulatory regions of HzNV-1 late gene p34 (ORF79) and p51 (ORF64) Burand 1996, Guttieri and. It was also found within 200 bp of the initiation codon of HzNV-1 ORF81 based on in silica sequence analysis (Cheng et al. 2002). Although consensus early and late promoter motif sequences similar to those of baculoviruses were predicted in nudivirus ORFs, convincing experimental data remain unavailable (Cheng et al. 2002, Wang et al. 2007c).

Untranslated regions
In the HzNV-1 transcripts, the early gene hhi1 (HzNV-1 HindIII fragment 1 gene) contains 270 nucleotides (nts) of 5' untranslated region (UTR) which, together with its upstream 62 bps, compose hhi1 early promoter (Wu et al. 2008, Wu et al. 2010; the HzNV-1 late gene p34 (ORF79) possesses 16 and 17 nts of 5' UTR, respectively, differing by 1 nt, and both 5' UTRs overlap with the identified 9 bp late promoter motif of the p34 (Guttieri and Burand 1996); as for the HzNV-1 late gene p51 (ORF64), the major late transcriptional initiation site is at −205 bp relative to the translational start codon and seven minor late start sites locate at various positions upstream of this primary site (Guttieri and Burand 2001). The putative www.intechopen.com polyadenylation signals (AATAAA) downstream of the stop codon of the p34 and p51 were found Burand 1996, Guttieri and. Thus far, nothing is known on how UTR mediate the translational efficiency of nudivirus genes.

Open reading frames (ORFs)
Computer-assisted ORF prediction included all sequences starting with ATG followed by 50 or more amino acid (aa) codons and minimum overlap with other ORFs. ORFs with less than 50 aa are only considered as putative genes in cases of clear homology to ORFs in other dsDNA viruses.

Gene content and conserved gene functions
There are 66, 34, and 33 homologous genes shared by OrNV and GbNV, OrNV and HzNV-1, and GbNV and HzNV-1, respectively ( Table 2), suggesting that OrNV and GbNV are more closely related to each other than to HzNV-1. OrNV, GbNV and HzNV-1 have 33 genes in common (Table 2). Strikingly, 20 out of them are homologues of baculovirus core genes, which are present in all 54 baculovirus genomes that have been deposited in GenBank as of July 2011. Baculovirus 31 core genes play crucial role in virus replication cycle and are the evolutionarily conserved marker genes in identification, classification and phylogeny of baculoviruses (Herniou et al. 2003, Herniou and Jehle 2007, Jehle et al. 2006a, Jehle et al. 2006b, van Oers and Vlak 2007. Nine other ORFs are likely involved in DNA replication, repair and recombination, and nucleotide metabolism; one is homologous to baculovirus iap-3 gene; two others are nudivirus-specific ORFs of unknown function ( Table 2). The presence of 20 baculovirus core genes in nudiviruses strongly indicates that nudiviruses and baculoviruses are the closest lineages among the viruses known so far.
Besides in nudiviruses, homologues to baculovirus core genes were also detected in two salivary gland hypertrophy viruses ( . Surprisingly, several core gene homologues of baculoviruses was identified in the marine WSSV as well (Wang et al. 2011), suggesting that WSSV, as suspected since it was observed, is evolutionarily related, albeit distantly, to baculoviruses. Most strikingly, nudiviruses, SGHVs and WSSV have the homologues to the genes encoding peroral infectivity factors (p74, pif-1, pif-2 and pif-3) (Wang et al. 2011). These four pif genes are conserved among all sequenced baculoviruses and are absolutely crucial for successful peroral infection of insect hosts. As midgut infection is the essential first step in the invasion of baculoviruses, PIFs may be the key determinants of host range and virulence. Accordingly, it seems to be reasonable to hypothesize that a highly conserved interaction mode of viruses and hosts upon primary infection is present in nudiviruses, baculoviruses, SGHVs and WSSV. However, only limited data on the function of the PIF proteins has been delineated in baculovirues (Slack and Arif 2007), let alone in nudiviruses, SGHVs and WSSV. Obviously, deeply exploring of the molecular mechanisms of the PIF proteins as well as their homologues is crucial for better understanding of host range, zoonotic behaviour, and epizootic or enzootic disease of these viruses. In addition, nudiviruses appear to share homologues of the transcription apparatus of baculoviruses, suggesting that a similar mode of late gene transcription is used in www.intechopen.com nudiviruses as well (Wang et al. 2011). Taken together, this finding provides crucial clues to the origin and evolution of arthropod specific large dsDNA viruses. The biochemical and biological function of the genes predicted in nudiviruses remains unknown. Only the occlusion body protein-encoding gene of PmNV has been molecularly characterised, revealing no homology to any other genes deposited in Genbank (Chaivisuthangkura et al. 2008).

Phylogenetic analysis
Due to the poorness of information of other distinguishing features, single gene phylogeny and/or phylogenomics became the most important approach to delineate the relationship of concerned viruses on strain and species level. However, single gene phylogenies have fallen www.intechopen.com increasingly into disfavor given the recognition that gene trees can often differ substantially from the underlying species tree due to a variety of evolutionary events in addition to simply stochastic or analytical error. This is likely to be especially true in DNA viruses with the substantial evolutionary dynamics intrinsic to their genomes. Thus, (i) horizontal gene transfer (HGT), i.e., exchange with other viruses, symbiotic bacteria and hosts; (ii) homologous or nonhomologous recombination with other viruses; (iii) gene/domain duplication and rearrangement; and (iv) lineage specific gene loss/expansion all impose significant complications on both the bioinformatic detection of orthologous genes and on the accuracy of the resulting gene trees with respect to the overall species tree (Shackelton and Holmes 2004).
To overcome these problems, a set of conserved genes were analysed using both the supertree and supermatrix approaches. Multiple sequence alignments of individual genes were performed using any of T-Coffee (Notredame, Higgins and Heringa 2000), MUSCLE (Edgar 2004), ClustalW/X (Chenna et al. 2003), MAFFT (Katoh et al. 2002) and Kalign (Lassmann and Sonnhammer 2005), and were manually refined as needed. Sequence alignment quality was assessed by using MUMSA (Lassmann and Sonnhammer 2005). In particular, 20 of the 30 baculovirus core genes (Table 2) were analysed, considering that they are evolutionarily more conserved than other nonessential genes and that homologues to all or most of them are present in NVs, SGHVs, and WSSV as well as, albeit more distantly, in other large eukaryotic dsDNA viruses such as NCLDVs (nucleocytoplasmic large DNA viruses) and herpesviruses (Table 3). Table 3. Ancient core genes identified in NALDVs, WSSV, NCLDVs, and herpesviruses. Black squares, homologue detected in all available genomes; grey squares, in many but not in all available genomes; white squares, in few available genomes; -, not detected. Homologue definition, gene name in NALDVs / in NCLDVs. Ascovirus is considered to be a member of the NCLDVs because of its high sequence similarity to iridovirus.
The supertree and supermatrix framework represent alternative strategies to the issue of data combination. In the supermatrix approach, all the primary character data are combined into a single supermatrix that is analysed using standard phylogenetic methods (de Queiroz and Gatesy 2007). By contrast, the supertree approach combines phylogenetic trees derived from individual partitions of the full data set (here the individual gene trees) to likewise derive a single, joint phylogenetic estimate (Bininda-Emonds 2004a). Thus, the supertree approach addresses conflict and congruence at the level of the source trees rather than at the level of the primary data (Bininda-Emonds 2004b). Although this approach has been www.intechopen.com criticised because of the inherent loss of information (among others, see de Queiroz and Gatesy 2007), numerous simulation studies have demonstrated that this loss of information is not detrimental in practice (see Bininda-Emonds 2004a). Moreover, the contrasting approaches of the supertree and supermatrix frameworks form the basis of the global congruence framework (Bininda-Emonds 2004b), whereby increased confidence is placed in those clades common to both approaches and increased attention is demanded on conflicting solutions, particularly when each is strongly supported.
For the supertree analyses, phylogenetic analyses of the individual gene trees were performed under a maximum likelihood (ML) framework using RAxML 7.0.4 (Stamatakis, Hoover and Rougemont 2008). Optimal substitution matrices for each amino acid data were selected initially using the Perl script ProteinModelSelector (http://icwww.epfl.ch/ ~stamatak/index-Dateien/Page443.htm) as implemented in batchRAxML (http://www. molekulare systematik.uni-oldenburg.de/33997.html) and then applied for the full ML analysis of each gene tree. In all cases, rate heterogeneity between sites was accounted for using the CAT approximation of the gamma distribution (Stamatakis 2006). The former is an approximation of the latter that is both computationally more efficient in terms of its memory demands and overall speed, and provides equivalent results (Stamatakis 2006). However, all final likelihood values were obtained under a true gamma distribution. ML analysis used the new fast bootstrapping approach (Stamatakis et al. 2008) that simultaneously obtained the ML tree as well as estimates of nodal support based on a nonparametric bootstrap (Felsenstein 1985). Bootstrap values were based on 1000 replicates. Gene trees were rooted on the herpesviruses HHV-3, HHV-4, and HHV-5 (as a monophyletic group) because they share the minimum number of conserved ancestral genes with the other viruses (Table 3); trees lacking herpesviruses were treated as unrooted.
The supertree analysis used the method of matrix representation with parsimony (MRP) (Baum 1992, Ragan 1992, whereby the topology of each gene tree was then encoded using additive binary coding: for each node in turn, all taxa descended from that node are scored as "1", all taxa otherwise present on the tree are scored as "0", and all remaining taxa as "?". Semi-rooted coding was employed in that rooted gene trees included an all-zero fictitious outgroup taxon to root the supertree; for unrooted gene trees, this taxon was coded using "?" (Bininda-Emonds, Beck and Purvis 2005). The matrix representations of all source trees were then combined into a single matrix that was analyzed using maximum parsimony (MP). Individual pseudocharacters in the matrix were weighted according to the bootstrap support of their corresponding nodes, a procedure that improves the accuracy of the supertree analysis by helping account for differential support within the primary character matrices (Bininda-Emonds and Sanderson 2001). MP searches in PAUP* v4.0b10 (Swofford 2002) used a heuristic search strategy based on a random addition sequence (10000 replicates), TBR branch swapping, and with up to 50000 equally most parsimonious trees (MPTs) being saved. The supertree was taken to be the 50% majority-rule consensus of all MPTs. Support for the nodes in the supertree was estimated using the rQS index (Bininda-Emonds et al. 2003, Price, Bininda-Emonds andGittleman 2005) restricted to informative gene trees only; analyses used the Perl script QualiTree (http://www.molekularesystematik.uni-oldenburg.de/33997.html).
The rQS index measures the number of gene trees that explicitly support or conflict with a given node on the supertree. Values of 1 and -1 indicate universal support or conflict, respectively, among the set of gene trees (Fig. 4). For the supermatrix analysis, all individual gene data sets were concatenated into a single, larger matrix that was analyzed using RAxML. Analysis used the same method as for the individual gene trees, except that a partitioned model was used whereby each gene partition was modeled individually according to the optimal model of evolution determined previously. Support values for each tree were also estimated using the support measure for the other technique. In other words, the rQS index was also applied to the supermatrix tree to estimate the support for its nodes across the gene trees and the bootstrap values for the nodes on the supertree were estimated using the 1000 bootstrap replicate trees derived from the supermatrix analysis (Fig. 4).

Common ancestry of NVs, baculoviruses and SGHVs
In the light of gene content analysis, an evolutionary link among NVs, baculoviruses, SGHVs and WSSV is most plausible. Consequently, it should be possible to analyze their phylogenetic relationship on the basis of their shared conserved ancestral genes. When these 20 single gene trees were inferred, most of the nodes showed medium to high bootstrap values, with average values across an entire gene tree ranging from 57.6 10.9 (helicase; n = 17 nodes) and 99.00.8 (p47; n = 4 nodes), suggesting the trees are topologically reliable on the whole (Wang et al. 2011).
The supermatrix (on the basis of the 20 core genes indicated in Table 2) and the supertree using these 20 single core gene trees in (Wang et al. 2011) analyses were performed. Both the supermatrix tree and supertree were highly congruent (Fig. 4). In both cases, the monophyly of each of the NVs, baculoviruses, and SGHVs was strongly supported, the branching patterns within each of the baculovirus and NV clades were also in good agreement with the current picture of their phylogeny, and a common ancestor of baculoviruses and NVs was suggested (Fig. 4). Hence, we recognized from both the supertree and supermatrix tree that baculoviruses and NVs are monophyletic; they can be considered as the minimally forming group that we term the nuclear arthropod-specific large DNA viruses (NALDVs). Both methods conflict in positioning the SGHVs within (or at least as sister lineage to the NALDV; supermatrix tree) or outside (supertree) the NALDV group. For each tree, the preferred position enjoys better support than that from the other analysis based on the most appropriate support measure. For instance, the supertree placement of the SGHVs has an rQS index value of 0.455 compared to a value of 0.143 supporting the grouping of SGHVs with the baculoviruses and NVs. However, whereas the supermatrix placement of the SGHVs enjoys some rQS support (0.143 as mentioned), the supertree placement has no bootstrap support whatsoever (0.6 compared to 99.4). Thus, the supermatrix placement of the SGHVs as sister lineage to baculoviruses and NVs, and the SGHVs being members of the NALDVs seem to be justifiable.

The "Monodon baculovirus" represents a nudivirus
Blastp searches revealed that a number of nudivirus homologues are present in the partially sequenced genome of the so-called "Monodon baculovirus" of the shrimp Penaeus monodon (21,150 bp in total; GenBank accession no. EU246943, EU246944, EF458632, AY819785).

www.intechopen.com
When using the annotated shrimp MBV ORFs as query in BLAST similarity search, best hits were frequently found with HzNV-1 (Wang et al. 2011). Different phylogenetic analyses, including single gene tree inference as well as both supermatrix and supertree analyses, of the homologues of baculovirus core genes lef-9, vlf-1, lef-5, 38K, revealed unequivocally an obvious relationship between MBV and the non-occluded HzNV-1 (Fig. 4) (Wang et al. 2011). Given that seven other MBV and HzNV-1 ORFs are also highly similar, it is strongly suggested to consider MBV as an occluded member of the NVs and to rename it to Penaeus monodon nudivirus (PmNV) (Wang et al. 2011).

WSSV might be related to the NALDVs
The position of WSSV differs between the two trees, however, being nested deep with the NALDVs in the supermatrix tree and as sister to the clade of NCLDVs plus NALDVs in the supertree (Fig. 4). Support for either position based on either the rQS index or the bootstrap is worse than that for other clades in the tree. The different positions for WSSV reflect how the two different methods used deal with the restricted, conflicting information that is available for this virus. Although WSSV shares six genes with the other viruses, only two of these (DNA polymerase and p33) are phylogenetically informative as to its potential placement with respect to the NALDV and NCLDV groups because homologue counterparts are available in both groups. The remaining four genes (p74, pif-1, pif-2, and pif-3) are restricted to baculoviruses, NVs, SGHVs, and WSSV only. The resulting trees are therefore essentially unrooted and it is not possible to determine if WSSV nests within NALDVs (contradicting the supermatrix placement) or is sister to them (consistent with both placements). Of the two informative genes, only p33 has associated WSSV with the NALDVs; the DNA polymerase has grouped it within the NCLDVs (Wang et al. 2011). The supermatrix analysis is influenced largely by the relative number of amino acids (aa) supporting a given position. In the current context, DNA polymerase with ~3000 aa residues is clearly outweighing the ~1000 aa residues of p33, thereby favouring the placement of WSSV with the NCLDVs. By contrast, the supertree analysis is more sensitive to the number of trees supporting a given position and, importantly, the relative node support within those trees (in a weighted supertree analysis). Thus, although the DNA polymerase tree places WSSV within the NCLDVs, this position is very poorly supported and outweighed by its more robust placement within NALDVs in the p33 tree (Wang et al. 2011). As a result, WSSV was excluded from the NCLDV group in the supermatrix tree.
Thus, the phylogenetic analyses are equivocal with respect to the evolutionary relationships of WSSV based on the current data set and more genes need to be sampled to resolve its placement. Nevertheless, other sources of evidence suggest that WSSV is more closely related to the NALDVs than to other DNA viruses. Notably, WSSV shares six conserved homologous genes with the NALDVs, but rarely possesses homologous genes with numerous other marine viruses colonising the same aquatic ecological niches. It therefore seems that WSSV is a very ancient virus that has undergone extremely divergent evolution, as witnessed by the branch lengths generally subtending this virus (Fig. 4). This fact, in turn, hampers identification of its gene homologues and reconstruction of its phylogenetic affinities using present-day alignment based methods. In contrast, when an alignment-free whole-proteome phylogenetic analysis was applied, WSSV clustered with SGHVs (Wu et www.intechopen.com al. 2009), which coincidently is in agreement with the presented hypothesis of WSSV`s evolutionary link to the NALDVs. However, in the study by Wu et al. (2009) the SGHV and WSSV were placed within the herpesviruses, although there is no evidence of relationship among these viruses, when considering structural, biological and other genome features.

A common ancestry of nudiviruses, baculoviruses, hytrosaviruses, and WSSV
Taking together, 20 baculovirus core gene homologues were identified in nudiviruses, 12 in SGHVs, and six in WSSV, respectively. Consequently, this shared gene content of baculoviruses, nudiviruses, SGHVs, and WSSV is an important evidence for a proposed common ancestry of these viruses. Any other explanation, e. g., horizontal gene transfer of these genes, seems to be less probable. Therefore it is proposed that baculoviruses, nudiviruses, hytrosaviruses, and WSSV most likely shared a common ancestor and form a highly diverse group of nuclear arthropod-specific large DNA viruses (Wang and Jehle 2009;Wang et al. 2011).

Acknowledgement
This work was funded by grants from Shanghai Municipal Education Commission (the Eastern Scholar Project and the Leading Academic Discipline Project) and Shanghai Municipal Science and Technology Commission (Project no. 10540503000).