Recent Progresses in Ab Initio Electronic Structure Calculation toward Understandings of Functional Mechanisms of Biological Macromolecular Systems

In this chapter, we present recent advances of theoretical analyses toward understandings of functional mechanisms of biological macromolecular systems, employing ab initio electronic structure calculations. Two distinct types of triggers to invoke dramatic rearrangements of electronic structures in the reaction centers are revealed by full ab initio quantum mechanics (QM) calculations (first example) and hybrid ab initio QM/molecular mechanics (MM) molecular dynamics (MD) calculations (second example). First, we demonstrate dramatic rearrangements of molecular orbitals (MOs) induced by binding of a hydroxyl ion (OH (cid:1) ) to the [4Fe-3S] cluster found in hydrogenases, which catalyzes both dissociation and production of dihydrogen (H 2 ). This induces the significant delocalization of the LUMO, resulting in formation of electron transfer pathways required for the catalysis. Thus, in organisms, just a tiny species (e.g. OH (cid:1) ligand) can play a key role for the biological functions. Second, we indicate dynamical rearrangements of MOs occurring in the enzymatic reactions of RNA-protein complexes. As the catalysis proceeds, the reactive MOs, which do not belong to the frontier orbitals in the initial stages of the reaction, are dramatically reconstituted in the hybrid ab initio QM/MM MD simulations, resulting in the frontier orbitals, which is a feature characteristic to biological macromolecular systems. the mutation, whereas in the Val and Ile systems, those were preserved. Our modeling studies elucidated that these were due to the absence and presence of compensation mechanisms in the Leu and Val/Ile protein moieties, respectively. Actually, in our previous study, we constructed the atomistic structural model of the Val system (i.e., the ValRS (cid:3) threonyl-tRNA Val (misaminoacylated) complex) and showed that for the Val system with a “ defective ” ribozymal activator of tRNA Val , the protein (ValRS) moiety could activate the nucleophilic water molecule (the red arrow), which is referred to as the protein enzyme (note that the definition of protein enzyme described here is different from that employed by Cech [64]). As discussed in the text, this transition from the hybrid ribozyme/protein catalyst toward the protein enzyme may fill a gap found in the evolutionary transition from the RNA world to the current RNP world, which could possibly occur in primordial biological macromolecular systems. © Sakabe et al., [25].


Introduction
Why are the theoretical analyses employing ab initio quantum mechanics (QM) calculations required to understand biological systems? In an organism, so many catalytic reactions are present, for example, transcription, DNA replication and repair, protein biosynthesis, respiration, photosynthesis, and synthesis and degradation of biological compounds (metabolites) such as amino acids, nucleotide, and lipid. In order to understand the mechanisms of biological functions, analyses of the electronic structure changes for the catalytic reactions are essential.
Up to date, QM calculations have been employed to understand many biochemical reactions, although the system sizes of such macromolecular systems are huge. In this section, we will briefly introduce several substantial issues in QM methods that have frequently been employed in analyses of biological systems. From biochemical and biophysical points of view, these descriptions are also relevant to the construction of the QM models, spin assignments, selection of QM/MM methods, the QM calculation methods, basis sets, and so on.

Construction of QM model system
To obtain precise geometric and electronic structures employing the QM calculations, high-quality three-dimensional (3D) structures are indispensable. In most studies of biological macromolecular systems, the initial 3D structures for the theoretical analyses are retrieved from Protein Data Bank (PDB) website (https://www.rc sb.org/), which provides 3D structures of biological macromolecules analyzed employing X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and electron microscope (EM) experiments. Currently, the PDB site contains more than 129,300 X-ray structures, 12,300 NMR structures, and 2400 EM structures.
Although state-of-the-art methodologies, such as the X-ray free electron laser (XFEL) and cryo-EM, provide high-quality 3D structures, the resolution of most experimental structures is still insufficient to observe hydrogen atoms. Indeed, only 0.5% of X-ray and EM structures are under 1.0 Å resolutions, and 80.3% are in the range of 1.4-2.8 Å resolutions. Thus, one needs to attach the hydrogen atoms in chemically appropriate manners. This is also an important issue, and so we need enough time to carefully identify the appropriate configurations for attachments of hydrogen atoms.
Since the computation costs of QM calculations are too large to include the entire biological macromolecular systems, QM models are usually extracted and thereby include the numbers of atoms in the ranges of 50-100 atoms. The truncated boundary carbons of the QM models are usually capped by the methyl group. Other crucial moieties in the systems, which can significantly affect geometric and electronic structures of the active centers, such as ligands of the transition metal binding sites and hydrogen-bonded waters, should also be included in the extracted QM models. Notably, to overcome the increase of computational costs by including large environmental moieties into the QM models such as the bulk water molecules as the solvent, hybrid quantum mechanics and molecular mechanics (MM) (i.e., classical mechanics) schemes have been developed up to date (the hybrid QM/MM calculation method is discussed in Section 1.3).

Spin assignments of the system
As mentioned, we require 3D structures of the calculation models, basis sets, the (total) charges, spin multiplicities, and initial wave functions of the systems, to perform the QM calculations. Since the QM calculations may also suffer from the nonlinear, initial guess, and local minimum problems, starting from appropriate initial 3D structures and wave functions is very important to obtain the correct states. In particular, the total spin of the system including multiple transition metals must be explored by carefully providing multiple combinations of spin assignments and thereby should be determined by identifying the spin state with the optimal total energy value among all of those spin states.
To provide the total spin assignments, it is convenient to divide the systems into some fragmental moieties, such as a transition metal and its coordinated ligands as a subsystem, which leads to spin assignments for a part of the QM models. Although this could be helpful to get a reasonable solution, there is no warranty of the convergence and acquisition of the correct solution. In Section 2, we discuss more details of this issue.

Hybrid functional in density functional theory (DFT) calculation
Hybrid functional approximation was introduced by Becke [1] in 1993 and has become one of the most popular computational approaches. By incorporating a portion of the exact exchange energy originated from the Hartree-Fock theory coupled with the exchange-correlation energies, the hybrid functional approach improves molecular properties that poorly described with simple ab initio functionals, such as bond lengths, vibration frequencies, and atomization energies [2]. In general, a hybrid functional is described as a linear combination of the Hartree-Fock exact exchange energy functional and the exchange-correlation density functionals; their weights are determined by a fitting procedure such as to reproduce experimental or highly advanced calculated thermochemical data.
Becke proposed the following exchange-correlation approximation [1], where a 0 , a x , and a c are the parameters determined by the fitting procedure (i.e., a 0 = 0.20, a x = 0.72, and a c = 0.81). Herein, subscripts X and C represent the exchange and correlation, respectively, E LSDA XC represents the exchange-correlation energy within the framework of the local spin-density approximation (LSDA), E exact X is the exact exchange energy, and E B88 X and E PW91 X are the gradient corrections for the exchange by Becke and Perdew-Wang, respectively.
The B3LYP functional, which has been one of the most widely used functionals in molecular quantum calculation fields, employs the nonlocal correlation provided by the LYP expression (Lee-Yang-Parr) and the Becke88 exchange functional [3], and VWN local-density approximation that was constructed by Volsko, Wilk, and Nusair (VWN) [4]. Thus, the general formula of the hybrid functionals can be written as follows (this is exploited in Gaussian software; http://gaussian.com/dft/), Various combinations of nonlocal exchange functionals and correlation functionals can be employed here. In the B3LYP functional, the Becke's three-parameter functional and the LYP correlation functional were combined with the following parameters, In an attempt to improve the B3LYP functional, the long-range correlations are incorporated into the cam-B3LYP [5] and LC-BLYP [6] functionals. However, to the best of our experiences on applying to biological macromolecular systems, selection of functionals is not so simple, and thus, careful investigations based on computational trials are required to determine them. In the following two examples described in this chapter, such examinations were actually performed, and thereby, the B3LYP functional was adopted.

Hybrid QM/MM calculation scheme
In 1976, Warshel and Levitt [7] developed a QM/MM method, in which QM calculation is combined with classical mechanics calculation, to obtain the electronic structures of the QM region with consideration of the environmental effects, such as protein, membrane, and solvent water molecules. In this strategy, the QM calculation is adopted to the active site (QM region), and for the remainder of the system, the MM calculation is adopted (MM regions) ( Figure 1).
Great progresses have been achieved up to date for improvement of QM/MM calculation algorithms and their applications to biological systems [8][9][10][11][12][13][14][15][16][17][18][19][20][21]. Importance of the environments has been reported from many QM/MM studies. For example, polarization from MM region affects both electronic structure and geometric structure [22]. Recently, we reported that in huge biological macromolecular systems such as complexes of aminoacyl-tRNA synthetases (aaRSs) and their cognate tRNAs, dynamical, geometrical changes induced dramatical rearrangements of the electronic structures in the catalytic sites, which thus generated the productive states in the reactions [23][24][25] (see Section 3). By contrast, in Section 2, full (ab initio) QM calculations were solely employed for the analysis of the electronic structures of a transition metal cluster found in proteins, since in most cases, such a structure is buried in protein environments.

Energy expression for QM/MM calculation
In the framework of QM/MM methodology, an entire system is divided into two regions: QM region, which is described by quantum mechanics principles, and MM region, which is described by molecular mechanics (i.e., classical mechanics). Due to the presence of the interactions between QM and MM regions, the total energy of the entire system can be formally written as follows: The inclusion of the energy term of QM-MM interactions, E(QM/MM), enables a more realistic description of the system, compared with isolated QM calculations. In terms of the treatment of the electrostatic interaction between the QM and MM regions, QM/MM methodologies are divided into two groups, subtractive and additive schemes. Herein, we discuss the advantages and disadvantages of the QM/MM methodologies in the comparison of these two schemes.
Subtractive schemes consist of the three steps as follows: (1) an MM calculation on the entire system, (2) a QM calculation on the QM region, and (3) an MM calculation on the QM region. Then, QM/MM energy of the entire system can be formulated as follows: The subscript indicates the type of calculation (QM or MM calculation), and the region on which the calculation is performed is described in parentheses. An advantage of the subtractive schemes is simplicity. Explicit descriptions of interactions between QM and MM regions are not required. In addition, artifacts that might be caused by using link atom schemes to cap the truncated bonds at the QM-MM boundary (described below) can be avoided. On the other hand, disadvantages of the subtractive schemes are the following. (1) Force fields are required for describing the QM region that often includes ligands and intermediate structures of enzymatic reactions; in general, reliable force fields of the molecules are not prepared, and additional QM calculations should be carried out for the parameterization every time a new system is studied. (2) The electrostatic interactions between the QM and MM regions are described at molecular mechanics level; that is, the interactions are calculated by the Coulomb interactions between fixed atomic charges in the QM and MM regions. Such descriptions cannot represent polarization of the QM region induced by the environment surrounding the QM region.
On the other hand, the additive schemes can take into account the polarization effects. The energy expression for the additive schemes is given in Eq. (6): A characteristic feature of this scheme is the presence of the energy term with respect to the interactions between QM and MM regions, described as follows: To calculate electrostatic interactions, that is, the first term in the left side of Eq. (7), one-electron integrals in the QM Hamiltonian incorporating MM partial charges can be used.
The symbols q j are the MM partial charges located at R j . Q k are the nuclear charges of the QM atoms at R k , and r i represents electron positions. N, M, and L represent number of electrons, MM atoms to be incorporated into the one-electron integrals, and QM atoms, respectively. Using the additive scheme, the electronic structures of the QM region are affected by the charge distribution of the environment. An advanced approach is to consider polarization of the MM region by the QM region (i.e., to allow the partial charges to be changed according to changes in the electronic structure of the QM region). However, polarizable force fields with broader applications have not yet emerged, while many efforts to account for the polarization effects were made [26,27].

Ab initio QM analysis of electron transfer (ET) mechanisms in hydrogenase 2.1 [NiFe] hydrogenase
Hydrogenase is an enzyme that can catalyze dihydrogen (H 2 ) to water molecules and its reverse process [28]. Due to the reversible oxidation properties in the H 2 catalysis, hydrogenase has been focused in biotechnological devices, such as generation of H 2 from solar energy [29]. However, most [NiFe] hydrogenases, which are classified as standard hydrogenase, are sensitive to the explosion of the O 2 ; that is, their activities decrease in an aerobic condition. By contrast, some hydrogenases preserve their activities even in the presence of O 2 , which are referred to as O 2tolerance.
Herein, we focus on membrane-bound [NiFe]-hydrogenases (MBHs), which are O 2 -tolerant hydrogenases. The 3D structures and active sites of MBHs are very similar to those of the standard hydrogenases except for the transition metal (iron) binding site that are located in the proximity of the catalytic center where Ni and Fe are bound, which are referred to as the proximal and [NiFe] active clusters, respectively ( Figure 2).
Although the mechanisms of the O 2 -tolerance still remained to be resolved, the structural differences of the proximal clusters between MBHs and the standard hydrogenases were suggested to be responsible for the O 2 -tolerance mechanisms. More specifically, the proximal cluster of the standard hydrogenase contains [4Fe-4S]-4Cys cluster, while that of MBH contains [4Fe-3S]-6Cys cluster ( Figure 2). In fact, two cysteine residues in MBH are replaced with glycine (Gly) residues in the standard [NiFe]-hydrogenases.
In addition, for the proximal cluster of MBH, three charge states have been reported; that is, the reduced, oxidized, and superoxidized states. Moreover, the 3D structure of the proximal cluster in MBH is also changed depending on those redox states. Moreover, combined crystallographic and spectroscopic analyses have recently suggested that a hydroxyl ion (OH À ) was attached to a Fe ion in the superoxidized states of the proximal cluster in Ralstonia eutropha MBH [30]. However, its functional role has still remained to be clarified.
In this section, we first introduce the way of how we investigated the electronic structures of the proximal cluster of Ralstonia eutropha MBH. Herein, systematic exploration of spin combinations of the proximal cluster was essential to obtain the reliable calculation data. For the calculation models, we constructed two structural models of the proximal cluster in the presence and absence of the hydroxyl ion and compared their detailed electronic structures, to reveal the functional role of the hydroxyl ion. To evaluate the total energy of various spin states and the electronic structures of the optimum energy states in the presence and absence of the hydroxyl ion, we employed full (ab initio) QM calculations with the use of the B3LYP functional as mentioned above (see Section 1.3).

Exploration of spin assignments
To build structural models, we employed the atomic coordinates of the proximal cluster in the superoxidized state of the crystal structure of Ralstonia eutropha MBH (PDB ID: 4IUD). Our computation models included the iron-sulfur cluster (i.e., [4Fe-3S]) and six Fe-coordinated cysteine (Cys) residues (i.e., Cys17, Cys19, Cys20, Cys120, Cys115, and Cys149). Moreover, we also included three amino acid residues (i.e., Ser21, Glu76, and His229), two crystal water molecules, and OH À ion, all of which are coordinated to the iron-sulfur cluster.
Six amino acid residues (Cys17, Cys115, Cys120, Cys149, Glu76, and His229) were truncated by Cα atoms with the attachment of methyl groups (▬CH 3 ). As mentioned in the last section, we constructed another similar model that did not include the OH À ion to reveal its effects, and thus 103 and 101 atoms were included in our structural models. These are referred to as the original (model 2) and Δ(OH À ) (model 1) models, respectively ( Figure 2).
Spectroscopic experiments elucidated the formal charge and total spin of the [4Fe-3S] cluster in the superoxidized state as +5 and 1/2, respectively [32]. For each of the iron and sulfur ions in the [4Fe-3S] proximal cluster, we set the formal charge as Fe 2+ or Fe 3+ , and S 2À , as found in the previous study [32]. Thus, the core consists of three Fe 3+ , one Fe 2+ , and 3S 2À ions.
Then, we constructed simple small fragmental models that were extracted from the 3D structure of the proximal cluster core: each of three Fe ions labeled as Fe2, Fe3, and Fe4 form a tetrahedral structure in the [4Fe-3S] proximal cluster, while the other Fe ion labeled as Fe1 form bipyramidal structures together with S Cys19 , S Cys17 , S1, (O atom of the hydroxyl ion), and S2 atoms ( Figure 2).
Thus, we built small models including only the core atoms (i.e., Fe1, S Cys19 , S Cys17 , S1, and S2) in the presence and absence of the OH À and evaluated the total energy values of the models. The analysis revealed that the optimum spin states with the minimum total energies were composed of the high spin states of Fe ions, which is consistent with the previous experimental data [33].
Herein, to specify the spin assignments of the [4Fe-3S] cluster, we represent them employing the nomenclature, BSij; that is, BS is an abbreviation of the broken symmetry state, and i and j indicate the (serial) numbers of Fe ions, as follows. Due to the two constraints, that is, (1) Fe ions take the high spin states as found above, and (2) the total spin sum is 1/2 (experimental data), the possible spin combinations of Fe ions are deduced as 5/2, À4/2, and À 5/2. We adopt the indices i and j that should be corresponding to the spin states of À4/2 and À5/2 (of Fe), respectively [34]. For example, BS12 represents that spin of Fe1 and Fe2 are assigned to the À4/2 and À5/2, respectively, and thus, 5/2 spin state is assigned to Fe3 and Fe4. Thus, BS12 represents (Fe1, Fe2, Fe3, Fe4) = (À4/2, À5/2, +5/2, +5/2).
Based on these considerations, we found that 12 spin assignments are possible for each structural model, and thus, we performed 24 QM calculations, to identify the optimal spin states of the [4Fe-3S] proximal cluster in the presence and absence of the hydroxyl ion. We employed the Gaussian09 package for all QM calculations with the B3LYP functional [3,35]. For the [4Fe-3S] core and atoms that are coordinated to the Fe ions, the triple-ζ valence polarized (TZVP) basis set [36,37] was adopted, and for the other atoms, the 6-311G** basis set [38] was employed. We performed geometry optimization with all hydrogen atoms being movable.
As a result of the analysis, we found that the total energy of BS12, BS21, BS13, and BS31 was smaller than the other states in model 1 and that the total energy of BS12, BS21, BS34, and BS43 was lower than the other states in model 2. Thus, we indicated that the favorable spin assignments were depending on the presence and absence of hydroxyl ion in the proximal cluster.
Note here that in previous studies employing DFT calculations and the simple iron-sulfur clusters, such as [2Fe2S], [3Fe4S], and [4Fe4S], BSij and BSji were shown to be identical [39,40]. However, this equivalence of BSij and BSji was not satisfied in the present case, since the [4Fe-3S] proximal cluster in the MBH is distorted, compared with the simple Fe-S clusters that were analyzed in the previous studies. Moreover, the attachment of the hydroxyl ion (model 2) induced the distinct electronic structures when we compared with those of model 1 and the standard iron-sulfur clusters. Thus, due to the distorted geometrical structure and attachment of the hydroxyl ion, the equivalence of BSij and BSji cannot be assured in the present case. In fact, BS34 and BS43 of models 1 and 2 were definitely different in the total energy by 7.78 and 1.77 kcal/mol, respectively (here, the optimum total energy is set to 0 kcal/mol).
In this manner, we determined the optimal spin states of the proximal cluster of the MBH in the presence and absence of the hydroxyl ion, as the lowest energy states, that is, BS12 and BS34 of models 1 and 2, respectively. In the subsequent part, we describe the electronic structures of these spin states.

Functional role of OH À
In the presence of O 2 , the inactive form is induced with respect to the [NiFe] catalytic site of MBHs (i.e., the Ni-B state). For the reactivation of the catalysis, the [NiFe] active site is required to be changed to another state (i.e., the Ni-SI state) [41]. Two recovery mechanisms have been suggested up to date. First, Volbeda et al. [42] suggested that formation of a dimer enhanced the reactivation of MBHs; that is, one that is inactivated can be recovered by the other that is activated in the dimer. In this mechanism, at least two electrons should be transferred from the activated MBH to the inactivated MBH, and then the received electrons are transferred through the distal, medial, proximal clusters, and the [NiFe] active site [42].
In the second mechanism suggested by Kurkin et al. [43], the reduction of the [NiFe] active site in the presence of H 2 reactivates the inactive MBH, although the process requires a few seconds (note here that the O 2 -sensitive hydrogenases such as the standard [NiFe] hydrogenase require over 1 h to be reactivated). In this reactivation process, H 2 cleavage reaction induces the Ni-SI state from the Ni-B state of the [NiFe] active site, and four electrons should be transferred from the [NiFe] active site to the proximal and medial clusters [43]. Notably, the directions of the electron transfers (ETs) are opposing between these two mechanisms that have been suggested so far.
To describe the differences of the electronic structures of the proximal cluster in the presence (model 2) and absence (model 1) of the hydroxyl ion [31], we focus on the lowest unoccupied molecular orbital (LUMO) here, since LUMO could be closely related to the ET, which is required to occur in both reactivation mechanisms as mentioned above. In fact, comparison of the LUMOs of models 1 and 2 led us to identify the effects of the hydroxyl ion on the LUMOs: In the absence of the hydroxyl ions (model 1), the LUMO was localized on Fe4 and S Cys20 , whereas in the presence of the hydroxyl ion (model 2), the LUMO was delocalized on S Cys17 , S Cys19 , S Cys20 , N Cys20 , the hydroxyl ion, Fe1, Fe2, and Fe4 ( Figure 3A and B).
To further approach the substantial meanings of differences found in the calculated electronic structures, we also investigated the ET pathways by adopting an empirical simulation methodology, which can identify possible ET pathways by minimizing the penalties of the steps mediated by covalent bonds, hydrogen bonds, and spaces [44,45]. Based on the aforementioned two ET processes, which have been suggested so far, that is, (1) the medial to proximal clusters and (2) the [NiFe] active site to proximal cluster, we examined the validity of these two ET pathways.
As a result of the analysis, we revealed that the optimal ET pathways were significantly overlapped with the delocalized LUMOs of model 2, which means that the attachment of the hydroxyl ion to Fe1 may promote the ETs ( Figure 3C). This also means that the attachment of the hydroxyl ion creates the ET pathways in the proximal cluster by inducing the electron delocalization of the LUMO, thus forming Stereoview of the LUMOs in terms of the optimal spin states of models 1 (BS12) (A) and 2 (BS34) (B). In model 1 (i.e., Δ(OH À )), the LUMOs are localized on Fe4 and S Cys20 atoms. By contrast, in model 2 (i.e., involving OH À ), the LUMO is distributed and thus principally composed of S Cys17 , S Cys19 , S Cys20 , N Cys20 , the hydroxyl ion, Fe1, Fe2, and Fe4. To investigate the relationships between the effects of the hydroxyl ion and the ET mechanisms, the plausible ET pathways obtained by an empirical method (i.e., pathway) to search for the ET pathways (C) are compared with the distribution of the LUMO of model 2, which is identical to that shown in panel (B) (note that the directions for rendering of the structures are identical in the panels (A) and (B), while those are different between the panels (A)/(B) and (C)). The blue line with an arrow shows the ET pathway from the [NiFe] active site to the proximal cluster, and the red line with an arrow shows the ET pathway from the medial to proximal clusters. © Kim et al., [31]. the S Cys17 -HO À -S Cys19 -Fe4 segmental electronic field, which is a main component of the delocalized LUMO in the presence of the OH À .
The origin and formation mechanisms of the delocalized LUMO are a very interesting issue. However, the limitation of space here does not allow us to describe it, and so for the detailed descriptions on this issue, see text and Figure 6(a) in our report [31].
To the best of our knowledge, this is the first work to have revealed the mechanisms of creation of the ET pathways in biological macromolecular systems [31]. Note here that a tiny molecular species, that is, OH À , is a trigger to generate the ET pathways. Thus, organisms regulate the functions employing such a subtle factor but thereby dramatically change their physiological status. The present achievements could further be a solid basis toward sophisticated rational design of novel catalysts, reactions, and functional materials, through regulations of the elaborately constituted orbital-based electronic structures.

Hybrid ab initio QM/MM molecular dynamics simulation
In the last section, we presented the effects of the attachment of a hydroxyl ligand on the electronic structure of the proximal cluster in the MBH and also showed that the effects could be closely related to the ET and recovery from the inactive form of the [NiFe] active site (i.e., the O 2 -tolerance of the MBH). Thus, the tiny species dramatically changes the electronic structures of the enzymes, as discussed above.
In this section, we demonstrate that electronic structures are dynamically changed as catalytic reactions proceed. In order to investigate such dynamical properties of electronic structures in biological macromolecular systems, we built a hybrid ab initio QM/MM molecular dynamics (MD) calculation system on superparallel computers. This computational system enabled us to evaluate dynamical transitions of electronic structures in catalytic sites involving the environmental structures such as protein moieties, nucleic acid (RNA and DNA) moieties, membrane moieties, and solvent [17,22,46,47].
We introduce editing reaction of aminoacyl-tRNA synthetase (aaRS), which forms a protein family composed of twenty enzymes, divided into two classes, that is, classes I and II [48]. Here, we focus on two class I aaRSs, that is, leucyl-tRNA synthetase (LeuRS) and valyl-tRNA synthetase (ValRS). We performed hybrid ab initio QM/MM MD simulations and thereby revealed that as the reactions proceeded, dynamical rearrangements of molecular orbitals (MOs) occurred, which was critical for both covalent bond formation and cleavage. We emphasize here that such dramatic transitions of electronic structures would be characteristic in catalytic reactions of biological macromolecular systems, as also indicated in hydrogenase.

Aminoacyl-tRNA synthetases (aaRSs)
In the central dogma, which demonstrates the flow of the genetic information (e.g., from gene coded in genome DNA toward protein), the adaptation between a codon (i.e., combination of three nucleotide bases) and an amino acid (aa) is required in the protein biosynthesis (i.e., translation), while synthesis of messenger RNA (mRNA) (i.e., transcription) occurs based on the rules of base pairing (e.g. the Watson-Crick and wobble base pairs). Thereby, base sequences of genes are converted to amino acid sequences of proteins. Here, aaRSs play a critical role by correctly recognizing and attaching both specific amino acid and cognate tRNA, thereby forming aminoacyl-tRNA (i.e., aa-tRNA aa ). This crucial catalytic reaction is referred to as aminoacylation.
If aaRSs misattach noncognate amino acids (or tRNAs) to the specific tRNAs (or amino acids), then the noncognate amino acids are incorporated into the protein as an incorrect amino acid residue, which is different from the correct one coded in the gene, since the tRNA recognizes its cognate codon by making base pairs between the codon (in mRNA) and the anticodon moiety of the tRNA. It should be noted here that only aaRSs determine the relationship between amino acids and codons for biosynthesis of proteins.
Based on the 3D structures, aaRSs are classified into two classes, that is, classes I and II [49,50]. More specifically, catalytic cores of class I aaRSs contain the classical nucleotide-binding fold (i.e., the Rossmann fold). By contrast, the active sites of class II aaRSs possess an antiparallel β sheet flanked by α helices, which is the architecture completely different from the Rossmann fold. Based on the structural similarity of the catalytic and noncatalytic domains, each class is further classified into three subclasses a, b, and c [51,52].

Editing reaction of aaRSs
As mentioned above, high translational fidelity is essential in the decoding of genetic information from mRNA (i.e., base sequences) to protein (i.e., amino acid sequences). Notably, the aminoacylation reaction of aaRSs consists of two steps; that is, (1) activation of the amino acid yielding aminoacyl adenylate and (2) transfer of amino acid moiety of the aminoacyl adenylate to the 3 0 -end of the tRNA. However, misactivated amino acids and misaminoacylated tRNAs are generated occasionally, since some amino acids are structurally similar (e.g., leucine (Leu), isoleucine (Ile), and valine (Val)).
To ensure the translational fidelity, some aaRSs possess editing functions to correct such errors [53][54][55][56][57][58][59][60]. Correspondingly, two types of editing reactions are known, that is, pre-and posttransfer editing reactions, which hydrolyze a misactivated amino acid and misaminoacylated tRNA, respectively. Herein, we focus on the posttransfer editing reaction, since the catalytic mechanisms were remained to be elucidated for the last some decades.

Ab initio QM/MM MD simulation of editing reaction
For the Leu system, we constructed a structural model of LeuRS in complex with a misaminoacylated tRNA Leu (i.e., valyl-tRNA Leu ), where the 3 0 -end nucleotide (adenine 76; A76) is bound to the active site for the editing reaction in the connective polypeptide (CP) 1 domain [22] (Figure 4). Then, we performed classical MD simulation and succeeded in identification of the nucleophilic water for the editing reaction [23]. Employing this structural model, we performed hybrid ab initio QM/ MM MD simulations with the use of our QM/MM interface program [47] that connects QM and MM calculation engines (i.e., GAMESS [61] and AMBER [62], respectively).
To determine the reaction path, we employed an adiabatic mapping approach, in which hybrid ab initio QM/MM MD simulations were performed to enhance the conformational sampling, together with hybrid ab initio QM/MM geometry optimization being employed to reach the potential energy surface. This scheme enabled us to conduct more effective explorations of both conformations and electronic structures than previous schemes that employ only geometry optimizations [63]. We assumed some possible reaction pathways and performed hybrid ab initio QM/MM MD simulation for each pathway, which provided the estimation of the energy barrier depending on the reaction pathway. Thus, we determined the optimal reaction pathway and thereby elucidated the mechanism of the editing reaction ( Figure 4).
As a result of the analysis, we discovered a novel catalytic mechanism, as follows: the editing reaction was revealed to be driven by the O 30 of the ribose moiety of the 3 0 -end nucleotide A76, which acts as the general base to activate the nucleophilic water. Surprisingly, the editing of the LeuRSÁvalyl-tRNA Leu complex was revealed to be ribozymal [63,65,66].
Furthermore, we found that this ribozyme reaction was enhanced by protein, through the formation of a hydrogen bond with the catalytic core of tRNA Leu . Since the catalytic cores of the conventional protein-dependent ribozymes such as ribosome and group I intron [67] are purely composed of RNAs [64], this finding, that is, direct contributions of the protein moiety on the ribozymal reaction, is novel, and thus, we referred to the LeuRSÁvalyl-tRNA Leu complex as a "hybrid ribozyme/ protein catalyst" (Figure 4).
A very recent experimental study conducted by Dulic et al. experimentally showed that the defective mutation of the O 30 atom (i.e., 3 0 -OH of A76 was replaced with 3 0 -H) significantly reduced the activity of the Leu system ($10 4 -fold rate reduction) ( Table 1) [68], and thus, the hybrid ribozyme/protein catalyst mechanism has been reasserted by the biochemical experiments.
Can we generalize this novel, hybrid ribozyme/protein catalytic reaction mechanism of the LeuRSÁvalyl-tRNA Leu complex? Considering structural similarity of class Ia aaRSs, which involves LeuRS, valyl-tRNA synthetase (ValRS), and isoleucyl-tRNA synthetase (IleRS), they may share a common editing mechanism with LeuRS. However, an experimental conflict has still been left to be resolved as follows. While the aforementioned modification of the O3 0 reduced the editing reaction in the Leu system [63,68], the identical modification was not severe in the reduction of the editing activity with respect to the Val and Ile systems ( Table 1) [69].
To resolve this discrepancy of the experiments, we constructed a structural model of the complex of ValRS and misaminoacylated tRNA [63] and thereby suggested that the hybrid ribozyme/protein catalyst mechanism was also shared in the editing reaction of the Val system [25]. Furthermore, to explain how the variant Val system can maintain its catalytic activity ( Table 1), we constructed a structural model involving the variant tRNA Val in which the 3 0 -OH (reactive) of A76 was shown (stereoview). The crystal structure of the complex (1IVS) is colored yellow (for amino acid and RNA backbones), green (for amino acid side chains), and orange (for nucleic acids). The crystal structure of the isolated CP1 domain (1WK9) is colored light blue (for amino acid backbone) and magenta (for amino acid side chains). (B, C) Schematic representations of fundamental reaction schemes of hybrid ribozyme/protein catalyst (left) and their variant systems. The black circles (broken line) show the catalytic site, and the macromolecules involving the catalysts are colored red. The mechanisms of the editing reactions in both Leu and Val systems are revealed to be common by our recent studies: Interestingly, the editing reaction is ribozymal together with assists of protein moiety (left panels in (B-C)), which is referred to as hybrid ribozyme/protein catalysis [63]. The ribozymal factor (i.e., 3 0 -OH of A76 of tRNA) activates the nucleophilic water molecule (as represented by the red arrow), and the protein (LeuRS and ValRS) moiety promotes the catalysis (the blue arrow) [63]. Nevertheless, in the "defective" mutated systems (i.e., replacements of the aforementioned 3 0 -OH with 3 0 -H), reductions of the editing activities were experimentally revealed to be distinct. In the Leu system (B), the editing activity significantly decreased by the mutation, whereas in the Val and Ile systems, those were preserved. Our modeling studies elucidated that these were due to the absence and presence of compensation mechanisms in the Leu and Val/Ile protein moieties, respectively. Actually, in our previous study, we constructed the atomistic structural model of the Val system (i.e., the ValRSÁthreonyl-tRNA Val (misaminoacylated) complex) and showed that for the Val system with a "defective" ribozymal activator of tRNA Val , the protein (ValRS) moiety could activate the nucleophilic water molecule (the red arrow), which is referred to as the protein enzyme (note that the definition of protein enzyme described here is different from that employed by Cech [64]). As discussed in the text, this transition from the hybrid ribozyme/protein catalyst toward the protein enzyme may fill a gap found in the evolutionary transition from the RNA world to the current RNP world, which could possibly occur in primordial biological macromolecular systems. © Sakabe et al., [25].
replaced with the 3 0 -H (unreactive). Based on this analysis [25], we suggested that the editing activity of the variant Val system is maintained by an amino acid residue of ValRS acting as the possible general base (Figure 4). More specifically, Asp276 of ValRS can replace the functional role of 3 0 -OH of A76 (i.e., activation of the nucleophilic water).
In this manner, the hybrid ribozyme/protein catalysts are operated in both Leu and Val systems. Moreover, this suggested that in the "defective" Val system that contains "unreactive" ribozymal functional group (i.e., 3 0 -OH of A76 is replaced with 3 0 -H), the function of a hybrid ribozyme/protein catalyst can be transferred to a protein enzyme. This could also be related to the evolutional transition from RNA enzymes (the RNA world) to protein enzymes (assisted by RNA) (the RNP world), intermediated by hybrid ribozyme/protein catalysts.
We further suggested that the ribozymal mechanism that we discovered is common in the editing reaction of various aaRS systems beyond the classes ( Table 1) [63]. In fact, Thermus thermophilus IleRS (class Ia) [77], Pyrococcus abyssi ThrRS (class IIa) [72,74,75,78], and Enterococcus faecalis ProRS (class IIa) [71] showed the similar binding mode of the nucleophilic water in the catalytic site. Furthermore, Kumar et al. also suggested that the editing reaction of the complex of prolyl-tRNA synthetase (ProRS) and alanyl-tRNA Pro exhibited a similar mechanism, in which the 2 0 -OH group of A76 of tRNA Pro was involved in the substrate binding and the activation of the nucleophilic water [71].
These data are summarized in Figure 4 and Table 1. In this section, we further discuss the dynamical aspects of the electronic structures in the editing reaction of Leu and Val systems, investigated by hybrid ab initio QM/MM MD simulations of the LeuRSÁvalyl-tRNA Leu and ValRSÁthreonyl-tRNA Val complexes, respectively. Escherichia coli ThrRS acts as a protein enzyme (see Figure 4C).

Dynamic rearrangement of MOs in the editing reactions
For both Leu and Val systems, we suggested that the editing reactions occur in a similar manner [25,63]. Actually, in both systems, the reactions were shown to be initiated by opening of the "H-gate": The H-gate is defined by a dihedral angle, C 40 -C 30 -O 30 -HO 30 of A76, and its opening represents the rotation of the dihedral angle by $100°, which thus leads to the nucleophilic attack of the water molecule. For the Val system, employing the hybrid ab initio QM/MM calculations, we evaluated the electronic structures for the two distinct H-gate states, that is, the opened and closed states [25,63]. The resultant data showed that the LUMO was located in the C atom in both closed and opened H-gate states ( Figure 5). By contrast, the energy levels of the MOs that include the 2p orbital of oxygen atom of the nucleophilic water (O w ) (i.e., the "reactive" MO) were different depending on the two distinct states of the H-gate: In the opened H-gate state, we observed the MO as HOMOÀ6, while in the closed H-gate state, the MO was observed as HOMOÀ11, for which the energy level was much lower compared with that of the former state. In this manner, as the reaction proceeds, the energy level of the reactive MO seems to go up ( Figure 5). Thus, if the nucleophilic attack would be achieved, the energy level of the reactive MO in the nucleophilic water could be raised up to that of the HOMO, which would thus result in hybridization of the HOMO and LUMO. In fact, the similar dynamical rearrangements in the electronic structure were also observed in our hybrid ab initio QM/MM calculations of the editing reaction occurring in the LeuRSÁvalyl-tRNA Leu complex [65,66]. For the Leu system, we further investigated the overall mechanism of the editing reaction employing the hybrid ab initio QM/MM MD simulations. Based on the orbital analysis of the trajectory of the hybrid MD simulations, we found more detailed dynamical properties of the electronic structures ( Figure 6): In the initial stage of the editing reaction, the HOMO did not contain the 2p orbital of the O w atom of the nucleophilic water, even though it attacked the C atom, which seemed to be inconsistent with the frontier orbital theory (FOT) [79].
However, when H-gate was open (state 3), the nucleophilic water approached the C atom, and the energy level of an MO that most contained the 2p orbital of O w was elevated to HOMOÀ9 from the HOMOÀ14 observed in state 1. This elevation decreased the energy difference between the LUMO, which contained the reactive moiety (i.e., atomic orbitals of the carbonyl group of the substrate and the O 20 atom Figure 6. Schematic representation of the reaction states of editing mechanism of LeuRS system is shown in (A). Energy diagram of the editing (B) and molecular orbitals (C) of states 1, 4, 5, 6, and 7 are shown. © Kang et al., [66]. of A76), and the MO involving the 2p orbital of O w , from 7.0 eV to 6.2 eV. In state 4, this energy gap further decreased to 5.4 eV, and the MO involving 2p orbital of O w became the HOMO, while the LUMO remained to be localized on the C atom.
In this stage, both HOMO and LUMO were the reactive MOs responsible for the catalytic reaction, and thus, the electronic structure was fully consistent with the FOT. In fact, the OwÀC covalent bond was formed by the interaction between the 2p orbital of O w (HOMO) and the 2p orbital of C (LUMO). Here, we referred to such dynamical rearrangements of the electronic structure as the dynamical induction of the reactive MOs for the HOMO and LUMO, that is, DIRH and DIRL, respectively [65,66].
In our preliminary studies of other biological macromolecular systems, we have also observed the DIRH and DIRL mechanisms (unpublished data). Future theoretical and experimental analyses are amenable to examine the generality of this picture on dynamical rearrangements of electronic structures occurring in the reaction cycles of biological macromolecular systems. In addition, aaRSs are closely relevant to the molecular evolution and the origin of life, which should thus be considered from ab initio QM calculations. So, the present achievements are also related to those evolutional issues, although we do not describe them herein due to space limitation (for more details, see the literature [25]).

Conclusions
In this chapter, we introduced investigations employing ab initio QM calculations and hybrid ab initio QM/MM MD calculations. For the latter, a catalytic reaction site is considered at ab initio QM level, and the other parts, such as the remainder in protein structures and solvent water molecules, are considered at MM (i.e., classical) level, and thus, we can consider the entire system with reasonable computational costs, to evaluate the electronic structure of the catalytic active site. In both analyses of biological macromolecular systems, we revealed the significant reconstitutions of the electronic structures in the reaction cycles.
In the first example, we demonstrated the detailed electronic structures of a crucial functional site, the proximal cluster, in the MBH, which contains multiple transition metals as [4Fe-3S] and is closely related to the ET. We analyzed the effects of the OH À ion that was experimentally identified in the proximal cluster, to the ET mechanisms. Thereby, we revealed that the OH À ion created the ET pathways by inducing the delocalization of the LUMO of the proximal cluster.
This means that tiny molecular species (e.g., OH À ) can induce dramatic rearrangements of the electronic structure in the biological macromolecular systems, which thus generates the ET pathways. This is the first work to point out the mechanisms to create the ET pathways in biological macromolecular systems. In this manner, organisms regulate the biological functions employing such a subtle factor but thereby dramatically change their physiological status.
In the second example, we investigated dynamical changes of the electronic structures in the catalytic reaction cycles of the LeuRSÁvalyl-tRNA Leu and ValRSÁthreonyl-tRNA Val complexes, employing our hybrid ab initio QM/MM MD calculation system, which is a state-of-the-art theoretical technique to elucidate the functional mechanisms of biological macromolecular systems. As a consequence, we revealed that the dynamical geometrical changes induced the dramatic rearrangemets of the electronic structures.
Thereby, the reactive MOs, which are positioned energetically far from the Fermi levels in the initial stages of the reaction cycles, are dynamically rearranged, but those MOs become the HOMO and LUMO, as the reaction cycles proceed. Thus, this reaction stage is followed by the subsequent phase of the covalent bond formation and cleavage. The obtained picture could be found in functional mechanisms of other various biological macromolecular systems, and thus, the generality of the presented novel picture is further amenable to future theoretical and experimental works. Thereby, this picture could be considered to be a characteristic feature in biological macromolecular systems. structure-based virtual screening and Lead optimization. Frontiers in Chemistry. 2018;6(188) coli seryl-tRNA synthetase at 2.5 Å. Nature. 1990;347(6290):249-255