## Abstract

This review introduces a novel mathematical description of protein assembly. Protein assembly occurs in a substantially open non-equilibrium and non-linear kinetic system. The goal of systems biology is to make predictions about such complicated systems, but few have conducted stability analysis for given systems. Particularly, simulated dynamic behaviors have not been sufficiently verified by kinetic analysis in predicting macromolecular protein interactions and assembly. The non-linearity of protein assembly kinetics is complex, and it is very difficult to determine a model of multi-protein interactions based on numerical calculation. We studied the non-linear kinetics involved in the diffusion process of proteins consisting of two or three species of macromolecules and set a novel model in which non-linearity is given by the diffusion coefficient that depends on the protein concentration. By making the diffusion coefficient concentration-dependent, non-linearity leads to a simple system model. Protein assembly is initiated by monomeric protein interactions and regulated by cofactors such as guanidine triphosphate (GTP) or adenosine triphosphate (ATP) binding to the monomer. This cofactor concentration promotes the dynamic behavior of protein assembly and can be treated as an order parameter. Further, kinetic stability analysis in the center manifold theory (CMT) is introduced for analyzing the behavior of the system around the critical state. Although CMT has not been sufficiently applied for stability analysis of protein assembly systems, this theory predicts the dynamic behavior of the assembly system around the critical point using concentration as a cofactor. Protein assembly theory will provide a novel framework for nonlinear multi-parametric analysis.

### Keywords

- protein assembly
- center manifold theory
- tubulin
- non-linear kinetics
- non-equilibrium state
- diffusion coefficient
- oscillation

## 1. Background

Protein assembly is essential for cellular activities such as cell signaling, gene expression by transcription factor complexes, cytoskeleton formation, endocytosis, and cell motility. This reaction system is one of the complicated systems and non-linear kinetics has been applied for understanding the dynamic behavior [1, 2, 3, 4, 5, 6, 7, 8]. Among the protein assembly, tubulin and actin polymerization are well-known events that have been analyzed using numerous methods [9, 10, 11, 12, 13]. Microtubule and actin filaments consist of monomers that bind the cofactors guanidine triphosphate (GTP) or adenosine triphosphate (ATP) to acquire assembly activity [14, 15, 16]. In general, a protein interaction is controlled by an electric charge on the amino acid residue(s) of the protein, such as tyrosine, serine, and threonine, by covalently binding to or reacting with a cofactor such as ATP or GTP; subsequently, the monomer loses its interaction activity by hydrolyzing ATP or GTP into ADP or GDP or through dephosphorylation, which is mediated by the other protein’s phosphatase activity or its own enzymatic activity [17, 18].

In particular, * dynamic instability*in tubulin polymerization has been extensively investigated.

*signifies the intermittent transition between slow growth and rapid shrinkage in polymeric assemblies of microtubules [9, 10, 11, 12, 13, 19]. Further, intra-polymeric Brownian motion and fluctuation influence the structure and elasticity of tubules [20]. Zapperi and Mahadevan presented an excellent model in which the ratio of longitudinal to lateral interactions characterizes the assembly [21]. Hammele et al. presented a physical model and suggested the physical properties of the microtubules [22]. Nucleation is the rate-limiting step controlling the overall polymerization process [18]. The stable nucleus for polymerization is oligomers, and the growth of aggregates through elongation/dissociation follows. For stable growth, tubule lifespan is controlled by a GTP-cap that forms at their ends [19].*Dynamic instability

As another example of protein interactions, in the mitogen-activated protein kinase (MAPK) signaling cascade, a set of protein kinases and protein substrates construct the signaling network. The cofactor ATP/GTP transfers biological information in the reaction network to alter gene expression [6, 7, 8]. Mathematical models of this cascade have demonstrated that the system can act as an ultra-sensitive switch based on a combination of phosphorylation of protein substrates and implicit feedback, leading to multi-stability [23, 24, 25]. Recently, Ueno et al. reported that a model of MAPK signaling cascade functions as a band-path filtering system.

## 2. Protein interaction kinetics

### 2.1. Protein interaction model

Steps in protein assembly of microtubular polymerization are summarized as follows: (i) the protein achieves an interaction active state by reversibly binding to a cofactor, which provides the protein with assembly or interaction activity; the protein interaction activity decreases when a hydrolyzed inactive cofactor is bound compared to an active cofactor; (ii) the protein can hydrolyze the cofactor; (iii) the protein can exchange the inactive cofactor, such as ADP, with an active cofactor; and (iv) active cofactors are supplied continuously and externally. Thus, the interaction activity is self-limiting, in which the protein itself limits assembly activity, resulting in dynamic instability [10, 11, 12, 13, 19, 23, 26, 27, 28].

Let us consider a three-monomer model in which an active cofactor-binding protein (* X*), oligomeric protein (

*), and inactive cofactor-binding protein (*W

*) coexist. The sum of monomeric proteins is kept constant; this kinetics is attributed to a two-parametric analysis. First,*Z

*can reversibly associate with the oligomer:*X

Subsequently, * Z*is released from the oligomer:

Here, Pi signifies the released inorganic phosphate. Finally, in the presence of sufficient active cofactor, * Z*releases the inactive cofactor, ADP or GDP (

*), binds an active cofactor, ATP or GTP (*I

*), and recovers its interaction activity, returning to the protein*P

In addition, a monomeric protein has the potential to hydrolyze the cofactor by interaction:

Formula (5) represents a self-reproducing reaction, which yields non-linear kinetics in this reaction system.

### 2.2. Concentration dependence of protein diffusion

Diffusion of proteins plays an important role in protein interactions. Analysis of dilute solutions of a macromolecule requires a greater understanding of the concentration dependence of the diffusion rate because of the hydrodynamics of protein solutions involving mutual diffusion of protein molecules. One of the approaches is applicable to the study of self-diffusion in solutions [29, 30, 31, 32]. In fact, proteins interact or associate with other monomeric proteins and phosphorylate or are phosphorylated by the proteins. In dilute solution, proteins may diffuse in a free manner with sufficiently large vacant space that accounts for only a fraction of the volume of a protein molecule. These vacancies are sufficiently large to be occupied by proteins that are as large as the hydrodynamic volume. The effects of molecular shape and size, solvent, and environment such as ion intensity and pH determine the concentration dependence of the diffusion rate. Here, the diffusion rate _{0} is set as the probability of vacancy formation and does not depend on the velocity at which a monomer diffuses. The probability * P*is given by a void adjacent to the objective protein, which is sufficiently large to permit diffusion:

D = D

_{o}P.

The probability * P*(

*) of forming a vacancy of volume V in the solution is given by*V

where * V*is protein exclusion volume, c is the protein solution concentration, and β is a constant that reflects the effects of interactions with other macromolecules and shape on the probability of vacancy formation. Thus,

_{e}

*depends on the size and shape of the protein and likely other factors*D

At low concentrations of the macromolecule

Thus, the dependence of the diffusion coefficient can be described using the protein solution concentration (Figure 1).

### 2.3. Viscosity and diffusion coefficient of protein

The compatibility of Eq. (8) with the experimental data strongly suggests that the concentration dependence of the protein diffusion constant is governed by excluded volume interactions, which may be predicted by calculating equilibrium protein density fluctuations. Eq. (8) is consistent with the equation describing the viscosity * η*of concentrated protein solutions

where η_{0} is the solution viscosity at infinite dilution, η is the intrinsic viscosity of the solution, * c*is the protein solution density, and

*/*k

*is a constant that corrects for the overlap of free volume*v

*.*v

### 2.4. Diffusion-limited protein association

Because protein diffusion in the viscous cytoplasm is significantly slower than other reactions between interactions with other molecules, the protein assembly reaction is a diffusion-controlled or diffusion-limit process. In general, the diffusion rate of monomer is given by the concentration gradient that is the molar flux per unit area, * J*. This is the additional rate of monomer

*towards oligomer*X

*multiplied by the area of the spherical surface of radius*W

*of the reactive end of polymer or oligomer*R

In Fick’s first law, the flux towards * X*is proportional to the concentration gradient at radius of macromolecule

R

By substitution of Eq. (10) into Eq. (11), the rate of reaction v is given as

The rate of the diffusion-controlled reaction is equal to the average flow of * X*molecule to all

*molecules. Accordingly, the global flow of all*W

*to*X

*is 4*W

*Similarly, the flow of all*π R*D

_{X}N

_{A}XW.

*to*W

*is 4*X

*Further, using the sum of the diffusion coefficients of the two species, the diffusion coefficient is rewritten as*π RD

_{W}N

_{A}WX.

*(*D =

*)/2. Then, the addition rate of*D

_{X}+ D

_{W}

*to*X

*is given by*W

In reality, the diffusion rate of oligomer is negligible relative to that of monomer, and the addition rate is given as

and

Using the above formula, the kinetic rate of * c*and items of interaction between monomers is as follows [8]:

_{i}

where * f*(

*) represents the sum of the reaction kinetic items of*c

_{j}

*th species,*j

*, except the diffusion process. When the potential of the electric field,*c

_{j}

*(r), is considered, the flux is described using spatial coordinate*ϕ

r

In the solvent including two chemical species, the relative movement is related to the local concentration gradient of * c*and potential energy. When the gradient is described using Eq. (17), monomer

_{j}

*moves across the sphere with radius*X

*surrounding*R

*and can be described by*W

when * X*and the active site on the oligomer or polymer

*interact to assemble or elongate the polymer, which is determined by*W

*The total flux will be equivalent to the chemical reaction rate using an arbitrary kinetic coefficient*R.

k

At the steady state, flux across the sphere with a radius, or a shape parameter, * r*is constant for any values. Accordingly,

*in Eq. (19) is replaced with*R

*using Eq. (18)*r

By integration and rearrangement of Eq. (20)

and here

Because * r- > ∞*,

*(*V

*) approaches zero*r

Here, * X* = (

_{r=R}

*)*k/k

_{R}

*In calculating the above formula, fluctuations of diffusion coefficients are neglected when the concentration of*X.

*is sufficiently low and kept constant during the reaction. The kinetic coefficient is controlled by the diffusion process and the aggregation or assembly is completed through interactions between*X

*and*X

*;*W

X

_{r=R}is set to zero. Then,

*is given by Eq. (23):*k

The diffusion coefficients can be altered in proportion to the fluctuation of monomeric protein concentration [9, 10, 11, 12, 13, 14]. In the derivation of Eq. (23), the diffusion coefficient is related to the viscosity * η*by the Einstein-Stokes formula

By using the Gibbs-Duhem expression, the diffusion coefficient * D*of one macromolecule can be written as

where * T*is the temperature of the solution,

*is the Boltzmann constant, and*k

_{B}

η

_{1}is the frictional coefficient of the macromolecule in solution.

A

_{1}is the second virial coefficient,

v

_{1}is the partial specific volume of protein with molecular weight

M

_{1}, and

*is Avogadro’s number. The small letter*N

_{A}

c

_{1}denotes the concentration of the solute macromolecule. Then, dependency of the diffusion coefficient on the

*th component,*i

*, is as follows from (26):*D

_{i}

where * v*is the partial specific volume of the polymer with molecular weight

_{j}

*.*M

_{j}

Further, the diffusion coefficient is given by extending the above formula to a mixed solution of two macromolecules, * X*and

Z

where * v*and

_{X}

*are the partial specific volumes of*v

_{Z}

*and*X

*with molecular weights*Z

*and*M

_{X}

*, respectively.*M

_{Z}

*and*A

_{X}

*are the second virial coefficients.*A

_{Z}

### 2.5. Fluctuation of diffusion coefficient

Subsequently, let us consider the fluctuation of participant proteins using lowercase letters

Here, the sum of the kinetic rate of fluctuations is constant because total amount of monomeric protein is constant

The dependency of the diffusion rate refers to the high interaction activities of * X*and low interaction activities of

*. An increase in*Z

*contributes to a decrease in the diffusion coefficients*X

*and*D

_{X}

*in fluctuation items because of the higher interaction activity that reduces diffusion; in contrast, an increased*D

_{Z}

*contributes to increased diffusion coefficients because the interaction between increased*Z

*induces lower assembly activity for the interaction with monomeric proteins, resulting in increased mobility of monomeric proteins. This dependency gives the non-linearity fluctuation items in the kinetic equation. Here, all preparations for kinetics are completed.*Z

### 2.6. General theory of non-linear kinetic equation of protein assembly

According to the above simple reaction cascade, (1)-(5), the kinetic equation contains the concentrations of monomeric proteins as variables. For simplification, the equations are written as follows:

Here,

Further, for simplicity, Eqs. (34) and (36) are given by replacing the kinetic coefficients with arbitrary coefficients

Here, _{1}_{1}* W* =

k

_{1}

*,*’

k

_{4}

D

_{4}

*=*W

k

_{4}

*,*’

k

_{5}

D

_{5}=

k

_{5}

*, and*’

*=*p

k

_{3}

*At the steady state, setting the right hands of Eqs. (32)-(34) equal to zero*P.

Here, the fluctuation of diffusion coefficients are given by

By altering * X*,

*,*Z

k

_{1}

*,*’

k

_{4}

*and*’,

k

_{5}

*into*’

*,*X + x

*,*Z + z

k

_{1}

*,*’ – ax + bz

k

_{4}

*, and*’ − cx + dz

k

_{5}

*in Eqs. (36) and (37) and arranging, two dependent equations are obtained*’ − ex + fz

Accordingly, the overall behavior of the kinetics of protein assembly is given by the monomeric kinetics of * x*and

*.*z

fxz

^{(‘)},

fzx

^{(‘)},

fxx

^{(‘)}, and

fzz

^{(‘)}represent the assembly activity between

*and*X

*,*Z

*and*Z

*,*X

*themselves, and*X

*themselves.*Z

In reality,

Thus, we determined a general formula for protein assembly. Cytoskeletal protein and protein complexes in the signaling cascade can be described using this formula.

### 2.7. Linearity of cofactor supply to the assembly system

While protein assembly is a non-linear reaction involving a complicated set of reactions or assembly steps, the supply of cofactor is simply given by the linear kinetic rate items as shown in Eqs. (36, 37, 40 and 41). This means that the supply rate will be essentially be an order parameter of the assembly system and is controllable by altering the concentration of the cofactor (Figure 2). Accordingly, paramete* r p*is variable in the numerical simulation, as described below.

## 3. Calculus simulation of concentration oscillation

### 3.1. Oscillation of monomer concentration fluctuation

In actual simulation of protein assembly, numerical calculation was performed over a sufficiently long period to evaluate the trend in system behavior.

Simulation: A simulation was performed using Mathematica® version 8 (Wolfram Research, Champaign, IL, USA).

Simulation was performed with the notation in Eqs. (42) and (43), in which p is (a) 0.8, (b) 0.81, and (c) 1.00 (Figure 3).

Below is the simulation program cord using Mathematica ver. 8 when p = 0.8:

D1 = 0.27, k2 = 0.00035, a = 790, b = 650, c = 105, d = 105, e = 105, f = 105, p = 0.8, Dxx = 155, Dxz = 155, W = 1.

X = k2/D1.

Z = (k2 (D1^2 W + Dxx k2))/(D1 (D1 p - Dxz k2)).

NDSolve[{x’[t] == − (W (D1 - a X) + 2 X Dxx + Dxz Z) x[t] + (W a - Dxx + 2 c X + e Z) x[t]^2 + (p - Dxz X - b X - d X^2 - f X Z) z[t] - (Dxz + W b - e X + f Z) x[t] z[t] - (f X) z[t]^2,

z’[t] == (2 X Dxx + Dxz Z - c X^2 - e X Z) x[t] + (Dxx - 2 c X - e Z) x[t]^2 + (Dxz + 2 X d - e X + f Z) x[t] z[t] + (Dxz X - p + d X^2 + f X Z) z[t], x[0] == 0.000001, z[0] == 0.000001}, {x, z}, {t, 0, 30,000}, MaxSteps - > 50,000].

g001 = Plot[{X + x[t]} /. %, {t, 0, 200}, PlotRange - > All, PlotStyle - > {RGBColor[1, 0, 1]}, PlotRange - > ALL].

g002 = Plot[{Z + z[t]} /. %%, {t, 0, 200}, PlotRange - > All, PlotStyle - > {RGBColor[0, 1, 1]}, PlotRange - > All].

g003 = ParametricPlot[Evaluate[{X + x[t], Z + z[t]} /. %%%], {t, 0, 2000}, PlotRange - > All, AxesLabel - > {“X”, “Z”}] Show[g001, g003, AxesLabel - > {“t”, “X, Z”}].

As a result, regular oscillation with fluctuations in the amplitude and frequency can be illustrated in the plot following the above calculation. Plots on the right side that the oscillation becomes definite within a limit-cycle when p = 1.5.

### 3.2. Oscillation frequency

In a previous study [6], an interesting relationship was observed between the average frequency of simulated oscillation of the monomeric proteins and difference * p − p*. The frequency was nearly equivalent, but with irregular fluctuations, except for during the initial phase. The relationship between average frequency < f > and

_{c}

*−*p

*is given by:*p

_{c}

These formulae imply that the amplitude of monomeric protein fluctuation provides information regarding the outside alteration of the cofactor. Thus, outside alteration is transformed inside into the information of assembly.

## 4. Center manifold theory (CMT)

### 4.1. Center manifold formulae

For stability analysis of the nonlinear dynamics in protein assembly, the center manifold theory (CMT) for non-linear dynamic biological systems has been applied. Simulation is oriented to analyze the behavior around critical values of the order parameter. CMT has been applied to the Lotka-Volterra model of the predator-prey system to provide important simulation results [33, 34]. However, the CMT can be applied to the protein assembly model. For stability analysis around the critical point, Eqs. (40) and (41) were formulated. When * p*is equivalent to

*, the Jacobian matrix*p

_{c}

*for (*L

*,*x

*) is given using the linear coefficients of (*z

*,*x

*) in Eqs. (40) and (41):*z

Particularly, the function * f*(

*) represents the input and output of the cofactor that is the order parameter. Using the eigenvectors of*p

*, (*L

l

_{1},

l

_{2}), coordinate transformation into

*and*u

*is performed as follows:*v

The above formulae are subsequently set as

The center manifold around the critical point (* p* =

p

_{c}) is then given as follows:

The effect of changing * p*and

*(*ε

*=*p

p

_{c}) is analyzed using the center manifold around the critical point of the system. Subsequently

Solving Eq. (52) gives the coefficients of * a*in Eq. (50):

_{i}

a

_{3}=

a

_{7}= 0. Substituting

*in Eq. (51) given by*u

*and*ν

*into*ε

*(*f

_{v}

*,*u

*) in Eq. (47), the kinetic stability equation is given for fluctuation*v

*using the coefficients*ν

*(*n

_{i}

*= 1, …, 7) as follows:*i

Independently of the numerical values in Eq. (53)

Using this result, we have

By setting the left-hand side of Eq. (55) equivalent to zero, a Hopf-bifurcation of the given system is shown

Further approximate solutions to Eq. (56) are given as

Thereafter, the solution * cε*increases as the concentration of ATP/GTP increases. From Eq. (51),

*is formulated using an arbitrary coefficient*u

c

This result implies that the fluctuation has two different values for the amplitude of an oscillation.

## 5. Prospects of protein analysis assembly

Our aim in this review was to evaluate the association between the non-linearity in protein diffusion and assembly. Here, we reviewed studies of protein assembly or interactions [1, 2, 3, 4] and performed mathematical analysis of the model in addition to numerical simulations. Because the assumptions of the model are minimal, the simulation provides insight into assembly. The results are summarized as follows: (i) the non-linear kinetic equations including only two independent parameters may reveal dynamic behavior in the fluctuation of the monomer concentration, (ii) the increase near the critical concentration of the cofactor induces oscillations in amplitude and frequency; and (iii) center manifold analysis predicts the stability of the model system near the critical concentration, showing bifurcation with respect to the cofactor supply value. The behavior of the system shown in the simulation indicates that the concentration change information of a cofactor outside the system is transduced into another type of information, e.g., frequency of the concentration oscillation of the monomer. A small increase in the outside cofactor concentration induces an oscillation change inside the monomeric protein, which may be crucial for responding to transformations in the outside environment. Such a trajectory in the observed oscillation resembles a limit-cycle-like in the well-known two-parametric Lorenz model [13, 14].

Previous systems biology models did not focus on the diffusion process of a protein in the cytoplasm. Non-linearity in the process is critical and essential to protein assembly. Before considering a set of simultaneous kinetic equations, non-linearity in the diffusion process should be considered, as * de novo*nucleation is negligible compared with the reaction of the monomer and oligomer. The interaction between assembly-active monomer proteins attenuates the diffusion rate in a non-linear fashion because they can assemble, which inevitably yields non-linearity. As shown in this review, CMT is useful for reducing the parameters of detailed stability analysis around the critical state.

## 6. Conclusion

Protein interactions play an important role in various biological activities at the cell level. Although protein diffusion is a rate-limiting-step, cell behavior is orchestrated by protein interactions, and signaling transduction, the cytoskeleton, and cell motility are dynamically altered in processes similar to “phase transitions” in inorganic chemical reactions. This review provides a model of phase transition affected by minimal changes in cofactor concentration; but oscillation and bifurcation are inducible by the simple model. Systems biology multi-parametric analysis remains important; however, a simple model sufficiently can better illustrate oscillations in protein concentration with a limit-cycle. Outside alteration such as cofactor concentration change is transformed inside into the information of assembly. Stability analysis using the CMT is a simple method for understanding protein-interacting systems.

## Acknowledgments

This work was supported by the Ministry of Education, Culture, Sport, Science and Technology, Japan, under the project name “Synergy of Fluctuation and Structure.” Project number 2502. The funders had no role in the study design, data collection, and analysis; decision to publish; or preparation of the manuscript.

## Conflicts of interest

The authors have declared no conflicts of interest.