## 1. Introduction

Increasingly, governments worldwide attach considerable importance to estimating biomass and carbon storage of forest ecosystems in the context of global climate change. To help countries conduct national greenhouse gas inventories, forest biomass estimation and carbon stock assessment, the Intergovernmental Panel on Climate Change (IPCC) provided such carbon-accounting parameters as biomass expansion factors (*BEF*) and root-to-shoot ratios (*RSR*) for estimating different geographic zones in 2003 [1]. However, it probably has great uncertainty to apply these parameters for biomass estimation. Developing individual tree biomass models and parameters for national monitoring and assessment of biomass and carbon storage of forest ecosystems has become fundamentally important.

The earliest research on forest biomass abroad can be traced to the 1870s [2]. In recent years, biomass models for major tree species in America, Canada and some European countries have been developed or improved [3–11]. Their purpose was to assess and monitor forest biomass and carbon storage and to provide a basis for evaluating the contribution of forest ecosystems to the global carbon cycle. Studies on forest biomass in China have only been implemented since the late 1970s when some related articles were published [12, 13], i.e., a century after the earliest study abroad. Due to special historical reasons, China did not participate in the International Biological Program (IBP), initiated by the International Union of Forest Research Organizations (IUFRO), during the period of 1964–1974 and thus missed the golden development stage of forest biomass research [14].

Reviewing the development of forest biomass modeling near 40 years in China, three stages could be classified: the first is estimating biomass and productivity of major forest types toward the end of the twentieth century [13, 15–30]; the second is assessing carbon storage in Chinese forest ecosystems since the beginning of the current century [31–37]; and the third is the new development stage for monitoring and assessing forest biomass and carbon storage at provincial and national levels [14, 38]. To monitor forest biomass and carbon storage in the National Forest Inventory (NFI) system, the National Forest Biomass Modeling Program has been implemented since early 2009. Up to now, many papers on modeling individual tree biomass have been published [39–51], which classified 70 modeling populations for developing individual tree biomass models, determined the sample structure of each population and studied the modeling methods including nonlinear error-in-variable simultaneous equations, mixed-effects modeling, dummy variable modeling and segmented modeling approaches. Also, logarithmic regression and weighted regression were analyzed [52] and goodness evaluation and precision analysis of biomass models were studied [53]. Based on the studying achievements, two ministerial standards on technical regulations and five ministerial standards on biomass models have been approved for application [54–60]. In the near future, more ministerial standards on biomass models for other tree species would be published.

From the published papers and ministerial standards, we could find that the aboveground and belowground biomass models were developed separately owing to the unequal sample sizes and most of the studies were only based on sample trees of one tree species. In this chapter, the author will use the mensuration data of aboveground and belowground biomass from 4818 to 1626 destructive sample trees of eight major tree species, respectively. The main purpose was to develop an integrated individual tree model system for aboveground and belowground biomass, biomass conversion factor (*BCF*) and root-to-shoot ratio (*RSR*), using the approach of nonlinear error-in-variable simultaneous equations with dummy variable. The system could assure aboveground biomass models compatible with stem volume models and *BCF* models and belowground biomass models compatible with aboveground biomass models and *RSR* models. Secondly, the generalized dummy-variable models of aboveground and belowground biomass for eight major tree species were established and compared and the ranks of eight species for aboveground and belowground biomass estimation were provided respectively from the species-specific parameter estimates.

## 2. Materials and methods

### 2.1. Data

During the 5 years between 2009 and 2013, a total amount of 4818 sample trees for 31 modeling populations of eight major tree species or species groups, namely, *Picea* spp., *Abies* spp., *Betula* spp., *Quercus* spp., *Populus* spp., *Larix* spp., *Cunninghamia lanceolata* and *Pinus massoniana*, which occupied more than 60% of forest volume in China [39], were felled for aboveground biomass mensuration. The sample trees were evenly distributed in ten diameter classes of 2, 4, 6, 8, 12, 16, 20, 26, 32 and more than 38 cm for each modeling population, and about 15 sample trees in each diameter class were selected by height class as evenly as possible. For example, if three height classes were defined, i.e., low, intermediate and high, then five sample trees should be selected in each height class. For each sample tree, the diameter at breast height of stem was measured in the field. After the tree was felled, total trunk length (tree height, from ground level to the top) and live crown length were also measured. The trunk was divided into 11 sections at points corresponding to 0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 of tree height. Base diameters of all sections were measured and the tree volume was computed using Smalian’s formula [61], which referred to total volume over bark. Specifically, the formula was written as *V* = (*A*_{1} + *A*_{2})/2 × *L* with *V* as the volume of a section of tree trunk, *A*_{1} and *A*_{2} as two areas of the small and large ends of the section and *L* as the section length. The fresh weights of stem, branch and foliage were also measured; subsamples were selected and weighed in the field [54]. Among all sample trees, about one third (1626 trees) were selected for measuring both aboveground and belowground biomass. The whole roots were excavated out, fresh weights of stump, coarse roots (more than 10 mm) and small roots (2–10 mm, not including fine roots less than 2 mm) were measured, respectively and subsamples were selected. After being taken into the laboratory, all subsamples were oven-dried at 85°C until a constant weight was reached. According to the ratio of dry weight to fresh weight, each component biomass was computed and the aboveground biomass of the tree was obtained by summation [54]. **Table 1** shows the general situation for biomass samples of eight major tree species or groups.

### 2.2 Model construction

The general form of individual tree biomass and stem volume models is as follows [45, 62]:

where *y* is biomass (kg), *x*_{j} are predictive biometric variables, which reflect the dimensions of a tree, such as diameter at breast height *D* (cm) and tree height *H* (m), *β*_{j} are parameters and *ε* is the error term. Because the biomass data are significantly heteroscedastic, some measures should be taken to eliminate heteroscedasticity prior to parameter estimation. In this paper, weighted regression was applied and the specific weight functions were derived from the residuals of independently fitted models by ordinary least squares regression [62, 63]. Since models based on one (*D*) or two variables (*D* and *H*) have been commonly used, this paper develops both one- and two-variable models. The aboveground biomass, belowground biomass and stem volume models based on two variables can be expressed respectively as:

where *M*_{a} and *M*_{b} are aboveground and belowground biomass (kg), respectively; *V* is stem volume (dm^{3}); *a*_{i}, *b*_{i} and *c*_{i} are parameters; and other symbols are the same as above.

#### 2.2.1 Integrated compatible model systems

The aboveground biomass is correlated to stem volume through biomass conversion factor (*BCF*), which is equal to biomass expansion factor (*BEF*) multiplied by basic wood density following the IPCC’s approach [64]. Because the *BCF* is an important parameter for forest biomass estimation [65], it is very common to develop both an aboveground biomass model and a *BCF* model that are compatible with stem volume model [45, 51]. Similarly, belowground biomass is connected with aboveground biomass model through root-to-shoot ratio (*RSR*) [66, 67]. Because the *RSR* model is also an important parameter for forest biomass estimation, generally both belowground biomass model and *RSR* model compatible with aboveground biomass model are developed simultaneously [44]. Therefore, we can develop an integrated aboveground and belowground biomass model system through using the nonlinear error-in-variable simultaneous equation approach [51, 68]. Because the belowground biomass observations were only 1/3 of the aboveground biomass observations, a dummy variable (*x*) was required for those trees for which no belowground biomass observation was available, i.e., 1 for the trees with belowground biomass observation and 0 for the trees with no belowground biomass observation [69]. The system can ensure the compatibility between aboveground biomass, belowground biomass, stem volume, *BCF* and *RSR*. The one- and two-variable integrated systems are as follows, respectively:

where, *M*_{a}, *M*_{b}, *V*, *BCF* and *RSR* are aboveground biomass, belowground biomass, stem volume, biomass conversion factor and root-to-shoot ratio, respectively, which are regarded as error-in-variables; *D* and *H* are diameter at breast height and tree height, which are regarded as error-free variables; *x* is a dummy variable to distinguish if belowground biomass is available; and *a*_{i}, *b*_{i} and *c*_{i} are parameters.

Various methods have been attempted to estimate the parameters of the simultaneous equations. Parresol [63] used the seemingly unrelated regression (SUR) for solving the additivity of simultaneous biomass equations. Tang et al. [70] further developed an error-in-variable modeling approach to estimate the parameters of simultaneous equations, which has been widely used in recent years [40, 45, 49, 51]. In this study, the error-in-variable simultaneous equation approach was used to estimate the parameters of the integrated systems based on maximum likelihood estimation through ForStat software (statistical software with analytical tools for forestry as well as general statistical procedures, developed in the Chinese Academy of Forestry, Beijing, China) [68].

In addition, the weighted regression method was used to eliminate the heteroscedasticity commonly exhibited in biomass and volume data by using specific weight functions, which were derived from the residuals of biomass or volume equations fitted through the ordinary least square (OLS) technique [52, 62]. For biomass conversion factor and root-to-shoot ratio modeling, the OLS regression technique was directly used to estimate the parameters because the *BCF* and *RSR* data mostly exhibited homoscedasticity.

#### 2.2.2 Generalized dummy variable models

The one-variable biomass equation was the most widely used model in estimating individual tree biomass [3, 7]. The power function of one-variable aboveground biomass equation was based on the WBE theory for the origin of allometric scaling laws [71, 72]. According to the results from Zeng and Tang [73], the generalized one-variable aboveground biomass model can be expressed as:

That is, the power parameter of the allometric model is constantly equal to 7/3 (≈2.33), only the parameter *a* depends on tree species. If a variable vector ** z** was defined as dummy variable to indicate tree species, then the generalized model (7) could be expressed as:

where *a* is the global parameter and *v*_{a} is tree species-specific parameter vector. The dummy variable vector ** z** includes seven elements, indicating the eight tree species by the following combinations:

*z*_{1}= 1,*z*_{2}=0,*z*_{3}= 0,*z*_{4}= 0,*z*_{5}= 0,*z*_{6}= 0 and*z*_{7}= 0 for*Picea*spp.*z*_{1}= 0,*z*_{2}= 1,*z*_{3}= 0,*z*_{4}= 0,*z*_{5}= 0,*z*_{6}= 0 and*z*_{7}= 0 for*Abies*spp.*z*_{1}= 0,*z*_{2}= 0,*z*_{3}= 1,*z*_{4}= 0,*z*_{5}= 0,*z*_{6}= 0 and*z*_{7}= 0 for*Betula*spp.*z*_{1}= 0,*z*_{2}= 0,*z*_{3}= 0,*z*_{4}= 1,*z*_{5}= 0,*z*_{6}= 0 and*z*_{7}= 0 for*Quercus*spp.*z*_{1}= 0,*z*_{2}= 0,*z*_{3}= 0,*z*_{4}= 0,*z*_{5}= 1,*z*_{6}= 0 and*z*_{7}= 0 for*Populus*spp.*z*_{1}= 0,*z*_{2}= 0,*z*_{3}= 0,*z*_{4}= 0,*z*_{5}= 0,*z*_{6}= 1 and*z*_{7}= 0 for*Larix*spp.*z*_{1}= 0,*z*_{2}= 0,*z*_{3}= 0,*z*_{4}= 0,*z*_{5}= 0,*z*_{6}= 0 and*z*_{7}= 1 for*C*.*lanceolata**z*_{1}= 0,*z*_{2}= 0,*z*_{3}= 0,*z*_{4}= 0,*z*_{5}= 0,*z*_{6}= 0 and*z*_{7}= 0 for*P*.*massoniana*

Consequently, from comparing the estimated values of species-specific parameter vector *v*_{a}, the differences among various tree species could be analyzed.

### 2.3 Model evaluation

Many statistical indices could be used to evaluate individual tree biomass models [63]. According to the study results from Zeng and Tang [53], the following six statistical indices, namely, the coefficient of determination (*R*^{2}), standard error of estimate (*SEE*), mean prediction error (*MPE*), total relative error (*TRE*), average systematic error (*ASE*) and mean percent standard error (*MPSE*), were very important for assessing biomass models. In this study, the same six statistical indices were used for model evaluation [50, 51]:

where *y*_{i} are observed values, *ŷ*_{i} are estimated values, *n* is the number of samples, *p* is the number of parameters and *t*_{α} is the *t*-value at confidence level *α* with *n*-*p* degrees of freedom.

## 3. Results and analysis

The one- and two-variable integrated systems (Eqs. (5) and (6)) for eight tree species or groups were estimated using the error-in-variable simultaneous equation approach through ForStat (**Tables 2** and **3**). The six fitting statistics, *R*^{2}, *SEE*, *TRE*, *ASE*, *MPE* and *MPSE*, were calculated and could be used for evaluating the goodness-of-fit of the three models (**Table 4**). From the fitting results of integrated systems (Eqs. (5) and (6)), the parameter estimates of the *BCF* and *RSR* models could be obtained (**Table 5**).

From comparison of the fitting statistics of two integrated systems (Eqs. (5) and (6)) in **Table 4**, we can found that for aboveground biomass estimation, two-variable models were better than one-variable models except *Picea*. For belowground biomass estimation, one- and two-variable models were not significantly different, even some of one-variable models were slightly better than two-variable models, such as *Picea*, *Quercus*, *Larix* and *C*. *lanceolata*. Considering that tree height measurement is time consuming and two-variable biomass models are not significantly different from one-variable models, especially for belowground biomass estimation, it was commended to apply one-variable models in forestry practice such as National Forest Inventory.

From **Table 2**, it was found that the estimates of parameter *a*_{1} were approximately equal to 7/3, confirming the results of an earlier study [73]. To analyze the difference among various tree species, the dummy model (8) was fitted using the aboveground biomass data of all eight species (**Table 6**).

Species | Global parameter (a) | Species-specific parameters (v_{a}) |
---|---|---|

Pi | 0.13485 | −0.01084 |

Ab | −0.01959 | |

Be | 0.00443 | |

Qu | 0.03908 | |

Po | 0.00106 | |

La | −0.01441 | |

Cl | −0.04254 | |

Pm | 0.00000 |

According to the parameter estimates in **Table 6**, we could rank the eight tree species by aboveground biomass estimates in descending order as *Quercus*, *Betula*, *Populus*, *P*. *massoniana*, *Picea*, *Larix*, *Abies* and *C*. *lanceolata*. That is, *Quercus* had the largest aboveground biomass, whereas *C*. *lanceolata* had the smallest one for the same diameter trees. The aboveground biomass estimates of the dummy model (Eq. (8)) for *Quercus*, *Betula*, *Populus*, *P*. *massoniana*, *Picea*, *Larix* and *Abies* were 88%, 51%, 47%, 46%, 34%, 30% and 25% larger, respectively, than that for *C*. *lanceolata* (see **Figure 1**).

Similarly, for one-variable belowground biomass models, it was found that the estimates of parameter *b*_{1} for eight species were not significantly different. To analyze the difference of belowground biomass estimation among various tree species, we fitted the following dummy model:

where *b*_{0} and *b*_{1} are global parameters and *v*_{b} is species-specific parameter vector. The parameter estimates of dummy model (Eq. (15)) are listed in **Table 7**.

Species | Global parameters | Species-specific parameters (v_{b}) | |
---|---|---|---|

b_{0} | b_{1} | ||

Pi | 0.03551 | 2.2544 | 0.00424 |

Ab | −0.00792 | ||

Be | 0.01437 | ||

Qu | 0.01835 | ||

Po | 0.00091 | ||

La | 0.00583 | ||

Cl | −0.00761 | ||

Pm | 0.00000 |

According to the parameter estimates in **Table 7**, we could rank the eight tree species by belowground biomass estimates in descending order as *Quercus*, *Betula*, *Larix*, *Picea*, *Populus*, *P*. *massoniana*, *C*. *lanceolata* and *Abies*. That is, *Quercus* had the largest belowground biomass, while *Abies* had the smallest one for the same diameter trees. The belowground biomass estimates of the dummy model (Eq. (15)) for *Quercus*, *Betula*, *Larix*, *Picea*, *Populus*, *P*. *massoniana* and *C*. *lanceolata* were 95%, 81%, 50%, 44%, 32%, 29% and 1% larger, respectively, than that for *Abies* (see **Figure 2**).

## 4. Discussion and conclusion

In this study, data on above- and belowground biomass from 4818 to 1626 sample trees, respectively, for eight major tree species in China were used to develop compatible individual tree biomass models. The models included aboveground biomass equations and *BCF* equations compatible with stem volume equations and belowground biomass equations and *RSR* models compatible with aboveground biomass equations. To solve compatibility of the biomass models, the nonlinear error-in-variable simultaneous equations were applied and to solve the issue of unequal sample sizes for above- and belowground biomass, the dummy-variable model approach was used. In the technical regulation on methodology for tree biomass modeling [55], the segmented modeling approach was recommended when the biomass estimate of small trees was obviously biased [43, 46]. Furthermore, for the tree species distributed in various regions, it was generally needed to develop biomass models for different regions. For example, according to the population classification on modeling of single-tree biomass equations [39], it was necessary to establish five sets of biomass models for both *Abies* and *Picea*. But in this study, the segmented modeling approach was not used to develop biomass models for large and small trees, respectively and the differences among various regions were not taken into account, only one set of biomass models, including one- and two-variable models, was developed for each tree species.

The data of three tree species, i.e., *C. lanceolata*, *P. massoniana* and *Larix* spp., were used or partly used to develop biomass models, which were published as original papers [40–51] or ministerial standards [56, 57]. Comparing with the study results by Zeng et al. [47], the parameter estimates and fitness indices of aboveground biomass and volume models are very close to those for *C*. *lanceolata* in this study. From the achievements by Zeng and Tang [45], we can find that the parameter estimates of aboveground biomass and volume models are not significantly different from those for *P*. *massoniana* in this chapter, but this study provided better models considering the statistical indices of goodness-of-fit. Comparing with the biomass models published as ministerial standards [56, 57], the developed models in this study are more generalized and simpler for application in national and regional biomass estimation. There are four sets of biomass models in total for trees (dbh ≥ 5 cm) and saplings (dbh < 5 cm) for two modelling populations of each tree species in the ministerial standards [56, 57] and here we have only one set of biomass models which are suitable for both trees and saplings and for the whole country.

The results indicated that two-variable models were almost better than one-variable models for aboveground biomass estimation, while the two model systems were not significantly different for belowground biomass estimation. The mean prediction errors (*MPEs*) of aboveground biomass models for the eight species were less than 5%, whereas *MPEs* of belowground biomass equations were less than 10%, except for *Abies*. The models developed in this study can provide a basis for estimating biomass for the eight major tree species in China and will fill in the lack for China on the web platform GlobAllomeTree [74]. Also, they will have the potential to support the implementation of policies and mechanisms designed to mitigate climate change (e.g., CDM and REDD+) and to calculate costs and benefits associated with forest carbon projects. In addition, the overall modeling methodology presented in this study can be taken into consideration in any case that involves individual tree biomass modeling.