Open access peer-reviewed chapter

The Basics of Structural Equations in Medicine and Health Sciences

Written By

Ramón Reyes-Carreto, Flaviano Godinez-Jaimes and María Guzmán-Martínez

Reviewed: 15 April 2022 Published: 25 May 2022

DOI: 10.5772/intechopen.104957

From the Edited Volume

Recent Advances in Medical Statistics

Edited by Cruz Vargas-De-León

Chapter metrics overview

247 Chapter Downloads

View Full Metrics

Abstract

Structural Equation Models (SEM) are very useful and, with a wide range of practical applications in many fields of science, in medicine and health sciences, have increased interest in their usefulness. This chapter is divided into three sections. The first includes concepts, notation, and theoretical aspects of SEM, such as path diagrams, measurement model, confirmatory factor analysis, structural regression, and identification model. In addition, it includes some simple examples applied to health sciences. The second section deals with the estimation and evaluation of the model. On the first topic, the methods of Maximum Likelihood (ML), Generalized Least Squares, Unweighted Least Squares, and ML with robust standard errors are addressed, as well as alternative methods to the problem of violations of the multivariate normality assumption. On the second topic, some goodness of fit statistics of the estimated model are defined, such as the chi-square statistic, Root Mean Square Error of Approximation, Tucker-Lewis Index, Comparative Fit Index, Standardized Root Mean Square Residual, and Goodness of Fit Index. The last section deals with SEM example and its implementation using the lavaan library of R software.

Keywords

  • causal effects
  • path diagram
  • measurement model
  • confirmatory factor analysis
  • structural regression

1. Introduction

SEM is a multivariate method whose use has grown exponentially in medicine and health sciences. The SEM is a statistical method considered as a causal model that includes, among other techniques, the Linear Regression Model (LRM), Factor Analysis (FA), Confirmatory Factor Analysis (CFA), and Path Analysis. This statistical model can help the researcher to test or confirm theoretical models or hypotheses and validate causal relations between variables, which can be latent and observed, or only between observed variables.

When a researcher is interested in investigating the causal relationships between a grouping of variables that define a factor or latent variable, he is interested in proving or confirming (or discontinuing) that his hypothetical model is appropriate for the data analyzed.

As a result, the researcher has the following options: a) when the hypothetical model is confirmed by the analyzed data, he can include new elements to the original model and then analyze that new structure; b) when the hypothetical model is not appropriate for the analyzed data, the original model can be modified or a new model can be tested.

Pearl cited by Kline [1] defines SEM as a causal method that considers as input (a) a set of qualitative causal hypotheses based on theory or the results of empirical studies represented by a set of equations, (b) a set of questions about the causal relationship between factors or latent variables of interest. Many SEM applications focus on non-experimental or observational designs and data from quasi-experimental or experimental designs.

Advertisement

2. Variables and path diagram in health and medicine

2.1 Causality

SEM models assume probabilistic causality. This allows changes in the results to occur with a probability between 0 and 1.0. The estimation of effects using the data is founded on probability distribution assumptions; thus, causality is understood as a functional relationship between two quantitative variables, effects change a probability distribution. Causality assumptions for a researcher in Medicine and Health are done through a synthesis of logic, theory, and prior knowledge, in this way the causal relationship between observed and latent variables is conceptually hypothesized with expert clinical judgment [2].

The SEM includes observed or manifest variables, latent variables, errors or disturbances, and parameters. There are two main ways to communicate and understand the equations that the SEM represents: through simultaneous equations or by a path diagram. A path diagram is a visualization of the conceptual model, and a conceptual model is an idea of the relationship under study. Behind the ideas of causal inference are Bayesian networks and causal graphs; for example, a causal directed graph can include, common causes, whether measured or unmeasured variables.

2.2 Observed, latent variables, disturbances, and effects

Observed variables are measured and recorded in the data (e.g. sex, age, height, weight, systolic blood pressure, diastolic blood pressure, body mass index). In a path diagram, these variables are represented by rectangles or a box. A standardized variable is a variable that has a mean zero and a variance one. Latent variables or latent constructs are variables that are not directly measured (e.g. depression, metabolic syndrome, obesity, and anxiety). In a path diagram, they are described by circles or ovals. Observed or latent variables can be exogenous or endogenous. Exogenous variables are variables that are not influenced (not caused) by others variables in a model. This variable is the cause or effect of one or more variables in the model. Endogenous variables are those variables that are influenced by other variables. An endogenous variable can affect another variable of the same type.

Disturbances are the unspecified causes of the effect variable. Each endogenous variable is assigned a disturbance, and this is considered as a latent variable.

Effects can be direct, indirect, and totals. These effects can be represented by directed lines. The direct effect () is the causal effect of an independent variable on another called dependent, that is, the direct influence of one variable on another. Any variable can be strictly independent (exogenous) or a dependent variable or endogenous.

Indirect effect is a causal effect of an independent variable on a dependent through the pathway of a third variable. This effect is synonymous with the mediation effect. The total effect is the sum of all possible effects of one independent variable on another dependent. All the effects are estimated by various techniques from the sample data.

2.3 Path diagram

It is a graphical description of an SEM that includes a measurement model and a structural model, where measured or observed variables are represented by rectangles, latent variables by circles, and curved lines represent unanalyzed associations. The covariances or correlations between exogenous variables are described by a curved line with two arrowheads. The variance is represented by two-headed curved arrows on the same variable observed or latent. Here, the latent variables are treated as continuous in what we shall refer to as conventional SEM (or what is sometimes called first-generation SEM). Hypothesized causal effects or direct effects, on endogenous variables, are represented by a line with a single arrowhead.

Kline [1] on parameters of SEM, when means are not included, suggests defining parameters in words that are parallel to three symbols utilized in Reticular Action Model (RAM) symbolis a direct effect, on endogenous variables, is represented by a line with a single arrowhead; double-headed curved arrows that go out and re-enter the same variable, represent the variance of an exogenous variable; and the double-headed curved arrows entering one variable and leaving another variable to represent the covariance.

In medicine and health sciences, it is common to assess a latent variable by several observed variables, for example, obesity can be indirectly measured by the observed variables percentage of fat (FAT), body mass index (BMI), and abdominal circumference (AC) (Figure 1).

Figure 1.

A path diagram representing the latent variable obesity measured by three observed variables: Percentage of FAT (FAT), body mass index (BMI), and abdominal circumference (AC).

Advertisement

3. Measurement and structural models

In medicine and health sciences, it is common to use one of the four types of SEM that exist in the literature. Next, each one is briefly described.

Path analysis models: This method of SEM only includes observed variables, similar that multiple regression models (MRM), but it has the advantage that a variable can be a dependent variable and an independent variable at the same time, in addition, there can be several dependent variables, and the indirect and direct effects can be measured (Figure 2). In Figure 2 (below), the Socioeconomic Status (SES) and Disease, both exogenous variables, represent direct effects on the endogenous variable obesity.

Figure 2.

The path diagram above represents the indirect effect between FAT and LV diastolic dysfunction, and the path diagram below shows the direct effects of socioeconomic status and obesity, and between disease and obesity [3].

Confirmatory Factor Analysis models (CFA): The CFA, like the measurement model, analyzes the relation among latent and observed variables, emphasizing that the theoretical factorial structure predetermined by the researcher is confirmed by the data; that is, it must be predetermined to which factor the observed variables will be loaded and the CFA will be useful to confirm or not the default structure.

Structural Regression Model: This method of SEM is a regression model between latent variables. The idea consists on to combine the techniques of CFA and MRM, further include the measurement errors.

The causal relationships between latent variables are represented by the directional arrows according to the hypothetical model. Typically, model fit indices are examined first, followed by hypothesis tests. The latent growth curve model is a statistical technique of longitudinal analysis that estimates or explains the growth over a period of time.

In this chapter, we will only address the Structural Regression Model.

3.1 Measurement model

This part of the path diagram is necessary to analyze all the items or observed variables that are “loaded” in the latent variable, their variances, and errors, as well as the relation between the observed variables.

The measurement model quantifies linkages among the latent variables and observed variables that characterize the hypothetical model.

The latent variables are representations of the concepts of interest. Previously the concept is selected, Bollen [4] recommends for the measurement process: (1) Determine its meaning, (2) Represent it with latent variables, (3) Form measures, and (4) Establish the relation among latent variables and measures variables.

The measurement model analyzes the relation between the measure and latent variables. The latent variable is the representation of a concept. This relation can be described or represented by an equation or in a path diagram (Figure 3).

Figure 3.

A path diagram of the CFA model on Matsuda index with 11 observed variables: Percentage of FAT (FAT), body mass index (BMI), abdominal circumference (AC), arginine (ARG), glycine (GLY), leucine (LEU), phenylalanine (PHE), valine (VAL), liver ultrasound (USG), alanine aminotransferase (ALT), aspartate aminotransferase (AST); and 3 latent variables: Amino acids (AA1), fatty liver, and obesity [5].

The CFA is a method for evaluating a measurement model. Klein [1] mentioning Bollen suggests applying some rules to ensure the identification of the measurement model re specifying it as a CFA. When it comes to a CFA, a factor must have at least three observed variables, when there are two or more factors or latent variables each factor must have at least two observed variables.

3.2 Structural regression model

The path diagram of structural regression (SR) includes the set of latent variables and their relationships. Unlike the measurement model (CFA) where all the factors or latent variables are exogenous and can be assumed to covary or have a dependency, the causal effects between latent variables are described only in the SR. Causal inference in latent variable modeling is more laborious than measurement model analysis. In SR models the effects between latent variables can also be direct or indirect. Similarly, the structural component can also be recursive or non-recursive. A recursive SR is a model in which causation is directed in one single direction, while a non-recursive structural model has causality going in both directions on some variables.

3.3 Identification of SR model

Identification of the SR model is analogous to the identification of the measurement model. However, before validating the SR, the measurement model needs to be identified (i.e., valid) and then evaluate the fully SEM model. The only valid identification of the CFA does not guarantee the identification of the SR.

Therefore, the analysis of a fully SEM must include the variances and covariances between the factors or latent variables A fully SR model is identified by [4]: (1) In the first, the researcher must analyze the measurement model as a CFA, that is, ignore in the analysis the relations among the latent variables of the SR model. After reformulating the model, discover if the model is identified. If identification is obtained, apply it to the second step; (2) in the second step, you must analyze the equation or equations that contain the relation among the latent variables of the SR model must be analyzed and then determine if the SR model is identified, assuming that the latent variables are observed variables. If in step 1 it is proved that the measurement parameters are identified and in step 2 that the parameters of the SR are also identified, both conditions are sufficient to fully identify the SR model. Figure 4 shows a path diagram of a complete SEM, which includes 9 observed variables and two latent variables. The objective is to analyze the relationship among central obesity (FAT), systemic inflammation (Inflammation), and left ventricular diastolic dysfunction (LV diastolic). This figure does not show the variances or the disturbances or errors.

Figure 4.

SEM to analyze relationships between adiposity, inflammatory responses, LV diastolic dysfunction. Fasting plasma glucose (FPG); high-density lipoprotein (HDL); homeostasis model of insulin resistance (HOMA); high sensitivity C-reactive protein (hsCRP); peritoneum fat area (Peri fat); retroperitoneum fat area (retro fat); subcutaneous fat area (sub fat); triglyceride (TG). The latent variable FAT directly influences the inflammation variable and indirectly on the observed variable LV diastolic [6].

Advertisement

4. Equations and model estimation

4.1 The equations

The basic goal of SEM is to generalize the CFA to assess relationships between latent variables [7]. A classic form of SEM representation is the LISREL model which involves a measurement model and a structural model. The measurement model defines the relationship between the latent variables and their indicators or observed variables, and the structural model defines the relationship between the latent variables. In this section, we will address the linear SEM model and the nonlinear case.

The measurement equations are:

x=Λxξ+δE1
y=Λyη+εE2

In Eq. (1), x is the vector of observed exogenous variables, ξ is the vector of exogenous latent variables, δ is the vector of errors and Λx the matrix of coefficients that relates x to ξ. In Eq. (2), y is vector of observed variables referred to as endogenous, η is the vector of latent variables also endogenous; ε is the vector of errors for the endogenous variables, and Λy the matrix of coefficients relating y to η. In addition, connected with the two previous equations we have the covariance matrices: Θδ and Θε are the matrix of covariances among errors δ and ε, respectively.

In summary, the object of the measurement model is to analyze the relation of the latent variables in ξ and η with the observed variables in x and y, respectively. One problem in formulating these equations is to specify the factorial loading matrix Λ, based on a priori information on the observed and latent variables considered in the study.

The structural equation for linear SEMs is:

η=Γξ+ζE3

The structural equation for nonlinear SEMs is:

η=+Γξ+ζE4

where η is the vector of endogenous variables, ξ is the vector of exogenous variables, and ζ explain the latent errors of endogenous variables; and B is the matrix of coefficients that explain the relation among endogenous latent variables, Γ explain the linear effects of exogenous variables on endogenous, and ζ include of errors of endogenous variables. Related to Eq. (4) we have the following matrices: Φ and Ψ are the covariance matrix of latent exogenous variables and the matrix of covariances among errors of endogenous variables, respectively.

4.2 Assumptions and limitations

Normality: The most important assumption in SEM is the multivariate normal distribution (MVN), particularly when the maximum likelihood (ML) method is used to estimate the model parameters. When discrete variables have used the assumption of normality is violated. The violation or omission of the assumption of the MVN of the observed variables leads to a high value of χM2/dfM and to an affectation of the significance of the test. In this scenario, it is suggested to apply other methods such as Generalized Least Squared (GLS).

When the complexity of the SEM increases, the sample size must also increase, and when the data depart from the normal distribution it is essential to increase the number of observations [1]. The non-normality assumption can be detected by univariate tests, multivariate tests, and skewness and kurtosis statistics. The skewness and kurtosis can be measured, separately or together in the same variable. In the context of SEM, the kurtosis is more problematic than skewness in terms of the effects on inference. If the absolute value of the skewness exceeds 2 and kurtosis exceeds 4, then the distribution is non-normal [8].

No correlation between errors: The errors are assumed to be independent, that is, there is no correlation between the errors δ,ε and ζ.

Multicollinearity: It is assumed that there is no strong relationship among the independent variables.

Linearity: It is assumed that exists linear relation among the variables.

Outliers: The presence of outliers in the data affects the significant results of the model.

Sample size: Generally, the number of observations in the sample affects the results of the fit indices in SEM. [9] suggest a minimum sample size of 150; [10] suggest at least 10 times the number of parameters in the model: [11] recommends should be at least 200, and Hair et al. mentioned by Thakkar [12] provides an interesting list. However, if the number of observations is small, it is reasonable and recommendable to use the Bayesian approach of SEM.

Limitations: Prior to analysis, and since the SEM model is a statistical method of confirmation, the researcher must establish a hypothetical model, analyze the model based on the sample and the latent, and observed variables. Additionally, one must know how many parameters you need to estimate, adding variances, covariances, and path coefficients. Of course, one must know all the relationships that he/she intends to specify in the model.

4.3 Estimation

Let Σ=Σθ be the covariance matrix of the model, where Σ is the population matrix corresponding to the observed variables, θ is a vector of (unknown) parameters, and Σθ is a matrix as a function of θ, which is estimated by minimizing the discrepancy among a sample covariance matrix S and Σθ. The estimation methods minimize different discrepancy functions F between S and Σθ, so that

F=minSΣE5

where the matrix Σθ is given by

Σθ=EyyTEyxTExyTExxT=ΣyyθΣyxθΣxyθΣxxθ=ΛyCΓΦΓT+ΨCTΛyT+ΘεΛyCΓΛΛxTΛxΦΓTCTΛyTΛxΦΛxT+ΘδE6

Note that this matrix does not depend on observed or latent variables but on the matrices of unknown parameters Θδ,Θε,Φ,Ψ,Λx,Λy,Γ and B, where C=IB1.

ML estimation: In this method, function (7) is the logarithm of the likelihood, the loglikelihood. Maximization is accomplished by deriving the loglikelihood with respect to the parameters, equating each derivative to zero, and solving the equations system. This procedure requires that the endogenous variables have an MVN distribution, S Wishart distribution, that the observations are distributed independently and identically, and that the matrices Σ and S are positive definite.

FML=logΣθ+trSΣθ1logStrSS1E7

where log is the natural logarithm function, is the determinant and tr is the trace function.

The ML estimator has among others the following advantages: is asymptotically consistent, unbiased, efficient, and the model fit statistic TML is asymptotically distributed as χ2 with df=pp+12t, where t is the number of model parameters estimated.

Two other estimation methods that consider endogenous variables with MVN distributions are generalized least squares (GLS) and unweighted least squares (ULS), which are described below.

GLS estimator: This method is a member of a family known as fully weighted least squares (WLS) estimation, which is suggested to be applied when the data is considered severely non-normal; in addition, it has the property of being asymptotically MVN distributed. The function to minimize is given by

FGLS=12trIΣθS12E8

ULS estimator: The method consists of minimizing the sum of squares of the differences among the sample covariance matrix and the predicted covariance matrix. This method can generate unbiased estimates but is not as good as the ML method [13]. The function to minimize is

FULS=12trSΣθ2E9

In general, the ML estimator is preferred over both GLS and ULS, especially when the number of observations is large.

GLS estimator requires well-specified models but allows small sample sizes to do an acceptable job in terms of theoretical and empirical fit. WLS estimator also requires well-specified models, but in contrast to GLS and ML, it also requires large sample sizes to perform well [14]. In general, the ML estimator is preferred over both GLS and ULS, especially when the number of observations is large.

4.4 Model assessment

The SEM tests a hypothetical theoretical model about the relation among latent and observed variables, the goal of model evaluation consists in test the causal relationships of a model. There are several criteria for evaluating the fit of an SEM, so it is difficult to adopt a single specific model fit criterion. The researcher generally uses three criteria to assess the statistical significance and the substantive significance of a hypothesized model [15]:

  1. The non-significance of the chi-square test indicates that the proposed model fits the data.

  2. The statistical significance of individual parameter estimates is applied as a t or z value, and are compared to a t or normal distributions.

  3. The magnitude and direction are positive or negative of the parameter estimates.

Kline [1], Schumacker and Lomax [15], Thakkar [12] and Douglas [2] provided indices and criteria for evaluating the fit of the model. This chapter only presents some indices: A statistical test and four basic fit statistics criteria.

  1. Chi-square χM2 with its degrees of freedom dfM and p value.

    This statistic is based on a function of the fitting function FML (7) and is given by

    χM2=n1FMLE10

    where n is sample size and χM2 has a central chi-square distribution with degrees of freedom dfM=pt, where p=pp+1/2 is the total number of variances and covariance terms, p is the number of observed variables, and t is the total number of free parameters. Among the problems that this statistic presents are that its value can be affected by the sample size, non-normality, correlation, and unique variance. To decrease the sensitivity of the χM2 to sample size, it is common to divide this statistic by its expected value, that is to say χM2/dfM, change that reduces the value of this ratio for dfM>1 compared with χM2. This statistic is used to test the absolute model fit. The null hypothesis of equal fit is that there is no difference between the proposed model and the data. A large value of statistics χM2 with a respective small p value imply that model does not fit the data well.

  2. Root Mean Square Error of Approximation (RMSEA) and its 90% confidence interval.

    The RMSEA is a function of χM2 statistics defined by

    RMSEA=δ̂MdfMn1E11

    where δ̂M=max0χM2df is the estimated noncentrality parameter and χM2 is defined in (10).

  3. The Comparative Fit Index (CFI).

    Let I be the null (independence) model and its χI2 statistic which is approximately central chi-square distributed with degrees of freedom dfI. The CFI can be obtained using the ML estimator. This index is given by:

    CFI=1χM2dfMχI2dfIE12

  4. Goodness of Fit Index (GFI)

    GFI is the amount of variances and covariances jointly accounted for by the model. It is given by:

    GFI=1trΣθ1SI2trΣθ1S2E13

    GFI varies from 0 to 1.0.

  5. Standardized Root Mean Square Residual (SRMR).

    SRMR is an absolute fit index that is a badness-of-fit statistic that consists of standardizing the Root Mean Square Residual (RMR). It is a measure of the mean absolute covariance residual. An SRMR = 0 means an ideal model fit, and increasingly higher values indicate a worse fit [1].

    In Table 1 a summary is given about the interpretation of the most important goodness of fit indices.

Model fit valueRule of Thumb Guidelines
Absolute fit indicesExcellentAcceptable
Chi-squarep0.05Smaller values
CFI0.950.90
RMSEA0.050.08
SRMR0.050.08

Table 1.

Guidelines in SEM for select model fit statistics and indices [2].

Advertisement

5. SEM example

5.1 Database

To illustrate the application of the packages lavaan of the R software, data from a study carried out in a Public Maternal Hospital in the state of Guerrero, Mexico, are used. The database corresponds to a cross-sectional study of pregnant women who presented to the emergency department of the Maternal Hospital with a clinical picture compatible with an obstetric emergency [16]. Two groups of patients were constituted, one group was treated from January 2009 to December 2011, which corresponds to the period before the implementation of a process called Red Code (Before RC), which is aimed at pregnant women with obstetric emergency situations; and another group of patients treated from September 2013 to December 2015, in which the Red Code (RC) procedure was implemented. The observed variables are the same for both cases, and the number of observations for the RC period is 106 and 230 for Before RC. The code and analysis presented below correspond to data from the CR period. For the Before CR case, it is a similar way. Since these are two different data sets, it is not possible to apply an analysis of variance. Therefore, to compare the results of the studied models, only the fit indices and the coefficients of factor loadings and regression are compared.

SEM is based on the variance/covariance matrix of the observed variables. However, when the observed variables present very different variances, it is suggested to use the correlation matrix. The R software is available on GNU GPL (General Public License) on the CRAN website (Comprehensive R Archive Network) https://CRAN.R-project.org [17]. To implement SEM using the lavaan [18] package, you first need to install it using the instructions:

install.packages(“lavaan”).

library(lavaan).

In this data set, the opinions of the expert medical personnel assigned to the Maternal Hospital are considered to determine the following latent variables and observed variables: First Hemodynamic State (FHS) is made up of the variables observed: Temperature (Tm1), heart rate (HR1), blood pressure (BP1), respiratory rate (BF1) and the number of seizures (NC). The latent variable Second Hemodynamic State (SHS) is made up of the observed variables: Temperature (Tm2), heart rate (HR2), blood pressure (BP2), respiratory rate (BF2). Gyneco-obstetric background (OGH) is measured by the variables number of abortions (NumAb), number of cesarean sections (NumCa), weight of the pregnant woman (PW), and number of vaginal deliveries (NVD). Treatment (Treat) formed by Plasma (PLAS), platelets (PLAT), and erythrocyte concentrates (EC).

Results of the Emergency Obstetric Care (Remoc) that measure the consequences of the actions carried out in the RC process, which are the number of sequelae (NumS), the weight of the newborn (NW) in kilograms, and the weeks of gestation (GW).

5.2 Model specification

In this example, it applies the function SEM of library lavaan, which uses the correlation matrix, Cor.RC, and the number of observations N.RC. The fit.RC object is created, where lavaan stores the results of our SEM.

  ### Model especification.

Sm.RC<-’.

FHS =∼ BP1 + BF1 + HR1 + Tm1 + NC.

SHS =∼ HR2 + Tm2 + BP2 + BF2.

OGH =∼ PW + NVD + NumAb + NumCa.

Treat =∼ PLAT + PLAS + EC.

Remoc =∼ NumS + NW + GW.

#### Structural model.

FHS ∼ OGH.

SHS ∼ FHS.

Treat ∼ OGH + FHS + SHS.

Remoc ∼ Treat + OGH + FHS + SHS.

fit.RC<-sem(Sm.RC, sample.cov= Cor.RC, sample.nobs = N.RC).

5.3 Model assessment

The estimation method used in this example is the maximum likelihood method. To obtain results, it is common to use the function summary that provides the results of the chi-square test, the indices for the adjustment of the model (RMSEA, CFI, AGFI, among others), the estimations of the factor loads, the coefficients of regression, standard errors, Z values, and p values for each estimated coefficient. In this example, only the estimates for both periods are included, since we are interested in identifying the change or effect in each parameter estimate.

summary(fit.RC, fit.measures= TRUE).

It is common that before interpreting the results of the fitted model, it is necessary to verify that the fit is suitable. Table 2 presents the results of chi-square, degrees of freedom, p-value and some fit indices to make decisions about evaluating the fit of the model for Before RC and RC case.

Before RCRCReference
Chi-square295.5156.0
p-value = 0.00p-value = 0.216
CFI0.8030.915 0.90
RMSEA0.068 (0.057–0.079)0.029 (0.000–0.056)0.05–0.08
p-value = 0.004p-value = 0.887
SRMR0.0780.087 0.08
AGFI0.8520.827

Table 2.

Goodness of fit indices for the two SEMs.

The chi-square results for the case of RD period are better than those of Before RD. Because the chi-square statistic is sensitive to sample size, correlation size, and non-normality, it is suggested that other adjustment indices be used. However, since there is no consensus on which goodness of fit index is the best to use, several of the indices available in the lavaan library of the R software are used here. The results of the goodness of fit indices: (a) CFI: 0.915 for RD is greater than 0.803 for Before RD, and is greater than the reference value (0.90); (b) the value of RMSEA for RD (0.029) is less than Before RD (0.068), and is even less than the reference value (0.05–0.08); and (c) the SRMR value for Before RD is less than for the RD case, both results very close to the reference value (0.08).

Finally, the values of AGFI for both periods Before RD and RD are close to the reference value of 0.90. In summary, according to the General Rules Guidelines in the SEM literature for selecting indices and model fit statistics, cited by [1, 2], these results indicate a good fit of both models, in particular, for the RD case.

5.4 Model interpretation

The interpretation for FHS is as follows: in the Before RC period, when the FHS increases by one unit, then BP1, BF1, HR1, Tm1, and NC increase by 1.0, 1.41, 1.73, 1.33, and 0.55, respectively. While for the RC period, when FHS increases by one unit, then BP1, HR1, and Tm1 increase by 1.0, 4.4, and 0.10, respectively, but BF1 and NC decrease by 1.76 and 1.98, respectively. The interpretation of the other latent variables is done in a similar way.

The observed variables that have the greatest impact or effect on each latent variable in the measurement model are: (a) In FHS: BF1 in both periods Before RC and RC (1.41 and − 1.76, respectively) and HR1 for Before RC (1.73) and for RC (4.40).

Additionally, all the factor loadings for Before RC are Significant, while for the RC period are Not Significant; (b) In SHS: All factor loadings have similar effects in both periods and also resulted in NS; (c) In OGH: the effect of NVD increased from 2.58 for the Before RC period to 30.96 for the RC period. In contrast, the effect of NumAb decreased from 0.84 for Before RC to −6.99 for period RC. However, all loads resulted in NS in both periods; (d) In Treat: The effect of the PLAS variable increased from 0.95 of the Before RC period to 1.48 of the RC period. In this latent variable, all factor loadings were significant in both periods; and (e) In Remoc: the effects of the observed variables NW and GW increased from the Before RC period to the RC period from −13.85 to 5.65 and from −18.22 to 5.14, respectively; however, all factor loadings were NS.

Although the results presented in Table 3, for the structural model, in both periods, are not significant, it can be said that: (a) when OGH increases one unit, then FHS increases 0.11 units in the Before RC period, but decreases by 1.27 units in the RC period; (b) When FHS increases by one unit, then SHS increases by 1.23 units in the Before RC period, but decreases by 0.25 units in the RC period. In a similar way, the other interpretations of the results of the structural model are made in both periods.

Measurement modelStructural model
LVBefore RCRCLVBefore RCRC
FHS =FHS
BP11.0001.000OGH0.111NS−1.270NS
BF11.410***−1.755NSSHS
HR11.729***4.401NSFHS1.224***−0.248NS
Tm11.333***0.103NSTreat
NC0.548***−1.978NSOGH1.194NS−8.185NS
SHS =FHS−0.182NS−0.174NS
HR21.0001.000SHS0.354NS−0.251NS
Tm20.491***0.497NSRemoc
BP20.803***0.201NSTreat0.010NS0.087NS
BF20.563***0.650NSOGH−0.042NS0.478NS
OGH =FHS−0.062NS−0.054NS
PW1.0001.000SHS0.017NS0.010NS
NVD2.579NS30.956NS
NumAb0.838NS−6.993NS
NumCa−0.978NS0.381NS
Treat =
PLAT1.0001.000
PLAS0.949***1.483***
EC0.407***0.531***
Remoc =
NumS1.0001.000
NW−13.848NS5.654NS
GW−18.224NS5.138NS

Table 3.

Estimates of the measurement model and structural model for both models.

LV: Latent variable model, NS: Non significant.

*** : P(> |Z|) < 0.05. That is, a value of statistical significance less than 0.05.

Finally, it is convenient to say that although the number of observations corresponding to the Before RC (230) is greater than the total number of observations to the RC period (106), the results of the fit indices are better for the RC case.

5.5 Results in a path diagram

It is quite common and useful to display the SEM results in a route diagram, for which the following semPlot package function can be used.

semPaths(fit_RD, “par”, edge.label.cex = 1.2, fade = FALSE, style=“lisrel”, layout = “tree”).

Figure 5 corresponds to the case of the RC period, it shows the final diagram of the model established in the model specification section, as well as the values of the estimates. Negative effects are shown in red and positive effects in green.

Figure 5.

Diagram path of SEM for RC period.

Advertisement

6. Conclusions

In summary, it can be concluded that there was a positive effect on the health status of patients treated with the RC process compared to patients who were not treated. The results of this study can provide information that allows the design of hospital management strategies for pregnant women with high morbidity to improve the quality of service, but in a particular way, for the Hospital de la Madre y el Nio Guerrense they can help in the care of their service. Finally, the contribution of this proposed SEM, in addition to helping to understand the management and interpretation of the model, can help to evaluate the effects of emergency obstetric care, using some observed and latent variables.

Advertisement

Conflict of interest

The authors declare no conflict of interest.

Advertisement

Abbreviations

AGFIAdjusted Goodness of Fit Index
BMIBody Mass Index
CFIComparative Fit Index
CFAConfirmatory Factor Analysis
CRANComprehensive R Archive Network
FAFactor Analysis
GFIGoodness of Fit Index
GLSGeneral Least Squares
LRMLinear Regression Model
MLMaximum Likelihood
MRMMultiple Regression Models
MVNMultivariate Normal Distribution
PAPath Analysis
RAMReticular Action Model
RCRed Code
RMSEARoot Mean Square Error of Approximation
SEMStructural Equation Models
SRStructural Regression
SRMRStandardized Root Mean Square Residual
TLITucker-Lewis Index
ULSUnweighted Least Squares
LISRELLinear Structural Relations

References

  1. 1. Kline RB. Principles and Practice of Structural Equation Modeling. 4th ed. New York: Guilford Press; 2016
  2. 2. Gunzler DD, Perzynski AT, Carle AC. Structural Equation Modeling for Health and Medicine. Chapman & Hall/CRC; 2021. p. 299
  3. 3. Darbandi M, Najafi F, Pasdar Y, Mostafaei S, Rezaeian S. Factors associated with overweight and obesity in adults using structural equation model: Mediation effect of physical activity and dietary pattern. Eating and Weight Disorders-Studies on Anorexia, Bulimia and Obesity. 2020;25(6):1561-1571. DOI: 10.1007/s40519-019-00793-7
  4. 4. Bollen KA. Structural Equations with Latent Variables. John Wiley & Sons; 1989
  5. 5. Romero-Ibarguengoitia ME, Vadillo-Ortega F, Caballero AE, Ibarra-Gonzlez H-RA, Serratos-Canales MF, et al. Family history and obesity in youth, their effect on acylcarnitine/aminoacids metabolomics and non-alcoholic fatty liver disease (NAFLD). Structural equation modeling approach. PLoS One. 2018;13(2):1-17. DOI: 10.1371/journal. pone.0193138
  6. 6. Wu CK, Yang CY, Lin JW, Hsieh HJ, Chiu FC, Chen JJ, et al. The relationship among central obesity, systemic inflammation, and left ventricular diastolic dysfunction as determined by structural equation modeling. Obesity. 2012;20(4):730-737
  7. 7. Lee SY. Structural Equations Modeling: A Bayesian Approach. John Wiley & Sons; 2007. p. 432
  8. 8. Kim HY. Statistical notes for clinical researchers: Assessing normal distribution using skewness and kurtosis. Restorative Dentistry and Endodontics. 2013;38(1):52-54
  9. 9. Bentler PM, Chou C-P. Practical issues in structural modeling. Sociological Methods Research. 1987;16(1):78-117
  10. 10. Jayaram J, Kannan V, Tan K. Influence of initiators on supply chain value creation. International Journal of Production Research. 2004;42(20):4377-4399
  11. 11. Celik HE, Yilmaz V. Lisrel 9.1 ile Yapisal Esitlik Modellemesi. Ankara: Ani Yayincilik; 2013
  12. 12. Thakkar JJ. Structural Equation Modelling. Applications for Research and Practice (with AMOS and R). Singapore: Springer; 2020. p. 124. DOI: 10.1007/978-981-15-3793-6
  13. 13. Kaplan D, Depaoli S. Bayesian Structural Equation Modeling, Handbook of Structural Equation Modeling. New York: The Guilford Press; 2012. pp. 650-673
  14. 14. Olsson UH, Foss T, Troye SV, Howell RD. The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality. Structural Equation Modeling. 2000;7(4):557-595
  15. 15. Schumacker RE, Lomax RG. A Beginners Guide to Structural Equation Modeling. Routlege: Taylor & Francis; 2016
  16. 16. Prez-Castro E, Godnez-Jaimes F, Barrera-Rodrguez E, Reyes-Carreto R, Lpez-Roque R, Vera-Leyva, V. Impact of the red code process using structural equation models. In: Antoniano-Villalobos I, Mena, R, Mendoza M, Naranjo L, Nieto-Barajas L. (eds). Selected Contributions on Statistics and Data Science in Latin America. FNE 2018. Springer Proceedings in Mathematics Statistics. Springer, Cham; 2018. p. 111-125. DOI: 10.1007/978-3-030-31551-1_9
  17. 17. R Core Team. R a Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2017
  18. 18. Rosseel Y. Lavaan: An R package for structural equation modeling. Journal of Statistical Software. 2012;48(2):1-36

Written By

Ramón Reyes-Carreto, Flaviano Godinez-Jaimes and María Guzmán-Martínez

Reviewed: 15 April 2022 Published: 25 May 2022