Bayesian Random-Effects Meta-Analysis Models in Gene Expression Studies

Uma Siangphoe

doi:10.5772/intechopen.103124

Abstract

Random-effects meta-analysis models are commonly applied in combining effect sizes from individual gene expression studies. However, study heterogeneity is unknown and may arise from a variation of sample quality and experimental conditions. High heterogeneity of effect sizes can reduce the statistical power of the models. In addition, classical random-effects meta-analysis models are based on a normal approximation, which may be limited to small samples and its results may be biased toward the null value. A Bayesian approach was used to avoid the approximation and the biases. We applied a sample-quality weight to adjust the study heterogeneity in the Bayesian random-effects meta-analysis model with weighted between-study variance on a sample quality indicator and illustrated the application of this approach in Alzheimer’s gene expression studies.

Keywords

Bayesian random-effects model
meta-analysis
study heterogeneity
gene expression
sample quality weight
Alzheimer’s disease

Author Information

Show +

Uma Siangphoe*
- Moderna, Cambridge, MA, USA

*Address all correspondence to: uma.siangphoe@gmail.com

1. Introduction

Advances in the development of high-throughput technologies have enabled researchers to identify and quantify a large range of gene expression differences in many diseases. The number of gene expression studies has been increasing over the past decades as a result of advanced technologies. Several of them examine and address the same biological questions, even they have been implemented under different experimental conditions. Meta-analysis of gene expression data, therefore, has become a widely applied approach in combining results from multiple studies to identify common expression patterns that sometimes cannot be detected in individual studies. The meta-analytic approach has been shown to increase statistical power, accuracy, and generalizability of results [1, 2, 3, 4]. The use of meta-analysis techniques depends on the type of response and study objectives and most analyses in microarray studies have emphasized identifying differentially expressed (DE) genes or genes that distinguish the group of samples.

Random-effects (RE) meta-analysis models are commonly applied in combining effect sizes from individual gene expression studies. However, study heterogeneity is unknown and may arise from the variation of sample quality and experimental conditions and the study heterogeneity can decrease the statistical power of the models. To maintain power, we can increase the number of studies [5] or apply an appropriate estimation method for incorporating study heterogeneity into the models. Typically, the classical RE models assume studies are independently and uniformly sampled from a population of studies. However, studies are possibly designed based on the results of previous studies. The independence assumption and an infinite population of studies may not exist. Bayesian random-effects (BRE) models have been applied to handle the uncertainty of parameters in the models. The uncertainty is incorporated through a prior distribution. A summary of evidence after the data have been observed is described by the likelihood of the models. Multiplying the prior distribution and the likelihood function will provide a posterior distribution of the parameters [6, 7].

Sample quality has a substantial impact on results of gene expression studies [8, 9]. Low heterogeneity can be found in meta-analyses containing good-quality samples, while poor-quality samples can induce high heterogeneity of effect sizes. We recently evaluated the relationships between DE and heterogeneous genes in meta-analyses of Alzheimer’s gene expression data. Some overlapped DE and heterogeneous genes were detected in meta-analyses containing borderline- or poor-quality samples, while no heterogeneous genes were identified in meta-analyses containing good-quality samples [10]. The data obtained from borderline- or poor-quality samples can increase study heterogeneity and decrease the efficiency of meta-analyses [11, 12].

Small samples in gene expression studies may limit the application of classical RE models and its results may be biased toward the null or the observed value is closer to the null hypothesis than the true value. The BRE model can be used to avoid the approximation and the biases. We introduced a meta-analytic approach that included a sample-quality weight to adjust study heterogeneity in the BRE model [13]. The gene expression data therefore would include both up-weighted good-quality samples and down-weighted borderline-quality samples. Therefore, we first reviewed the classical RE models, the BRE model, and the weighted BRE model in the methods section and then illustrated an application of the methods in Alzheimer’s gene expression studies. Our results are then compiled in the results section and followed by discussion and conclusions.

2. Methods

2.1 Standard random-effects model

Choi et al. [14] introduced an unbiased standardized mean difference in expression between groups for each gene g [14, 15]. The meta-analytic model for combining the effect sizes is based on a two-level hierarchical model as follows:

yig=θig+εig,εig∼N0σig2

θig=βg+δig,δig∼N0τg2E1

where yig denotes the expression for gene g in study i = 1, …, k; θ_ig denotes the true difference in mean expression; σig2 denotes the within study variability displaying sampling errors conditional on ith study; βg denotes the common effects or the average measure of differential expression across individual datasets for each gene or the parameter of interest; δig denotes the random effects; and τg2 denotes the between-study variability displaying the variability between studies. We estimate the parameter of the common effect using a weighted least squares estimation. We minimize the sum of squares error by differentiating with respect to β̂g for each gene in each study, which yields.

β̂g=∑i=1kwigyig∑i=1kwig,wig=σig−2.E2

Generally, an unbiased estimator for τg2 can be derived from the method of moments and the estimator can attain negative values, therefore its truncated version, or the DerSimonian-Laird (DSL) estimator, is considered instead:

τ̂DSLg2=τ̂g2=max0Qg−kg−1S1g−S2g/S1gE3

where Qg=∑i=1kwigyig−β̂g2; wig=σig−2; and Srg=∑i=1kwigr. There may be a small bias of the DSL estimator but the bias occurs where τ̂g2 is close to zero or homogeneity [16], and the bias of the DSL estimator could not be traded off by variance reductions [17, 18]. Therefore, the DSL estimator is commonly applied when fitting random-effects models for a meta-analysis [19, 20]. In this study, we estimated β̂gτ̂g2 for each gene from the microarray data with the weight wig=σ̂ig2+τ̂g2−1 by the generalized least squares method [14], providing the minimum variance unbiased estimator for βg, to obtain the statistic for each gene zg,

β̂DSLgτ̂DSLg2=β̂gτ̂g2=∑i=1kσ̂ig2+τ̂g2−1yig∑i=1kσ̂ig2+τ̂g2−1,E4

Varβ̂DSLgτ̂DSLg2=Varβ̂gτ̂g2=1∑i=1kσ̂ig2+τ̂g2−1,E5

such thatzg=β̂DSLgτ̂DSLg2Varβ̂DSLgτ̂DSLg2∼N01.E6

The standard random-effects model currently estimates the between-study variance τg2 or the study heterogeneity using the DSL estimator.

3. Bayesian random-effects model (BRE)

In contrast to the classical RE model, the data and model parameters in the BRE model are considered to be random quantities [21]. We applied the BRE model to allow for the uncertainty of the between-study variance in this study. The model for gene g is written as

yigθig∼Nθigσig2,

θigβgτg∼Nβgτg2,

βg∼N01000,and

τg∼uniform01.E7

The kernel of the posterior distribution can be written as

pβgθ1g…θkgτg2∝pθgygσg2pβgτg2θg∝∏i=1kpθigyigσig2pθigβgτg2πβgπτg2,E8

where yg=y1g…ykg, σg2=σ1g2…σkg2, and θg=θ1g…θkg for gene g in the ith study; i = 1,…,k. The πβg and πτg2 are non-informative priors given as βg∼N01000, and τg∼ uniform (0,1).

The choice of prior distributions for scale parameters can affect analysis results, particularly in small samples. With scale parameters, the distributional form and the location of the prior distributions are obtained [22]. Uniform distributions are appropriate non-informative priors for τg2 [6, 13].

4. Sample-quality weights

The quality control (QC) criteria for identifying poor-quality samples in this study were the 3′:5′ GAPDH ratio greater than 3 and/or percent of present calls less than 30% for Affymetrix arrays; and detection rate less than 30% for Illumina Bead Arrays, in addition to data visualizations [8, 23]. Poor-quality samples were excluded before data preprocessing. Furthermore, the inverse of the within-study variance is considered an optimal weight for meta-analysis. The variance of weighted mean (β̂g) is minimized when the weights are taken from the variance of the samples yig. A high variance gives low weights in meta-analysis [24, 25]. In our recent study, the weight intermingled with the QC indicators called as “zero-to-one weight” was most appropriate for detecting DE genes [13]. The QC indicators adjusted the within-study variance in the weighted function as:

wP6=σig2wP1+τ̂g2−1,E9

where wP1∈2−Sij0.01P˜ijP˜ij denotes the percent of present calls of the jth sample in the ith study. A high value of the Pij weight indicates good-quality samples, providing high values of zero-to-one weights wP,ijto give more weight to the expression data.

5. Weighted between-study variance model

We adjusted the between-study variance in the BRE model (Eq. (9)) by multiplying with an average weight over the total sample in the ith study for gene gw¯ig=∑j=1niga+nigcwijg/niga+nigc. The BRE weighted between-study variance model for gene g is given by

yigθig∼Nθigσig2,

θigβgτgw¯ig∼Nβgτg2w¯ig,

βg∼N01000,and

τg∼uniform01.E10

We implemented two chains each with 20,000 iterations, a 15,000 burn-in period, and a thinning of 3 in the Bayesian model, and assessed the convergence of the model using the Gelman and Rubin diagnostic [26]. The posterior mean was standardized by posterior standard deviation as the posterior distribution was symmetric and normal. We then applied a Benjamini and Hochberg (BH) procedure to control the false discovery rate (FDR) for multiple gene testing. The performance of several BRE models for unweighted and weighted data, Gibbs and Metropolis-Hastings sampling algorithms, weighted common effect, and weighted between-study variance and classical RE models for unweighted and weighted data were evaluated in simulation studies [10, 13]. The classical RE and BRE meta-analysis models were implemented using programs from MAMA, R2jags, and metaDE packages in the R programming environment [27, 28, 29].

6. Results

We reviewed publicly available gene expression data from the NCBI GEO database. Twelve series of RNA expression profiling in the GEO database were selected for initial review. Eligible criteria for data acquisition were as follows: (1) the datasets were publicly accessible, (2) the samples were from human brain regions, (3) the series included samples from healthy controls, (4) the datasets included phenotypic data and published manuscripts describing the data, (5) the datasets without redundant samples, and (6) the raw or normalized intensity data were defined as gene expression levels. For each study we reviewed the minimum information about a microarray experiment (MIAME) from the GEO website, including research methods and results described in the manuscripts, and data summaries of the phenotypic data. This presented study included four publicly available Alzheimer’s disease (AD) gene expression datasets of post-mortem brain samples. The Gene Expression Omnibus accession numbers: GSE1297 [30], GSE5281 [31], GSE29378 [32], and GSE48350 [33] containing the gene expression and phenotypic data were included in this presented study. Some of these accession numbers (GSE5281, GSE13214, and GSE48350) include samples from multiple brain regions; we restricted our attention to only those samples acquired from hippocampus and to AD and control subjects in each dataset. The QC criteria for identifying poor-quality samples were having a 3′/5′ glyceraldehyde-3-phosphate dehydrogenase (GAPDH) ratio greater than three and/or percent of present calls less than 30% [23]. We then conducted within study data preprocessing, quantile normalization, and data aggregating. Our meta-analysis was therefore performed on 12,037 target genes in 131 subjects (68 AD cases and 63 controls) from the four studies using the Affymetrix and Illumina platforms (Figure 1). We then considered five ways of metadata sets and primarily examined the strength of study heterogeneity (R2) of each metadata object as described in [10]. The metadata A, B, D, and E had a relatively high R2, while the metadata C had a relatively low R2. In other words, metadata C contains homogenous data, while the remaining metadata objects may contain heterogeneous data (Figure 2). The distribution of unbiased standardized mean differences of gene expression in the GSE5281 dataset, which is different from the other datasets, is presented in Figure 3. The percent of present calls and the 3′:5′ GAPDH ratio of the heterogeneous dataset is presented in Figure 4.

Figure 1.
Study profile for meta-analysis in Alzheimer’s gene expression datasets.

Figure 2.
Barplots on strength of study heterogeneity measuring by random effects coefficient determination (R²) in meta-analysis in Alzheimer’s gene expression datasets. The R² of 10,000 genes were categorized into five groups. Tentatively, R² close to 0.25, 0.50, and 0.75 indicate low, moderate, and high heterogeneity, respectively. The y-axis presents the R² and the x-axis presents the number of genes in the meta-analysis.

Figure 3.
Distribution of unbiased standardized mean difference of gene expression (x-axis) between Alzheimer’s and control groups in GSE1297, GSE5281, GSE29378, and GSE48350 datasets.

Figure 4.
Percentage of present calls and 3′/5′ GAPDH ratio of GSE5281samples.

In this meta-analysis on the metadata of the four AD gene expression datasets, 1766 DE genes were identified by the classical RE model, while 466 DE genes were identified by the weighted BRE model. Almost all the DE genes identified by the weighted BRE model were genes among the significant DE genes identified by the classical RE model. Figure 5 presents the heatmap of 1766 DE up-regulated and down-regulated genes detected in the AD samples. There was no trend apparently toward more up-regulated genes or down-regulated genes on the AD samples as compared to the control samples. Meanwhile, there was a trend toward more down-regulated genes on the AD samples as compared to the control samples in the heatmap of the 466 identified DE genes (Figure 6). The 446 genes could potentially be down-regulated genes that may contribute to the good classification of Alzheimer’s samples (Table 1).

Figure 5.
Heatmaps of expression patterns of 1766 differentially expressed genes in hippocampus in Alzheimer’s and control samples. The differentially expressed genes were detected by the classical random-effect meta-analysis model on the metadata of four Alzheimer’s gene expression datasets: GSE1297, GSE5281, GSE29378, and GSE48350.

Figure 6.
Heatmaps of expression patterns of 446 differentially expressed genes in hippocampus in Alzheimer’s and control samples. The differentially expressed genes were detected by the classical random-effect meta-analysis model on the metadata of four Alzheimer’s gene expression datasets: GSE1297, GSE5281, GSE29378, and GSE48350.

AACS, AASDHPPT, ABCA1, ACLY, ACOT7, ADAM22, ADAM23, ADARB1,AFF2, AGK, AMPH, ANGPT1, ANP32C, AP2S1, AP3B2, AP3D1, AP3M2, APBA2, APMAP, ARFGEF1, ARHGDIG, ARHGEF9, ARPC5L, ASIC2, ASNS, ASPHD1, ATAT1, ATP1A1, ATP1A3, ATP2A2, ATP2B2, ATP5B, ATP5C1, ATP5D, ATP5G1, ATP5H, ATP5L, ATP6AP1, ATP6V0B, ATP6V0E1, ATP6V1B2, ATP6V1E1, ATP6V1G2, ATP8A2, ATPIF1, ATR, ATRN, ATRNL1, ATXN7L3B, BCL2, BEX1, BEX4, BPGM, BSN, C10orf88, C12orf10, C14orf2, C16orf45, C1orf216, C2CD5, C2orf47, C5orf22, CA10, CABYR, CACNA2D3, CADPS, CALY, CAMK1, CAMK2N1, CAMKV, CAPRIN2, CCK, CDC40, CDC42EP4, CDK5, CDKN2D, CGREF1, CHGB, CHN1, CISD1, CLIP3, CLTA, CNR1, COPS3, COPS7A, COPZ2, COQ6, COX4I1, COX6C, CP, CREBBP, CRYM, CS, CUL2, CYCS, CYP4F12, DAP3, DCTN1, DDX41, DEAF1, DGUOK, DHRS11, DHRS3, DIRAS3, DLEC1, DLG2, DLGAP2, DMXL2, DNASE2, DNM1, DNM1L, DNM3, DOCK3, DOPEY1, DROSHA, DYNC1H1, DYNC1I1, ECM2, EEF1A2, EGFR, EHD3, ELF1, ELOVL4, ELOVL6, ENC1, ENO2, ENTPD2, ENTPD3, EPB41L1, EPS15, ERC2, FAM111A, FAM127A, FAM162A, FAM174B, FAM188A, FAM216A, FAM60A, FAM98A, FAR2, FGF12, FH, FHL2, FIBP, FKBP3, FMO2, FOCAD, FOXJ1, FOXO1, FRMPD4, FSD1, FXN, FYCO1, FYN, GABBR2, GABRG2, GAD, GAD2, GCC2, GLS2, GNAI2, GNG3, GNG4,GOT1, GPHN,GPI, GPRASP1, GRIA2, GRIN1, GRM1, GSTA4, GUCY1B3, GUK1v GYPC, HAGH, HARS, HERC1, HMGCR, HMP19, HN1, HNRNPUL1, HPRT1, HSPA12A, IGF1R, IMMT, IMP3, IMP4, INA, INPP5F, ITPKB, ITSN1, KAT6A, KCNN3, KCNQ2, KIAA0513, KIAA1324, KIF21B, KIFAP3, LARGE, LCMT1, LDB2, LEMD3, LGALS8, LPAR4, LPCAT4, LPIN1, LPP, LRPPRC, LRRC8B, LY6H, MAK16, MAP1A, MAP2K1, MAP2K4, MAP3K9, MAPK9, MAST3, MCF2, MCTS1, MDH1, MDH2, MICU1,MKKS, MLLT11, MOAP1, MPP1, MPPED2, MRPL15, MRPL17, MRPL35, MRPS11, MRPS17, MRPS22, MTMR11, MTSS1L, MTX2, MXI1, MYL12B, MYT1L, NAP1L2, NAP1L3, NCALD, NDN, NDRG3, NDRG4, NDUFA10, NDUFA3, NDUFA4, NDUFA8, NDUFA9, NDUFS3, NDUFS5, NDUFV2, NECAP1, NEDD8, NEFL, NEFM, NELL1, NETO2, NFIB, NIPSNAP3B, NLK, NME1, NMNAT2, NOVA1, NREP, NRGN, NRIP3, NRN1, NSF, NSG1, NUPL2, OGDHL, OPA1, ORC5, P4HTM, PAGE1, PAX6, PDCD1LG2, PEX11B, PIN1, PLCD1, PLCE1, PLCL2, PLD3, PLEC, PLEKHA4, PLK2, PLSCR4, PLXNB2, PMFBP1, PNMAL1, PNO1, PODXL2, POLB, POLRMT, POP7, PPFIA4, PPIA, PPIP5K1, PPM1H, PPME1, PPP1R13L, PPP2CA, PPP3CB, PREP, PREPL, PRKCZ, PRMT1, PRPF40A, PSD4, PSMD8, PTDSS1, PTGES2, PTPRE, PTPRR, PTRH2, PTS, PVRL3, RAB11A, RAB26, RAB27A, RAB2A, RAB6A, RAD51C, RAP1GDS1, RARS, RBFOX2, RGS17, RGS7, RHOQ, RIMBP2, RIT2, RND2, RNF123, RNF41, RNFT2, RNMT, RNPS1, RPH3A, RPP40, RPS6KC1, RUNDC3B, RWDD2A, RXRA, SCAMP2, SCG5, SCN2A, SCN3B, SDHA, SEC22A, SEC61A2, SEH1L, SEPT6, SERPINI2, SEZ6L2, SLC12A5, SLC25A11, SLC25A12, SLC25A14, SLC25A4, SLC4A1AP, SLIRP, SLITRK3, SMARCA4, SMO, SMOX, SMYD2, SNAP25, SNAP91, SNCB, SOX2, SPAG7, SPIN2A, SPINT2, SRM, SRPR, SS18L1, SSPN, STAU2, STMN2, STX6, STXBP1, SULT4A1, SUSD4, SV2B, SYDE1, SYN1, SYN2, SYNGR1, SYNJ1, SYT1, TAGLN3, TAZ, TBC1D31, TBC1D9, TBCC, TBCE, TBL1X, TBPL1, TCEA2, TCF7L2, TERF2IP, TGFBR3, THOC5, TMEM151B, TMEM160, TMEM246, TMEM59L, TMEM70, TMEM97, TNPO1, TOMM34, TOMM70A, TOR1A, TPD52, TPI1, TRAP1, TRAPPC2, TRIM37, TRIM9, TRIOBP, TSPAN13, TSPAN7, TSSC1, TUBA1B, TUBA4A, TUBB3, TUBG1, TUBG2, TXNDC9, UBE2M, UBE2S, UCHL1, UCHL3, UQCC1, UTP11L, VSNL1, WDR47, WDR7, WFDC1, XK, YWHAH, ZFP36L1, ZNF365, ZNHIT3

Table 1.

List of 446 significantly differentially expressed genes in Alzheimer’s gene expression datasets.

Note: The differentially expressed genes were detected by the weighted Bayesian random-effect meta-analysis models on the metadata of four Alzheimer’s gene expression datasets: GSE1297, GSE5281, GSE29378, and GSE48350.

We then performed gene network analysis using a publicly available web interface, GeneMANIA [34]. The 446 DE genes identified by the weighted BRE model participate in 146 significant pathways at a false discovery rate of 1%. The first-thirty highly significant pathways with more than twenty identified DE genes in each network included cellular respiration, oxidative phosphorylation, mitochondrial protein complex, inner mitochondrial membrane, protein complex, ATP metabolic process, respiratory electron transport chain, ATP synthesis coupled electron transport, electron transport chain, mitochondrial ATP synthesis coupled, electron transport, mitochondrial inner membrane, energy derivation by oxidation of organic compounds, respiratory chain complex, respirasome NADH dehydrogenase activity, NADH dehydrogenase (quinone) activity, NAD(P)H dehydrogenase (quinone) activity, mitochondrial respirasome, oxidoreductase activity, acting on NAD(P)H, quinone or similar compound as acceptor, respiratory chain complex I, NADH dehydrogenase complex, proton transmembrane transporter activity, aerobic respiration, presynapse, postsynapse, NADH dehydrogenase, complex assembly, oxidoreductase complex, proton-transporting two-sector ATPase complex, mitochondrial proton-transporting ATP synthase complex, ATPase-coupled cation transmembrane transporter activity, synaptic vesicle recycling, inner mitochondrial membrane organization, and cellular response to peptide. GeneMANIA overall retrieved the genes with known coexpression (51.98%), consolidated pathways (25.08%), physical interactions (27.73%), colocalization (10.79%), genetic interactions (5.79%), predicted interactions (2.65%), pathway (0.86%), and share protein domain (0.20%). More details can be found on www.genemania.org.

7. Discussion

In this study, we developed a meta-analytic approach for incorporating sample-quality information into the BRE meta-analysis model using an efficient weight identified by a series of simulation studies [10, 13] to adjust the study heterogeneity in the model. We illustrated the weighted Bayesian approach as compared to the classical RE model through an application in Alzheimer’s gene expression studies. We have seen the results of Alzheimer’s gene expression varied by the sample qualities [13]. The variation of sample quality restricted meta-analysis techniques to properly detect DE genes [35, 36]. Meanwhile, the BRE meta-analysis model allows flexibility in calculating yig and its variance σig2 as well as study-specific adjustments [37]. We therefore can up-weight good-quality samples and down-weight borderline-quality samples in the model. This developed approach utilizes sample-quality information in the meta-analysis of high-dimensional microarray studies in detecting DE genes.

Additionally, the classical RE model tended to estimate τg2 as being zero and the variance of β̂gwas underestimated, while the BRE meta-analysis model can allow for the uncertainty of the parameter estimates in the model. The BRE model used the marginal posterior distribution of τg2 for β̂g estimation, which does not reply on the point estimate of τg2. The BRE model can therefore, in turn, increase the fitness of the model [38].

8. Conclusions

This meta-analytic approach with the sample-quality weight can increase the precision and accuracy of the Bayesian random-effects models in gene expression meta-analysis. The performance of the weighted Bayesian random-effects model may be varied depending on data feature, levels of sample quality, and adjustment of parameter estimates.

References

1. Rung J, Brazma A. Reuse of public genome-wide gene expression data. Nature Reviews Genetics. 2013;14(2):89-99
2. Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Medicine. 2008;5(9):e184
3. Tseng GC, Ghosh D, Feingold E. Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Research. 2012;40(9):3785-3799
4. Chang LC, Lin HM, Sibille E, Tseng GC. Meta-analysis methods for combining multiple expression profiles: Comparisons, statistical characterization and an application guideline. BMC Bioinformatics. 2013;14(1):1-15
5. Borenstein M, Hedges LV, Higgins JP, Rothstein HR. Power analysis of meta-analysis. In: Introduction to Meta-analysis. The Atrium, Southern Gate, Chichester, West Sussex, United Kingdom: John Wiley & Sons Ltd; 2021. pp. 266-276
6. Higgins J, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society: Series A (Statistics in Society). 2009;172(1):137-159
7. Ntzoufras I. Bayesian hierarchical models. In: Bayesian Modeling Using WinBUGS. Hoboken, New Jersey: John Wiley & Sons; 2011. pp. 305-340
8. Draghici S. Quality control. In: Statistics and Data Analysis for Microarrays Using R and Bioconductor. 2nd ed. Boca Raton, Florida: Chapman & Hall/CRC Mathematical and Computational Biology; 2016. pp. 633-689
9. Siangphoe U, Archer KJ. Gene expression in HIV-associated neurocognitive disorders: A meta-analysis. Journal of Acquired Immune Deficiency Syndromes. 2015;70(5):479-488
10. Siangphoe U, Archer KJ. Estimation of random effects and identifying heterogeneous genes in meta-analysis of gene expression studies. Briefings in Bioinformatics. 2017;18(4):602-618
11. Cheng WC, Tsai ML, Chang CW, Huang CL, Chen CR, Shu WY, et al. Microarray meta-analysis database (M(2)DB): A uniformly pre-processed, quality controlled, and manually curated human clinical microarray database. BMC Bioinformatics. 2010;11(1):1-9
12. Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, et al. Multiple-laboratory comparison of microarray platforms. Nature Methods. 2005;2(5):345-350
13. Siangphoe U, Archer KJ, Mukhopadhyay ND. Classical and Bayesian random-effects meta-analysis models with sample quality weights in gene expression studies. BMC Bioinformatics. 2019;20(1):1-5
14. Choi JK, Yu U, Kim S, et al. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003;19(suppl 1):i84-i90
15. Hedges L, Olkin I. Random effects models for effect sizes. In: Statistical Methods for Meta-Analysis. Orlando, FL: Academic Press; 1985. pp. 189-203
16. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials. 1986;7(3):177-188
17. Biggerstaff B, Tweedie R. Incorporating variability in estimates of heterogeneity in the random effects model in meta-analysis. Statistics in Medicine. 1997;16(7):753-768
18. Shuster JJ. Empirical vs natural weighting in random effects meta-analysis. Statistics in Medicine. 2010;29(12):1259-1265
19. Whitehead A, Whitehead J. A general parametric approach to the meta-analysis of randomized clinical trials. Statistics in Medicine. 1991;10(11):1665-1677
20. Demidenko E, Sargent J, Onega T. Random effects coefficient of determination for mixed and meta-analysis models. Communications in Statistics-theory and Methods. 2012;41(6):953-969
21. Alex JS, Keith RA. Bayesian methods in meta-analysis and evidence synthesis. Statistical Methods in Medical Research. 2001;10(4):277-303
22. Lambert PC, Sutton AJ, Burton PR, Abrams KR, Jones DR. How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS. Statistics in Medicine. 2005;24(15):2401-2428
23. Dumur CI, Nasim S, Best AM, Archer KJ, Ladd AC, Mas VR, et al. Evaluation of quality-control criteria for microarray gene expression analysis. Clinical Chemistry. 2004;50(11):1994-2002
24. Chen D, Peace KE. Fixed-effects and random-effects in meta-analysis. In: Applied Meta-Analysis Using R. Boca Raton, Florida: CRC Press; 2013. pp. 27-52
25. Kacker RN. Combining information from interlaboratory evaluations using a random effects model. Metrologia. 2004;41(3):132
26. Gelman A, Carlin JB, Stern HS, Rubin DB. Model checking and improvement. In: Bayesian Data Analysis. Boca Raton, Florida: Chapman & Hall. CRC Texts in Statistical Science; 2004. pp. 161-197
27. Ihnatova I. MAMA: Meta-Analysis of MicroArray. R Package Version 2.2.1. 2013
28. Su YS, Yajima M. R2jags: Using R to Run'JAGS'. R package version 0.5-7. Available from: CRAN. R-project. org/package= R2jags. 2015
29. Wang X, Kang DD, Shen K, et al. An R package suite for microarray meta-analysis in quality control, differentially expressed gene analysis and pathway enrichment detection. Bioinformatics. 2012;28(19):2534-2536
30. Blalock EM, Geddes JW, Chen KC, et al. Incipient Alzheimer’s disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responses. Proceedings of the National Academy of Sciences. 2004;101(7):2173-2178
31. Liang WS, Dunckley T, Beach TG, et al. Gene expression profiles in anatomically and functionally distinct regions of the normal aged human brain. Physiological Genomics. 2007;28(3):311-322
32. Miller JA, Woltjer RL, Goodenbour JM, et al. Genes and pathways underlying regional and cell type changes in Alzheimer’s disease. Genome Medicine. 2013;5(5):48
33. Blair LJ, Nordhues BA, Hill SE, et al. Accelerated neurodegeneration through chaperone-mediated oligomerization of tau. The Journal of Clinical Investigation. 2013;123(10):4158-4169
34. Warde-Farley D, Donaldson SL, Comes O, et al. The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Research. 2010;38:W214-W220
35. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology. 2004;5(10):R80
36. Eijssen LM, Jaillard M, Adriaens ME, Gaj S, de Groot PJ, Muller M, et al. User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org. Nucleic Acids Research. 2013;41(W1):W71-W76
37. Demidenko E. Meta-analysis model. In: Mixed Models: Theory and Applications with R. Hoboken, New Jersey: John Wiley & Sons; 2013. pp. 247-291
38. Bodnar O, Link A, Arendacká B, Possolo A, Elster C. Bayesian estimation in random effects meta-analysis using a non-informative prior. Statistics in Medicine. 2017;36(2):378-399

[1] 1. Rung J, Brazma A. Reuse of public genome-wide gene expression data. Nature Reviews Genetics. 2013;14(2):89-99

[2] 2. Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Medicine. 2008;5(9):e184

[3] 3. Tseng GC, Ghosh D, Feingold E. Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Research. 2012;40(9):3785-3799

[4] 4. Chang LC, Lin HM, Sibille E, Tseng GC. Meta-analysis methods for combining multiple expression profiles: Comparisons, statistical characterization and an application guideline. BMC Bioinformatics. 2013;14(1):1-15

[5] 5. Borenstein M, Hedges LV, Higgins JP, Rothstein HR. Power analysis of meta-analysis. In: Introduction to Meta-analysis. The Atrium, Southern Gate, Chichester, West Sussex, United Kingdom: John Wiley & Sons Ltd; 2021. pp. 266-276

[6] 6. Higgins J, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society: Series A (Statistics in Society). 2009;172(1):137-159

[7] 7. Ntzoufras I. Bayesian hierarchical models. In: Bayesian Modeling Using WinBUGS. Hoboken, New Jersey: John Wiley & Sons; 2011. pp. 305-340

[8] 8. Draghici S. Quality control. In: Statistics and Data Analysis for Microarrays Using R and Bioconductor. 2nd ed. Boca Raton, Florida: Chapman & Hall/CRC Mathematical and Computational Biology; 2016. pp. 633-689

[9] 9. Siangphoe U, Archer KJ. Gene expression in HIV-associated neurocognitive disorders: A meta-analysis. Journal of Acquired Immune Deficiency Syndromes. 2015;70(5):479-488

[10] 10. Siangphoe U, Archer KJ. Estimation of random effects and identifying heterogeneous genes in meta-analysis of gene expression studies. Briefings in Bioinformatics. 2017;18(4):602-618

[11] 11. Cheng WC, Tsai ML, Chang CW, Huang CL, Chen CR, Shu WY, et al. Microarray meta-analysis database (M(2)DB): A uniformly pre-processed, quality controlled, and manually curated human clinical microarray database. BMC Bioinformatics. 2010;11(1):1-9

[12] 12. Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, et al. Multiple-laboratory comparison of microarray platforms. Nature Methods. 2005;2(5):345-350

[13] 13. Siangphoe U, Archer KJ, Mukhopadhyay ND. Classical and Bayesian random-effects meta-analysis models with sample quality weights in gene expression studies. BMC Bioinformatics. 2019;20(1):1-5

[14] 14. Choi JK, Yu U, Kim S, et al. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003;19(suppl 1):i84-i90

[15] 15. Hedges L, Olkin I. Random effects models for effect sizes. In: Statistical Methods for Meta-Analysis. Orlando, FL: Academic Press; 1985. pp. 189-203

[16] 16. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials. 1986;7(3):177-188

[17] 17. Biggerstaff B, Tweedie R. Incorporating variability in estimates of heterogeneity in the random effects model in meta-analysis. Statistics in Medicine. 1997;16(7):753-768

[18] 18. Shuster JJ. Empirical vs natural weighting in random effects meta-analysis. Statistics in Medicine. 2010;29(12):1259-1265

[19] 19. Whitehead A, Whitehead J. A general parametric approach to the meta-analysis of randomized clinical trials. Statistics in Medicine. 1991;10(11):1665-1677

[20] 20. Demidenko E, Sargent J, Onega T. Random effects coefficient of determination for mixed and meta-analysis models. Communications in Statistics-theory and Methods. 2012;41(6):953-969

[21] 21. Alex JS, Keith RA. Bayesian methods in meta-analysis and evidence synthesis. Statistical Methods in Medical Research. 2001;10(4):277-303

[22] 22. Lambert PC, Sutton AJ, Burton PR, Abrams KR, Jones DR. How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS. Statistics in Medicine. 2005;24(15):2401-2428

[23] 23. Dumur CI, Nasim S, Best AM, Archer KJ, Ladd AC, Mas VR, et al. Evaluation of quality-control criteria for microarray gene expression analysis. Clinical Chemistry. 2004;50(11):1994-2002

[24] 24. Chen D, Peace KE. Fixed-effects and random-effects in meta-analysis. In: Applied Meta-Analysis Using R. Boca Raton, Florida: CRC Press; 2013. pp. 27-52

[25] 25. Kacker RN. Combining information from interlaboratory evaluations using a random effects model. Metrologia. 2004;41(3):132

[26] 26. Gelman A, Carlin JB, Stern HS, Rubin DB. Model checking and improvement. In: Bayesian Data Analysis. Boca Raton, Florida: Chapman & Hall. CRC Texts in Statistical Science; 2004. pp. 161-197

[27] 27. Ihnatova I. MAMA: Meta-Analysis of MicroArray. R Package Version 2.2.1. 2013

[28] 28. Su YS, Yajima M. R2jags: Using R to Run'JAGS'. R package version 0.5-7. Available from: CRAN. R-project. org/package= R2jags. 2015

[29] 29. Wang X, Kang DD, Shen K, et al. An R package suite for microarray meta-analysis in quality control, differentially expressed gene analysis and pathway enrichment detection. Bioinformatics. 2012;28(19):2534-2536

[30] 30. Blalock EM, Geddes JW, Chen KC, et al. Incipient Alzheimer’s disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responses. Proceedings of the National Academy of Sciences. 2004;101(7):2173-2178

[31] 31. Liang WS, Dunckley T, Beach TG, et al. Gene expression profiles in anatomically and functionally distinct regions of the normal aged human brain. Physiological Genomics. 2007;28(3):311-322

[32] 32. Miller JA, Woltjer RL, Goodenbour JM, et al. Genes and pathways underlying regional and cell type changes in Alzheimer’s disease. Genome Medicine. 2013;5(5):48

[33] 33. Blair LJ, Nordhues BA, Hill SE, et al. Accelerated neurodegeneration through chaperone-mediated oligomerization of tau. The Journal of Clinical Investigation. 2013;123(10):4158-4169

[34] 34. Warde-Farley D, Donaldson SL, Comes O, et al. The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Research. 2010;38:W214-W220

[35] 35. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology. 2004;5(10):R80

[36] 36. Eijssen LM, Jaillard M, Adriaens ME, Gaj S, de Groot PJ, Muller M, et al. User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org. Nucleic Acids Research. 2013;41(W1):W71-W76

[37] 37. Demidenko E. Meta-analysis model. In: Mixed Models: Theory and Applications with R. Hoboken, New Jersey: John Wiley & Sons; 2013. pp. 247-291

[38] 38. Bodnar O, Link A, Arendacká B, Possolo A, Elster C. Bayesian estimation in random effects meta-analysis using a non-informative prior. Statistics in Medicine. 2017;36(2):378-399

Bayesian Random-Effects Meta-Analysis Models in Gene Expression Studies

Gene Expression

Abstract

Keywords

Author Information

Uma Siangphoe*

1. Introduction

2. Methods

2.1 Standard random-effects model

3. Bayesian random-effects model (BRE)

4. Sample-quality weights

5. Weighted between-study variance model

6. Results

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Table 1.

7. Discussion

8. Conclusions

References

Continue reading from the same book

Gene Expression