Post-Traumatic Stress Disorder Outcome Research: Why Moderators Should not be Neglected Post-Traumatic Stress Disorder Outcome Research: Why Moderators Should not be Neglected

Several psychotherapeutic treatments have been developed over the years for treating the symptoms of post-traumatic stress disorder (PTSD). But it remains still unclear which components of the complex treatment packages are necessary and most beneficial for PTSD symptom improvement. In PTSD outcome research, the randomized controlled trial (RCT) design has been applied in order to address this issue. However, meta-analyses repeatedly reported considerable variation between results from individual RCTs (i.e. between-study heterogeneity). Attempts to explain such hetero- geneity led to the identification of relevant moderators of treatment effects in PTSD RCTs. This study presents meta-analytic findings, which show that factors, which are not part of the treatment (such as the investigators’ preferences for a particular treatment or the complexity of the patients’ clinical problems), impact on outcome in PTSD RCTs. We show that considering extra-therapeutic moderators in meta-analyses on PTSD RCTs may impact the conclusions and recommendations that may be deduced. The summarized findings confirm the notion that no PTSD treatment consistently outper-forms the others and strengthen the position that even non-trauma-focused treatments may be beneficial PTSD treatments.


Introduction
Over the last decades, different etiological models of post-traumatic stress disorder (PTSD) have led to the design of a number of psychotherapeutic treatments that all target at reducing PTSD symptoms (e.g. exposure therapy [1], cognitive processing therapy [2], for an overview see [3]). Until recently, clinical guidelines and systematic reviews concluded that patients with PTSD require psychotherapeutic treatments that specifically target the trauma experience [4]. However, recent meta-analyses showed that focusing on the trauma experience may not be generally necessary for successful PTSD treatment [5][6][7].
Randomized controlled trials (RCTs) have typically been conducted in order to identify those components of complex treatment packages that critically impact symptom improvement (i.e. in placebo-controlled studies and comparative or dismantling studies, see [8]). Recently, however, the general validity of RCTs has been criticized in medical as well as in psychotherapy research, by showing that extra-therapeutic factors (such as blinding of outcome assessors or the sample size) may considerably affect the outcome in RCTs [9][10][11][12]. Accordingly, metaanalyses, which attempted to explain variation between effect estimates from individual studies-the so-called between-study heterogeneity-identified a number of moderators of treatment effects in PTSD RCTs [7,[13][14][15][16].
This paper summarizes meta-analytic findings, which show that in PTSD outcome research extra-therapeutic factors affect the outcome. These findings relate to two questions in the current debate in PTSD outcome research: first, 'Is there evidence that some PTSD treatments consistently outperform others?' and second, 'Is a trauma focus generally necessary for successful PTSD treatment?'. We will briefly describe the research designs that have been used in order to address the abovementioned questions. Then, we will describe common flaws in meta-analyses and we will use examples from PTSD research in order to show how flaws in meta-analyses may lead to invalid conclusions. And finally, we will summarize how the consideration of relevant moderators may alter the conclusions that may be drawn from metaanalyses of RCTs with respect to the two highlighted research questions.

What is characteristic in PTSD treatments?
Psychotherapy outcome research aims at identifying treatment components that are critical for symptom improvement. The RCT design has been adopted from medical research and became standard in psychotherapy outcome research. This design relies on the assumption that the overall treatment effect is composed of first, the true effect of the treatment under investigation and second, effects that are due to the context of being in treatment.
Whereas the first type of components has typically been described as 'specific' or 'active' components in psychotherapy outcome research, for the second type of components a number of synonymously used terms occur in the literature, although they may have slightly different connotations [17,18]. Such terms include 'common,' 'general' or 'non-specific' factors or 'psychological placebos.' For the present review, we will follow the terminology proposed by Grünbaum ( [19], p. 159). Grünbaum's definition captures the outlined dichotomy with reference to the presence or absence of a psychological (e.g. etiological) theory, which defines the content of a complex treatment, such as psychotherapy. Accordingly, treatment components will be considered characteristic if there is a theoretical model that describes how the respective component will contribute to symptom improvement. In contrast, treatment components will be considered incidental if there is no such theory-based link to symptom improvement. Thus, the components that are considered active or specific would be considered characteristic, because typically they are deduced from psychological theories, which describe how they will improve the symptoms of a particular disorder. In contrast to the concept of specificity, which has been related to uniqueness of treatment components [18], components may be considered characteristic even though they may not be a unique component of a particular treatment package. All other factors that may contribute to a treatment effect but which have not (yet) been specified within a psychological theory would be considered incidental, no matter whether they are common to all treatments, shared by some or not at all related to the treatment itself, but, for instance, rather to the patient, therapist or to the conduct of the study.
When looking at PTSD research, a number of etiological models of PTSD led to the definition of a number of characteristic components and accordingly to the development of several rival treatment packages (see [3,5,20,21]). However, the classification of PTSD treatments according to the underlying etiological model has been a challenge in previous meta-analyses [14]. An inconsistent terminology and the use of treatment labels that are not clearly defined and thus not exclusive (see [22]) lead to considerable variation in the classification of PTSD treatments across individual meta-analyses. In order to reduce some complexity and despite the differences in the foci of the underlying etiological models (e.g. focusing on cognitions vs. focusing on behavioral aspects), several treatment packages have been summarized under the umbrella term of trauma-focused treatments [23]. But again, the definition of the term remained largely unclear and led to inconsistencies with respect to which treatments were to be considered trauma-focused [22]. In contrast, treatments that clearly do not address the trauma experience or even proscribe talking about the trauma have consistently been used as psychological placebo control conditions. According to the Grünbaum definition, for the present review, we will consider treatments that provided some theory-driven link between treatment components and symptom improvement as characteristic (this includes treatments that have previously been summarized as trauma-focused), whereas the clearly non-trauma-focused interventions would be regarded as relying on incidental treatment components. However, we will not be able to resolve the inconsistencies, which appear between different meta-analyses and which are due to the imprecise terminology.

Estimating the relevance of characteristic treatment components in metaanalyses of RCTs
The RCT and meta-analyses of RCTs are considered the highest level of evidence for the efficacy of treatments [24], and different types of RCTs are employed in psychotherapy outcome research [8].
First, psychotherapeutic treatments are compared with an untreated control group, such as no-treatment or waiting list (WL) designs. The inclusion of an untreated control group in an RCT minimizes most threats to the internal validity (e.g. controls for spontaneous remission and regression to the mean). Therefore, such a design may be used for showing that a psychotherapeutic treatment is efficacious. With this study design, a number of meta-analyses demonstrated large effect sizes (ESs) for eye-movement desensitization and reprocessing (EMDR), cognitive treatments, exposure-based treatments and the combination of the latter to cognitive-behavioral treatments (CBT) (e.g. [13,14,23]), even though treatment effects may be overestimated in studies with small sample size [14]. However, with respect to the research questions highlighted in the present review, such design does not provide an answer: first, a larger effect size of an assumed study comparing treatment A vs. WL, as compared to a second study comparing treatment B vs. WL, may not be interpreted as superiority of A over B, if A has not been shown to be superior to B in a comparative RCT. A number of study characteristics, which may differentiate between the two assumed studies-A vs. WL and B vs. WL, such as different patient samples, different therapists, different study methodology and designmight explain a larger effect in one comparison than in the other. Second, such design does also not tell which characteristic treatment components are critical for symptom improvement, because the amount of the total treatment effect that is due to the characteristic vs. incidental components cannot be disentangled. Thus, such design may not answer the question whether a trauma focus is necessary for the successful PTSD treatment.
In order to control for the incidental effects and to evaluate the impact of characteristic treatment components, in a second type of RCTs, psychotherapeutic treatments are compared with psychological placebos. Superiority of the psychotherapeutic treatment over the psychological placebo could be specifically attributed to the characteristic treatment components, which were lacking in the placebo control. Thus, by manipulating the presence of a particular component, the incremental value of this component can be estimated. For example, the impact of prolonged exposure on PTSD symptoms was compared to present-centered therapy, which was designed as placebo control. It explicitly excluded exposure to the trauma and thus did not focus on the trauma experience [25]. In this particular study, superiority of prolonged exposure over present-centered therapy was small to moderate, and meta-analyses revealed mixed findings with a small, moderate or large superiority of specific, trauma-focused PTSD treatments over placebo control treatments [7,13,14]. While the placebo-controlled RCT in the best case allows for estimating the amount of the treatment effect that may be attributed to the characteristic vs. incidental components, there is still no information on which out of several rival treatment packages contains the most relevant components, that is which treatment should be considered the treatment of choice for a particular problem or disorder. Therefore, a third type of control treatments in RCTs may encompass treatments with established efficacy (i.e. comparing treatment A vs. treatment B). Such comparative designs are typically used for demonstrating superiority of a novel treatment compared with an established one. If a novel treatment consists of an amendment to an established treatment, the dismantling or add-on study design may be applied in order to demonstrate the incremental benefit of adding or removing a particular treatment component to or from a complex treatment package. Superiority of the novel treatment over an established one would of course be attributed to the superior efficacy of the unique component(s) of the novel treatment. If, however, such a study demonstrated equivalence in treatment effects of the two compared treatments, symptom improvements in both treatments are most likely mediated by common or shared mechanisms [26,27]. In PTSD outcome research, for example, prolonged exposure plus cognitive restructuring was compared with exposure alone in one RCT, in order to estimate the incremental effect of adding cognitive restructuring to the established exposure treatment [28]. This particular RCT failed to demonstrate superiority of adding cognitive restructuring to an exposure treatment and also meta-analyses that summarized comparative RCTs of individual PTSD treatments found no statistically significant differences between the effects of two types of PTSD treatments (e.g. [13,14,23,[29][30][31][32][33][34]). The equivalent effects of the diverse PTSD treatments have been explained by the presence of a shared mechanism in all of the successful treatments, namely that all treatments focus on the trauma experience [23].
Thus, regarding the first research question, whether a particular treatment package outperforms the others, meta-analyses on comparative RCTs mostly indicated rather similar effects of different treatment packages and thus no superiority of particular characteristic components over other characteristic components. Regarding the second research question, the results of placebo-controlled PTSD RCTs at first sight may be considered as confirming the assumption that successful PTSD treatment requires the characteristic component of focusing on the trauma experience. However, upon closer examination, a substantial amount of unexplained betweenstudy heterogeneity indicates the presence of moderators in several of the abovementioned meta-analyses [7,13,14,23,30,32,34], which complicates or even precludes drawing valid conclusions [35].

Bias and diversity in meta-analytically pooled estimates of treatment effects
Meta-analyses have the potential to provide a precise and valid estimate of an treatment effect on the outcome of interest, as they statistically combine the available evidence relevant for a particular research question. Accordingly, meta-analyses have been established as the top level of the hierarchy of evidence [36]. In a meta-analysis, a pooled estimate of the treatment effect is calculated using the treatment effects obtained from each of the included studies. A metaanalysis thus heavily relies on the necessity of including all relevant studies, or at least a random sample of the relevant studies. This becomes more important, of course, if the total number of relevant studies is small, because each individual study may have a larger impact on the pooled effect size estimate in this case. Therefore, a meta-analysis should be preceded by a systematic review, which intends to identify all studies that addressed a particular research question. A systematic review should be conducted using a documented and systematic approach [37].
Conducting a meta-analysis typically increases the precision of the estimated treatment effect, because the number of patients that contribute to the pooled estimate of the treatment effect is larger in the meta-analysis than in each individual study. But estimates from individual studies may vary considerably. The presence of between-study heterogeneity in effect sizes from individual psychotherapy outcome studies suggests that the total pool of studies may be divided into subgroups of studies that show either larger or smaller treatment effects. The presence of between-study heterogeneity may hint at potential sources of bias or at genuine diversity [38]. Heterogeneity, which is commonly present in meta-analyses of psychotherapy outcome studies, does not necessarily prohibit the conduct of meta-analysis, but rather demands exploration of potential sources of variation [35].
Thus, reducing unsystematic error in the data will result in more precise estimates of treatment effects, while avoiding systematic error-that is bias and genuine diversity-will reduce heterogeneity and increase validity. Thus, bias is different from unsystematic random error and can be regarded as the opposite of validity [39]. Bias has been defined as 'any process at any stage of inference, which tends to produce results or conclusions that differ systematically from the truth ( [40], p. 60).' This means that bias may lead to an overestimation or to an underestimation of the true effect. It is important to note that a particular type of bias may lead to opposite deviations from the true effect in different studies [41]. Theoretically, genuine diversity may be differentiated from bias. Nevertheless, the presence of genuine diversity in the studies, which contribute to a pooled effect-estimate in a meta-analysis, may as well reduce the validity of the pooled estimate, because genuine diversity may distort an overall pooled estimate just the same way as bias does.
The issue of bias and diversity in meta-analyses has previously been related to three typically occurring problems in meta-analyses: first, a meta-analysis may not reduce or eliminate bias that has been present in the included studies. For example, if effect estimates from a large number of methodologically flawed studies are combined with only few methodologically sound studies in a meta-analysis, the pooled effect estimate will be biased as well (the so-called garbage-in, garbage-out problem [42,43]). Second, with respect to the potentially present genuine diversity, meta-analysis may even introduce bias in estimating a treatment effect: for example, if the included studies differed regarding study characteristics that may affect the treatment effect (i.e. studies are genuinely diverse), the pooled effect estimate may be invalid (the socalled apples-and-oranges problem [42,43]). If, for example, studies with patients that fulfill diagnostic criteria of two or more mental disorders had systematically larger treatment effects than studies that included patients who fulfill diagnostic criteria of one mental disorder only, combining treatment effects from both subsets of studies would result in an invalid pooled effect estimate. Thus, genuine diversity in the studied samples, treatments, treatment providers or study methodology may reduce the validity of the pooled estimate of the treatment effect. Finally, the validity of meta-analyses may be reduced if the sample of studies that is considered for meta-analytic pooling of treatment effects is not representative of all relevant studies (the so-called file-drawer problem [43,44]). This problem is related to difficulties to publish studies with negative or non-significant results, especially if the study samples are small. In published articles of small-sized studies, treatment effects thus tend to be large and significant. If only published articles are considered for meta-analysis, the obtained effect estimates may only poorly reflect the true treatment effect. The file-drawer problem has consequently also been described as publication bias. The problems that may occur from including mainly small and underpowered studies in a meta-analysis have been summarized nicely by Cuijpers ([9], p. 2): 'If a therapy is found to be superior to an existing therapy in an underpowered trial that would rather raise doubts about the validity of the trial than trust that this new therapy is indeed more effective.' Thus, all three briefly introduced problems in meta-analyses threaten the validity of the pooled effect estimate. They differ, however, with respect to the interpretation of the estimated treatment effects (Figure 1): first, the garbage-in, garbage-out problem reflects the bias that has   already been present in a subgroup of poor-quality studies. The pooled effect estimate across all studies as well as the pooled effect estimate of the poor-quality subgroup of studies will be biased. In this case, only the effect estimate of the high-quality subgroup of studies may be regarded as valid. Second, the apples-and-oranges problem reflects meaningful variation between effect estimates due to dissimilarity between studies on relevant study characteristics (i.e. genuine diversity). That is, the pooled effect estimate across all included studies may be invalid whereas the pooled effect estimates in each subgroup of studies may be valid. Third, if the filedrawer problem was present, the pooled effect estimate of published studies is expected to differ from the pooled effect estimate of unpublished studies, as study results may more likely be published if they have statistically significant results. Many of the unpublished studies may, therefore, have non-significant results. Thus, a meta-analysis restricted to the published studies would probably provide a higher result compared to a meta-analysis restricted to the unpublished studies [45]. Only including both, published and unpublished studies, would warrant the validity of the effect-estimate. The difference between published and unpublished studies should particularly be the case if the study samples are small [46], which further complicates the issue. If a meta-analysis considers only published studies and non-significant results are most likely to be lacking if a study was small in scale, publication bias should be most potent in the small-scale studies and less pronounced or even not present in the large-scale studies.
In this case (i.e. when including only published studies), the pooled effect estimate including all studies as well as the pooled effect estimate restricted to small-scale studies might be biased, whereas the effect estimate in the large-scale studies will be most valid with respect to publication bias. It is important to note, however, that the presence of any of the three problems indicates only an increased probability of bias in a meta-analysis rather than being necessarily associated with bias in the meta-analytically pooled effect estimates.

Addressing between-study heterogeneity in meta-analyses of PTSD RCTs
Meta-analyses on the effectiveness of psychotherapeutic treatments for PTSD revealed between-study heterogeneity in any kind of RCTs-those that compared psychotherapy with wait list, with a psychological placebo control and those that compared two types of rival PTSD psychotherapies. But not all meta-analyses reporting heterogeneous results made attempts to explore potential sources of the observed heterogeneity (e.g. several of the comparisons in Bisson and Andrew [23]).
A typical approach in meta-analyses to deal with the presence of between-study heterogeneity is to identify characteristics of the included studies that systematically differentiate studies with larger or smaller effect sizes (i.e. so-called moderators or effect modifiers [43,47]. Every characteristic of a study that is associated with the treatment effect may also act as a moderator in a meta-analysis. Two statistical approaches are used in order to identify relevant moderators: stratification of analyses by potential moderators and meta-regression analyses [35]. In stratified meta-analysis, the effect estimates of the subgroups of studies with and without a particular characteristic are contrasted. If effect estimates differ significantly in the contrasted subgroups or heterogeneity is reduced in at least one of the contrasted subgroups, the respective study characteristic may be interpreted as relevant moderator. Meta-regression analysis provides a statistical test for the exploration of sources of heterogeneity in metaanalyses [48].
The following paragraphs will give examples of different kinds of study characteristics that have been shown to moderate the pooled effect estimates in meta-analyses on PTSD treatments. We will focus on meta-analytic findings that summarized data from placebo-controlled and comparative RCTs as those two designs are informative with respect to the identification of characteristic treatment components in PTSD treatments and thus to the two highlighted research questions.

Moderators in placebo-controlled PTSD RCTs
One meta-analysis [7] summarized RCTs which compared treatments that somehow focused on the trauma with treatments lacking a trauma focus. The initial overall analysis showed a moderate superiority of the trauma-focused over the non-trauma-focused treatments with moderate between-study heterogeneity. On closer examination, the extent of structural equivalence (i.e. that therapists in both treatment conditions were equally trained and supervised and that the number of sessions was equivalent in both treatment conditions) substantially moderated the initially observed differences when all studies were included in the analysis. In the stratified meta-analyses, the superiority of trauma-focused treatments over the placebo controls was larger in studies without equivalence between the two treatment conditions, which was most likely due to an underestimation of the efficacy of the placebo control. Heterogeneity was considerably reduced in the stratified analyses. Furthermore, the initially observed superiority of trauma-focused treatments over placebo controls was moderated by patient characteristics: by combining several indicators of more complex clinical problems (e.g. the presence of comorbid disorders in addition to the PTSD symptoms or trauma history, as suggested by Cloitre and colleagues [49]), the moderator analysis conducted by Gerger and colleagues [7] revealed that patients with more complex clinical problems benefited equally from trauma-focused PTSD treatments as well as from the psychological placebo control. In contrast, in studies with less complex problems (e.g. PTSD symptoms without comorbid symptoms following a single trauma), patients benefited more from the traumafocused treatments than from the placebo control. Again, the inclusion of the moderator reduced heterogeneity, and only a small amount of between-study heterogeneity remained unexplained. Importantly, in studies with less complex clinical problems and structural inequivalence, a clear superiority of trauma-focused over placebo control treatments was observed (ES = 0.93; p = 0.001), whereas this was not the case in studies with complex clinical problems and structural equivalence of the trauma-focused treatment and placebo control (ES = 0.11; p = 0.28).

Moderators in comparative PTSD RCTs
A concrete example of the presence of between-study heterogeneity, i.e. contradicting findings from individual studies that compared two rival PTSD treatments, can be found in the review by Bisson and Andrew [23]. In one of the conducted meta-analyses out of six studies that compared CBT and EMDR, three studies reported moderate to large superiority of CBT on clinician-rated PTSD scores, while the remaining three studies reported the exact opposite effect, namely moderate to large superiority of EMDR over CBT. Overall, the meta-analysis indicated no difference between the effects of the two treatments, but a large amount of between-study heterogeneity (ES = 0.03, p = 0.92, τ 2 = 0.28). Without further exploration of the observed heterogeneity, no valid conclusions may be drawn from such data [35].
Several meta-analyses aimed at explaining such heterogeneity between individual study estimates in PTSD RCTs. Two meta-analyses [14,34] included different types of PTSD treatments, but found no evidence for the type of psychotherapeutic PTSD treatment to explain between-study heterogeneity. Rather, Gerger et al. [14] found evidence for the presence of publication bias with respect to the trauma-focused PTSD treatments: a meta-analysis that was restricted to large-scale studies demonstrated considerably reduced treatment effects compared to the effects found in the overall analysis or in an analysis that was restricted to smallscale studies. The between-study heterogeneity, which was very large in the initial analysis (τ 2 = 0.29), was considerably reduced in the analysis restricted to large-scale trials (τ 2 = 0.08).
One possible explanation for the striking differences in the direction of effects between two treatments as in the EMDR-CBT comparison by Bisson and Andrew [23] is the presence of researchers' preferences for one over the other treatment, the so-called researcher allegiance [50]. Accordingly, the intriguing pattern of results in the EMDR-CBT meta-analysis by Bisson and Andrew [23] could simply be explained by the fact that in one half of the studies researchers preferred CBT and in the other half researchers preferred EMDR. While, by chance, in this particular case, the distribution of researcher allegiance appears balanced across the six included studies, an unbalanced preference for one particular treatment may be more problematic. In fact, a meta-analysis on trauma-focused PTSD treatments found researcher allegiance to significantly correlate with effect-size differences between the trauma-focused PTSD treatments (r = 0.35) and to explain between-study heterogeneity [15]. Further, Munder and colleagues presented evidence for the assumption that the association between researcher allegiance and outcome was due to bias [16] and against the assumption that true differences in the effectiveness of different types of PTSD treatments explained the association between researcher allegiance and outcome [15].
Thus, meta-analyses on comparative PTSD RCTs failed to demonstrate the superiority of particular characteristic components, but demonstrated the relevance of researcher allegiance -a factor that is incidental to the treatment-in explaining differences between individual study results. Thus, in the case of PTSD outcome studies, comparative RCTs run a considerable risk of providing biased estimates of the contribution of characteristic treatment components to the entire treatment effect. Furthermore, while on first sight meta-analyses of placebocontrolled PTSD RCTs appeared to support the claim that focusing on the trauma is necessary for successful PTSD treatment, a closer examination of potential moderators of treatment effects indicated that a trauma focus might be necessary for some but not all patient samples. A thorough implementation of the assumed psychological placebo might further enhance its effectiveness and, hence, reduce the superiority of trauma-focused treatments over placebo controls. The finding of only a small and non-significant superiority of established PTSD treatments over present-centered therapy in a meta-analysis [6], as well as a recent metaanalysis on counseling treatments for PTSD [5], confirm the objection regarding a general necessity of a trauma focus in psychotherapeutic PTSD treatment.

Conclusion
Our analysis of PTSD outcome research demonstrates the presence of considerable conceptual problems in PTSD RCTs, which limit the validity of the conclusions that may be drawn from these studies when trying to identify the most beneficial treatment components. In placebocontrolled RCTs, an inappropriate implementation of the placebo control led to an overestimation of the superiority of the PTSD treatments over the placebo control, and in comparative RCTs, the presence of unbalanced researcher allegiance led to biased estimates of treatment effect differences. Besides such conceptual issues, which hamper valid conclusions from PTSD RCTs, the moderating role of patient characteristics confirms the recent conclusion that 'one size does not fit all' in PTSD treatment [49]. Thus, moderators of treatment effects in PTSD RCTs may include genuine diversity, which contributes to the apples-and-oranges problem and that indicates a need for differential treatments, but may also include factors that contribute to bias-that is the garbage-in, garbage-out problem as well as the file-drawer problem.
Future attempts to identify the most beneficial treatment components of PTSD treatments should therefore consider not only the theory-driven characteristic components but must also further investigate how the assumed incidental factors may impact on outcome in order to warrant the validity of conclusions from PTSD outcome research. The underlying etiological theories may need revisions if moderators indicated genuine diversity and study methodology may need to be adapted in order to ensure the validity of psychotherapy outcome studies. Neglecting extra-therapeutic moderators may threaten the validity of RCTs and meta-analyses and may result in misleading recommendations for researchers, practitioners and policymakers, who base their treatment decisions on empirical findings. On the other hand, the possibility of exploring sources of genuine diversity between RCTs when conducting a meta-analysis (i.e. conducting moderator analyses in order to explain between-study heterogeneity) may be seen as an important step towards personalized psychotherapy [51 ].
It is important to note, however, that moderator analyses in meta-analyses, even if they include only high-quality RCTs, should always be considered as retrospective and observational in nature, because the studies were not randomly assigned according to their characteristics (e.g. studies have neither been randomly assigned to being of high vs. low quality nor to having included patients with complex vs. non-complex problems). Thus, the results from moderator analyses in meta-analyses should be considered as hypothesis generating, which would, if possible, at best to be confirmed by high-quality experimental research.