Internal consistency scores for Clinical Frailty Scale across elderly population from different geographical areas.
Geriatrics is an applied science as its practice is an art of medicine. As a scientific discipline, there exists a potential race for measurements. Frailty stands as among poorly defined concepts in geriatric medicine. There are philosophical, circumstantial, and practical justifications behind this rather seemingly clinical tragedy. This chapter contributes toward reliability and validity aspects of currently applied frailty scales and indicators across different population base. It acknowledges the contribution of Fried’s frailty scale. It also describes different frailty scales and indicators tested in America, Europe, and Asia. Lastly, the chapter contrasts the popular belief behind applications of Cronbach’s α coefficient of test scores for reliability assessment in clinical research. Other research gaps are also highlighted including merging clinical research findings in geriatrics with psychosocial aspects under the emerging field of geropsychology. It also proposes a solution for usage in future studies that aim at assessing reliability of test scores in clinical and biomedical sciences.
Geriatric medicine is a relatively younger sub-specialty of medicine. Unlike fields like general internal medicine or surgery, that are known to have existed since antiquity, geriatrics has gained significant popularity, in orthodox medical practice, around the second half of the twentieth century. Geriatrics is
The clinical characterization, of modern geriatric medicine, owes much to the pioneering work, of Professor Bernard Isaacs back in the 1960s. It was Isaacs who is credited in public literature, to have coined the term “
There exists a lot of confusion in geriatric medicine to date, regarding the measurable construct of
Atypical presentation in the elderly can be exemplified, say by a Nonagenarian lady, presenting to the emergency department of a typical hospital, with symptoms and signs suggestive of acute confusional state like delirium, caused by
Frailty is a poorly defined syndrome almost exclusively confined to the elderly population. There are dozens of descriptions given for frailty [8, 9, 10, 11, 12, 13, 14, 15]. All of them were made for specified frameworks of interest by their original authors. On pedagogic sense, none could be used systematically, without a pinch of doubt, to any destitute clinician/researcher. However, out of dozens of frailty definitions available, the one proposed by Fried and colleagues in Cardiovascular Health Study Collaborative Research Group back in 2001  is the most widely applied framework by clinicians/researchers in bio-gerontologists the world over. The underlying scientific framework for applying frailty syndrome to be discussed in this chapter has taken into account the famous fact derived by Sir George Box’s seminal paper back in 1978 that
2. Reliability of frailty models in elderly population studies
Geriatrics, as a branch of clinical sciences, is a scientific discipline as physics or chemistry is to natural sciences. To that end, measurements are the core aspect for its sustainability. It is under this framework that most frailty scales are available the world over to date, and in future shall be assessed. Technically, assessing the quality aspects of the scale takes mainly two domains, namely,
2.1 Internal consistency assessment of frailty scales used in elderly population
There are various ways of assessing reliability index of any given phenomenon/scale in nature. Some are well known in literature, and there are probably many others in production pipelines for usage in future. However, the most popular methods include test-retest reliability, split-half reliability, and internal consistency reliability tests. Out of these, this chapter will deal with internal consistency reliability. The decision to do so is derived from its conceptual meaning as opposed to the rest. Simply stated, internal consistency refers to the extent to which a measurement of a scale provides stable and consistent results across a specified condition . One rule is important to be mentioned here, in that all accounts of assessing reliability of any given scale, the reliability score to be obtained is not reflective of a constancy but rather a mere statistic for a given test result. This translates to the fact that a given scale may end up with different scores, under different elderly population conditions, dependent on a number of factors, some known (e.g., test settings and gender) and others unknown even to the test itself. Thus, caution to the interpretation of the test scores is highly warranted.
2.1.1 Clinical Frailty Scale
Clinical Frailty Scale (CFS) is a clinical judgment-based tool (originally designed as an epidemiological tool) to screen for frailty and other adverse health events in opposition to fitness in older aged population. It is a direct replica of a frailty index that was part of the original design aspect of the first part of Canadian Study on Health and Aging (CSAH), with the aim of characterizing cognitive impairment and other important health issues, designed as a prospective 5-year follow-up of 10,263 people aged at least 65 years back in 1991 [18, 19]. At the time of going to press, the Clinical Frailty Scale is composed of a 9-point scale, that was made public in 2007, an improvement from the original scale of a 5-point scale originally published in 2005 . It was originally developed in the second half of the Canadian Study of Health and Aging (CSHA) as a quick means to assess frailty and other senile physical and mental challenges past clinical assessment . The conceptual framework of the Clinical Frailty Scale relies on the “fitness and frailty” model, and the scale was designed by adopting the mechanism from Streiner and Norman . It is for all practical purposes, not a questionnaire but a quantified summary write-up of an elderly overall health status in relation to mortality risks. Internal consistency scores for Clinical Frailty Scale among elderly population across different geographical areas are provided in Table 1.
2.1.2 Edmonton Frail Scale (EFS)
Edmonton Frail Scale, an effort first conceptualized by Darryl Rolfson while at the University of Alberta, Canada back in 1999, was presented for the first time to peer review at the Canadian Geriatric Society in Edmonton, Canada, in 2000 . Ever since its first time in press, the scale has been applied in research, educational and clinical settings for quantitative frailty assessment among senior citizens [15, 26–33]. Edmonton Frail Scale consists of nine domains and 11 items. The initial scale devised by Rolfson at Edmonton had 10 domains . Each component may have a score of 0, 1, or 2 signifying normal health, mild/moderate impairment, or severe impairment, respectively. Domains include general health status; cognitive status; medication use; presence of social support; incontinence; nutrition and mood; functional dependency; and functional performance test . The total scores are also classified into no frailty (0–3 points); pre-frailty (4–5 points); frailty (6–8 points), and severe frailty (9–17) . The internal consistency scores of Edmonton Frail Scale for senior citizens across different geographical settings are as reported here in Table 2.
2.1.3 Groningen Frailty Indicator
Groningen Frailty Indicator (GFI) is a 15-item indicator for assessment of frailty developed by Professor Steverink and his colleagues at the University of Groningen, The Netherlands, first published in 2001 . The internal consistency findings of GFI are as summarized in Table 3.
2.1.4 Tilburg Frailty Indicator
Tilburg Frailty Indicator (TFI) is a questionnaire for screening frail community dwelling older people that includes self-reported information, originally tested and validated from an elderly community of Roosendaal in The Netherlands, based on a working framework in development, developed by a team of Dutch scientist first published in 2009 . Tilburg Frailty Indicator is unique among frailty indicators, in that it includes multiple domains of human functions but selectively excludes disability . TFI consists of two parts, namely, multimorbidity and frailty domains. The first part (designated as part A) contains 10 questions on determinants of frailty in relation to disease states, while the second part is solely on frailty aspects . The internal consistency score ratings of TFI across studies from different geographical areas are given in Table 4.
|Country/region||Construct validity index||Settings|
|1. China ||Physical domain: r = −0.39–0.57 (P < 0.001)|
Psychological domain: r = −0.47–0.49
(P < 0.001)
|2. The Netherlands ||Social domain: r = −0.35–0.71 (P < 0.001)|
Physical domain: r = −0.43–0.62 (P < 0.001)
Psychological domain: r = −0.19–0.46 (P < 0.001)
Social domain: r = 0.29–0.96 (P < 0.001)
Physical domain: r = 0.31 (P < 0.001)
Psychological domain: r = 0.24
(P < 0.001)
|3. Italy ||Social domain: r = 0.25 (P < 0.001)||Community-based|
3. Validity aspects of frailty scales and indicators used in elderly population
Much as reliability may be loosely assumed to be synonymous to precision in measurements, it follows a natural pattern then to ensure validity by the assumption of accuracy. It must be understood that geriatrics, just as other branches of clinical medicine, is essentially an applied science field. To this end, the reader is cautioned against making substantial error in reasoning, that of assuming measurement exactness of constructs made in its clinical measurements, just as natural scientists make, in say reaction time in subjects like physics or chemistry. It is on this basis, that all aspects of validity, discussed in this chapter, constitute a number of assumptions, some of them may be hard to prove, even when considered useful in the stated models. For instance, since most validation processes in constructing frailty scales and indicators consisted of a number of items, the assumptions made are such that those items, when taken collectively, refer to a construct of frailty, and that when applied to humans in their contextual nature, can distinguish those who are frail from those who are not. This section will deal with one important form of validity measurement, that of construct validity at most .
Construct validity is a way of measuring a disposition/character/trait/belief such that its accuracy can be estimated with quantifiable degrees of confidence. In simplistic fashion, it is a way of measuring a test for what it claims to quantify. Construct validity differs from other forms of validity in applied sciences, namely, criterion and content validity, since in construct validation, there is an aspect of quantifying the quality of a measuring instrument toward what it claims to measure. Thus, for all practical purposes, this chapter will endeavor to quantify aspects hypothesized to assess frailty, as applied to the community of senior citizens living in different geographical communities. This notion, inter alia, follows the appreciable level of acceptance in reliability indices prior to its undertaking, lest of that, it may be deemed invalid in practice. In this sub-section, an analysis of different frailty scales/indicators in construct validity will be determined here underneath.
4. The triumph and controversies surrounding reliability and validity of frailty scales/indicators in elderly population
It is important to underscore the importance of association between what is characterized as
First, on reliability aspects, it is important for geriatricians, other clinicians handling senior citizens, clinician-scientists, policy makers as well as other readers alike to be aware of the fact that frailty scales/indicators scores derived from cited studies above do not in actual sense measure reliability at best. There is no doubt that no other statistic in published literature has been a subject of wide confusion than
where ρx+x’+—test score reliability in a population, σx—standard deviation of a population of interest, and σy—standard measurement error of a sample of interest.
It is important to remind readers that application of standard measurement error as a measure of internal consistency of test scores assumes each individual score results originated from a test with the same accuracy . Details of this method of assessing internal consistency, and therefore inherent reliability of any given test scores, are given in other published findings of the past [43–47].
In this chapter, I have hesitated myself from committing a rather common statistical crime. It is well known that meta-analysis of findings from individual studies, customarily using forest plots, is an efficacious way of deriving effect size as well as identifying small and insignificant statistical results. However, I must admit there have been strong attempts to pool reliability and validity estimates from different studies here. The decision at the end, of not to include forest plots, from meta-analysis in this chapter, is based on the same philosophy, behind the chapter, namely,
At large, these studies differ significantly on the basis of their designs. For instance, whereas findings in Table 1 reflected assessment of Cronbach’s alpha coefficient, for what was referred to as internal consistency, out of studies targeting Clinical Frailty Scale, the study by Rockwood and colleagues in Canada was conceived as a prospective observational study . Moreover, Chong and colleagues’ study conducted in Singapore was designed in a retrospective fashion . It therefore comes out automatic that
Apart from the design differences between studies whose estimates were pooled as means to assess reliability and validity in this chapter, heterogeneity is also suspected to be present from publication bias. Quite commonly in biomedical research and databases, studies are only published if they attain positive outcomes as per research questions designed by investigators. Whereas the message here is not to support the idea, as I personally believe in learning from findings with negative results from their hypothesized questions, I found it an important message to remind readers. It is quite possible that there were other studies left behind simply because they either failed to appear in press for what so ever reasons or they were left behind merely out of ignorance by the author during retrieval of information used to pool these data. At this point, it should be clear that there are quantitative mechanisms of assessing heterogeneity in statistical data [39, 40, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61]. However, those techniques are far behind abilities to correct what went wrong during design stage. It was therefore futile to justify application of those techniques to data that was conceived in either retrospective fashion or out of publication bias.
On a positive note, however, the findings from these studies do probably highlight an important construct that is related to diminished ability of various body systems, currently coined as
Lastly, and as a matter of urgent priority, geriatricians, aging research scientists as well as other practitioners and decision makers in health need to consider different population base in their future research on frailty. At present, there appears to be palpable evidence that demographic transition has started, and likely to mature soon, in parts of sub-Saharan Africa . For instance, it is quite evident that Tanzania, just like other sub-Saharan African countries, has its population undergoing
Likewise, on a pioneering scale, global efforts in the interplay of
I wish to convey my sincere vote of thanks to my fellow trustees of
I also wholeheartedly thank all patients and other clients in geriatric/endocrinology clinics at Moyo Safi Hospital, Alshifa Medical and Dialysis Centre as well as AB hospitals in Dar es Salaam, Tanzania, for their openness and very welcoming ground to work and appreciate my geriatric practice. Special vote of thanks goes to Dr Chuor Garang de Alier for his tireless efforts in tracing some of the full text documents used as references in this chapter. His helping hand, out of his tight schedule, at times via midnight calls, as a student at St. Hugh’s College in Oxford, aiming at ensuring all references cited, have been read in context, is highly appreciated.
Conflict of interest
No conflict of interest declared in the preparation of this manuscript.