QoLMiss: Package for Repeatedly Measured Quality of Life of Cancer Patients Data

Ankita Pal; Satyajit Pradhan; Aseem Mishra; Pankaj Chaturvedi; Atanu Bhattacharjee

doi:10.5772/intechopen.1001908

Abstract

Quality of Life (QoL) has become increasingly important in cancer clinical trials. The R package QoLMiss is a package developed for computing the different sub-domains of the European Organization for Research and Treatment of Cancer (EORTC) questionnaire and also finding a survival link with these sub-domains. This package contains the scale scoring and survival outcomes of the other domains obtained from QoL data. The scale scores are also evaluated if there is the presence of missing data in repeatedly measured QoL data. The cancer specific QLQ are also considered scoring and survival analysis.

Keywords

quality of life
cancer
QLQ-C30
functional scales
symptom scales
global health status
hazard ratio
R package

Author Information

Show +

Ankita Pal*
- Mahamana Pandit Madan Mohan Malaviya Cancer Centre, Tata Memorial Centre, India
Satyajit Pradhan
- Mahamana Pandit Madan Mohan Malaviya Cancer Centre, Tata Memorial Centre, India
Aseem Mishra
- Mahamana Pandit Madan Mohan Malaviya Cancer Centre, Tata Memorial Centre, India
Pankaj Chaturvedi
- Centre for Cancer Epidemiology, Tata Memorial Centre, India
- Homi Bhabha National Institute, Mumbai, India
Atanu Bhattacharjee
- Homi Bhabha National Institute, Mumbai, India
- Section of Biostatistics, Centre for Cancer Epidemiology, Tata Memorial Centre, India

*Address all correspondence to: apalstat97@gmail.com

1. Introduction

Current oncology focuses not only on pharmacological treatment but also on a fuller understanding of the experiences of patients and their families. This will help in prioritizing the allocation of resources and planning and providing holistic care that will measurably affect the quality of life [1]. The therapeutic efficacy of a controlled clinical trial is most often measured in terms of patients’ survival. In cancer trials, several types of survival times are used, which include disease-free, relapse-free, local recurrence-free survival, LR recurrence-free survival, progression-free survival, disinfection-free survival, and overall survival [2].

Nearly every cancer treatment that intends a cure in some way interferes with a patient’s bodily integrity. Quality of life (QoL) or a person’s well-being encompasses a broad range of variables describing the patient’s subjective reactions and perceptions to their environment as long as any treatment fails to expand the lives of patients exceptionally. The alternative preference is given to the increase in QoL [3]. Understanding the social consequences of disease is very important for any treatment protocol and acknowledging the fact medical intervention aims to increase the length and QoL. For these reasons, the quality, effectiveness, and efficiency of health care are often evaluated by their impact on a patient’s QoL [4].

The statistical analysis of QoL is challenging, so a few assumptions are to be considered: (i) QoL is a subjective construct that is indirectly observed and measured, (ii) It is multiple dimensional based on different characteristics of physical and psychological well-being. (iii) QoL is time-dependent, which reflects a person’s experiences.

The QoL in cancer is a multidimensional concept that is dynamic, referring to the patient’s day-to-day life—balancing between the present situation and the ideal situation at a given time [5]. It is a specific and multidimensional type of patient-reported outcomes (PROs) that encompasses the patients’ social, financial, psycho-social, and physical activities [6, 7]. After their completion of treatment, the QoL for cancer patients is related both directly and indirectly to health, disease, disability, and impairment. The more significant symptoms have been associated with, the higher levels of emotional suffering and poor physical and societal functioning, and global QoL.

A first-generation core questionnaire, the EORTC QLQ-C36, was developed in 1987 [8]. It is studied through validated questionnaires that the patients fill at different time points. The European Organization for Research and Treatment of Cancer (EORTC) has developed a questionnaire, named the QLQ-C30, which helps assess Health-Related Quality of Life (HRQoL) in cancer patients with 30 questions and some extra questions related to disease-specific treatment measurements. The EORTC QLQ-C30 is a widely used and well-validated instrument that is designed to assess health-related quality of life in patients with cancer.

EORTC QLQ-C30 scales are scored on a 4-point response scale, ranging from not at all to very much, except the last two questions, which are scored on a 7-point response scale. Statistically, the most exciting feature of QoL evaluation is considering its time-dependent structure. Whenever any patient faces a diagnosis of a fatal disease or a distorting treatment, their well-being may be affected and hence decline. In other words, it will be the process of surviving the disease and its treatment that reflects in their future QoL. Traditional clinical trials that measure the time of a fatal event, of course, take into account the time factor in treatment comparisons. Statistical analytical procedures for these tests use survival analysis methods [9].

The aim of this paper is to prepare and present R package functions that can easily work with different sub-domains of the EORTC questionnaires, both for QLQ-C30 and the cancer-specific QLQs. In addition, the presence of missing data in repeatedly measured QoL data is quite often. This package and functions are also presented to impute the valid missing observations and cover the analytical support to work with QoL data. The missing observations were imputed with the minimum value of the questions. Survival analysis is also performed for all sub-domains from the cancer-specific quality of life questionnaires.

2. Methodology

The QLQ-C30 in EORTC questionnaire provides functional scales, symptoms scales, and global health status. The functional scales include five functions, they are, Physical Functioning (PF), Role Functioning (RF), Cognitive Functioning (CF), Emotional Functioning (EF), and Social Functioning (SF). The symptom scales include nine symptoms, are, Fatigue, Nausea and Vomiting, Pain, Dyspnoea, Insomnia, Appetite Loss, Constipation, Diarrhea, and Financial Difficulties. Each of the multi-item scales includes a different set of items, in other words, there are several items included on every scale. All of the scales have scores ranging from 0 to 100. A high functional, symptom, and global scale score represent a healthy level of functioning, a high level of symptomatology or problems, and a high QoL, respectively. The principle of the scoring scales is the same for all the domains; that is, a linear transformation is used to standardize the raw scores [8].

The procedure for computing the domain-wise scale scores [8] is explained by scale items as I1,I2,…,In. For instance, in the case of QLQ-C30 the scale items will be I1,I2,…,I30. The scoring manual [8] has a detailed explanation for all the domains and sub-domains.

For all scales, the RawScores, is the mean of the component items given as,

RawScore=RS=I1+I2+…+In/nE1

For Functional Scales, the formula for calculating scores is,

Score=1−RS−1range×100E2

For Symptom scales/items and Global health scales/QoL, the formula for calculating scores is,

Score=RS−1range×100E3

Range is the difference between the possible maximum and the minimum response to individual items; most items take values from 1 to 4, giving a range = 3. The QLQ-C30 has been designed so that all items on any scale take the same range of values. The exceptions are the items contributing to the global health status/QoL, which contains two items taking values from 1 to 7, giving a range = 6.

A function was developed in R using the above formulae. The purpose of this function is to convert the item-wise values into domain-wise scores and generate a comprehensive dataset.

Separate functions were prepared for Raw Score, Functional Scales Score, Global health status/QoL, and Symptom Scales, and then all these functions were collated under one single function. This function aims to take the entire data as the input and consider only those columns that contain the data of the 30 questionnaires by considering it as the revised data. Further, a nested function was formed, which contains three functions for calculating the domain-wise scale scores.

The first function rs() was prepared for calculating the Raw Score. Similarly, other scoring values were created. The Functional Scales Score fs() uses the formula of the functional scale score mentioned above for calculating the scores for all the scales under this domain. The combined function ss_gs() for Global health status/QoL and Symptom Scales calculates all the items under these two domains since both these scales use the same formula for calculating the scores. The three functions, rs, fs, and ss_gs were combined under one single function named as qol and qol_miss depending on the type of data that is used. The qol function can work with both complete data and data with some missing information because this function will first check for any missing information. If missing information is found, it will replace them with the minimum value. On the other hand, if no missing information is found, it will continue to calculate the scores. On the other hand, if the data contains missing information for all the questions for a particular patient, that is, if row(s) have complete missing values then the qol_miss function can be used. In such a case, the row(s) with completely missing data of patients will be omitted and then the score calculations will be performed. A flowchart of the process for calculating all the scale scores using these functions available in the QoLMiss package is represented below (Figure 1).

Figure 1.
Flowchart about creation of algorithms to calculate the domain-wise scale scores from the QoL data.

The domain-wise scale scores are also calculated from the cancer-specific questionnaires, such as, lung, head and neck, breast, ovarian, and thyroid. The functions are named as lc_qol for QLQ-LC13, hnc_qol for QLQ-HN35, brc_qol for QLQ-BR23, ovc_qol for QLQ-OV28, and thyc_qol for QLQ-THY34. These functions works similarly as the functions, qol and qol_miss mentioned in the above flowchart.

Another set of functions is prepared for determining the survival outcome for each and every scale scores. The hazard ratio (95% CI) is calculated for all scales, with the help of the function coxph() from the survival package. The values of the hazard ratio will help in understanding the survival relationship of the patients with the domain-wise scale scores.

The functions that are prepared for determining the survival relationship are named as surv_c30, surv_c30_miss, surv_lc13, surv_hn35, surv_br23, surv_ov18, and surv_thy34. The first step in all these functions is to divide the data according to the two arms, which will help in comparing the survival outcomes between the two arms. The different scales are considered as the covariates and univariate analysis of each of these scales is performed using the Cox-Proportional Hazard model. This analysis provides the results of the hazard ratio (95% CI) for each of the scales.

Hence the survival functions take the entire dataset as its input, provided the data consists columns such as ‘time’, ‘event’, and ‘arm’. The column named ‘time’ should contain the survival time of the patients. The column named ‘event’ should contain the status of the patient, indicating with the value 0 if the patient is alive and 1 if death has occurred. Another column named ‘arm’ should contain the arm to which the patient has been randomized. This data is then passed to the respective QoL function for obtaining the domain-wise scale scores, which are then passed to the function coxph() from the survival package for obtaining the hazard ratio (95% CI). Therefore, all the prepared survival functions returns a dataframe containing the hazard ratios along with the 95% CIs for each and every scale. A flowchart of the process for performing the analysis using these functions is provided below (Figure 2).

Figure 2.
Flowchart about creation of algorithms to perform the survival analysis from the QoL data.

3. Simulation

The first step in all qol functions was to select only the 30 columns containing the data of the 30 questions from the complete response values.

We prefer to use simulated data and find the results and analyze from this data. So, data were simulated from the Poisson distribution with mean (λ=2.5). A random data set of size 100 was simulated for each of the 30 questions. A complete data is available considering that it contained the information of 100 patients. The only condition that is required to be checked is that the column names of the 30 questions needed to be mentioned as Q1, Q2, …, Q30.

In some cases, it can occur that there is missing information in the data denoted as NA. Among these 30 questions, most items take values from 1 to 4, and the last two questions take values from 1 to 7, so it can be the case that the value entered is 9 or 99. Thus, another data was simulated which contained missing information for some patients, that is, the values occurred as NA or 9 or 99.

Other cases can occur that no information is obtained for any particular patient; that is, it may be obtained that for all the 30 questions, the data is available as NA. So, the third type of data was simulated in which for some patients no information was open, and the data is represented as NA.

For using the survival functions, a data is needed which contains three columns time to event (denoted as time), status of the patient (denoted as event) and type of treatment (denoted as arm). Therefore, the survival functions, surv_c30, surv_c30_miss, surv_lc13, surv_hn35, surv_br23, surv_ov18, and surv_thy34, will take a dataset as input which contains the 30 questions, time, event and arm.

4. Results

After the simulated data is passed to any of the following functions qol, qol_miss, lc_qol, hnc_qol, brc_qol, ovc_qol, or thyc_qol, depending on the type of data, the domain-wise scores are calculated with the help of the function fs for Functional Scale scores, and the combined function ss_gs for Global health status/QoL and Symptom Scale scores. A matrix of dimension 100 × 30 is obtained where the 30 domain-wise scale scores are for each of the 100 patients. These values are replaced in the data set with the 30 questions and returned as the final result.

Suppose some of the values entered in the data is NA or 9 or 99. In that case, this data will be passed to the qol function to check for any missing information, and if found, these values will be replaced with the minimum value of that particular question, which is generally obtained as 1. Thus, complete data will be obtained without any missing or incorrect values and can for calculating the domain-wise scale scores. The cancer specific functions, lc_qol, hnc_qol, brc_qol, ovc_qol, and thyc_qol will also perform similarly as the qol function, depending on the cancer-specific QoL questionnaire data.

In case there is no information available for a patient, that is, the scale values are available as NA, this data will be passed to the qol_miss function, and the information of this particular patient will be completely ignored, in other words, the information of this patient will be removed from the data. After the required changes have been made in the data, then the domain-wise scale scores will be calculated.

For performing the univariate survival analysis considering the domain-wise scale scores as the covariates, the simulated data is passed to any of the following functions surv_c30, surv_c30_miss, surv_lc13, surv_hn35, surv_br23, surv_ov18, or surv_thy34, depending on the type of data. These data will again be passed to any of the qol functions as required for obtaining the domain-wise scale scores. These outputs are passed to the coxph() function from the survival package for obtaining the hazard ratios (95% CI) for each of the domain-wise scale scores.

5. Illustration

A simulated data was obtained from Poisson Distribution with a mean of 2.5(=λ). This is complete data without any missing information, so after passing the data into the qol function, no modifications are required. The final data frame is obtained containing the domain-wise scale scores.

Similarly, data were simulated in which there were some values were obtained as NA or 9 or 99. It is also possible to work with the qol function. The values NA or 9 or 99 were replaced with the minimum value for that particular question. After this modification of the data, the domain-wise scale scores were calculated, and a data frame was returned containing the domain-wise scale scores.

Lastly, the third type of data was simulated in which, for some patients, there was no information available. The information of patients was represented as NA. After passing this data into the qol_miss function, the information of these patients was removed from the data frame. The qol(x) was used to run the function and in the place of x the simulated data named as c30_df was passed as input in the qol function, that is, qol(c30_df). A small part of output is provided below.

> # Load the simulated data

> data("c30_df")

> # Display head of the scale scores dataframe

> head(qol(c30_df))

ID	time	event	arm	QL	PF	RF	EF	CF	SF
1	498	0	2	8.33	46.67	50.00	58.33	66.67	100.00
2	91	0	1	16.67	40.00	50.00	50.00	5 0.00	33.33
3	13	1	1	8.33	53.33	50.00	41.67	50.00	50.00
4	707	0	2	16.67	60.00	33.33	41.67	83.33	50.00
5	993	1	2	66.67	33.33	66.67	66.67	83.33	50.00
6	23	0	1	25.00	73.33	16.67	33.33	83.33	16.67
ID	FA	NV	PA	DY	SL	AP	CO	DI	FI
1	44.44	0.00	33.33	0.00	0.00	33.33	33.33	100.00	100.00
2	55.56	16.67	83.33	66.67	0.00	33.33	0.00	100.00	100.00
3	33.33	50.00	33.33	33.33	66.67	66.67	66.67	33.33	33.33
4	11.11	0.00	50.00	66.67	0.00	33.33	0.00	66.67	66.67
5	44.44	50.00	66.67	100.00	66.67	33.33	0.00	100.00	66.67
6	66.67	50.00	16.67	0.00	66.67	0.00	33.33	100.00	33.33

The data that was simulated for testing the qol and qol_miss functions also contained three more columns containing information for time to event (denoted as time), status of the patient (denoted as event) and type of treatment (denoted as arm), which will help in illustrating the surv_c30, surv_c30_miss, surv_lc13, surv_hn35, surv_br23, surv_ov18, and surv_thy34 functions. An illustration is given using the surv_c30(x) function and in the place of x the simulated data named as c30_df was passed as input in the surv_c30 function, that is, surv_c30(c30_df). The output as obtained is provided below.

> # Load the simulated data

> data("c30_df")

> # Display the Hazard Ratios (95% CI)

> surv_c30(c30_df)

	HR	Lower 95% CI	Upper 95% CI
QL	1.030	1.030	1.020
PF	1.010	1.020	0.999
RF	1.010	1.010	1.000
EF	1.010	1.010	1.010
CF	1.010	1.010	1.000
SF	1.000	1.000	1.000
FA	0.994	0.991	0.997
NV	1.010	1.020	1.010
PA	0.986	0.992	0.980
DY	1.010	1.010	1.010
SL	1.010	1.010	1.000
AP	1.010	1.000	1.010
CO	1.000	1.000	1.000
DI	0.993	0.994	0.993
FI	0.975	0.979	0.972

6. Discussion

The application of QoL assessment is unavoidable in cancer care [10, 11]. There is not enough ready-to-use functions for calculating the scale scores, so we prepared the method and package QoLMiss to work with the QoL data for cancer patients. It can quickly provide domain-wise score computation. The implementation of the domain-wise scores in a QoL allows in finding the significant symptoms affecting the patients achieve the goal of understanding the well-being of the patients in QoL analysis in oncology clinical trials.

The QoLMiss will be updated as new modules are developed by the EORTC group. The QoLMiss package will be completed over time, by applying some Cox analyses algorithms of QoL score data. The current package will then be expanded by the addition of Cox-Proportional Hazard models. Our future endeavor will also involve in preparing functions for Bayesian Survival analysis. The package will be upgraded over time, by applying some new imputation techniques.

Further research can be performed by exploring different missing data imputation techniques in scenarios where missing data are Missing at Random (MAR), Missing not at Random (MNAR), Missing Completely at Random (MCAR) and then evaluating the domain-wise scores. Future endeavor can be to use the different scale scores to further analyze the quality of life of cancer patients. The QoLMiss package repository can be obtained using the link: https://github.com/apstat/QoLMiss-Package.

References

1. Adler NE, Page AEK. Cancer Care for the Whole Patient: Meeting Psychosocial Health Needs. Washington, DC, USA: National Academies Press/Institute of Medicine (US) Committee on Psychosocial Services to Cancer Patients/Families in a Community Setting; 2008
2. Olschewski M, Schumacher M. Statistical analysis of quality of life data in cancer clinical trials. Statistics in Medicine. 1990;9:749-763
3. Wood-Dauphinee S, Troidl H. Endpoints for clinical studies: Conventional and innovative variables. In: Troidl H, Spitzer WO, McPeek B, Mulder DS, McKneally MF, editors. Principles and Practice of Research: Strategies for Surgical Investigators. New York: Springer; 1986. pp. 53-68
4. Carr AJ, Gibson B, Robinson PG. Is quality of life determined by expectations or experience? BMJ. 2001;322(7296):1240-1243. DOI: 10.1136/bmj.322.7296.1240
5. Aaronson NK, Cull A, Kaasa S. Sprangers MAG for the EORTC Study Group on Quality of Life. The European Organization for Research and Treatment of Cancer (EORTC) modular approach to quality of life assessment in oncology: An update. In: Spilker B, editor. Quality of Life and Pharmacoeconomics in Clinical Trials. 2nd ed. New York, NY, USA: Raven Press; 1996. pp. 179-189
6. Reale ML, De Luca E, Lombardi P, Marandino L, Zichi C, Pignataro D, et al. Quality of life analysis in lung cancer: A systematic review of phase III trials published between 2012 and 2018. Lung Cancer. 2019;139(2020):47-54
7. Jitender S, Mahajan R, Rathore V, Choudhary R. Quality of life of cancer patients. Journal of Experimental Therapeutics & Oncology. 2018;12(3):217-221
8. EORTC QLQ-C30 Scoring Manual. 3rd ed. Brussels: EORTC; 2001. ISBN: 2-9300 64-22-6
9. Cox DR, Oakes D. Analysis of Survival Data. London: Chapman and Hall; 1984
10. Nayak MG, George A, Vidyasagar MS, et al. Quality of life among cancer patients. Indian Journal of Palliative Care. 2017;23(4):445-450. DOI: 10.4103/IJPC.IJPC_82_17
11. Hassen AM, Taye G, Gizaw M, Hussien FM. Quality of life and associated factors among patients with breast cancer under chemotherapy at Tikur Anbessa specialized hospital, Addis Ababa, Ethiopia. 2019. DOI: 10.1371/journal.pone.0222629

[1] 1. Adler NE, Page AEK. Cancer Care for the Whole Patient: Meeting Psychosocial Health Needs. Washington, DC, USA: National Academies Press/Institute of Medicine (US) Committee on Psychosocial Services to Cancer Patients/Families in a Community Setting; 2008

[2] 2. Olschewski M, Schumacher M. Statistical analysis of quality of life data in cancer clinical trials. Statistics in Medicine. 1990;9:749-763

[3] 3. Wood-Dauphinee S, Troidl H. Endpoints for clinical studies: Conventional and innovative variables. In: Troidl H, Spitzer WO, McPeek B, Mulder DS, McKneally MF, editors. Principles and Practice of Research: Strategies for Surgical Investigators. New York: Springer; 1986. pp. 53-68

[4] 4. Carr AJ, Gibson B, Robinson PG. Is quality of life determined by expectations or experience? BMJ. 2001;322(7296):1240-1243. DOI: 10.1136/bmj.322.7296.1240

[5] 5. Aaronson NK, Cull A, Kaasa S. Sprangers MAG for the EORTC Study Group on Quality of Life. The European Organization for Research and Treatment of Cancer (EORTC) modular approach to quality of life assessment in oncology: An update. In: Spilker B, editor. Quality of Life and Pharmacoeconomics in Clinical Trials. 2nd ed. New York, NY, USA: Raven Press; 1996. pp. 179-189

[6] 6. Reale ML, De Luca E, Lombardi P, Marandino L, Zichi C, Pignataro D, et al. Quality of life analysis in lung cancer: A systematic review of phase III trials published between 2012 and 2018. Lung Cancer. 2019;139(2020):47-54

[7] 7. Jitender S, Mahajan R, Rathore V, Choudhary R. Quality of life of cancer patients. Journal of Experimental Therapeutics & Oncology. 2018;12(3):217-221

[8] 8. EORTC QLQ-C30 Scoring Manual. 3rd ed. Brussels: EORTC; 2001. ISBN: 2-9300 64-22-6

[9] 9. Cox DR, Oakes D. Analysis of Survival Data. London: Chapman and Hall; 1984

[10] 10. Nayak MG, George A, Vidyasagar MS, et al. Quality of life among cancer patients. Indian Journal of Palliative Care. 2017;23(4):445-450. DOI: 10.4103/IJPC.IJPC_82_17

[11] 11. Hassen AM, Taye G, Gizaw M, Hussien FM. Quality of life and associated factors among patients with breast cancer under chemotherapy at Tikur Anbessa specialized hospital, Addis Ababa, Ethiopia. 2019. DOI: 10.1371/journal.pone.0222629

QoLMiss: Package for Repeatedly Measured Quality of Life of Cancer Patients Data

Recent Advances in Biostatistics [Working Title]

Abstract

Keywords

Author Information

Ankita Pal*

Satyajit Pradhan

Aseem Mishra

Pankaj Chaturvedi

Atanu Bhattacharjee