Introduction
Patient-centered outcomes are of increasing importance in medical research. The Affordable Care Act provided $150 million in annual funding for the Patient-Centered Outcome Research Institute, an organization expressly committed to addressing the questions and concerns most relevant to patients when comparing treatment strategies. Although there is robust research on patient-centered outcomes among patients undergoing medical therapies, research in patient-centered outcomes after surgical procedures is lacking. In January 2015, funding from the Agency for Health Care Quality supported a stakeholder conference on patient-reported outcomes (PRO) in surgery with a major goal of prioritizing PRO research in surgery.
An area of surgical quality that is under increasing scrutiny is surgical complications. However, we are limited in our ability to measure them through our current surgical evaluation systems. The utility of complication data driven purely through administrative claims has been questioned. Claims data are a poor source of information regarding symptom burden and functional impairment, however these key aspects of health-related quality of life (HRQOL) can be reliably assessed via PRO measures.
Currently, the impact of surgical complications on patient-reported HRQOL is unknown. We, along with others have previously published on the relationship between preoperative HRQOL domains and postoperative complications. In this study we sought to discover what aspects of HRQOL are affected by adverse surgical outcomes in a new, larger, prospective cohort. The goals of this study were to measure the impact of surgical complications on HRQOL in women undergoing gynecologic and gynecologic oncology procedures, using current commonly used PRO instruments to measure overall and cancer-specific HRQOL. The primary outcome of change in HRQOL scores was measured at 1 month, or 30 days, consistent with the commonly used 30-day window for postoperative complication assessment and postoperative recovery milestones.
Materials and Methods
Study design, enrollment, and data collection
We conducted a prospective longitudinal cohort study of women enrolled in the Health Registry/Cancer Survivorship Cohort (HR/CSC) at the University of North Carolina. This study was approved by the institutional review board (study no. 13-2367) and enrolled patients from October 2013 through October 2014, with final follow-up through May 2015. Under the HR/CSC protocol, patients were identified and recruited through the University of North Carolina health care oncology outpatient clinics with the following eligibility criteria: age ≥18 years; North Carolina mailing address; and English or Spanish speaking. Patients unable to provide informed consent or participate in interview questionnaires were excluded. For this study, eligibility was further restricted to HR/CSC patients recruited through the gynecologic oncology clinics, with newly diagnosed gynecologic cancer and planned surgical management. Initial exclusion criteria included primary surgery completed or to be completed at an outside institution, previous chemotherapy or radiation therapy, and pregnancy. After the first 10 weeks of enrollment, the exclusion criteria were modified to allow for retention of patients with final benign pathology to allow inclusion of women undergoing surgery for suspicious pelvic masses/suspected ovarian cancer. The exclusion criteria were also modified to include patients with prior, but not active, chemotherapy or radiation treatment to allow inclusion of patients with new recurrences.
Baseline interviews were conducted within 2 weeks of enrollment, prior to surgery, by trained staff using a computer-assisted telephone interview software tool specifically developed for the HR/CSC. Follow-up interviews were conducted at 1, 3, and 6 months after surgery. Patients were included who completed follow-up interviews within a 3-week interval around each targeted time point (1 week prior or up to 2 weeks after the target date [eg, 1, 3, or 6 months postsurgery]). Participants received gift cards as compensation after completion of each interview. Interview questionnaire topics included medical and social histories, and general and cancer-specific HRQOL assessments. The cohort was limited to participants who met all inclusion criteria and completed at least the baseline and 1-month interviews, as this was the primary outcome point of the analysis. As a secondary outcome, the 3- and 6-month data of this cohort were also reported.
The following validated questionnaires were included in this analysis: the Functional Assessment of Cancer Therapy (FACT)-General Population (GP), the PRO Measurement Information System (PROMIS) Global Health Short Form v1.0, PROMIS Anxiety 4-item Short Form and PROMIS Depression 4-item Short Form , and the modified Work Ability Index (WAI). The FACT-GP version 4 is a 21-item scale that measures 4 HRQOL subscales: physical, functional, emotional, and social well-being. The minimally important difference (MID) for the FACT-GP is 5-7 points, and 2-3 points for each subscale. The FACT-GP covers a wide range of patient experiences (eg, pain, energy levels, social support) and is commonly used in gynecologic oncology clinical trials ( Appendix ). The PROMIS global is a 10-item scale including questions about fatigue, physical function, pain, emotional distress, and social health that are summarized into measures of physical and mental health. PROMIS scales are scored using T scores (0-100 scale), which are standardized to the US general population and have a mean of 50 and SD of 10. The modified WAI includes a subset of questions from the original scale, designed to assess work ability compared to lifetime best, in relation to mental and physical demands, and sick leave. The specific question analyzed for this study was, “Assume that your ability to work at its best has a value of 10 points and 0 means that you cannot currently work at all. How many points would you give your current ability to work?” with a 0-10 scale response.
Patients’ self-reported age, race/ethnicity, marital status, and employment status were obtained from the HR/CSC baseline interview. The electronic medical record was reviewed (physician, nursing, and case management staff documentation) to abstract clinical data at the time of new patient visit (body mass index [BMI], comorbid conditions, cancer site) and during the 30-day postoperative follow-up window (surgical procedure, postoperative complications, and adjuvant treatment plan). Composite variables of major medical comorbidity and postoperative complication were created as reported in previous studies. The major comorbidity variable included notation in the record for at least 1 of these conditions: diabetes, pulmonary disease (chronic obstructive pulmonary disease, restrictive lung disease, home oxygen requirement), cardiac disease (congestive heart failure, history of myocardial infarction, coronary artery disease), immunocompromised states (HIV, chronic steroid use), and chronic kidney disease. Surgical complications were identified and graded using Clavien-Dindo criteria. All medical record abstraction data were limited to encounters at our institution, however patients were asked about outside medical encounters at each follow-up interview session. Those that reported encounters associated with surgical complications (n = 2) were recorded as experiencing a postoperative complication. Insurance status, at the time of the new patient visit, was also abstracted from the medical record. The medical record information was then merged with the HR/CSC demographic and interview data.
Statistical analysis
This study was designed to have 80% power to detect an effect size of 0.47 for comparing mean change in FACT-GP physical well-being scores between women with and without surgical complications at an alpha of 0.05. Change was defined for each woman as the difference in score between 1 month and baseline (ie, 1-month score – baseline score). We planned enrollment of 225 women, assuming 200 would have evaluable baseline and 1-month data and 50 would have surgical complications. Based on our actual number of evaluable women of 187 (47 of whom had complications), we had 80% power to detect an effect size of 0.49.
Summary statistics were generated using simple frequencies and percentages for categorical variables and mean with SD for continuous variables. Bivariate statistics using Student t tests for continuous data and χ 2 tests for categorical data were used to compare baseline characteristics between women with and without complications. Histograms of survey scale responses were constructed to determine normality. The primary outcome, change in mean score from baseline to 1 month, was defined as Δ score = 1-month score – baseline score. Mean scores and mean change in scores of the FACT-GP and subscales, PROMIS global health, and PROMIS anxiety and depression scales were compared using Student t tests accounting for unequal variance. Multivariate linear regression models were constructed and adjusted for age, BMI, and any HRQOL scores that differed significantly at baseline. All other potential covariates were explored and were to be included only if they were associated ( P < .05) with both the exposure (complications) and the outcomes (change in specific HRQOL score). No other covariates met these criteria.
To describe how HRQOL domains behaved within each group, we performed responder analyses by categorizing the trend in score change (from baseline to 1 month) into 3 groups: increased, no change, and decreased. This allowed us to examine what fraction of women in each group experienced improvement in a given HRQOL domain, compared to their baseline. Scores had to change in a magnitude greater than the MID for each scale to be categorized as increased or decreased. For example, the MID for the FACT-GP is 5-7. Therefore, only a score that increased by at least 5 points would be in that category. Otherwise, a score that changed by <5 points, in either direction, was grouped under “no change.” A Fisher exact test was used to compare proportions of each category between women with and without surgical complications. For the responder analysis that was statistically significant, the likelihood of increase/no change in score vs decrease was estimated using a multivariate logistic regression model. The question from the modified WAI was analyzed using parametric (Student t test) and nonparametric (Wilcoxon rank-sum test) approaches, given the 0-10 response scale and skewed sample sizes of the responders. For the secondary analysis, graphs of mean score over time, from baseline to 6 months, stratified by complication groups, were constructed.
Materials and Methods
Study design, enrollment, and data collection
We conducted a prospective longitudinal cohort study of women enrolled in the Health Registry/Cancer Survivorship Cohort (HR/CSC) at the University of North Carolina. This study was approved by the institutional review board (study no. 13-2367) and enrolled patients from October 2013 through October 2014, with final follow-up through May 2015. Under the HR/CSC protocol, patients were identified and recruited through the University of North Carolina health care oncology outpatient clinics with the following eligibility criteria: age ≥18 years; North Carolina mailing address; and English or Spanish speaking. Patients unable to provide informed consent or participate in interview questionnaires were excluded. For this study, eligibility was further restricted to HR/CSC patients recruited through the gynecologic oncology clinics, with newly diagnosed gynecologic cancer and planned surgical management. Initial exclusion criteria included primary surgery completed or to be completed at an outside institution, previous chemotherapy or radiation therapy, and pregnancy. After the first 10 weeks of enrollment, the exclusion criteria were modified to allow for retention of patients with final benign pathology to allow inclusion of women undergoing surgery for suspicious pelvic masses/suspected ovarian cancer. The exclusion criteria were also modified to include patients with prior, but not active, chemotherapy or radiation treatment to allow inclusion of patients with new recurrences.
Baseline interviews were conducted within 2 weeks of enrollment, prior to surgery, by trained staff using a computer-assisted telephone interview software tool specifically developed for the HR/CSC. Follow-up interviews were conducted at 1, 3, and 6 months after surgery. Patients were included who completed follow-up interviews within a 3-week interval around each targeted time point (1 week prior or up to 2 weeks after the target date [eg, 1, 3, or 6 months postsurgery]). Participants received gift cards as compensation after completion of each interview. Interview questionnaire topics included medical and social histories, and general and cancer-specific HRQOL assessments. The cohort was limited to participants who met all inclusion criteria and completed at least the baseline and 1-month interviews, as this was the primary outcome point of the analysis. As a secondary outcome, the 3- and 6-month data of this cohort were also reported.
The following validated questionnaires were included in this analysis: the Functional Assessment of Cancer Therapy (FACT)-General Population (GP), the PRO Measurement Information System (PROMIS) Global Health Short Form v1.0, PROMIS Anxiety 4-item Short Form and PROMIS Depression 4-item Short Form , and the modified Work Ability Index (WAI). The FACT-GP version 4 is a 21-item scale that measures 4 HRQOL subscales: physical, functional, emotional, and social well-being. The minimally important difference (MID) for the FACT-GP is 5-7 points, and 2-3 points for each subscale. The FACT-GP covers a wide range of patient experiences (eg, pain, energy levels, social support) and is commonly used in gynecologic oncology clinical trials ( Appendix ). The PROMIS global is a 10-item scale including questions about fatigue, physical function, pain, emotional distress, and social health that are summarized into measures of physical and mental health. PROMIS scales are scored using T scores (0-100 scale), which are standardized to the US general population and have a mean of 50 and SD of 10. The modified WAI includes a subset of questions from the original scale, designed to assess work ability compared to lifetime best, in relation to mental and physical demands, and sick leave. The specific question analyzed for this study was, “Assume that your ability to work at its best has a value of 10 points and 0 means that you cannot currently work at all. How many points would you give your current ability to work?” with a 0-10 scale response.
Patients’ self-reported age, race/ethnicity, marital status, and employment status were obtained from the HR/CSC baseline interview. The electronic medical record was reviewed (physician, nursing, and case management staff documentation) to abstract clinical data at the time of new patient visit (body mass index [BMI], comorbid conditions, cancer site) and during the 30-day postoperative follow-up window (surgical procedure, postoperative complications, and adjuvant treatment plan). Composite variables of major medical comorbidity and postoperative complication were created as reported in previous studies. The major comorbidity variable included notation in the record for at least 1 of these conditions: diabetes, pulmonary disease (chronic obstructive pulmonary disease, restrictive lung disease, home oxygen requirement), cardiac disease (congestive heart failure, history of myocardial infarction, coronary artery disease), immunocompromised states (HIV, chronic steroid use), and chronic kidney disease. Surgical complications were identified and graded using Clavien-Dindo criteria. All medical record abstraction data were limited to encounters at our institution, however patients were asked about outside medical encounters at each follow-up interview session. Those that reported encounters associated with surgical complications (n = 2) were recorded as experiencing a postoperative complication. Insurance status, at the time of the new patient visit, was also abstracted from the medical record. The medical record information was then merged with the HR/CSC demographic and interview data.
Statistical analysis
This study was designed to have 80% power to detect an effect size of 0.47 for comparing mean change in FACT-GP physical well-being scores between women with and without surgical complications at an alpha of 0.05. Change was defined for each woman as the difference in score between 1 month and baseline (ie, 1-month score – baseline score). We planned enrollment of 225 women, assuming 200 would have evaluable baseline and 1-month data and 50 would have surgical complications. Based on our actual number of evaluable women of 187 (47 of whom had complications), we had 80% power to detect an effect size of 0.49.
Summary statistics were generated using simple frequencies and percentages for categorical variables and mean with SD for continuous variables. Bivariate statistics using Student t tests for continuous data and χ 2 tests for categorical data were used to compare baseline characteristics between women with and without complications. Histograms of survey scale responses were constructed to determine normality. The primary outcome, change in mean score from baseline to 1 month, was defined as Δ score = 1-month score – baseline score. Mean scores and mean change in scores of the FACT-GP and subscales, PROMIS global health, and PROMIS anxiety and depression scales were compared using Student t tests accounting for unequal variance. Multivariate linear regression models were constructed and adjusted for age, BMI, and any HRQOL scores that differed significantly at baseline. All other potential covariates were explored and were to be included only if they were associated ( P < .05) with both the exposure (complications) and the outcomes (change in specific HRQOL score). No other covariates met these criteria.
To describe how HRQOL domains behaved within each group, we performed responder analyses by categorizing the trend in score change (from baseline to 1 month) into 3 groups: increased, no change, and decreased. This allowed us to examine what fraction of women in each group experienced improvement in a given HRQOL domain, compared to their baseline. Scores had to change in a magnitude greater than the MID for each scale to be categorized as increased or decreased. For example, the MID for the FACT-GP is 5-7. Therefore, only a score that increased by at least 5 points would be in that category. Otherwise, a score that changed by <5 points, in either direction, was grouped under “no change.” A Fisher exact test was used to compare proportions of each category between women with and without surgical complications. For the responder analysis that was statistically significant, the likelihood of increase/no change in score vs decrease was estimated using a multivariate logistic regression model. The question from the modified WAI was analyzed using parametric (Student t test) and nonparametric (Wilcoxon rank-sum test) approaches, given the 0-10 response scale and skewed sample sizes of the responders. For the secondary analysis, graphs of mean score over time, from baseline to 6 months, stratified by complication groups, were constructed.
Results
Study population
Of 281 women who consented for the study, 231 (82%) completed baseline interviews. Of the 50 nonparticipants, 12 were ineligible due to protocol or withdrawal, and 38 did not reply to interview requests. Of the 231 women who completed the baseline interview, 187 completed 1-month interviews, 185 of whom had completed medical record abstraction data. These 185 comprised the final study cohort based on the a priori primary outcome measure. These 185 were 65% (185/281) of the initial enrollment cohort and 80% of those who completed baseline interviews. Subsequent follow-up rates, based on the baseline interview, were 74% (170/231) at 3 months and 75% (174/231) at 6 months for the secondary outcome measures ( Figure 1 ).
From the 231 women who completed the baseline interview, responders and nonresponders at 1 month were assessed for differences in baseline characteristics. There was a greater proportion of ovarian cancer in nonresponders (39% vs 12%), with associated greater proportion of debulking procedures (26% vs 10%), and composite postoperative complications (35% vs 25%). These differences were expected as ovarian cancer patients have the highest rate of complication and readmissions among gynecologic cancers, and thus may be the most difficult to follow up for serial interview assessments. Responders and nonresponders were balanced on all other characteristics ( Supplemental Table 1 ).
Table 1 details the baseline characteristics of the main study cohort. Due to small numbers, non-white and non-black races were collapsed into an “other” category, which included Asian (n = 2, 1%), Native American (n = 3, 1.6%), and other (n = 5, 2.7%) respondents. There were 8 Hispanic women, 1 who identified as white, 1 as black, and 5 as other. There were 54 women with suspected malignancy who had benign disease on final pathology. These women were kept in the cohort, given they had procedures and therefore associated postoperative risks similar to those with cancer on final pathology. The group of women who experienced a postoperative complication had a larger proportion of ovarian cancer, unemployed status (including retired individuals), laparotomy, debulking surgery, bowel surgery, and adjuvant therapy compared to the women without complications.