Background
The Pelvic Floor Distress Inventory-20 is used to evaluate symptoms and treatment effects in women with pelvic floor disorders. To interpret changes in the scores of this inventory, information is needed about what patients and clinicians perceive as the minimal important (meaningful) change. Although this change in the inventory score has been investigated previously in women who have undergone pelvic floor surgery, the results could not be generalized to women with milder symptoms (ie, lower scores) who often require only conservative treatment.
Objective
We aimed to estimate the minimal important change in the Pelvic Floor Distress Inventory-20 that was needed to demonstrate clinical improvement in women who qualify for conservative pelvic floor treatment.
Study Design
The data of 214 women aged ≥55 years were used. All participants were from 2 randomized controlled trials that compared conservative prolapse treatments in primary care in The Netherlands. The degree of prolapse was assessed with the use of the Pelvic Organ Prolapse Quantification system; participants completed the Pelvic Floor Distress Inventory-20 at baseline and at 12 months, with a global perception of improvement question at 12 months. To assess both the patient perspective and the clinical perspective, 2 anchors were assessed: (1) the global perception of improvement was considered the anchor for the patients’ perspective, and (2) the difference in the degree of prolapse was considered the anchor for the clinical perspective. Provided that the anchors were correlated by at least 0.3 to the Pelvic Floor Distress Inventory-20 change scores, we estimated the following minimal important changes: (1) the optimal cutoff-point of the receiver operating characteristics curve that discriminates between women with and without improvement in the global perception of improvement scale and (2) the mean Pelvic Floor Distress Inventory-20 change score of participants who improved 1 assessment stage. We then calculated the smallest detectable change to check whether the minimal important change was larger than the measurement error of the questionnaire.
Results
Using the global perception of improvement as the anchor, we found a minimal important change for improvement of 13.5 points (95% confidence interval, 6.2–20.9). The Pelvic Organ Prolapse Quantification change scores correlated poorly to the Pelvic Floor Distress Inventory-20 change scores and therefore could not be used as an anchor. The smallest detectable change at the group level was 5.5 points. Thus, the minimal important change was larger than the smallest detectable change at the group level.
Conclusion
In women with relatively mild pelvic floor symptoms, an improvement of 13.5 points (or a 23% reduction) in the Pelvic Floor Distress Inventory-20 score can be considered clinically relevant. This minimal important change can be used for clinical trial planning and evaluation of treatment effects in women whose condition is considered suitable for conservative treatment.
Patient-reported outcomes commonly are used to evaluate symptoms and treatment effects in research and clinical practice. The Pelvic Floor Distress Inventory-20 (PFDI-20) is a recommended questionnaire for use to evaluate the degree to which pelvic floor symptoms cause distress. Although it has been shown to have good validity, reliability, and responsiveness for this purpose, information is needed on what patients (and/or clinicians) perceive as a meaningful difference when interpreting the changes in PFDI-20 scores.
To determine whether a statistically significant change is also clinically relevant, Jaeschke et al introduced the concept of the minimal clinically important difference, also termed the minimal important change (MIC). They defined the MIC as “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive costs, a change in the patient’s management.” Any change larger than the MIC can then be considered clinically relevant. The MIC can be determined from many different perspectives, including that of the patient, the clinician, the researcher, the consumer, or even of society.
Various methods have been proposed to determine the MIC. Anchor-based methods compare changes in patient-reported outcome scores with other clinical changes or results; distribution-based approaches rely on statistical distributions of the results. Although there is no gold standard for the determination of the MIC, most authorities recommend the use of anchor-based methods when applying it to the various relevant perspectives (eg, patient-rated, clinician-rated, and disease-specific measures).
Only a few studies have attempted to estimate the MIC for the PFDI-20, but these were mostly among women who were undergoing prolapse or incontinence surgery who had relatively high baseline scores (94–121 points), and were all performed in tertiary urogynecology units. These resulted in anchor-based MICs for improvement that ranged from 23–45 points (24–47%) but were in a highly selected patient group. By contrast, women who opt for conservative treatment generally experience less severe symptoms than women who prefer surgical treatment. For example, studies that have evaluated conservative prolapse treatments have reported average PFDI-20 baseline scores of approximately 60 points. To further confound matters, other research has shown that the MIC depends on the baseline score, with evidence that patients with more severe baseline symptoms seem to require greater improvements to consider them clinically relevant (ie, a larger MIC) than patients with less severe initial symptoms. Consequently, the MICs established in tertiary care populations probably are unsuitable for use in the evaluation of conservative treatments in women with milder symptoms of pelvic floor dysfunction.
The aim of this study was to estimate the MIC for the PFDI-20 among women who qualified for conservative treatment of pelvic floor disorders.
Methods
Participants and setting
This analysis was conducted with data of the “Pelvic Organ Prolapse in Primary Care: Effects of Pelvic Floor Muscle Training and Pessary Treatment Study” for which the design and primary outcomes have been published. The Pelvic Organ Prolapse in Primary Care: Effects of Pelvic Floor Muscle Training and Pessary Treatment Study comprised 2 randomized controlled trials, with participants (women aged ≥55 years with symptomatic prolapse) who were recruited from 20 primary care practices. In the first trial, pelvic floor muscle training was compared with watchful waiting in women with a prolapse above the hymen. In the second trial, pelvic floor muscle training was compared with pessary treatment in women with a prolapse at or beyond the hymen. The trials were approved by the Medical Ethics Committee of the University Medical Center Groningen (METc2009.215) and were registered in the Dutch Trial Register ( www.trialregister.nl , identifier NTR 2047). All participants provided written informed consent. The present study included participants in both trials and used data that were collected at baseline and at the 12-month follow-up assessment.
Measures
Participants completed the PFDI-20 questionnaire at baseline and after 12-months’ follow-up evaluation. The PFDI-20 score ranges from 0–300; higher scores indicate higher symptom distress. PFDI-20 change scores were calculated by subtracting the follow-up score from the baseline score, such that a negative change score represented symptoms getting worse and a positive change score represented symptoms getting better.
The degree of prolapse was measured according to the Pelvic Organ Prolapse Quantification (POP-Q) system at baseline and after 12 months. The POP-Q stage (0–4) was assessed for each compartment (anterior vaginal wall, posterior vaginal wall, and uterus or vaginal vault), with the overall POP-Q stage being equal to the POP-Q stage of the most severely prolapsed compartment. A higher POP-Q stage represented more severe prolapse. The change score for the POP-Q stage was calculated by subtracting the overall POP-Q stage at follow up from the overall POP-Q stage at baseline. This led to a POP-Q change score that ranged from –3 (all participants had at least stage 1 prolapse at baseline) to +4. We also assessed the change of prolapse using a continuous measure of anatomic support. We calculated change scores for the degree of prolapse of the anterior vaginal wall (POP-Q point Ba), the posterior vaginal wall (POP-Q point Bp), and the uterus or vaginal vault (POP-Q point C) by subtracting the follow-up value from the baseline value. This led to change scores that ranged from –2 to +3 cm for Ba, –3 to +2 cm for Bp and –7 to +4 cm for C. In each of these anatomic change scores, a negative score represented a deteriorating prolapse, zero represented no change, and a positive score represented an improving prolapse.
After 12 months, participants were also asked to rate their global perception of improvement (GPI) since baseline, according to the following question: “Overall, do you believe that your symptoms are much worse (–2), worse (–1), about the same (0), better (+1), or much better (+2).”
Statistical methods
The MIC was determined by receiver operator characteristics (ROC) analysis and visualized with the use of an anchor-based distribution plot. In an attempt to assess both the patient and the clinical perspectives, 2 anchors were assessed for eligibility. The GPI was considered the anchor for the patients’ perspective, and the difference in the degree of prolapse on physical examination was considered the anchor for the clinical perspective. The anchors were only considered suitable for further analysis if they correlated with the PFDI-20 change score by at least 0.3 (Spearman’s ρ). Statistical analyses were performed using IBM SPSS for Windows (version 23.0; IBM Corp, Armonk, NY). Stata/SE software (version 14; Stata Corporation, College Station, TX) was used to estimate a confidence interval (CI) for the MIC based on 1000 bootstrap replicates.
Patients’ perspective
Provided that the GPI and the PFDI-20 change scores were correlated sufficiently, the participants were divided into 2 groups: those who reported symptom improvement (GPI categories “better” and “much better”) and those who did not (GPI categories “about the same,” “worse,” and “much worse”). The MIC, or the optimal ROC cut-off point, was defined as the value for which the sum of the false-positive and false-negative ([1-sensitivity] + [1-specificity]) percentages was the smallest. This value was taken to represent the PFDI-20 change score that best discriminated between participants with and without clinical improvement according to the GPI.
Clinical perspective
Provided that the POP-Q change score and the PFDI-20 change scores were correlated sufficiently, participants were divided into 2 groups: those who improved ≥1 POP-Q stages and those who did not (remained stable or deteriorated), with the MIC defined as the optimal cut-off point on the ROC curve. This value was taken as the PFDI-20 change score that allowed for the best discrimination between participants with and without improvement of ≥1 POP-Q stage.
For the continuous measure of anatomic support, provided that the Ba, Bp, and C change scores and the PFDI-20 change score were sufficiently correlated, participants were divided into 2 groups: those who improved ≥2 cm in Ba, Bp, or C, and those who did not (remained stable or deteriorated). The MIC was defined as the optimal cut-off point on the ROC curve. This value was taken as the PFDI-20 change score that allowed for the best discrimination between participants with and without improvement of ≥2 cm in the degree of prolapse.
Smallest detectable change
Not every change in a questionnaire score can be considered real. Small changes in questionnaire scores could reflect measurement error. In this context, the smallest detectable change (SDC) is the smallest measurement change that can be detected by an instrument beyond measurement error. Measurement error can be reduced by measuring in groups of patients and calculating average scores. This allows for the SDC on group level to be smaller than the SDC on the individual level (ie, greater change needed to be detected beyond measurement error in an individual patient compared with a group of patients).
We compared the estimated MIC with the SDC at both the individual and group levels using the standard error of measurement (SEM). The SDC at the individual level was calculated as [1.96 × SEM × √2]; the SDC at the group level was calculated as [(1.96 × SEM × √2) /√n]. The SEM was derived from the intraclass correlation coefficient for agreement of the relationship between the PFDI-20 scores at baseline and follow-up evaluation. The intraclass correlation coefficient was used as a reliability parameter and was calculated as a ratio of variances. The variance components were obtained by analysis of variance and represented different sources of within and between variances, as well as residual variance. This latter variance was used to calculate the SEM, as [√(σ 2 observation + σ 2 residual )], in the subgroup of participants who reported their symptoms to be “about the same” on the GPI.
Results
We obtained PFDI-20 scores at baseline and 12 months, plus the GPI scores at 12 months, for 214 women. Of these, 74 women were assigned randomly to watchful waiting; 110 women were assigned randomly to pelvic floor muscle training, and 39 women were assigned randomly to pessary treatment. The characteristics of the study population are shown in Table 1 .
Variable | Measure |
---|---|
Median age, y (interquartile range) | 63.1 (58.7–68.4) |
Mean body mass index, kg/m 2 (standard deviation) | 26.7±4.2 |
Postmenopausal, n (%) | 210 (98) |
Parity, n (%) | |
No children | 5 (2) |
1 Child | 12 (6) |
2 Children | 109 (51) |
>3 Children | 88 (41) |
Education level, n (%) | |
Primary | 10 (5) |
Lower | 74 (35) |
Intermediate | 62 (29) |
Higher | 68 (32) |
Hysterectomy, n (%) | 42 (20) |
Other pelvic floor surgery, n (%) | 6 (3) |
Pelvic Floor Distress Inventory–20 baseline score, mean ± standard deviation | 59.5±36.6 |
Pelvic organ prolapse quantification stage at baseline, n (%) | |
Stage 1 | 77 (36) |
Stage 2 | 122 (57) |
Stage 3 | 15 (7) |
Stage 4 | 0 |
The PFDI-20 change scores and GPI scores were moderately but sufficiently correlated, with a Spearman’s ρ of 0.35. The mean PFDI-20 change scores per GPI category are shown in Table 2 , and the distribution of the PFDI-20 change scores are shown within each GPI category in Figure 1 . None of the participants reported their symptoms to be “much worse.” The ROC curve analysis, which used the PFDI-20 change score cut-off points to discriminate between women with and without improvement, is shown in Figure 2 . The area under the ROC curve was 0.67, with a significant 95% CI of 0.59–0.74 ( P <.001). The MIC for improvement, defined as the optimal ROC cut-off point, was 13.5 points (95% CI, 6.2–20.9). This cut-off point had a sensitivity of 56% and a specificity of 75%, which means that 56% of the participants who reported symptom improvement had a PFDI-20 change score of ≥13.5 points; 75% of the participants who did not report symptom improvement had a PFDI-20 change score <13.5 points. The SDC at the individual level was 58.1 points; the SDC at the group level was 5.5 points. Figure 3 shows the anchor-based MIC distribution plot and gives a visual representation of the distributions of the PFDI-20 change scores for participants who reported symptom improvement (GPI categories “better” and “much better”) and for those who did not (GPI categories “the same” and “worse”), combined with the MIC estimate.
Global perception of improvement | n | Pelvic Floor Distress Inventory–20 change score±standard deviation |
---|---|---|
Much better | 13 | 40.5±35.5 |
Better | 65 | 14.8±29.8 |
About the same | 110 | 6.0±22.2 |
Worse | 26 | –9.3±18.5 |
Much worse | 0 | — |