Background
Uterine fibroids are an important source of morbidity for reproductive-aged women. Despite an increasing number of alternatives, hysterectomies account for about 75% of all fibroid interventional treatments. Evidence is lacking to help women and their health care providers decide among alternatives to hysterectomy. Fibroid Interventions: Reducing Symptoms Today and Tomorrow ( NCT00995878 , clinicaltrials.gov ) is a randomized controlled trial to compare the safety, efficacy, and economics of 2 minimally invasive alternatives to hysterectomy: uterine artery embolization and magnetic resonance imaging–guided focused ultrasound surgery. Although randomized trials provide the highest level of evidence, they have been difficult to conduct in the United States for interventional fibroid treatments. Thus, contemporaneously recruiting women declining randomization may have value as an alternative strategy for comparative effectiveness research.
Objective
We sought to compare baseline characteristics of randomized participants with nonrandomized participants meeting the same enrollment criteria and to determine whether combining the 2 cohorts in a comprehensive cohort design would be useful for analysis.
Study Design
Premenopausal women with symptomatic uterine fibroids seeking interventional therapy at 3 US academic medical centers were randomized (1:1) in 2 strata based on calculated uterine volume (<700 and ≥700 cc 3 ) to undergo embolization or focused ultrasound surgery. Women who met the same inclusion criteria but declined randomization were offered enrollment in a parallel cohort. Both cohorts were followed up for a maximum of 36 months after treatment. The measures addressed in this report were baseline demographics, symptoms, fibroid and uterine characteristics, and scores on validated quality-of-life measures.
Results
Of 723 women screened, 57 were randomized and 49 underwent treatment (27 with focused ultrasound and 22 with embolization). Seven of the 8 women randomized but not treated were assigned to embolization. Of 34 women in the parallel cohort, 16 elected focused ultrasound and 18 elected embolization. Compared with nonrandomized participants, randomized participants had higher mean body mass index (28.7 vs 25.3 kg/m 2 ; P = .01) and were more likely to be gravid (77% vs 47%; P = .003) and smokers (42% vs 12%; P = .003). Age, race, uterine volume, number of fibroids, and baseline validated measures of general and disease-specific quality of life, pain, depression, and sexual function did not differ between the groups. When we performed a comprehensive cohort analysis and analyzed by treatment arm, the only baseline difference observed was a higher median McGill Pain Score among women undergoing focused ultrasound (10.5 vs 6; P = .03); a similar but nonsignificant trend was seen in visual analog scale scores for pain (median, 39.0 vs 24.0; P = .06).
Conclusion
Using a comprehensive cohort analysis of study data could result in additional power and greater generalizability if results are adjusted for baseline differences.
Introduction
Uterine fibroids (leiomyomas) are a common and burdensome disease of reproductive-aged women, yet quality evidence to inform treatment decisions is currently lacking. The few randomized controlled trials (RCTs) comparing fibroid therapies have largely been performed outside the United States. US-based RCTs studying therapies for the related problem of heavy menstrual bleeding have often experienced recruitment challenges.
The Fibroid Interventions: Reducing Symptoms Today and Tomorrow (FIRSTT) study is an RCT ( NCT00995878 , clinicaltrials.gov ) funded by the National Institutes of Health (NIH) to compare 2 minimally invasive therapies approved by the US Food and Drug Administration (FDA): uterine artery embolization (UAE) and magnetic resonance imaging (MRI)–guided focused ultrasound surgery (MRgFUS).
Because of slow recruitment for the RCT, women meeting the FIRSTT inclusion criteria who declined randomization were offered enrollment in an abbreviated study protocol. Prior studies have demonstrated that such a comprehensive cohort design (CCD) can be useful because outcomes do not appear to differ when both groups receive the same treatments. The current report summarizes the baseline data for this trial and tests the hypothesis that contemporaneously recruited women meeting the same enrollment criteria who decline participation can be used in a CCD for fibroid treatment trials. Subsequent reports will use these data to compare safety, efficacy, economics, and ovarian reserve after treatment with UAE or MRgFUS.
Materials and Methods
Overview
The overview of the design of the FIRSTT study has been previously reported. Since the initial report, the University of California, San Francisco, was added in June 2013 to the initial sites: Mayo Clinic, Rochester, MN, and Duke University, Durham, NC. The institutional review board of each site approved the same study protocol.
Briefly, UAE and MRgFUS were performed according to the clinical standard of care, and treatment costs were designed to be paid by the participant’s health insurance. Insurance approval for both procedures was confirmed before randomization. Initially, funding was obtained for a 6-month study (RC1HD063312). Study visits occurred at baseline, day of study treatment, and 6 months after treatment. Telephone follow-up and review of study diaries took place 2, 4, and 6 weeks after treatment.
Protocol modifications are outlined in Supplemental Table 1 . Full funding (R01HD060503) allowed for visits at 12, 24, and 36 months after treatment, which included MRI at 24 and 36 months; measurement of ovarian reserve at baseline and 12, 24, and 36 months; and economic analysis of treatments.
Enrollment was substantially limited because of lack of insurance coverage for MRgFUS, and industry funding was obtained to support treatment if insurance coverage was denied (InSightec, Tirat Carmel, Israel). Thus, women randomized to MRgFUS were allowed to proceed with treatment while insurance appeals took place. This backup payment method was closed at 1 site as of Nov. 1, 2013, and the institution committed to cover these costs. Women in both treatment arms were responsible for copayments and deductibles.
The primary efficacy end point for the 36-month study was additional intervention for symptomatic fibroids during the study period. Change in the symptom severity score (SSS) subscale of the Uterine Fibroid Symptom and Quality of Life instrument (UFS-QOL) was the key secondary efficacy end point. Safety was assessed by examining adverse events. Other specific aims of the study included assessing fibroid regression with MRI, assessing ovarian reserve by examining antimüllerian hormone levels, and conducting an economic analysis of treatment.
Study population
Full inclusion and exclusion criteria at the study outset were reported. In brief, participants were premenopausal women with symptomatic fibroids and uteri <20 gestational weeks in size who were not actively trying for pregnancy. All women underwent MRI with gadolinium contrast during screening. Uterine volume was calculated using the formula for the volume of a prolate ellipsoid, and the number of fibroids with maximal diameter of ≥3 cm was recorded. Changes in enrollment criteria over time are outlined in Supplemental Table 1 . None of the previously treated participants would have been excluded by these protocol changes.
Enrollment and recruitment
At 2 of the 3 sites, a standard telephone screening instrument was used for prescreening, which sequentially identified exclusion criteria, including perimenopausal or postmenopausal status, women actively seeking pregnancy, prior fibroid interventional therapy, current use of a gonadotropin-releasing hormone analog, and medical contraindications to either study treatment. A study gynecologist subsequently screened each participant to exclude other causes of symptoms and to assess contraindications to either therapy, and MRI results were reviewed by a study radiologist for imaging enrollment criteria.
Multiple modes of recruitment were used and were extended over time ( Supplemental Table 1 ). Two observational cohorts were established in 2011 for women who did not enroll in the RCT. Parallel cohort 1 (PC1) consisted of women who met the RCT enrollment criteria but declined participation. PC1 participants completed all study instruments and underwent assessment of ovarian reserve but did not have follow-up MRIs and received a smaller stipend. Parallel cohort 2 (PC2) included all women undergoing fibroid therapy. The protocol for PC2 involved obtaining limited baseline data, assessing ovarian reserve, and collecting economic costs; participants received a minimal stipend. Two sites participated in PC1, and 1 participated in PC2.
Baseline measures
Baseline questionnaires were administered at the participant’s initial visit; we attempted to schedule women in the early proliferative phase. The collected validated measures included the UFS-QOL, Short Form-36, Center for Epidemiologic Studies Depression Scale, Female Sexual Function Index, McGill Pain Score (MPS), and visual analog scale for pain (VAS). Detailed information on study instruments was reported previously.
Randomization
The randomization scheme was stratified by study site and calculated uterine volume: <700 and ≥700 cc 3 . The randomization was performed using a World Wide Web–based application created and supported by the Division of Biomedical Statistics and Informatics at Mayo Clinic, Rochester, MN, which used a dynamic allocation approach based on the method of Pocock and Simon to achieve balance within each stratum. After randomization, treatment was typically scheduled within 10 days to minimize the chance of subject loss. Neither participants nor clinical investigators were blinded to study assignment.
Standardized treatment and recovery protocols
UAE and MRgFUS were performed using standardized protocols, and all participants received the same discharge instructions and prescriptions.
Data safety monitoring board
The data safety monitoring board (DSMB) consisted of the study statistician (A.L.W.); Dr Bradley Van Voorhis, a nationally recognized gynecologist and fibroid expert at the University of Iowa; and Dr James Spies, a nationally recognized interventional radiologist at Georgetown University with expertise in UAE. The NIH project officer was an ex officio member of the DSMB.
Statistical considerations
An intention-to-treat analysis was conducted. The initial sample size calculations were conducted to detect differences between the 2 treatment arms for: (1) the need for additional interventional therapy for symptomatic fibroids over the course of the follow-up period, and (2) the mean (SD) reduction (compared with baseline) in UFS-QOL SSS. Without published data on MRgFUS outcomes at 36 months at that time, calculations were based on published outcomes at 24 months: (1) 20% and 37.5% of patients needing additional therapy after UAE and MRgFUS, respectively; and (2) mean SSS decreases of 40.1 (SD 25.2) and 28.1 (SD 23.6) from baseline scores for UAE and MRgFUS, respectively. The study was designed to recruit 99 women per treatment arm, which provided statistical power of 78% and 93%, respectively, to detect the anticipated differences in outcomes 1 and 2. These calculations were based on a 2-sided χ 2 test and t test with a type I error of 0.05. There were no a priori stopping rules.
An interim analysis was conducted by the study statistician in February 2014 to assess the results and to determine whether study enrollment should be terminated early given the slow enrollment. Although too few participants had reached the 36-month point to assess the primary end point, statistically significant differences were observed between treatments for SSS over 24 months. The results were presented by blinded group assignment and reviewed by study investigators and the DSMB. The decision was made to end study enrollment as of Aug. 1, 2014, to allow all participants to have at least 12 months of follow-up within the study. In performing the interim analysis, missed follow-up visits were identified at 1 site, and enrollment was closed at that site on March 18, 2014.
Demographic and baseline characteristics were summarized using standard descriptive statistics: frequency (percentage) for categorical variables and mean (SD) or median (interquartile range [IQR]) for continuous variables. Comparisons between groups (RCT vs PC1 and MRgFUS vs UAE) were evaluated using the χ 2 test or Fisher exact test for categorical variables and the 2-sample t test or Wilcoxon rank sum test for continuous variables. All calculated P values were 2-sided, and P values <.05 were considered statistically significant. Analyses were performed using software (SAS, Version 9.3; SAS Institute Inc., Cary, NC).
Materials and Methods
Overview
The overview of the design of the FIRSTT study has been previously reported. Since the initial report, the University of California, San Francisco, was added in June 2013 to the initial sites: Mayo Clinic, Rochester, MN, and Duke University, Durham, NC. The institutional review board of each site approved the same study protocol.
Briefly, UAE and MRgFUS were performed according to the clinical standard of care, and treatment costs were designed to be paid by the participant’s health insurance. Insurance approval for both procedures was confirmed before randomization. Initially, funding was obtained for a 6-month study (RC1HD063312). Study visits occurred at baseline, day of study treatment, and 6 months after treatment. Telephone follow-up and review of study diaries took place 2, 4, and 6 weeks after treatment.
Protocol modifications are outlined in Supplemental Table 1 . Full funding (R01HD060503) allowed for visits at 12, 24, and 36 months after treatment, which included MRI at 24 and 36 months; measurement of ovarian reserve at baseline and 12, 24, and 36 months; and economic analysis of treatments.
Enrollment was substantially limited because of lack of insurance coverage for MRgFUS, and industry funding was obtained to support treatment if insurance coverage was denied (InSightec, Tirat Carmel, Israel). Thus, women randomized to MRgFUS were allowed to proceed with treatment while insurance appeals took place. This backup payment method was closed at 1 site as of Nov. 1, 2013, and the institution committed to cover these costs. Women in both treatment arms were responsible for copayments and deductibles.
The primary efficacy end point for the 36-month study was additional intervention for symptomatic fibroids during the study period. Change in the symptom severity score (SSS) subscale of the Uterine Fibroid Symptom and Quality of Life instrument (UFS-QOL) was the key secondary efficacy end point. Safety was assessed by examining adverse events. Other specific aims of the study included assessing fibroid regression with MRI, assessing ovarian reserve by examining antimüllerian hormone levels, and conducting an economic analysis of treatment.
Study population
Full inclusion and exclusion criteria at the study outset were reported. In brief, participants were premenopausal women with symptomatic fibroids and uteri <20 gestational weeks in size who were not actively trying for pregnancy. All women underwent MRI with gadolinium contrast during screening. Uterine volume was calculated using the formula for the volume of a prolate ellipsoid, and the number of fibroids with maximal diameter of ≥3 cm was recorded. Changes in enrollment criteria over time are outlined in Supplemental Table 1 . None of the previously treated participants would have been excluded by these protocol changes.
Enrollment and recruitment
At 2 of the 3 sites, a standard telephone screening instrument was used for prescreening, which sequentially identified exclusion criteria, including perimenopausal or postmenopausal status, women actively seeking pregnancy, prior fibroid interventional therapy, current use of a gonadotropin-releasing hormone analog, and medical contraindications to either study treatment. A study gynecologist subsequently screened each participant to exclude other causes of symptoms and to assess contraindications to either therapy, and MRI results were reviewed by a study radiologist for imaging enrollment criteria.
Multiple modes of recruitment were used and were extended over time ( Supplemental Table 1 ). Two observational cohorts were established in 2011 for women who did not enroll in the RCT. Parallel cohort 1 (PC1) consisted of women who met the RCT enrollment criteria but declined participation. PC1 participants completed all study instruments and underwent assessment of ovarian reserve but did not have follow-up MRIs and received a smaller stipend. Parallel cohort 2 (PC2) included all women undergoing fibroid therapy. The protocol for PC2 involved obtaining limited baseline data, assessing ovarian reserve, and collecting economic costs; participants received a minimal stipend. Two sites participated in PC1, and 1 participated in PC2.
Baseline measures
Baseline questionnaires were administered at the participant’s initial visit; we attempted to schedule women in the early proliferative phase. The collected validated measures included the UFS-QOL, Short Form-36, Center for Epidemiologic Studies Depression Scale, Female Sexual Function Index, McGill Pain Score (MPS), and visual analog scale for pain (VAS). Detailed information on study instruments was reported previously.
Randomization
The randomization scheme was stratified by study site and calculated uterine volume: <700 and ≥700 cc 3 . The randomization was performed using a World Wide Web–based application created and supported by the Division of Biomedical Statistics and Informatics at Mayo Clinic, Rochester, MN, which used a dynamic allocation approach based on the method of Pocock and Simon to achieve balance within each stratum. After randomization, treatment was typically scheduled within 10 days to minimize the chance of subject loss. Neither participants nor clinical investigators were blinded to study assignment.
Standardized treatment and recovery protocols
UAE and MRgFUS were performed using standardized protocols, and all participants received the same discharge instructions and prescriptions.
Data safety monitoring board
The data safety monitoring board (DSMB) consisted of the study statistician (A.L.W.); Dr Bradley Van Voorhis, a nationally recognized gynecologist and fibroid expert at the University of Iowa; and Dr James Spies, a nationally recognized interventional radiologist at Georgetown University with expertise in UAE. The NIH project officer was an ex officio member of the DSMB.
Statistical considerations
An intention-to-treat analysis was conducted. The initial sample size calculations were conducted to detect differences between the 2 treatment arms for: (1) the need for additional interventional therapy for symptomatic fibroids over the course of the follow-up period, and (2) the mean (SD) reduction (compared with baseline) in UFS-QOL SSS. Without published data on MRgFUS outcomes at 36 months at that time, calculations were based on published outcomes at 24 months: (1) 20% and 37.5% of patients needing additional therapy after UAE and MRgFUS, respectively; and (2) mean SSS decreases of 40.1 (SD 25.2) and 28.1 (SD 23.6) from baseline scores for UAE and MRgFUS, respectively. The study was designed to recruit 99 women per treatment arm, which provided statistical power of 78% and 93%, respectively, to detect the anticipated differences in outcomes 1 and 2. These calculations were based on a 2-sided χ 2 test and t test with a type I error of 0.05. There were no a priori stopping rules.
An interim analysis was conducted by the study statistician in February 2014 to assess the results and to determine whether study enrollment should be terminated early given the slow enrollment. Although too few participants had reached the 36-month point to assess the primary end point, statistically significant differences were observed between treatments for SSS over 24 months. The results were presented by blinded group assignment and reviewed by study investigators and the DSMB. The decision was made to end study enrollment as of Aug. 1, 2014, to allow all participants to have at least 12 months of follow-up within the study. In performing the interim analysis, missed follow-up visits were identified at 1 site, and enrollment was closed at that site on March 18, 2014.
Demographic and baseline characteristics were summarized using standard descriptive statistics: frequency (percentage) for categorical variables and mean (SD) or median (interquartile range [IQR]) for continuous variables. Comparisons between groups (RCT vs PC1 and MRgFUS vs UAE) were evaluated using the χ 2 test or Fisher exact test for categorical variables and the 2-sample t test or Wilcoxon rank sum test for continuous variables. All calculated P values were 2-sided, and P values <.05 were considered statistically significant. Analyses were performed using software (SAS, Version 9.3; SAS Institute Inc., Cary, NC).
Results
Inclusions and exclusions
Across study sites, 723 women were prescreened and 568 (79%) were excluded. The remaining 155 women consented to study participation and were seen for an initial study visit ( Figure ). Of these, 57 women (37%) made up the RCT group. Uterine size was <700 cc 3 in 41 (72%) of the RCT participants. Among the 57 women, 49 (86%) underwent study treatment: 22 UAE and 27 MRgFUS ( Figure ). Of the 8 women randomized but not treated, 7 were assigned to the UAE group. A protocol violation occurred in 1 of these 7 participants: she was randomized to UAE but then was subsequently assigned to MRgFUS at a second site before the dual randomization was discovered. None of these 8 participants underwent either study treatment. A majority of the 27 women undergoing MRgFUS had treatment funded either by industry (8; 30%) or institution (7; 26%).
In all, 38 women (25%) met all enrollment criteria but declined participation; 34 of these women participated in the PC1 observational cohort: 18 underwent UAE and 16 underwent MRgFUS ( Figure ). Of the women who consented, 60 (39%) did not meet the inclusion criteria for the RCT. Having an MRI finding that was an exclusion criterion was the most common reason for exclusion. Of the 60 women excluded, 38 were at the site participating in PC2, and 28 (74%) consented to be included in PC2 ( Figure ).
RCT cohort
Complete demographic and clinical information for the RCT cohort is shown in Table 1 . Women enrolled in the RCT were typical of women with uterine fibroids: mean age was 44.3 (SD 4.7) years, body mass index was 28.7 (SD 5.5) kg/m 2 , and 24% had experienced infertility. The RCT participants also tended to reflect the racial diversity of the United States, except for considerable underrepresentation of Hispanic women (5%). The majority of RCT participants were current alcohol users (82%) and regularly exercised (75%), whereas 18% were current smokers and 42% reported having >100 cigarettes in their lifetime.
Characteristic or measure | Overall, N = 91 | Study cohort | P a | |
---|---|---|---|---|
RCT, n = 57 | PC1, n = 34 | |||
Patient characteristics | ||||
Age at treatment, y b , mean (SD) | 44.5 (5.0) | 44.3 (4.7) | 44.9 (5.4) | .60 |
Race, n (%) | .15 | |||
African American | 11 (12) | 9 (16) | 2 (6) | |
Asian | 5 (5) | 1 (2) | 4 (12) | |
Hispanic or Latina | 5 (5) | 3 (5) | 2 (6) | |
White | 65 (71) | 42 (74) | 23 (68) | |
Other | 5 (5) | 2 (4) | 3 (9) | |
BMI, kg/m 2 , mean (SD) | 27.4 (5.9) | 28.7 (5.5) | 25.3 (6.0) | .01 |
BMI category, n (%) | .01 | |||
Underweight | 1 (1) | 0 (0) | 1 (3) | |
Normal | 29 (32) | 14 (25) | 15 (44) | |
Overweight | 34 (37) | 22 (39) | 12 (35) | |
Obese | 27 (30) | 21 (37) | 6 (18) | |
Gravidity, n (%) | .003 | |||
0 | 31 (34) | 13 (23) | 18 (53) | |
≥1 | 60 (66) | 44 (77) | 16 (47) | |
Parity, n (%) | .12 | |||
0 | 44 (48) | 24 (42) | 20 (59) | |
≥1 | 47 (52) | 33 (58) | 14 (41) | |
History of infertility, n (%) | 20 (23) (n = 88) | 13 (24) (n = 55) | 7 (21) (n = 33) | .79 |
Education, n (%) | (n = 88) | (n = 56) | (n = 32) | .17 |
Some high school | 1 (1) | 1 (2) | 0 (0) | |
High school graduate | 6 (7) | 4 (7) | 2 (6) | |
Some college | 16 (18) | 11 (20) | 5 (16) | |
College graduate with 4-y degree | 25 (28) | 18 (32) | 7 (22) | |
Postgraduate education | 40 (45) | 22 (39) | 18 (56) | |
Smoked >100 cigarettes in lifetime | 28 (31) (n = 90) | 24 (42) | 4 (12) (n = 33) | .003 |
Current smoker | 11 (12) | 10 (18) | 1 (3) | .048 |
Current alcohol use | 71 (80) (n = 89) | 46 (82) (n = 56) | 25 (76) (n = 33) | .47 |
Regular exercise | 65 (73) (n = 89) | 42 (75) (n = 56) | 23 (70) (n = 33) | .59 |
Insurance status, n (%) | .56 | |||
Commercial | 83 (91) | 53 (93) | 30 (88) | |
Government | 3 (3) | 1 (2) | 2 (6) | |
Self-pay | 5 (5) | 3 (5) | 2 (6) | |
Uterine and fibroid characteristics | ||||
Uterine volume, cc 3 , median (IQR) | 584 (395–756) | 563 (402–693) | 594 (368–814) | .82 |
No. of fibroids ≥3 cm, n (%) | .36 | |||
0 | 3 (3) | 0 (0) | 3 (5) | |
1 | 44 (48) | 15 (44) | 29 (51) | |
2 | 17 (19) | 9 (27) | 8 (14) | |
3 | 14 (15) | 5 (15) | 9 (16) | |
≥4 | 13 (14) | 5 (15) | 8 (14) | |
Age at fibroid diagnosis, y, mean (SD) | 40.3 (6.9) | 39.5 (7.1) | 41.7 (6.4) | .16 |
Self-reported fibroid-related symptoms, n (%) | ||||
Presenting symptom(s) | (n = 89) | (n = 32) | ||
Bulk symptoms | 85 (96) | 55 (97) | 30 (94) | .62 |
Heavy menstrual bleeding | 73 (82) | 48 (84) | 25 (78) | .47 |
Pain or fatigue | 78 (88) | 51 (90) | 27 (84) | .52 |
Predominant symptom | (n = 89) | (n = 32) | .21 | |
Bulk symptoms | 32 (36) | 22 (39) | 10 (31) | |
Heavy menstrual bleeding | 30 (34) | 15 (26) | 15 (47) | |
Pain or fatigue | 16 (18) | 11 (19) | 5 (16) | |
>1 Symptom | 11 (12) | 9 (16) | 2 (6) | |
Baseline validated measures | ||||
UFS-QOL, mean (SD) | ||||
Symptom severity score | 53.5 (19.5) | 53.8 (19.5) | 53.1 (19.9) | .89 |
Health-related quality of life | 51.6 (20.3) | 50.8 (18.8) | 52.8 (22.9) | .67 |
Short Form-36, mean (SD) | ||||
Mental composite score | 42.9 (10.6) | 41.7 (10.7) | 45.0 (10.4) | .16 |
Physical composite score | 45.1 (9.2) | 44.8 (9.1) | 45.7 (9.6) | .69 |
MPS, total, median (IQR) | 8.0 (3.0–15.0) | 9.0 (4.0–17.0) | 7.5 (2.0–12.0) | .17 |
VAS, intensity score, median (IQR) | 28.0 (5.0–59.0) | 35.5 (7.5–60.5) | 23.0 (4.0–49.0) | .11 |
CES-D | ||||
Total score, mean (SD) | 19.0 (6.8) | 19.5 (7.2) | 18.1 (6.3) | .38 |
Met criteria for subthreshold depression symptoms c , n (%) | 55 (66) (n = 84) | 37 (71) (n = 52) | 18 (56) (n = 32) | .16 |
FSFI, full scale score, mean (SD) | 19.8 (10.2) | 19.5 (10.7) | 20.1 (9.3) | .81 |