Validation of metabolomic models for prediction of early-onset preeclampsia




Objective


We sought to perform validation studies of previously published and newly derived first-trimester metabolomic algorithms for prediction of early preeclampsia (PE).


Study Design


Nuclear magnetic resonance–based metabolomic analysis was performed on first-trimester serum in 50 women who subsequently developed early PE and in 108 first-trimester controls. Random stratification and allocation was used to divide cases into a discovery group (30 early PE and 65 controls) for generation of the biomarker model(s) and a validation group (20 early PE and 43 controls) to ensure an unbiased assessment of the predictive algorithms. Cross-validation testing on the different algorithms was performed to confirm their robustness before use. Metabolites, demographic features, clinical characteristics, and uterine Doppler pulsatility index data were evaluated. Area under the receiver operator characteristic curve (AUC), 95% confidence interval (CI), sensitivity, and specificity of the biomarker models were derived.


Results


Validation testing found that the metabolite-only model had an AUC of 0.835 (95% CI, 0.769–0.941) with a 75% sensitivity and 74.4% specificity and for the metabolites plus uterine Doppler pulsatility index model it was 0.916 (95% CI, 0.836–0.996), 90%, and 88.4%, respectively. Predictive metabolites included arginine and 2-hydroxybutyrate, which are known to be involved in vascular dilation, and insulin resistance and impaired glucose regulation, respectively.


Conclusion


We found confirmatory evidence that first-trimester metabolomic biomarkers can predict future development of early PE.


A large prospective study recently reported a frequency of 0.46% for early-onset preeclampsia (PE) compared to 1.6% for late-onset PE. Despite its lower frequency, early PE is of paramount importance to medical practitioners because of the strong association with adverse perinatal outcomes. A population-based study from Washington State found a significantly increased adjusted odds ratio for perinatal complications including small-for-gestational-age status, fetal and neonatal death, and combined perinatal death and morbidity in early- compared to late-onset PE. A high frequency of histologic lesions consistent with placental underperfusion has been described in early PE cases and points to a pathological basis for the increased rates of adverse outcomes observed in this subgroup.


Recent metaanalyses found that early aspirin prophylaxis, ie, <16 weeks’ gestation, reduced the risk of subsequent PE by slightly >50% while reducing preterm delivery for PE by close to 90%. However, after 16 weeks, aspirin prophylaxis had significantly reduced effectiveness. Developing biomarkers for the diagnosis or prediction of PE is now a priority. Further, several national and international organizations have recommended that PE risk assessment, based largely on historical factors, be performed at initiation of prenatal care and that aspirin prophylaxis be used in appropriate high-risk cases. Metabolomics is being extensively used as a platform for biomarker discovery in complex diseases. Our group recently reported the feasibility of accurate first-trimester nuclear magnetic resonance (NMR)-based metabolomic prediction for both early and late PE. It is important that the performance of the identified biomarkers be validated to reduce the risk of overfitting and overly optimistic estimates of diagnostic accuracy. In this manuscript we report the results of a validation study to determine the diagnostic accuracy of the metabolomic biomarkers for the first-trimester prediction of early PE.


Materials and Methods


Study population


The details of patient recruitment, and specimen collection and handling have been previously published. That report consisted of 30 early PE cases and 60 healthy controls. An additional 20 early PE cases and 48 normal controls were added for the current report, resulting in a total of 50 early PE cases and 108 controls. This is part of an ongoing prospective study conducted by the Fetal Medicine Foundation, London, United Kingdom, for the first-trimester prediction of pregnancy complications including PE. The study was approved by the King’s College Hospital research ethics committee. Institutional review board project no. 02-03-033 approval was obtained initially on March 14, 2003. An average-risk population of British women were prospectively screened from March 2003 through September 2009 for the prediction of pregnancy complications. All patients gave written consent to participate. Pregnant women with singleton pregnancies were recruited at 11 +0 –13 +6 weeks’ gestation. Maternal demographics and medical history were documented. First-trimester ultrasound assessment including crown-rump length and uterine artery Doppler pulsatility index (UtPI) were performed. Uterine artery Doppler screening was performed using a previously published and extensively utilized protocol. To summarize, a sagittal plane of the uterus was imaged, and cervical canal and internal os were visualized. Transducer position was adjusted by tilting from side to side and using color flow Doppler. The uterine artery was identified running along the side of the uterus and cervix. The uterine artery on each side was identified and Doppler interrogation performed at the level of the internal os. UtPI was measured. To perform pulsed Doppler, a 2-mm sampling gate was placed over the point of interest and covered the uterine vessel. The angle of Doppler insonation was <30 degrees. Doppler pulsatility index (PI) was measured when 3 consecutive similar waveforms were obtained. Measurements were performed on the left and right uterine arteries. In the previously published study, the lower mean and higher UtPI were compared and the lower PI was found to have the highest screening performance. All Doppler measurements were performed by sonographers who achieved the Certificate of Competence ( http://www.fetalmedicine.com ). This technique of uterine Doppler measurements has been validated in a large number of patients in multiple studies. Maternal blood was obtained and immediately transferred to the laboratory within 5 minutes of collection. Specimens were left to stand for 10-15 minutes at room temperature to allow the blood to clot. The specimens were centrifuged at 3000 rpm for 10 minutes to separate serum from clots. The serum was aliquoted in 0.5-mL quantities in screw tubes. Samples were temporarily stored in a –20°C freezer and then transferred to a –80°C freezer within 24 hours.


The early PE cases were selected at random from our database of available stored samples. Controls were from pregnancies that delivered a phenotypically normal neonate with appropriate birthweight for gestational age at term and did not develop any hypertensive disorder of pregnancy. Each control had blood collected within 3 days of assessment of the late PE case. PE was defined as proposed by the International Society for the Study of Hypertension in Pregnancy with systolic blood pressure ≥140 mm Hg or diastolic ≥90 mm Hg on ≥2 occasions 4 hours apart >20 weeks of gestation, in previously normotensive women. Proteinuria was defined as a total of 300 mg in a 24-hour urine collection or, in the absence of a 24-hour urine collection, 2 readings of at least 2 + proteinuria on a midstream or catheterized urine specimen must also have been present in addition to the hypertension. Cases diagnosed with HELLP syndrome or gestational hypertension were excluded. As previously defined in our study, early PE were cases had a diagnosis that required delivery at <34 weeks.


Metabolomic analysis


The details of the NMR-based metabolomic analyses and statistical methods have been extensively described by our group and are summarized below.


NMR-based metabolomic analysis


Prior to NMR analysis, serum samples were filtered through 3-kDa cut-off centrifuge filter units (Amicon Micoron YM-3; Sigma-Aldrich, St. Louis, MO) to remove blood proteins. Aliquots of each serum sample were transferred into the centrifuge filter devices and spun (10,000 rpm for 20 minutes) to remove macromolecules (primarily protein and lipoproteins) from the sample. The filtrates were checked visually for any evidence that the membrane was compromised and for these samples the filtration process was repeated with a different filter and the filtrate inspected again. The subsequent filtrates were collected and the volumes were recorded. If the total volume of the sample was <300 μL an appropriate amount from a 50-mmol/L monosodium phosphate buffer (pH 7) was added until the total volume of the sample was 300 μL. Any sample that had to have buffer added to bring the solution volume to 300 μL was annotated with the dilution factor and metabolite concentrations were corrected in the subsequent analysis. After this, 35 μL of deuterium oxide and 15 μL of buffer solution containing 50 mmol/L of monosodium phosphate at pH 7; 11.667 mmol/L of disodium-2, 2-dimethyl-2-silceptentane-5-sulphonate; and 0.01% sodium oxide in H 2 O was added to the sample.


In all, 350 μL of serum was then transferred to a microcell NMR tube (Shigemi Inc, Allison Park, PA). 1 H-NMR spectra were collected on a 500-MHz Inova spectrometer (Varian Inc, Palo Alto, CA) equipped with a 5-mm hydrogen cyanide Z-gradient pulsed field gradient room-temperature probe. The singlet produced by the disodium-2, 2-dimethyl-2-silceptentane-5-sulphonate methyl groups was used as an internal standard for chemical shift referencing (set to 0 ppm) and for quantification. All 1 H-NMR spectra were processed and analyzed using a software package (Chenomx NMR Suite Professional, Version 7.6; Chenomx Inc, Edmonton, Alberta, Canada). Each serum NMR spectrum was manually fitted to an internal spectral database of pure compounds collected under identical conditions, which allowed an average of 50 compounds in each serum sample to be identified and quantified. Each spectrum was evaluated by at least 2 NMR spectroscopists to minimize errors.


Statistical analysis


Demographic and clinical data of the early PE and control groups were compared using a Student t test, χ 2 test, or a Fisher exact test, as appropriate.


For the comparisons of each metabolite, mean values of matched early PE and control sample populations were tested using a Student t test for metabolites exhibiting normal distributions or a Mann-Whitney U test for metabolites exhibiting nonnormal distribution. A Bonferroni corrected P value was calculated for multiple comparisons.


Multivariate statistical analysis was performed using log scaling to achieve the normalization of all NMR-derived metabolite concentration data. Multivariate statistical analysis was performed using principal component analysis, partial least squares discriminant analysis (PLS-DA), permutation testing and variable importance in projection plot, and stepwise logistic regression. These statistical techniques are important for analyzing metabolomic data.


Metabolites with a P value < .3 (using univariate analysis) were selected for generating the logistic regression model. A k-fold cross-validation technique was used to ensure that the logistic regression models were robust.


Two approaches were used in attempting to validate the metabolomics prediction models in an independent patient group. The performance of the previously published model was evaluated in the new patient group consisting of 20 early PE cases and 48 normal controls. To perform additional validation of metabolomics algorithms, the entire data set (previously published plus new patients) was randomly split into a discovery (training) set (60%) and a validation (test) set (40%). Random stratification and allocation of patients and controls such that the proportion of cases and controls in each group was similar in terms of demographics and other potentially confounding variables was performed. The discovery or training group was used to develop the predictive algorithm and model optimization was achieved using the cross-validation technique. The final result is a robust, optimal, and maximally parsimonious biomarker model. The predictive ability of the model was then tested independently in the validation group, which consisted of cases and control that had not been used in model generation.


For the selection of predictor variables in our logistic regression models, Least Absolute Shrinkage and Selection Operator and stepwise variable selection were utilized for optimizing all the model components via 10-fold cross-validation.


To determine the performance of each logistical regression model, area under the receiver operating characteristics (ROC) curve (AUC) was calculated as well as sensitivity and specificity values.


The MetaboAnalyst was used for principal component analysis, PLS-DA and permutation analyses. All other statistical analyses were performed using the MetaboAnalyst World Wide Web server. The custom programs written using the R statistical software package (R Foundation for Statistical Computing, Vienna, Austria) and STATA 12.0 (release 7.1, 2001; StataCorp, College Station, TX) were used to perform all other statistical analyses. A more detailed description of the statistical techniques is provided in the supplementary section.




Results


Table 1 compares demographic and clinical characteristics of the combined patient group. Race/ethnicity, weight, and uterine artery Doppler values were significantly different between controls and early PE cases. Table 2 separately compares the demographic and clinical characteristics between the cases and controls in both the discovery (training) and validation subsets. There were no significant differences between early PE cases and controls in either the discovery or in the validation groups apart from the maternal race/ethnicity and UtPI values. As expected the UtPI was generally elevated in the early PE cases compared to controls. Table 3 shows the univariate comparison of metabolite concentrations in early PE cases vs controls in the combined patient groups. Metabolite concentrations are expressed in μM/L. The direction of change and fold change in metabolite concentrations are also provided in this table. Bonferroni correction (adjusted significance level of .013) was utilized. The PLS-DA analysis resulted in a good separation between the early PE and controls ( Figure 1 ) for the combined data sets. Permutation testing demonstrated that the observed separation was statistically significant and not due to chance ( P < .001).



Table 1

Demographic and clinical characteristics of early preeclampsia and control groups (combined group)
































































Parameter Early PE Control P value
No. of cases 50 108
Maternal age, y, mean (SD) 31.0 (7.1) 31.7 (5.9) .467
Racial origin, n (%) .013
White 14 (28.0) 60 (55.6)
Black 28 (56.0) 35 (32.4)
Asian 7 (14.0) 12 (11.1)
Mixed 1 (2.0) 1 (0.9)
Nullipara, n (%) 23 (46.0) 45 (41.7) .609
Weight, kg, mean (SD) 73.6 (17.3) 68.4 (14.6) .052
Crown-rump length, mm, mean (SD) 62.3 (7.5) 64.3 (8.1) .143
UtPI, MoM, mean (SD) 1.80 (0.69) 1.23 (0.46) < .001

MoM , multiples of median for gestational age; PE , preeclampsia; UtPI , uterine artery Doppler pulsatility index.

Bahado-Singh. Metabolomic prediction of preeclampsia. Am J Obstet Gynecol 2015 .


Table 2

Demographic and other characteristics early preeclampsia: discovery vs validation group







































































































Parameter Discovery group Validation group
Early PE Control P value Early PE Control P value
No. of cases 30 65 20 43
Maternal age, y, mean (SD) 30.6 (7.0) 31.5 (5.8) .535 31.4 (7.4) 32.1 (6.0) .699
Racial origin, n (%) .42 .002
White 10 (33.3) 32 (49.2) 4 (20.0) 28 (65.1)
Black 25 (38.5) 15 (50.0) 13 (65.0) 10 (23.3)
Asian 4 (13.3) 7 (10.8) 3 (15.0) 5 (11.6)
Mixed 1 (3.3) 1 (1.5)
Nullipara, n (%) 13 (43.3) 27 (41.5) .869 10 (50.0) 18 (41.9) .545
Weight, kg, mean (SD) 74.2 (15.8) 69.3 (15.5) .183 73.0 (19.7) 67.0 (13.2) .228
Crown-rump length, mm, mean (SD) 62.4 (6.8) 64.7 (8.5) .205 60.1 (8.6) 63.7 (7.5) .458
UtPI, MoM, mean (SD) 1.82 (0.67) 1.25 (0.46) < .001 1.77 (0.62) 1.20 (0.48) .003

Discovery and validation data sets were randomly assigned to control for confounding variables.

MoM , multiples of median for gestational age; PE , preeclampsia; UtPI , uterine artery Doppler pulsatility index.

Bahado-Singh. Metabolomic prediction of preeclampsia. Am J Obstet Gynecol 2015 .


Table 3

Univariate analysis of metabolite concentrations in combined group (concentration: μmol/L)






























































































































































































































































































Metabolite Combined group P value Early PE/control Fold change
Early PE Control
No. of cases 50 108
2-hydroxybutyrate 23.21 (9.50) 21.39 (12.29) .313 Up 1.08
3-hydroxybutyrate 29.77 (16.49) 39.72 (59.92) .112 Down –1.33
3-hydroxyisovalerate 6.46 (3.69) 5.02 (3.75) .025 Up 1.29
Acetate 40.71 (34.29) 50.93 (39.30) .013 a Down –1.25
Acetoacetate 9.89 (7.05) 11.78 (10.80) .191 Down –1.19
Acetone 15.66 (5.03) 21.07 (22.19) .018 Down –1.35
Alanine 316.04 (91.87) 340.90 (144.19) .193 Down –1.08
Arginine 110.82 (32.10) 108.91 (33.73) .738 Up 1.02
Betaine 26.18 (7.56) 24.12 (7.64) .039 a Up 1.09
Carnitine 28.14 (6.45) 28.98 (12.04) .57 Down –1.03
Choline 24.91 (98.62) 84.87 (218.07) < .001 a Down –3.41
Citrate 86.64 (18.48) 81.25 (17.33) .077 Up 1.07
Creatine 36.68 (14.37) 36.62 (13.75) .979 Up 1.0
Creatinine 54.82 (11.54) 55.34 (12.55) .804 Down –1.01
Ethanol 30.34 (23.85) 36.71 (31.13) .16 Down –1.21
Formate 12.58 (4.84) 15.72 (12.12) .022 Down –1.25
Glucose 4397.9 (1231.4) 4014.9 (743.5) .046 Up 1.1
Glutamine 315.37 (66.84) 315.20 (77.74) .989 Up 1.0
Glycerol 168.72 (124.08) 322.81 (314.50) .001 a Down –1.91
Glycine 194.49 (60.87) 219.21 (88.41) .043 Down –1.13
Isobutyrate 6.83 (2.80) 6.20 (2.00) .159 Up 1.1
Isoleucine 46.53 (18.66) 48.84 (18.18) .464 Down –1.05
Isopropanol 7.47 (6.85) 26.61 (75.71) .011 a Down –3.56
Lactate 1259.2 (509.8) 1302.6 (714.6) .664 Down –1.03
Leucine 82.18 (32.78) 92.99 (58.92) .142 Down –1.13
Malonate 14.05 (6.74) 16.02 (8.52) .152 Down –1.14
Methionine 20.52 (5.55) 21.60 (6.90) .331 Down –1.05
Methylhistidine 42.60 (15.98) 40.49 (16.38) .448 Up 1.05
Ornithine 35.47 (12.70) 35.42 (14.22) .983 Up 1.0
Phenylalanine 63.14 (13.91) 65.77 (36.03) .028 a Down –1.04
Proline 136.25 (48.15) 131.83 (53.32) .619 Up 1.03
Propylene glycol 9.50 (4.16) 8.46 (4.27) .039 a Up 1.12
Pyruvate 70.34 (35.08) 60.39 (27.43) .059 a Up 1.16
Serine 122.62 (33.12) 138.16 (67.37) .054 Down –1.13
Succinate 5.16 (3.56) 9.02 (11.12) .001 Down –1.75
Threonine 124.98 (29.43) 131.35 (50.76) .322 Down –1.05
Tyrosine 52.50 (15.54) 51.10 (19.75) .659 Up 1.03
Valine 141.86 (45.49) 143.89 (47.10) .799 Down –1.01

Data presented as mean (SD) μmol/L. P values were calculated based on t test.

PE , preeclampsia.

Bahado-Singh. Metabolomic prediction of preeclampsia. Am J Obstet Gynecol 2015 .

a Calculated based on Mann-Whitney U test with nonnormal distributions. Adjusted significance level with Bonferroni correction for .05 is .0013.




Figure 1


Separation between PE and controls: PLS-DA

A , 2- and B , 3-dimensional score plots.

PE , preeclampsia; PLS-DA , partial least squares discriminant analysis.

Bahado-Singh. Metabolomic prediction of preeclampsia. Am J Obstet Gynecol 2015 .


The previously published metabolite plus Doppler prediction model, log (odds) = –0.008 – 0.075 acetate – 0.013 glycerol + 0.496 (3-hydroxyisovalerate) + 0.252 succinate + 0.155 crown-rump length + 8.148 UtPI multiples of median for gestational age, when tested in the new patient group (20 early PE cases and 48 normal controls) had an AUC of 0.79 (95% confidence interval, 0.65–0.93), sensitivity of 85%, and specificity of 65%. The previously published metabolite only model was not significant.


Using the discovery set only from the combined patient group, a series of logistic regression analyses were performed to develop biomarker models (ie, equations) for early PE prediction. Three models were developed: one consisted of UtPI only, the second used metabolites only, and the third evaluated a combination of metabolites with clinical/demographic and Doppler data. Table 4 shows the respective logistic regression models that resulted. The performances for the discovery models in the training group and the results of 5-fold cross-validation are presented in Table 5 . In the metabolite-only model the significant predictors were 2-hydroxybutyrate, 3-hydroxyisovalerate, acetone, citrate, and glycerol. The initial discovery model and the model after 5-fold cross-validation procedures for the training cohort only were compared and were found to be similar. The AUC, sensitivity, and specificity for the 3 different models in the discovery group are shown in Table 5 . The associated ROC plots in the discovery group are shown in Figure 2 . The biomarker models from the discovery group were then tested on the independent validation group and their performance is shown in Table 6 . The performance in the discovery (training) and validation groups were similar, thus confirming reproducibility of the algorithms. High diagnostic accuracy was achieved with the combination of metabolites plus uterine artery Doppler. These were also compared with the performance achieved by our previously published metabolite-only models. The ROC plots for the models in the validation group are shown in Figure 3 . The area under the curve is better for the current model compared to the previously published models.


May 5, 2017 | Posted by in GYNECOLOGY | Comments Off on Validation of metabolomic models for prediction of early-onset preeclampsia

Full access? Get Clinical Tree

Get Clinical Tree app for offline access