While I read with great interest the development of a nomogram to predict the 5-year recurrence of border ovarian tumors, my enthusiasm was tempered by the numerous deficiencies in the statistical analysis.
The first issue is the prediction time horizon, which was reported to be 5 years. All included patients appear to have been followed for a minimum 5 years from diagnosis to recurrence or last follow up, with a median of 7.9 years; it is therefore improbable that recurrence at 5 years is actually being predicted. To predict recurrence at 5 years would require follow-up evaluation until 5 years from diagnosis, with those lost to follow up censored at the point of last known outcome status. Survival-based methods such as Cox regression are then used, which uses all available information, in preference to logistic regression, which was used by Bendifallah et al, because logistic regression is unable to account for censored observations.
This naturally brings us to the absence of information on whether there were patients who were followed for <5 years or had a recurrence before the 5 years and were omitted from the analysis. This would seem not only illogical, given the aim to predict recurrence by 5 years, but also likely to suggest selection bias.
Bootstrapping (or cross-validation) is seen as the preferred method for internal validation to quantify and adjust for optimism. However, all variable selection procedures, including the flawed univariate screening (carried out by Bendifallah et al ) should be repeated the within each bootstrap sample. Failure to do this and simply to assess the final model in each bootstrap sample, as done in the study by Bendifallah et al, is flawed and does not constitute a valid assessment of internal validity.
Finally, the authors correctly concluded that external validation is required before the nomogram is considered in routine practice. Unfortunately, the authors have crucially failed to present the model in sufficient detail to allow external validation by independent investigators. A nomogram is merely a graphic presentation of the underlying logistic regression model to aid routine use. External validation of the model would require the actual regression model, including all regression coefficients and the intercept, to be reported. Given the sample size requirements of external validation studies, manually entering patient details into the nomogram to generate the predicted probabilities of recurrence is not only infeasible but also fraught with likely data entry and measurement error issues.