The use of a comparability scoring system in reporting observational studies




The traditional statistical analyses with adjustment for confounders in observational studies assume that there is perfect similarity in the already-provided medical management between the comparison groups. However, variations in medical management frequently exist because of differences in circumstances of health care. We propose that to minimize the selection bias of observational studies, the degree of similarity or dissimilarity of the comparison groups regarding the circumstances of health care should be considered. Circumstances of health care include the geographic setting, health care setting, type of health care providers, and likelihood in having confounding introduced by differences in the medical management between comparison groups. We propose a comparability scoring system of circumstances of care and provide examples of the application of this system, using recent literature to assess comparability among study groups. In our examples, the presupposed statistical associations disappeared once the analyses accounted for the differences in circumstances of care. Authors of submitted manuscripts using an observational study design may consider incorporating our scoring system or an equivalent in their methods and in reporting of the results. The comparability score should be factored during statistical analysis so that the appropriate analysis can correct for differences in circumstances of care. The use of a comparability scoring system can provide important insights for reviewers and readers that will improve the interpretation of this type of research study.


In cohort studies, to avoid selection bias, a fundamentally important criterion is that subjects recruited to the unexposed group are chosen from the same population as those chosen with an exposure. This criterion presupposes that, at least theoretically, the distribution of covariates, on average, is similar between those exposed and unexposed to a treatment or intervention. An analogous argument extends to case-control studies, which often use historical controls for comparisons. Historical controls are identified from a group of subjects observed in the past close enough in time with those exposed. The implicit assumption is that bias because of temporal changes in population health or clinical practice between the groups can be avoided or minimized.


In its guidance to industry, clinical investigators, and the Food and Drug Administration, the US Department of Health and Human Services defines historical controls as “comparable patients or populations” with adequately documented natural history of the disease or condition who, in the past, had received no treatment or had undergone a standard management of either therapeutic, diagnostic or prophylactic. However, the definition of “comparable patients or populations” was not provided and has remained murky; as a result, the ideal control group for comparison observational studies remains poorly characterized.


Comparability or similarity between the control and study groups is extremely important to ensure valid results. However, perfect or almost perfect similarity in the baseline characteristics and medical management can be expected only in randomzied controlled trials. In observational studies, which represent real-life scenarios, it may be impossible to achieve an ideal or perfect control group. In such instances, any differences in the baseline characteristics or medical management between the 2 groups are usually handled with the use of appropriate statistical analytic techniques (ie, adjustment for confounders). However, such statistical analyses can take into consideration only those confounders that are collected and reported. Unfortunately, some of the important confounding variables may be related to differences in the medical management that are impossible to uncover unless the circumstances of health care are known for the 2 comparison groups.


It is unusual for investigators to report complete data about the circumstances of health care for the groups studied. Important circumstances of health care include the geographic setting, health care setting, type of health care providers, and likelihood in having confounding introduced by differences in the medical management between the 2 groups. The circumstances of care should be taken into consideration in determining comparability (or degree of similarity) between the groups being compared.


We hypothesized that some of the inherent selection biases of observational studies can be minimized if certain criteria (regarding the circumstances of health care) are used to determine comparability regarding the already provided medical care of the comparison groups. However, there are multiple types of circumstances of care along with different degrees of similarity or dissimilarity within each circumstance of care, thus leading to different degrees of comparability that are difficult to untangle.


The use of a comparability scoring system can provide a reasonable assessment of group similarities but, perhaps more importantly, can provide a discrete variable that can be used during data analysis. The comparability score can be incorporated in logistic regression models to adjust for differences in the circumstances of care.


The aim of this commentary is to propose certain comparability criteria, as they relate to circumstances of care, through the development of a scoring system that can be used to determine the degree of comparability or similarity between the study and control groups ( Table 1 ).



Table 1

Comparability scoring criteria




























































































Criteria Scoring category Comparability score Definition
Geographic setting Same 2 Comparison groups are from the same city
Similar 1 Comparison groups are from the same state (US studies); same county or province (Canadian studies); or same country (non-US/Canadian studies)
Dissimilar 0 Comparison groups are from different states (US studies); different counties or provinces (Canadian studies); or different countries (non-US/Canadian studies)
Health care setting Same 2 Comparison groups are from the same health care facility (ie, same hospitals) and same health system (eg, Kaiser Permanente or HCA system [US studies]) or same national health system (non-US studies)
Similar 1 Comparison groups are from the same setting with respect to inpatient vs outpatient, ED, nursing home, and public vs private settings
Dissimilar 0 Comparison groups are from different settings with respect to inpatient vs outpatient, ED, nursing home, and public vs private settings
Health care providers Same 2 Health care providers of the comparison groups are the same individuals or same groups of individuals
Similar 1 Health care providers of the comparison groups are not the same individuals or same groups of individuals, but they have similar credentials with respect to training (ie, MD, RN, etc)
Dissimilar 0 Health care providers of the comparison groups have different credentials with respect to training.
Confounding interventions impact Not likely 2 There is no reasonable possibility of confounding medical interventions in the comparison groups
Likely 1 Confounding medical interventions cannot be excluded in the comparison groups
Very likely 0 Confounding medical interventions are very likely in the comparison groups
Time interval 2 Midyear interval between the historical control and study group less than 3 years
1 Midyear interval between the historical control and study group 3-6 years
0 Midyear interval between the historical control and study group 7 years or longer
Consensus statements impact Not likely 2 There have not been any major public announcements or official consensus statement(s) to influence management between the historical control and study group
Likely 1 There have not been any major public announcements or official consensus statement(s), but other publications may have influenced management between the historical control and study group
Very likely 0 There have been major public announcements and/or official consensus statement(s), which may have influenced management between the historical control and study group
Total score

ED , emergency department; HCA , Hospital Corporation of America.

Vintzileos. Comparability criteria in observational studies. Am J Obstet Gynecol 2014.


Comparability scoring criteria


Ideally, comparative analyses should take into consideration the following criteria regarding circumstances of health care. The definition of the criteria is shown in Table 1 .


Geographic setting


It is known that the circumstances of health care delivery may vary according to the cultural or geographic setting. The geographic setting can be the same, similar, or dissimilar between the comparison groups.


Health care setting/health care system


Variations in health care may be expected according to the health care setting or health care system. The health care setting defines the facility and its available resources, in which care was provided, and it can be the same, similar, or dissimilar. The practices in university vs community hospitals are expected to be similar.


Type of health care providers


Practices vary among practitioners with different specialty training or credentials. Health care providers can be the same, similar, or dissimilar.


The change under investigation (diagnostic or therapeutic) should ideally be the only major difference in the management between the control and study groups


This is an extremely important requirement because in the presence of more than 1 intervention, it may be difficult or even impossible to determine the impact of each intervention separately. A determination should be made regarding the possibility of confounding intervention or interventions that often lead to the well-recognized issue called confounding by indication.


If a historical control group is used, the time interval between the historic control group and the study group should be short so that the possibility for confounding by practice changes between the 2 time periods is minimized


The time interval is defined as the time that elapses between the midyear of the historical control group’s study period and the midyear of the study group’s study period. This midyear definition was chosen to compensate for differences in the duration of the study periods of the 2 groups. Time intervals less than 3 years are ideal because no major practice changes are usually expected to occur within this short interval. The threshold of 3 years was chosen based on our practice experience. Intervals between 3 and 6 years are reasonable (although not ideal), and intervals of 7 years or greater are likely to be associated with some confounding by practice changes.


If a historical control group is used, there should be no consensus statements from official professional organizations that may have influenced management and therefore the comparability of the control and study groups


Public announcements or statements from highly influential sources such as the National Institutes of Health (NIH), state health departments, or professional societies are very likely to change practice. Examples of influential statements that most likely have changed practice is the 1994 NIH statement of administration of antenatal corticosteroids or the 2003 American College of Obstetricians and Gynecologists (ACOG) practice bulletin statement regarding the administration of 17-alpha-hydroxyprogesterone caproate in women with prior preterm delivery. Less influential statements are those that are less likely to alter practice (ie, ACOG practice bulletin recommendations based on levels of evidence B or C). A determination should be made regarding the possibility of confounding practices because of consensus statement or statements.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

May 11, 2017 | Posted by in GYNECOLOGY | Comments Off on The use of a comparability scoring system in reporting observational studies

Full access? Get Clinical Tree

Get Clinical Tree app for offline access