Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

Mark J Bolland, Greg D Gamble, Alison Avenell, David J Cooper, Andrew Grey

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.

STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.

RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.

CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.

Original languageEnglish
Pages (from-to)117-124
Number of pages7
JournalJournal of Clinical Epidemiology
Early online date27 Dec 2022
Publication statusPublished - Feb 2023

Bibliographical note

Funding: This research received no specific funding. MB is a recipient of an HRC Clinical Practitioners Fellowship. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The authors are independent of the HRC. The HRC had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.


  • statistical methods
  • research integrity
  • categorical variables
  • p-values
  • data intergrity
  • randomization
  • fabricated data


Dive into the research topics of 'Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns'. Together they form a unique fingerprint.

Cite this