Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

Mark J Bolland; Greg D Gamble; Alison Avenell; David J Cooper; Andrew Grey

doi:10.1016/j.jclinepi.2022.12.018

Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

Mark J Bolland, Greg D Gamble, Alison Avenell, David J Cooper, Andrew Grey

The University of Auckland

Research output: Contribution to journal › Article › peer-review

5 Citations (Scopus)

Abstract

OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.

STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.

RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.

CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.

Original language	English
Pages (from-to)	117-124
Number of pages	7
Journal	Journal of Clinical Epidemiology
Volume	154
Early online date	27 Dec 2022
DOIs	https://doi.org/10.1016/j.jclinepi.2022.12.018
Publication status	Published - Feb 2023

Bibliographical note

Funding: This research received no specific funding. MB is a recipient of an HRC Clinical Practitioners Fellowship. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The authors are independent of the HRC. The HRC had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Keywords

statistical methods
research integrity
categorical variables
p-values
data intergrity
randomization
fabricated data

Access to Document

10.1016/j.jclinepi.2022.12.018

Cite this

@article{473ea27dea444bbb8e64bd602ebfe69b,

title = "Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns",

abstract = "OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.",

keywords = "statistical methods, research integrity, categorical variables, p-values, data intergrity, randomization, fabricated data",

author = "Bolland, {Mark J} and Gamble, {Greg D} and Alison Avenell and Cooper, {David J} and Andrew Grey",

note = "Funding: This research received no specific funding. MB is a recipient of an HRC Clinical Practitioners Fellowship. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The authors are independent of the HRC. The HRC had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.",

year = "2023",

month = feb,

doi = "10.1016/j.jclinepi.2022.12.018",

language = "English",

volume = "154",

pages = "117--124",

journal = "Journal of Clinical Epidemiology",

issn = "0895-4356",

publisher = "Elsevier USA",

}

TY - JOUR

T1 - Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

AU - Bolland, Mark J

AU - Gamble, Greg D

AU - Avenell, Alison

AU - Cooper, David J

AU - Grey, Andrew

N1 - Funding: This research received no specific funding. MB is a recipient of an HRC Clinical Practitioners Fellowship. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The authors are independent of the HRC. The HRC had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

PY - 2023/2

Y1 - 2023/2

N2 - OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.

AB - OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.

KW - statistical methods

KW - research integrity

KW - categorical variables

KW - p-values

KW - data intergrity

KW - randomization

KW - fabricated data

U2 - 10.1016/j.jclinepi.2022.12.018

DO - 10.1016/j.jclinepi.2022.12.018

M3 - Article

C2 - 36584733

SN - 0895-4356

VL - 154

SP - 117

EP - 124

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

ER -

Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this