Baseline P  value distributions in randomized trials were uniform for continuous but not categorical variables

Mark J. Bolland; Greg D. Gamble; Alison Avenell; Andrew Grey; Thomas Lumley

doi:10.1016/j.jclinepi.2019.05.006

Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables

Mark J. Bolland (Corresponding Author), Greg D. Gamble, Alison Avenell, Andrew Grey, Thomas Lumley

The University of Auckland

Research output: Contribution to journal › Article › peer-review

16 Citations (Scopus)

Abstract

OBJECTIVE: Comparing observed and expected distributions of baseline variables in randomized controlled trials (RCTs) has been used to investigate possible research misconduct, although the validity of this approach has been questioned. We explored this technique and introduced a novel metric to compare P values from baseline variables between treatment arms.

STUDY DESIGN AND SETTING: We compared observed with expected distributions of baseline P values using a one-way chi-square test and by comparing the area under the curve (AUC) of the cumulative distribution function in 13 RCTs conducted by our group, two groups of RCTs known to contain fabricated data, and simulations.

RESULTS: In our 13 RCTs, the distribution of P values from baseline continuous variables was consistent with the expected theoretical uniform distribution (P = 0.19, difference from expected AUC -0.03, 95% confidence interval [-0.04, 0.04]). For categorical variables, the P value distribution was not uniform. The distributions of P values from RCTs with fabricated data were highly unusual and not consistent with the uniform distribution for continuous variables, nor with the expected distribution for categorical variables, nor with the distribution of P values in genuine RCTs.

CONCLUSIONS: Assessing baseline P values in groups of RCTs can identify highly unusual distributions that might raise or reinforce concerns about randomization and data integrity.

Original language	English
Pages (from-to)	67-76
Number of pages	10
Journal	Journal of Clinical Epidemiology
Volume	112
Early online date	21 May 2019
DOIs	https://doi.org/10.1016/j.jclinepi.2019.05.006
Publication status	Published - Aug 2019

Bibliographical note

Funding: No specific funding was received for this study. M.B. receives salary support from the Health Research Council of New Zealand. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The funders had no role in the study design; collection, analysis, and interpretation of the data; writing of the report; and in the decision to submit the paper for publication.

Data access: The specific SAS code used to perform the analyses can be obtained by contacting the lead author (M.B.) by email.

Keywords

Statistical methods
Research integrity
Fabricated data
Data integrity
P values
Randomization
STATISTICS
NORMAL POSTMENOPAUSAL WOMEN
HEALTHY
DENSITY
CALCIUM SUPPLEMENTATION
ZOLEDRONATE
THERAPY
BONE

Access to Document

10.1016/j.jclinepi.2019.05.006Licence: Unspecified

Cite this

@article{3183d36be91d4a159f64088aed5fc2db,

title = "Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables",

abstract = "OBJECTIVE: Comparing observed and expected distributions of baseline variables in randomized controlled trials (RCTs) has been used to investigate possible research misconduct, although the validity of this approach has been questioned. We explored this technique and introduced a novel metric to compare P values from baseline variables between treatment arms.STUDY DESIGN AND SETTING: We compared observed with expected distributions of baseline P values using a one-way chi-square test and by comparing the area under the curve (AUC) of the cumulative distribution function in 13 RCTs conducted by our group, two groups of RCTs known to contain fabricated data, and simulations.RESULTS: In our 13 RCTs, the distribution of P values from baseline continuous variables was consistent with the expected theoretical uniform distribution (P = 0.19, difference from expected AUC -0.03, 95% confidence interval [-0.04, 0.04]). For categorical variables, the P value distribution was not uniform. The distributions of P values from RCTs with fabricated data were highly unusual and not consistent with the uniform distribution for continuous variables, nor with the expected distribution for categorical variables, nor with the distribution of P values in genuine RCTs.CONCLUSIONS: Assessing baseline P values in groups of RCTs can identify highly unusual distributions that might raise or reinforce concerns about randomization and data integrity.",

keywords = "Statistical methods, Research integrity, Fabricated data, Data integrity, P values, Randomization, STATISTICS, NORMAL POSTMENOPAUSAL WOMEN, HEALTHY, DENSITY, CALCIUM SUPPLEMENTATION, ZOLEDRONATE, THERAPY, BONE",

author = "Bolland, {Mark J.} and Gamble, {Greg D.} and Alison Avenell and Andrew Grey and Thomas Lumley",

note = "Funding: No specific funding was received for this study. M.B. receives salary support from the Health Research Council of New Zealand. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The funders had no role in the study design; collection, analysis, and interpretation of the data; writing of the report; and in the decision to submit the paper for publication. Data access: The specific SAS code used to perform the analyses can be obtained by contacting the lead author (M.B.) by email.",

year = "2019",

month = aug,

doi = "10.1016/j.jclinepi.2019.05.006",

language = "English",

volume = "112",

pages = "67--76",

journal = "Journal of Clinical Epidemiology",

issn = "0895-4356",

publisher = "Elsevier USA",

}

TY - JOUR

T1 - Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables

AU - Bolland, Mark J.

AU - Gamble, Greg D.

AU - Avenell, Alison

AU - Grey, Andrew

AU - Lumley, Thomas

N1 - Funding: No specific funding was received for this study. M.B. receives salary support from the Health Research Council of New Zealand. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The funders had no role in the study design; collection, analysis, and interpretation of the data; writing of the report; and in the decision to submit the paper for publication. Data access: The specific SAS code used to perform the analyses can be obtained by contacting the lead author (M.B.) by email.

PY - 2019/8

Y1 - 2019/8

N2 - OBJECTIVE: Comparing observed and expected distributions of baseline variables in randomized controlled trials (RCTs) has been used to investigate possible research misconduct, although the validity of this approach has been questioned. We explored this technique and introduced a novel metric to compare P values from baseline variables between treatment arms.STUDY DESIGN AND SETTING: We compared observed with expected distributions of baseline P values using a one-way chi-square test and by comparing the area under the curve (AUC) of the cumulative distribution function in 13 RCTs conducted by our group, two groups of RCTs known to contain fabricated data, and simulations.RESULTS: In our 13 RCTs, the distribution of P values from baseline continuous variables was consistent with the expected theoretical uniform distribution (P = 0.19, difference from expected AUC -0.03, 95% confidence interval [-0.04, 0.04]). For categorical variables, the P value distribution was not uniform. The distributions of P values from RCTs with fabricated data were highly unusual and not consistent with the uniform distribution for continuous variables, nor with the expected distribution for categorical variables, nor with the distribution of P values in genuine RCTs.CONCLUSIONS: Assessing baseline P values in groups of RCTs can identify highly unusual distributions that might raise or reinforce concerns about randomization and data integrity.

AB - OBJECTIVE: Comparing observed and expected distributions of baseline variables in randomized controlled trials (RCTs) has been used to investigate possible research misconduct, although the validity of this approach has been questioned. We explored this technique and introduced a novel metric to compare P values from baseline variables between treatment arms.STUDY DESIGN AND SETTING: We compared observed with expected distributions of baseline P values using a one-way chi-square test and by comparing the area under the curve (AUC) of the cumulative distribution function in 13 RCTs conducted by our group, two groups of RCTs known to contain fabricated data, and simulations.RESULTS: In our 13 RCTs, the distribution of P values from baseline continuous variables was consistent with the expected theoretical uniform distribution (P = 0.19, difference from expected AUC -0.03, 95% confidence interval [-0.04, 0.04]). For categorical variables, the P value distribution was not uniform. The distributions of P values from RCTs with fabricated data were highly unusual and not consistent with the uniform distribution for continuous variables, nor with the expected distribution for categorical variables, nor with the distribution of P values in genuine RCTs.CONCLUSIONS: Assessing baseline P values in groups of RCTs can identify highly unusual distributions that might raise or reinforce concerns about randomization and data integrity.

KW - Statistical methods

KW - Research integrity

KW - Fabricated data

KW - Data integrity

KW - P values

KW - Randomization

KW - STATISTICS

KW - NORMAL POSTMENOPAUSAL WOMEN

KW - HEALTHY

KW - DENSITY

KW - CALCIUM SUPPLEMENTATION

KW - ZOLEDRONATE

KW - THERAPY

KW - BONE

UR - http://www.mendeley.com/research/baseline-pvalue-distributions-randomised-trials-were-uniform-continuous-not-categorical-variables

U2 - 10.1016/j.jclinepi.2019.05.006

DO - 10.1016/j.jclinepi.2019.05.006

M3 - Article

C2 - 31125614

SN - 0895-4356

VL - 112

SP - 67

EP - 76

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

ER -

Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Alison Avenell

Cite this

Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Profiles

Alison Avenell

Cite this