Confidence intervals for reporting results of clinical trials




















An evaluation of whether a treatment effect varies across subgroups i. This evaluation is typically conducted via statistical tests for interaction.

Only if the treatment effect varies across subgroups should specific subgroup analyses be undertaken. For example, there may be interest in evaluating whether a treatment effect is similar for men vs. If the treatment effect varies by gender then subgroup analyses may be undertaken. However, if the treatment effect is not dissimilar then there is no reason to conduct subgroup analyses within each gender.

When evaluating whether the treatment effect varies across subgroups, it is important to clarify the metric. For each treatment group, the response rate increases with age. However there is no interaction heterogeneity of treatment effects on the relative risk scale but there is on the absolute scale. The reporting of subgroup analyses in the literature has generally been poor. Furthermore the results of subgroup analyses are often over-interpreted.

First there is often low power to detect effects within subgroups. This is because clinical trials are generally powered to detect overall treatment effects and not necessarily for effects within particular subgroups where sample sizes are obviously smaller. Furthermore consider a trial that compares a new therapy vs. Suppose the results for men were 32 of 40 in the new therapy arm responded vs.

Suppose the results for women were 4 of 10 in the new therapy arm responded vs. This yields a p-value of 0. Does this imply that the treatment is effective in males but not females? Note that the relative risk in each gender is 2. It is only the smaller sample size that leads to the nonsignificant result in females. Note that conducting these subgroup analyses do not address the question of whether the treatment effect varies by gender.

When subgroup analyses are conducted then they should be reported regardless of significance. A forest plot is an effective method for reporting the results of subgroup analyses. The number of subgroup analyses conducted should be transparent so that results can be interpreted within the appropriate context. Subgroup analyses should generally be considered exploratory analyses rather than confirmatory. A common mistake of clinical researchers is to interpret significant statistical tests of association as causation.

Causation is a much stronger concept than association. There are no formal statistical tests for causation only for association. Although criteria for determining causation are not universal, a conclusion of causation often requires ruling out other possible causes, temporality demonstrating that the cause precedes the effect , strong association, consistency repeatability , specificity causes result in a single effect , biological gradient monotone dose response , biological plausibility, coherence consistency with other knowledge , and experimental evidence.

Clinical trials try to address the causation issue through the use of randomization and the ITT principle. However even in randomized clinical trials, replication of trial results via other randomized trials is usually needed. This is particularly true for evaluating causes other than randomized treatment. A more common concern is to conclude causation between a nonrandomized factor and a trial outcome. Researchers should be very careful about concluding causation without randomization.

Appropriate reporting of clinical trial results is crucial for scientific advancement. Selective reporting is very common and can result in sub-optimal patient care. A common problem in medical research is the under-reporting of negative evidence.

If trial results are negative, researchers often elect not to publish these results, perhaps in part because medical journals do not consider the results exciting enough to publish. However, if several trials are conducted to evaluate the effectiveness of a new intervention, and only one trial is positive and furthermore is the only trial that is published, then the medical community is left with a distorted view of the evidence of effectiveness of the new intervention.

For these reasons, negative evidence should be reported with equal vigor. When reporting the results of clinical trials, it is important to report measures of variation along with point estimates of the treatment effect, and confidence intervals. Reporting both relative risk and absolute risk measures, of adverse events for example, are helpful for interpreting the impact of the events. Creative and interpretable data presentation helps to convey the overall message from the trial data.

Reporting both benefits and risks categorized by severity provides a more complete picture of the effect of a therapy. Providing reference rates e. Researchers can consult the Consolidated Standards of Reporting Trials CONSORT Statement, which encompasses various initiatives to alleviate the problems arising from inadequate reporting of randomized controlled trials. It offers a standard way for authors to prepare reports of trial findings, facilitating their complete and transparent reporting, and aiding their critical appraisal and interpretation.

It comprises a item checklist and a flow diagram and is considered an evolving document. The checklist items focus on reporting how the trial was designed, analyzed, and interpreted; the flow diagram displays the progress of all participants through the trial. A p-value is the probability of observing data as or more extreme than that observed if the null hypothesis is true. In other words it is the probability of the data given a hypothesis being true.

Traditional frequentist statisticians view this as asking the probability of a fact i. However an alternative statistical approach, Bayesian statistics, allows calculation of the probability of a hypothesis being true given the data.

This approach can be more intuitive or appealing to researchers as they wish to know if a particular hypothesis is true. The disadvantage of this approach is that it requires additional assumptions and researchers generally try to move towards fewer assumptions so that results are robust.

Bayesian approaches are based on the idea that unknown quantities e. The assumptions called prior distributions in Bayesian terms often incorporate prior beliefs about the hypothesis. Historical data can be used to help construct the prior distribution. This might be an attractive approach when sound prior knowledge based on reliable data is available.

Use of Bayesian statistics has become more common in the design of clinical trials for devices. Jerzy Neyman. Biogr Mem Fellows R Soc. Eur Heart J. Bolstad WM. Introduction to Bayesian statistics. Hoboken, NJ: Wiley; Statistical inference: populations and samples. A methodology for health sciences. Hoboken, NJ: Wiley; , pp. Precision and statistics in epidemiologic studies. Modern epidemiology. Sedgwick P. Uncertainty in sample estimates: sampling error. Why clinicians are natural Bayesians.

The fallacy of placing confidence in confidence intervals. Psychon Bull Rev. Statistical tests, P values, confidence intervals and power: a guide to misinterpretations. Eur J Epidemiol. Robust misinterpretation of confidence intervals. Fiducial argument and the theory of confidence intervals. Frequentist probability and frequentist statistics.

Altman DG. Why we need confidence intervals. World J Surg. Efron B. Bayesian clinical trials in action. Stat Med. Comparison of Bayesian credible intervals to frequentist confidence intervals. J Mod Appl Stat Methods. Little RJ. Calibrated Bayes. Support Center Support Center.

External link. Conflict of Interest: None declared. National Center for Biotechnology Information , U. Journal List Indian J Urol v. Indian J Urol. Lawrence Flechner and Timothy Y. Tseng 1. Timothy Y. Author information Copyright and License information Disclaimer. For correspondence: Dr. E-mail: ude. This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.

Abstract Objectives: With the increasing emphasis on evidence-based medicine, the urology literature has seen a rapid growth in the number of high-quality randomized controlled trials along with increased statistical rigor in the reporting of study results.

Materials and Methods: The meaning and appropriate interpretation of these statistical measures is reviewed through the use of a clinical scenario. Results: The reader will be better able to understand such statistical measures and apply them to the critical appraisal of the literature. Keywords: Confidence intervals, evidence-based medicine, number needed to treat, statistical significance.

Evidence based clinical practice: A primer for urologists. J Urol. A critical assessment of the quality of reporting of randomized, controlled trials in the urology literature. How to perform a literature search.

Evidence-based urology in practice: How to use PubMed effectively. BJU Int. Effect of dutasteride on the risk of prostate cancer. N Engl J Med. Ann Intern Med. Sifting the evidence-what's wrong with significance tests?

The number needed to treat benefit NNTB values are shown to the left and number needed to treat harm NNTH values on the right as it has become more usual to show beneficial effects on the left. The valuable concept of the number need to treat was introduced about 10 years ago. Confidence intervals are usually quoted for the results of clinical trials, and this is widely recommended. Here confidence intervals have either been omitted or reported incompletely. In this paper I have shown how to produce sensible confidence intervals for the number needed to treat in all cases, both for numerical summary and graphical display.

These should be quoted whenever a number needed to treat value is presented. Conflicts of interest: None. National Center for Biotechnology Information , U. Journal List BMJ v. Douglas G Altman , professor of statistics in medicine.

Author information Article notes Copyright and License information Disclaimer. Accepted May This article has been cited by other articles in PMC. Summary points The number needed to treat is a useful way of reporting results of randomised clinical trials When the difference between the two treatments is not statistically significant, the confidence interval for the number needed to treat is difficult to describe Sensible confidence intervals can always be constructed for the number needed to treat Confidence intervals should be quoted whenever a number needed to treat value is given.

Open in a separate window. Figure 1. Rethinking the NNT scale The number needed to treat is calculated by taking the reciprocal of the absolute risk reduction. Figure 2. Number needed to treat in meta-analysis In meta-analyses it is desirable to show graphically the results of all the trials with their confidence intervals.

Figure 3. Figure 4. Comment The valuable concept of the number need to treat was introduced about 10 years ago. Footnotes Funding: None.

References 1.



0コメント

  • 1000 / 1000