![]() |
|
The
medical review article revisited: has the science improved?
McAlister FA, Clark HD,
van Walraven C, Straus SE, Lawson FNE, Moher D, Mulrow CD.
Annals of Internal Medicine
1999; 131:947-951.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: Concerns about the quality of conclusions and recommendations in review articles have led to proposals of methodological criteria for conducting and evaluating such reports.
Objective: To evaluate recently published reviews to determine if the quality of such articles has improved as a result of the proposed improvements.
Design: Explicit methodological criteria were used to rate recent review articles. The reviews were also rated relevant to the scientific basis of their treatment recommendations.
Data collection and analysis: 6 general medicine journals (3 high impact journals: New England Journal of Medicine, Annals of Internal Medicine and JAMA; and 3 lower impact journals: BMJ, American Journal of Medicine and Journal of Internal Medicine) were hand-searched for review articles published in 1996. Review articles were rated using 10 criteria for methodological rigor (previously validated) and 5 criteria for the scientific basis of their treatment recommendations (developed for this study). The proportion of reviews from high impact journals and from lower impact journals that met each criterion was compared.
Main Results: A total of 158 review articles were identified and rated. Only 2 reviews met all 10 methodological criteria. Nineteen review articles were described as a "meta-analysis", "systematic review" or "overview", a higher proportion of these met the methodological criteria. Review articles published in high impact journals did not meet the methodological criteria more often than those published in lower impact journals. Only 44 of the review articles described how evidence was located. Of the 111 review articles in which treatment recommendations were made, a median of 3 therapies were discussed. The benefits and harms of the treatments were discussed in one third of these review articles.
Conclusions: Although there have been improvements since the use of methodological criteria for review articles was proposed, there is a wide variation in the methodological quality of review articles. Few review articles fulfil methodological criteria of rigor, and concern remains about the validity of their conclusions and recommendations.
COMMENTARY
Prepared by Peter Langhorne
More than 10 years ago Cindy Mulrow published an influential article pointing out the poor scientific quality of the conventional medical review article, which highlighted the need for systematic reviews.1 In this article, Mulrow and colleagues revisit the question of the scientific quality of medical review articles from 6 core general medical journals published in 1996. The study is carefully conducted using several strategies to minimise bias.
The authors conclude that the scientific quality of medical review articles, when judged against a range of criteria, has improved since the 1980s but remains very variable. Their conclusions complement other studies highlighting the higher methodological quality of systematic reviews compared with conventional narrative reviews. The main implications for Cochrane reviewers are that the many painful hours taken up with rigorous review methods are worthwhile and likely to result in a substantially less biased review article. For users of evidence it highlights the importance of accessing high quality systematic reviews.
Some of the limitations of this article also carry implications for future methodological research. Firstly, there is the concern that judging the quality of an article using published information may be misleading, i.e. your judgement is based on what was reported which may not be what was actually done. It would be interesting to establish to what extent this is the case. Secondly, and crucially, we lack really firm empirical evidence that reviews which have used explicit systematic methods yield results which are closer to the truth. In particular we do not know which aspects of review methodology are essential to ensure a high validity, and those which are not necessary and could be abandoned. However, in the interim it seems much safer to conclude that systematic reviews are the gold standard and to ask proponents of narrative reviews to prove that they are not hopelessly invalid.
Reference
1. Mulrow CD. The medical
review article: state of the scene. Annals of Internal Medicine 1987; 106:485-488.
Identification
of randomized controlled trials from the emergency medicine literature
Langham J, Thompson E, Rowan
K. Annals of Emergency Medicine 1999; 34:25-34.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: Handsearching of journals is a key element in the process of identifying randomised trials for consideration for Cochrane reviews.
Objectives: Two studies were undertaken: one to compare motives for active participation in handsearching of the literature by emergency medicine professionals, and the other to compare handsearching with MEDLINE searching of emergency medicine journals.
Design: A letter was sent to members of the British Association for Emergency Medicine (BAEM) and the Society for Academic Emergency Medicine (SAEM) seeking their help in handsearching journals for the Cochrane Collaboration. Prioritized journals from the emergency medicine literature were handsearched by the volunteers recruited in this way. A comprehensive MEDLINE search was done for each journal in 1996. The searching covered the period 1948-1995 where possible.
Data collection and analysis: The proportion of the members of BAEM and SAEM who expressed an interest and who went on to handsearch journals was collected and attempts were made to investigate their motivation for doing so. The yield of randomised trials from handsearching and MEDLINE searching were compared.
Main results: The response rate from members of BAEM and SAEM (10.1% and 1.8%, respectively) was low and there was insufficient data to investigate motivation. 62 journals were prioritised, 18 of which were indexed by MEDLINE, and handsearching was completed for 14 of these. A total of 710 reports of randomised trials were identified by a combination of handsearching and the MEDLINE search. Both methods identified 365 (51%) of these reports; handsearching revealed an additional 227 (32%) that were not identified by MEDLINE searching, and MEDLINE searching found 118 (17%) that were not identified by handsearching.
Conclusions: Writing to members of relevant professional organisations is not recommended as a way to recruit handsearchers. A combination of handsearching and MEDLINE searching was needed to identify reports of randomised trials.
COMMENTARY
Prepared by Sally Hopewell
The authors provide a useful comparison of handsearching versus MEDLINE searching to identify reports of randomised controlled trials. Their work highlights several important issues. Firstly, the high proportion of journals not indexed by MEDLINE (in particular non-English language journals), and secondly, that even if a journal is indexed by MEDLINE many conference abstracts and supplements it contains are often not indexed. The authors demonstrate that handsearching alone should not be considered a "gold standard", and that a combination of both MEDLINE and handsearching can produce the highest yield. This approach, drawing on additional sources such as EMBASE, underpins the creation of the Cochrane Controlled Trials Register. This register continues to be the best single source of information on published trials for systematic reviews.1 However, the findings of comparisons such as Langham et al need to be interpreted cautiously due to recent improvements in MEDLINE by the National Library of Medicine and the Cochrane Collaboration to ensure that randomised controlled trials not coded as such in MEDLINE are being retagged. This work has led to incremental improvements in MEDLINE over the last 4 years. 2
Early findings of an ongoing project at the UK Cochrane Centre comparing handsearching versus MEDLINE searching using the Publication Type terms RANDOMIZED CONTROLLED TRIAL and CONTROLLED CLINICAL TRIAL appear to support Langham et al’s conclusions. A total of 715 reports of RCTs were found in 22 English language specialised healthcare journals by using a combination of both handsearching and MEDLINE. Of these, 314 (44%) were identified by MEDLINE and handsearching, 369 (52%) were identified only by handsearching, and 32 (4%) were identified only by MEDLINE. Handsearching appears to more than double the MEDLINE search but for two very different reasons. Firstly, trials were published in journals before 1991 when the MEDLINE indexing term RANDOMIZED CONTROLLED TRIAL was not available and secondly, trials were not indexed by MEDLINE because they were conference abstracts or published in supplements.3
These studies add to a growing body of evidence that will be combined in a Cochrane review of methodology. This shows the continuing importance of handsearching as a means of identifying randomised controlled trials. However, it also reveals that the introduction of specific index terms for randomised trials in MEDLINE has led to improvements in the identification of trials in that database. The relative merits of handsearching versus MEDLINE searching need to be considered carefully by those embarking on the searching of journals published in recent years.
References
1. Egger M, Smith GD. Bias in location and selection of studies. BMJ 1998; 316:61-66.
2. Lefebvre C, Clarke M. Identifying randomised trials. In: Egger M, Davey Smith G, Altman D. Systematic reviews in health care: meta-analysis in context. 2000; In press.
3. Hopewell S, Clarke M, Lusher A, Westby M, Lefebvre C. A comparison of handsearching versus MEDLINE searching to identify reports of randomised controlled trials. 3rd Symposium on Systematic Reviews. 2000; In press.
The hazards of scoring the quality of clinical trials for meta-analysis
Juni P, Witschi A, Bloch
R, Egger M. JAMA 1999; 282:1054-1060.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: Although it is widely recommended that clinical trials undergo some type of quality review, the number and variety of quality assessment scales that exist make it unclear how to achieve the best assessment.
Objectives: To determine whether the type of quality assessment scale used affects the conclusions of meta-analytic studies.
Design: A meta-analysis of 17 trials comparing low-molecular-weight heparin (LMWH) with standard heparin for prevention of postoperative thrombosis was carried out using 25 different scales to identify high quality trials.
Data collection and analysis: The association between treatment effect and summary scores and the association with 3 key domains (concealment of treatment allocation, blinding of outcome assessment, and handling of withdrawals) were examined in regression models. The main outcome measure was the pooled relative risks of deep vein thrombosis with LMWH versus standard heparin in high quality versus low quality trials as determined by the 25 quality scales.
Main results: The pooled relative risks from high-quality trials ranged from 0.63 (95% confidence interval (CI), 0.44-0.90) to 0.90 (95% CI, 0.67-1.21) compared to 0.52 (95% CI, 0.24-1.09) to 1.13 (95% CI, 0.70-1.82) for low quality trials. For 6 scales, the relative risks of high quality trials were close to unity, indicating that LMWH was not significantly superior to standard heparin, whereas low quality trials showed better protection with LMWH (p < 0.05). Seven scales showed the opposite: high quality trials showed an effect, whereas low quality trials did not. For the remaining 12 scales, effect estimates were similar in the 2 quality strata. In regression analysis, summary quality scores were not significantly associated with treatment effects. There was no significant association of treatment effects with allocation concealment and handling of withdrawals. However, open outcome assessment did influence effect size with the effect of LMWH, on average, being exaggerated by 35% (95% CI, 1%-57%; p = 0.046) compared to trials with blinded outcome assessment.
Conclusions: These data indicate that the use of summary scores to identify trials of high quality is problematic. Relevant methodological aspects of the trials should be assessed individually and their influence on effect sizes explored.
COMMENTARY
Prepared by Mike Clarke
It is important that reviewers try to make some judgements about the quality of studies they might include in their review. There seems little point in including studies that are of such low quality that their results are too unreliable to contribute anything meaningful to the review. However, the judgement on what is a high or low quality study can be very difficult, and many scales and checklists have been developed in the hope that these would help with the process for randomised trials.1
Juni et al have shown that this is not necessarily so. The 25 scales they used were inconsistent in categorising trials into high and low quality. Some produced meta-analyses with higher effect estimates for "high quality" trials, others produced higher estimates for "low quality" trials, and others produced similar results for the two categories. This range of results indicates the instability of using these scales. Of preference, and in keeping with the guidance in the Cochrane Reviewers’ Handbook, it seems preferable for reviewers to focus on specific key domains that seem to relate to the quality, and hence the reliability, of the trial. These are concealment of allocation, handling of withdrawals or dropouts in the analysis of the trial, and blinding of outcome assessment. If reviewers investigate these factors, it should help them to judge the relative quality of trials. If they report these factors it will also help readers of their review to make their own judgements about trial quality and its possible influence on the results of the review.
Reference
1. Moher D, Jadad A, Nichol G, Penman M, Tugwell P, Walsh S. Assessing the quality of randomised controlled trials. Controlled Clinical Trials. 1995; 16:62-73.
Random-effects
meta-analyses are not always conservative
Poole C, Greenland S.
American
Journal of Epidemiology 1999; 150:469-475.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: In meta-analyses, effect estimates based on a random-effects model are generally believed to be more conservative than those based on a fixed-effect model because they are believed to have higher estimated variances (and therefore wider confidence intervals).
Objective: To compare random-effect summary estimates with fixed-effect estimates.
Design: Re-analysis of the meta-analyses of water chlorination and specific cancers.1
Data collection and analysis: Random-effect summaries, fixed-effect summaries and homogeneity p-values were computed for each of the 12 cancer sites.
Main Results: There was considerable unexplained variation between the results. The fixed-effect and random-effect summaries agreed for only the few cancer sites where the results of the contributing studies were fairly homogeneous. The homogeneity p-values for many of the other sites indicate that the calculation of any summary effect estimate might be unreliable. In such circumstances, the fixed-effect summary had a narrower confidence interval and was closer to the null hypothesis, suggesting that it was more conservative than the random-effect summary.
Conclusions: The discussion of when to use fixed-effect and when to use random-effect summaries should be replaced by a discussion of whether summary effects should be computed at all when the studies are not methodologically comparable, when their results are discernibly heterogeneous, or when there is evidence of publication bias.
Reference
1. Morris R, Audct A, Angelillo I et al. Chlorination, chlorination by-products and cancer: a meta-analysis. American Journal of Public Health 1992; 82:955-963. (Erratum in American Journal of Public Health 1993; 83:1257).
COMMENTARY
Prepared by Jesse Berlin
This article makes a number of valid arguments regarding fixed versus random-effect meta-analysis. The emphasis is on heterogeneity of study results and the implications of heterogeneity for the interpretation of summaries of the data. While the authors’ focus is on analyses of epidemiologic studies, which one might argue are likely to be more heterogeneous than randomised trials with respect to design, study populations, and outcome, the important points the authors make are also relevant to Cochrane reviews of randomised trials.
The example used by the authors is a previously published systematic review of studies of the association between water chlorination and cancer. In particular, they focus on studies of cancer of the rectum. There is strong evidence of heterogeneity of results among these studies. Poole and Greenland suggest that, rather than simply use a random-effects summary of the data, as was done by the authors of the original systematic review, one should explore potential sources of bias as an explanation for the heterogeneity. For example, when Poole and Greenland separate studies in which the water source (surface or ground water) was held constant by design, they find weaker associations in these studies than in studies in which the comparisons were likely to be between chlorinated surface water and unchlorinated ground water.
Publication bias is offered as another possible problem with studies of water chlorination. They argue, as others have, that under publication bias, small studies show larger effects, and such studies are given relatively more weight by random-effects analyses than by fixed-effects analyses.
The basic argument offered by Poole and Greenland is that, in fact, there may be situations in which random-effects summaries are actually less conservative than fixed-effects summaries. The message for Cochrane reviewers seems to be that it is imperative to perform both fixed-effect and random-effect meta-analyses. When the two results disagree, it is almost certainly because of heterogeneity of findings across studies. In some situations, one may wish to consider not producing a single summary estimate, because the heterogeneity would make such a summary difficult to interpret. If a meta-analysis is to be retained in the review, one goal for it should be to explore the reasons for the heterogeneity.
The Cochrane Library may offer a unique opportunity to study this phenomenon further. For example, by using a large number of existing meta-analyses, one might consider relating measures of heterogeneity and of publication bias to the differences between fixed and random-effects summary estimates. This might produce empirical evidence of how often publication bias may, in fact, be exaggerated by the use of random-effects summaries.
The
effects of information framing on the practices of physicians
McGettigan P, Sly K, O’Connell
D, Hill S, Henry D. Journal of General Internal Medicine 1999; 14:633-642.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: The presentation format of clinical trial results, or the "frame", may influence perceptions about the worth of a treatment. The extent and consistency of this influence are unclear.
Objective: To systematically review published literature on the effects of information framing on the practices of physicians.
Design: Specific study inclusion criteria do not appear to have been pre-defined and were determined following a preliminary review of potentially relevant studies.
Data collection and analysis: Relevant studies were retrieved using bibliographic databases and electronic searches. Information was extracted in relation to study design, frame type, parameter assessment, assessment scale, clinical setting, intervention, results, and factors modifying the frame effect. It was the author’s intention to carry out a meta-analysis of compatible data sets but this was not possible because of inter-study variability.
Main results: Twelve articles reported randomised trials investigating the effect of framing on doctors’ opinions or intended practices. Methodological shortcomings were numerous. Seven papers investigated the effect of presenting clinical trial results in terms of relative risk reduction, absolute risk reduction or the number needing to treat; gain/loss (positive/negative) terms were assessed in four papers; verbal/numeric terms in one. In simple clinical scenarios, doctors viewed results expressed in relative risk reduction or gain terms most positively. Factors that reduced the impact of framing included the risk of causing harm, pre-existing prejudices about treatments, the type of decision, the therapeutic yield, clinical experience and cost. No study investigated the effect of framing on actual clinical practice.
Conclusions: While a framing effect may exist, particularly when results are presented in terms of proportional or absolute measures of gain and loss, it appears highly susceptible to modification, and even neutralisation, by other factors that influence doctors’ decision making. Its effects on clinical practice are unknown.
COMMENTARY
Prepared by Paul Glasziou
As with a humorous story, the impact of a systematic review will depend on both the quality of the content and on the methods of presentation. For several decades psychologists such as Kahnemann and Tversky have shown that decisions are influenced by the way data is framed, e.g. as improved survival or decreased mortality.1 This gain versus loss framing has been well-documented in psychology, and in some medical settings in four of the papers included in this review. In general we prefer an intervention framed in terms of gain rather than loss. However the level of risk, the type of health decision, the level of experience and the costs of interventions may modify this effect. This modification makes generalising the size of the effect difficult for any specific setting.
More recently, researchers have studied the framing effects of absolute versus relative risk. Beginning with the classic results of Forrow in 1992,2 seven studies have documented the effects – with six of these demonstrating that relative risk reductions generated greater enthusiasm than either absolute risk reductions or number needed to treat. However in two of the studies this effect was attenuated when multiple outcomes were used, e.g. both cardiovascular mortality and total mortality.
This review is important as it demonstrates that the method of presentation may be as important as the evidence itself. However, although the effects are reasonably consistent in their direction, their importance in real clinical practice is unknown – no study assessed the effects of framing on clinicians’ behaviour, but only on stated intentions. This is important as some of the studies suggest that the effects are attenuated when the complexity of the scenarios, such as the number of outcomes, is increased to something closer to clinical practice. Given that we cannot predict the impact in any single situation, it would seem prudent to always express the results of systematic reviews in at least two complementary frames, e.g. as both relative and absolute risk reduction.
References
1. Kahnermann D, Tversky A. Choices, values and frames. American Psychologist. 1984; 39:341-350.
2. Forrow L, Taylor WC, Arnold RM. Absolutely relative: how research results are summarized can affect treatment decisions. American Journal of Medicine. 1992; 92:121-124.
An
introduction to Bayesian methods in health technology assessment
Spiegelhalter D, Myles J,
Jones D, Abrams K. BMJ 1999; 319:508-512.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: Bayesian methods in health technology assessment can be defined as the explicit, quantitative use of external evidence in the design, monitoring, analysis, interpretation and reporting of health technology.
Objectives: To review current thinking on the value of the Bayesian approach in health technology assessment.
Design: To review available literature, discuss the main techniques that have been used and to provide recommendations for future work.
Data collection and analysis: Searches of MEDLINE and EMBASE identified 300 potentially relevant papers. Explicit methods on how these were assessed and then analysed are not discussed.
Main results: Examples from published studies are used to demonstrate the approach. The following themes which emerge from the literature are discussed: (1) the philosophy of the Bayesian approach and Bayes theorem over traditional statistical methods; (2) quantifying the use of Bayesian methods when there is no good evidence for prior beliefs and subjective judgement needs to be relied upon; (3) multiple subgroup analysis; (4) applying Bayesian methods to non-randomised studies; and (5) the use of Bayesian methods in formal decision making analysis.
Conclusions: Health technology assessment has been slow to adopt Bayesian methods. Possible reasons for this include a reluctance to use prior opinions and unfamiliarity with the approach, mathematical complexity, a lack of software, and beliefs of the health care establishment. Practical steps are required to overcome these difficulties.
COMMENTARY
Prepared by Julian Higgins
We undertake systematic reviews because we appreciate that we need more than the result of a single randomised trial to make a judgement about a healthcare intervention. Analysing the totality of relevant trials goes some way towards addressing this problem, but sometimes there is good reason to include further evidence, for instance from indirect treatment comparisons, empirical research or even subjective beliefs. Bayesian statistics provides a formal means of achieving this. This article provides an extensive and useful introduction to its use in health technology assessment (HTA).
The authors propose a definition of Bayesian methods in HTA as "the explicit quantitative use of external evidence in the design, monitoring, analysis and reporting". They go on to outline the Bayesian approach using case studies as examples, and report briefly on a review of Bayesian examples in the HTA literature. Meta-analysis fares well in this review, contributing the only textbook citation and a number of examples of the combination of studies with either similar or different designs. The authors also discuss Bayesian approaches to tackling some problems familiar to us in systematic reviews, such as multiple subgroup analyses, interpreting freak results and deciding whether future trials are appropriate.
Bayesian methods are not without their problems and are not easy to implement. Cochrane reviewers interested in applying them would need the help of a Bayesian-sympathetic statistician. However, they do offer flexibility and provide an alternative to anyone experiencing problems understanding p-values and confidence intervals. The article appeals for case studies in the application of Bayesian statistics, and for developments in methods, reporting standards and software. Thus, many Cochrane reviewers and methodologists have the opportunity to contribute to research in this interesting field.
The
role of expectancies in the placebo effect and their use in the delivery
of health care: a systematic review
Crow R, Gage H, Hampson
S, Hart J, Kimber A, Thomas H. Health Technology Assessment 1999;
3:1-96.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: A review commissioned by the HTA programme to enhance understanding of the role of expectancies in the placebo effect and their use in the delivery of health care.
Objective: To assess the nature and extent of the placebo effect and to consider how it may be harnessed within the UK National Health Service to improve quality of care and cost effectiveness.
Design: A systematic review of literature on the placebo effect.
Data collection and analysis: The search stage sought to identify studies examining the placebo effect when confined to the expectancy mechanism. The main outcome measures were treatment-related expectancy and patient related self-efficacy expectations. Heterogeneity of the outcomes assessed meant that no formal meta-analysis was carried out. Many of the studies also exhibited weakness of methodological quality due to small sample size and lack of detail to research design.
Main results: Papers were classified into three clinical areas, depending on the type of expectancy they addressed. The three clinical areas are the preparation for medical procedures, management of illness and medical treatment. A narrative review of the studies in each category was conducted. (1) Preparation for medical procedures (25 studies) - prior medical preparation, and other interventions that train patients to cope with procedures and manage their consequences, were effective. (2) Management of illness (40 studies) – patients who have undergone training in self management skills showed improvements in health outcomes, as did patients who were interactive in their medical encounter. (3) Medical treatment (20 studies) - the majority of studies provided evidence that positive outcome expectancy enhanced the effects of medical treatment.
Conclusions: The hypothesis that expectancies are a mechanism for placebo effects received support across a range of clinical areas in a variety of studies.
COMMENTARY
Prepared by Andrew Herxheimer
The concepts developed in this review significantly advance the understanding of placebo effects. Two different kinds of patients’ expectations mediate placebo effects: "treatment-related outcome expectations" and "patient-related self-efficacy expectations". The distinction enables us to separate two essential tasks: firstly, to inform patients and prepare them for what is likely to happen to them and secondly, to teach and train them to manage their disease or cope with the treatment. The review is well done and is the first real attempt to consider what is worth doing in clinical practice now (a lot) and what research is needed next (even more). The review is dry and dense, but don’t be put off - it’s important.
The lessons are relevant to most Cochrane reviews. Reviewers need to ask what patients in ‘their’ trials were expecting, how well attuned they were to the clinical setting and how far they felt in control. Since most trial reports are silent on these questions, reviewers should raise them in their discussion and conclusions. Users of reviews and trialists must also start to take them seriously.
Patients’ expectations are largely unexplored territory in the Cochrane Collaboration. The subject is ripe for cultivation by a Methods Group, perhaps together with the authors of this review and/or the Effective Practice and Organisation of Care Group.
Meta-analysis
of observational studies in epidemiology: a proposal for reporting
Stroup D, Berlin J, Morton
S, Olkin I et al. JAMA 2000; 283:2008-2012.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: Although meta-analyses restricted to randomised trials are the most common, the number of published meta-analyses of observational studies has increased substantially in the past four decades. Thus, a clearer understanding of the advantages and limitations of statistical syntheses of observational data is needed.
Objectives: A workshop was held in April 1997 in Atlanta, USA, to examine the reporting of meta-analyses of observational studies and to make recommendations to aid authors, reviewers, editors, and readers.
Design: Twenty-seven workshop participants were selected by a steering committee, based on expertise in clinical practice, trials, statistics, epidemiology, social sciences, and biomedical editing.
Data collection and analysis: A systematic review of the published literature on the conduct and reporting of meta-analyses of observational studies was carried out using MEDLINE, Educational Research Information Center (ERIC), PsycLIT, and the Current Index to Statistics. In addition, reference lists were examined and experts in the field were contacted. The 32 articles retrieved were used to generate the workshop agenda and participants were assigned to small-group discussions on the subjects of bias, searching and abstracting, heterogeneity, study categorisation, and statistical methods.
Main results: From the material presented at the workshop, the authors developed a checklist summarising recommendations for reporting meta-analyses of observational studies. The proposed checklist contains specifications for reporting of meta-analyses of observational studies in epidemiology. These include guidance on background, search strategy, methods, results, discussion, and conclusion.
Conclusions: Use of the checklist should improve the usefulness of meta-analyses for authors, reviewers, editors, readers, and decision makers.
COMMENTARY
Prepared by Mike Clarke
Randomised trials usually provide the strongest evidence of the relative effectiveness of different interventions, and the search for randomised trials and the development of methods to bring them together in systematic reviews have been key components of the work of the Cochrane Collaboration during its early years. This work is likely to continue for some time and the majority of Cochrane reviews concentrate on randomised trials. However, randomised trials are not always possible. Reviewers sometimes need to decide whether or not to combine non-randomised observational studies, such as case-control and cohort studies, and studies with historical controls. The recent registration of the Non-randomised Studies Methods Group (see page 26) will help provide guidance on how these types of study might be dealt with by Cochrane reviewers.
This recent report contains the recommendations of the MOOSE (Meta-analysis of Observational Studies in Epidemiology) group, on how reviews of observational studies should be reported, and provides useful guidance in the interim. They set out specific issues that should be addressed in reviews of non-randomised studies and, where possible, cite the evidence on which the relevant recommendation is based. Several members of the MOOSE group are also members of the Statistics Methods Group (see page 27) and the authors of the paper have committed themselves to working with the Cochrane Collaboration on the promotion of their recommendations. Cochrane reviewers should consider these recommendations when preparing reviews of observational studies. Two more items that they might wish to consider are the provision of sufficient information to ensure that the reader can identify unambiguously different versions of the same review and the inclusion of some discussion that sets the results of their review in the context of other, related research.
Variability
in meta-analytic results concerning the value of cholesterol reduction
in coronary heart disease
Katerndahl DA, Lawler W.
American Journal of Epidemiology 1999; 149:429-441.
STRUCTURED ABSTRACT
Prepared by the Empirical Methodological Studies Methods Group
Background: Despite official support for the efficacy of cholesterol reduction, considerable controversy exists, and meta-analyses of this topic have produced conflicting results.
Objectives: To assess the variability of meta-analyses, evaluating the cardiovascular value of cholesterol reduction and to attempt to explain the variability between study results.
Design: A review of meta-analyses of studies assessing the value of cholesterol reduction in coronary heart disease.
Data collection and analysis: Meta-analyses were identified using MEDLINE, The Cochrane Library and by citation tracking. Meta-analyses were included if they investigated the relationship between cholesterol reduction and total mortality, cardiovascular mortality, or nonfatal cardiovascular disease. The odds ratios for total mortality, cardiovascular mortality, and nonfatal cardiovascular disease were extracted, as were data on the authors’ encoded methodological variables, publication variables, and information on the investigators' backgrounds. It is not clear if these variables were predefined.
Results: Twenty-three meta-analyses were reviewed, 15 of these concluded that cholesterol reduction was beneficial. The overall summary odds ratios for total mortality were heterogeneous and generally failed to support the value of cholesterol reduction. However, odds ratios for cardiovascular mortality and for nonfatal cardiovascular disease were more homogeneous and did support the value of cholesterol reduction. Odds ratios were also found to be dependent on the study inclusion criteria and investigator variables.
Conclusions: Meta-analyses which were of higher methodological quality tended to report odds ratios that were more beneficial than those reported in lower quality meta-analyses. The benefit of cholesterol reduction was associated with the study inclusion/exclusion criteria and publication variables.
COMMENTARY
Prepared by Matthias Egger
This 'meta-meta-analysis' of studies of cholesterol lowering in coronary heart disease examined to what extent conflicting meta-analytical results are explained by different methodologies or bias. Investigator bias, the systematic deviation from the truth as a result of investigators' opinions or feelings was of particular interest.
Twenty-three meta-analyses published from 1971 to 1994 were analysed. A number of interesting findings emerged. For example, seeking studies from the pharmaceutical industry was associated with more beneficial overall effects of cholesterol lowering interventions. Previous research has shown that the pharmaceutical industry discourages the publication of negative studies, which it has funded. Katerndahl and Lawler's findings indicate that selective provision of studies (published and unpublished) may also play a role and that this can affect the conclusions of published meta-analyses. Interestingly, only 4 out of 10 meta-analyses in British journals were supportive of a beneficial effect of cholesterol lowering compared with 11 of the 13 meta-analyses in journals from other countries. As the authors acknowledge, this analysis was, however, post hoc.
There are other important limitations of this meta-epidemiological study. The period covered, 1971 to 1994, saw important improvements in cholesterol lowering interventions and in particular the advent of the powerful statin drugs. It would be surprising, therefore, if there was no variation between the results of meta-analyses performed at different times during this period. I disagree with the authors' assessment that such variation necessarily raises "questions about the reliability and validity" of meta-analysis. Several of the meta-analyses included by Katerndahl and Lawler represent initial analyses and later updates by the same authors. The ease with which systematic reviews and meta-analyses can be updated in the light of new evidence is an important strength, which is, of course, explicitly incorporated in the conduct of Cochrane reviews.