Cochrane has launched a study to test whether artificial intelligence (AI) tools can support or enhance evidence synthesis. The initiative will use an innovative platform study design and will adhere to robust and responsible criteria, upholding Cochrane’s rigorous standards for evidence.
Cochrane remains committed to advancing responsible AI in evidence synthesis. AI technologies are advancing rapidly, with a growing number of studies and preprints exploring whether they can replicate elements of a Cochrane review in a fraction of the time. Many existing evaluations have methodological limitations, as well as a reliance on studies conducted by tool developers themselves. Cochrane seeks to bring a higher level of rigour to this rapidly evolving field, through independent, prospective, reproducible methods. With the application of AI, we know that speed is only one factor, our reviews are trusted because they follow a transparent, independent and methodologically rigorous process.
To understand the potential of AI tools, we are embarking on a study to evaluate how AI tools could support or enhance key stages of the evidence synthesis lifecycle, including literature screening and data extraction. This study is being led by the AI Methods Group.
A robust process
In November 2025, we welcomed proposals from developers of AI tools that could support or enhance key stages of the evidence synthesis lifecycle. We received 48 proposals and the project team have completed a detailed internal assessment to shortlist these.
Submissions by AI tool developers were screened for:
- alignment with RAISE principles
- alignment with Cochrane's mission, vision and values
- maturity, we considered only tools already released and evaluated on screening and data extraction
- affordability
- compliance with data protection and copyright standards
Tools were then ranked by self-nominated members of the AI Methods Group, with reference to:
- existing evaluation evidence bases
- evidence provided to substantiate performance claims
- scope of functionality (screening and data extraction)
- level of human oversight during automation
- dependency on user-defined prompts
This process, closely aligned with RAISE principles, resulted in a shortlist of two tools. However, five additional tools are in a reserve list, which could potentially be included as part of the innovative study design.
An innovative approach
The study will use an innovative platform design. This is an adaptive and efficient approach that allows multiple interventions (i.e., different AI tools and the dual-human standard) to be evaluated simultaneously under a single protocol. The protocol will define criteria we expect the AI tools to meet, including performance metric and uncertainty thresholds, and the platform study will allow new tools to be added and ineffective ones removed over time.
“The rapid advancements in AI tools for evidence synthesis require innovative methodological approaches to evaluate their effectiveness. Our novel adaptive platform study using a common study within a review protocol offers the flexibility to select the most suitable tools for the Cochrane workflow.” Gerald Gartlehner, Principal investigator, Danube University Krems, Austria
The AI tools will be tested across approximately 15 Cochrane review updates, with performance compared against traditional methods through the work of author teams.
Next steps
Author teams involved in the study are being finalized, with two teams ready to test the protocol before it is shared publicly. We look forward to collaborating with Matteo Bruschettini (Lund University, Sweden) and colleagues with Strategies for cessation of caffeine administration in preterm infants and Glen Hazelwood (University of Calgary, Canada) and colleagues with Disease‐modifying anti‐rheumatic drugs for rheumatoid arthritis on testing this protocol. The wider project team has also been assembled, and we are aiming to complete the platform study in the second half of 2026, with the results being written up after this milestone.
The team includes:
- Principle investigator: Gerald Gartlehner (Danube University Krems, Austria)
- Core project team: Rachel Craven (Cochrane, UK), Ella Flemyng (Cochrane, UK), Ruth Foxlee (Cochrane, UK), Tom Nissen (Cochrane, UK), Krishna Kishore Pandalaneni (Columbia University, USA)
- Research group: Susan Banda (Kamuzu University of Health Sciences, Malawi), Max Callaghan (Potsdam Institute for Climate Impact Research, Germany), Jo-Ana Chase (Cochrane, UK), Andreea Dobrescu (University for Continuing Education Krems, Austria), Sean Gardner (Cochrane, UK), Ursula Griebler (University for Continuing Education Krems, Austria), Pawel Jemiolo (AGH University of Krakow, Poland), Afroditi Kanellopoulou (Cochrane, UK), Amin Sharifan (University for Continuing Education Krems, Austria), Noosheen Rajabzadeh Tahmasebi (University of Freiburg, Germany)
- Data adjudication and monitoring committee: Matteo Bruschettini (Lund University, Sweden), Andreea Dobrescu (University for Continuing Education Krems, Austria), Bartosz Helfer (University for Continuing Education Krems, Austria), Larisa Pinte (“Carol Davila" University of Medicine and Pharmacy, Romania)
- Advisory group: Angelika Eisele-Metzger (University Medical Center Freiburg, Germany), Biljana Macura (Stockholm Environment Institute, Sweden), Joerg J. Meerpohl (University of Freiburg, Germany), Jan C. Minx (Potsdam Institute for Climate Impact Research, Germany), Anna Noel-Storr (Cochrane, UK), Kylie Porritt (University of Adelaide, Australia), James Thomas (University College London, UK)
Part of this work was supported by the Wellcome Trust grant number 323143/Z/24/Z.