Which models exist for prediction of future disease outcomes in people with multiple sclerosis?

Why is it important to study multiple sclerosis?

Multiple sclerosis (MS) is a chronic disease of the brain, spine, and nerves. Millions of people worldwide suffer from this disease, but the disease and how it progresses can be very different from person to person. Although MS cannot be cured, different treatments are available that can help reduce symptoms and slow the worsening of the disease. These treatments work differently, with some having more severe side effects than others. Understanding the severity of an individual’s MS is important to patients and medical professionals.

Why are prognostic models important in the context of multiple sclerosis?

Prognostic models help patients and medical professionals understand how sick an individual is and will become. This understanding can support patients during life and treatment choices. Prognostic models can also help medical professionals make decisions about how to best treat an individual, better understand the disease, or to develop treatments. Prognostic models for MS might involve combining a range of different pieces of information about an individual to predict how their MS will continue to develop. Important pieces of information to include in a prognostic model could be, for example, information on personal characteristics (such as age, sex, body mass index), information on their behaviour (such as whether they smoke), and information about their MS (such as how long they have had the disease). Other clinical features or measurements may also be important.

What did we want to find out?

We wanted to search for and find all prognostic models that combine multiple pieces of information to predict how MS will continue to develop and worsen in adults.

What did we do?

We used different techniques to search for all studies that described prognostic models, which combine multiple pieces of information, developed in the context of MS. We were interested in studies showing how these prognostic models were developed, as well as studies evaluating how well they actually work in practice. Once we found all relevant studies, we summarised them and evaluated how well they reported their results and how well they were conducted.

What did we find?

We found 57 studies that described prognostic models combining multiple pieces of information to predict how MS will continue to develop and worsen in adults. These studies described the development of 75 different prognostic models. There were 15 instances in which the performance of specific prognostic models was evaluated.

We found that prognostic models focus on different outcomes; 41% looked at disease progression, 8% at relapses, 18% at moving from a first attack to definite MS, and 28% at moving from the early stages of MS to progressive MS. The prognostic models we found were very different from one another in many ways. The patients they used to develop the models, for example, were very different in terms of treatments. In addition, the pieces of information they used to predict the course of MS were very different from one another. We found that prognostic models have changed over time regarding the diagnosis of MS and increase in use of treatment, the information observed with new techniques, or new modelling approaches. We also found that using these prognostic models requires information about the individual that would require a medical specialist and often specialist equipment, both of which may not be available in many clinics and hospitals.

What are the limitations of the evidence?

We found problems with most studies, meaning that we may not be able to trust their results. Common problems involved data and statistical methods used across studies. Additionally, many of the studies report results that may be very different if the prognostic models are applied to a new set of people with MS. We also found that the studies did a poor job of describing their methods and reporting their findings.

What does this mean?

The studies we found show that the evidence on prognostic models for predicting how MS will continue to develop and worsen in adults is not yet well-developed. New research is needed that focusses on using methods recommended in guidelines to develop prognostic models and evaluate their performance. This research should also focus on describing their methods and results well, so that other researchers and medical professionals can use them for research and clinical practice.

Authors' conclusions: 

The current evidence is not sufficient for recommending the use of any of the published prognostic prediction models for people with MS in clinical routine today due to lack of independent external validations. The MS prognostic research community should adhere to the current reporting and methodological guidelines and conduct many more state-of-the-art external validation studies for the existing or newly developed models.

Read the full abstract...

Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system that affects millions of people worldwide. The disease course varies greatly across individuals and many disease-modifying treatments with different safety and efficacy profiles have been developed recently. Prognostic models evaluated and shown to be valid in different settings have the potential to support people with MS and their physicians during the decision-making process for treatment or disease/life management, allow stratified and more precise interpretation of interventional trials, and provide insights into disease mechanisms. Many researchers have turned to prognostic models to help predict clinical outcomes in people with MS; however, to our knowledge, no widely accepted prognostic model for MS is being used in clinical practice yet.


To identify and summarise multivariable prognostic models, and their validation studies for quantifying the risk of clinical disease progression, worsening, and activity in adults with MS.

Search strategy: 

We searched MEDLINE, Embase, and the Cochrane Database of Systematic Reviews from January 1996 until July 2021. We also screened the reference lists of included studies and relevant reviews, and references citing the included studies.

Selection criteria: 

We included all statistically developed multivariable prognostic models aiming to predict clinical disease progression, worsening, and activity, as measured by disability, relapse, conversion to definite MS, conversion to progressive MS, or a composite of these in adult individuals with MS. We also included any studies evaluating the performance of (i.e. validating) these models. There were no restrictions based on language, data source, timing of prognostication, or timing of outcome.

Data collection and analysis: 

Pairs of review authors independently screened titles/abstracts and full texts, extracted data using a piloted form based on the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS), assessed risk of bias using the Prediction Model Risk Of Bias Assessment Tool (PROBAST), and assessed reporting deficiencies based on the checklist items in Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD). The characteristics of the included models and their validations are described narratively. We planned to meta-analyse the discrimination and calibration of models with at least three external validations outside the model development study but no model met this criterion. We summarised between-study heterogeneity narratively but again could not perform the planned meta-regression.

Main results: 

We included 57 studies, from which we identified 75 model developments, 15 external validations corresponding to only 12 (16%) of the models, and six author-reported validations. Only two models were externally validated multiple times. None of the identified external validations were performed by researchers independent of those that developed the model. The outcome was related to disease progression in 39 (41%), relapses in 8 (8%), conversion to definite MS in 17 (18%), and conversion to progressive MS in 27 (28%) of the 96 models or validations. The disease and treatment-related characteristics of included participants, and definitions of considered predictors and outcome, were highly heterogeneous amongst the studies. Based on the publication year, we observed an increase in the percent of participants on treatment, diversification of the diagnostic criteria used, an increase in consideration of biomarkers or treatment as predictors, and increased use of machine learning methods over time.

Usability and reproducibility

All identified models contained at least one predictor requiring the skills of a medical specialist for measurement or assessment. Most of the models (44; 59%) contained predictors that require specialist equipment likely to be absent from primary care or standard hospital settings. Over half (52%) of the developed models were not accompanied by model coefficients, tools, or instructions, which hinders their application, independent validation or reproduction. The data used in model developments were made publicly available or reported to be available on request only in a few studies (two and six, respectively).

Risk of bias

We rated all but one of the model developments or validations as having high overall risk of bias. The main reason for this was the statistical methods used for the development or evaluation of prognostic models; we rated all but two of the included model developments or validations as having high risk of bias in the analysis domain. None of the model developments that were externally validated or these models' external validations had low risk of bias. There were concerns related to applicability of the models to our research question in over one-third (38%) of the models or their validations.

Reporting deficiencies

Reporting was poor overall and there was no observable increase in the quality of reporting over time. The items that were unclearly reported or not reported at all for most of the included models or validations were related to sample size justification, blinding of outcome assessors, details of the full model or how to obtain predictions from it, amount of missing data, and treatments received by the participants. Reporting of preferred model performance measures of discrimination and calibration was suboptimal.