Laboratory blood, urine tests and liver biopsy used for the diagnosis of Wilson's disease in children and adults

Why is improving Wilson's disease diagnosis important?

Wilson's disease is an inherited disease that leads to a build-up of copper in affected parts of the body. Diagnosis usually occurs in children or young adults, but has been seen in adults over 60 years of age. Copper build-up begins in the liver progressing over time to affect the brain; however, the challenge for doctors is that liver disease in Wilson's disease has non-specific features and standard liver blood tests may be normal, even with advanced scarring of the liver or cirrhosis. Early diagnosis allows earlier treatment, however, other causes of chronic liver disease may cause false-positive results and, depending on cut-off values used for testing, may result in further unnecessary testing. Conversely, false-negative results may also arise when a single-test strategy for diagnosis is used, possibly leading to a delay in treatment.

What is the aim and what was included in this review?

We aimed to examine the accuracy of three commonly used diagnostic tests to correctly identify Wilson's disease. These tests are: caeruloplasmin (a protein that carries copper in blood); copper in the urine; and copper in the liver. Initial evaluation usually involves checking an individual's eyes for signs of Wilson's disease and a blood test for caeruloplasmin, as this is the most widely accessible biochemical test for Wilson's disease. However, the pathway to diagnosing Wilson's disease is highly variable. Follow-up testing depends on results of initial testing, plus the ability to access relevant tests and the likelihood with which the doctor believes the individual has Wilson's disease.

What are the main results in the review?

We found eight studies (5699 participants), of whom 1009 were diagnosed with Wilson's disease. One study assessed all three biochemical tests, three assessed caeruloplasmin, one assessed 24-hour urinary copper, two assessed hepatic copper and one assessed both urine and hepatic copper.

Four studies evaluated adults and children, three evaluated children and adolescents and one evaluated adults. The clinical presentation of Wilson's disease also varied: six studies evaluated individuals with both liver and neurological symptoms of Wilson's disease in addition to individuals who had not yet developed symptoms; and two studies evaluated individuals with liver symptoms only.

The ability of the three tests evaluated to detect those with Wilson's disease (termed sensitivity) was variable (50% to 94.4%); the ability to detect those without disease (termed specificity) was also variable (52.2% to 98.3%). No single test was capable of diagnosing Wilson's disease in isolation. There was also not enough evidence to determine the accuracy of the tests within different age groups or Wilson's disease subgroups (e.g. those with liver or neurological symptoms).

How reliable are the results of the studies in this review?

Since there is no gold standard test for diagnosing Wilson's disease, we selected a clinical and laboratory standard (the Leipzig criteria) to determine the diagnosis of the disease. Results of this review suggest that part of the variability in test sensitivity and specificity at the cut-offs in the Leipzig criteria is likely to be influenced by the method used to undertake the diagnostic tests. However, there were some problems with how the included studies were conducted. This may result in the caeruloplasmin, urine or liver copper appearing more accurate than it is, increasing the number of positive results (sensitivity).

What are the implications of this review?

Limited evidence from the included studies support the use of multiple-index testing as outlined in the Leipzig criteria. The diagnostic thresholds used in this criteria will vary with laboratory test, with the method used to conduct the laboratory test, and with the individuals in the included studies (who varied by age, ethnicity and clinical presentation of disease). These factors should therefore be taken into account when interpreting the results. High sensitivity (true-positive rate) for each of the laboratory tests is possible at particular cut-off values; however, when used in isolation, each laboratory test may have a false-positive or false-negative rate. Limitations in study design may exaggerate test accuracy.

How up-to-date is this review?

The authors searched for and used studies published up to 29 May 2019.

Authors' conclusions: 

The cut-offs used for caeruloplasmin, 24-hour urinary copper and hepatic copper for diagnosing Wilson's disease are method-dependent and require validation in the population in which such index tests are going to be used. Binary cut-offs and use of single-test strategies to rule Wilson's disease in or out is not supported by the evidence in this review. There is insufficient evidence to inform testing in specific subgroups, defined by age, ethnicity or clinical subgroups.

Read the full abstract...

Wilson's disease, first described by Samuel Wilson in 1912, is an autosomal recessive metabolic disorder resulting from mutations in the ATP7B gene. The disease develops as a consequence of copper accumulating in affected tissues.

There is no gold standard for the diagnosis of Wilson's disease, which is often delayed due to the non-specific clinical features and the need for a combination of clinical and laboratory tests for diagnosis. This delay may in turn affect clinical outcome and has implications for other family members in terms of diagnosis. The Leipzig criteria were established to help standardise diagnosis and management. However, it should be emphasised that these criteria date from 2003, and many of these have not been formally evaluated; this review examines the evidence behind biochemical testing for Wilson's disease.


To determine the diagnostic accuracy of three biochemical tests at specified cut-off levels for Wilson's disease. The index tests covered by this Cochrane Review are caeruloplasmin, 24-hour urinary copper and hepatic copper content. These tests were evaluated in those with suspected Wilson's disease and appropriate controls (either healthy or those with chronic liver disease other than Wilson's). In the absence of a gold standard for diagnosing Wilson's disease, we have used the Leipzig criteria as a clinical reference standard.

To investigate whether index tests should be performed in all individuals who have been recommended for testing for Wilson's disease, or whether these tests should be limited to subgroups of individuals.

Search strategy: 

We identified studies by extensive searching of, e.g. the Cochrane Central Register of Controlled Trials (CENTRAL), PubMed, Embase, the Web of Science and clinical trial registries (29 May 2019).

Date of the most recent search of the Cochrane Cystic Fibrosis and Genetic Disorders Inborn Errors of Metabolism Register: 29 May 2019.

Selection criteria: 

We included prospective and retrospective cohort studies that assessed the diagnostic accuracy of an index test using the Leipzig criteria as a clinical reference standard for the diagnosis of Wilson's disease.

Data collection and analysis: 

Two review authors independently reviewed and extracted data and assessed the methodological quality of each included study using the QUADAS-2 tool. We had planned to undertake meta-analyses of the sensitivity, specificity at relevant cut-offs for each of the biochemical tests for Wilson's, however, due to differences in the methods used for each biochemical index test, it was not possible to combine the results in meta-analyses and hence these are described narratively.

Main results: 

Eight studies, involving 5699 participants (which included 1009 diagnosed with Wilson's disease) were eligible for inclusion in the review. Three studies involved children only, one adults only and the four remaining studies involved both children and adults. Two evaluated participants with hepatic signs and six with a combination of hepatic and neurological signs and symptoms of Wilson's disease, as well as pre-symptomatic individuals. The studies were of variable methodological quality; with high risk if bias for participant selection and the reference standard used being of greatest methodological concern. Key differences between studies include differences in assay methodology, different cut-off values for diagnostic thresholds, different age and ethnicity groups. Concerns around study design imply that diagnostic accuracy figures may not transfer to populations outside of the relevant study.

Index test: caeruloplasmin

Five studies evaluated various thresholds of caeruloplasmin (4281 participants, of which 541 had WD). For caeruloplasmin a cut-off of 0.2 g/L as in the Leipzig criteria achieved a sensitivity of 77.1% to 99%, with variable specificity of 55.9% to 82.8%. Using the cut-off of 0.1 g/L of the Leipzig criteria seemed to lower the sensitivity overall, 65% to 78.9%, while increasing the specificity to 96.6% to 100%.

Index test: hepatic copper

Four studies evaluated various thresholds of hepatic copper (1150 participants, of which 367 had WD). The hepatic copper cut-off of 4 μmol/g used in the Leipzig criteria achieved a sensitivity of 65.7% to 94.4%, with a variable specificity of 52.2% to 98.6%.

Index test: 24-hour urinary copper

Three studies evaluated various thresholds of 24-hour urinary copper (268 participants, of which 101 had WD). For 24-hour urinary copper, a cut-off of 0.64 to 1.6 μmol/24 hours used in the Leipzig criteria achieved a variable sensitivity of 50.0% to 80.0%, with a specificity of 75.6% to 98.3%.