Physical Tests for Differentiating Painful Cervical Radiculopathy

Introduksjon

Patients experiencing radiating neck and arm pain are frequently encountered in physiotherapy practice, and we have a crucial role to play in establishing an effective diagnosis. As radiating pain can be a symptom coming from several pathologies, differentiating painful cervical radiculopathy from other causes, like somatic referred pain, is essential, as both the prognosis and management strategies will differ. Cervical radiculopathy is a condition in which compression or inflammation of the cervical nerve root leads to a conduction block in the nerve, causing sensory changes such as paresthesia, and/or weakness when motor fibers are involved, and diminished reflexes. In a previous study in 2018, Thoomes et al. already assessed the diagnostic accuracy of physical examination tests in the diagnosis of painful cervical radiculopathy up to 2016, but the evidence was of low quality. Since 10 years have passed, the current review was eager to find whether the evidence base has become stronger in the meantime. The main goal of this systematic review is to evaluate the clinical utility of physical examination tests in differentiating painful cervical radiculopathy from other sources of radiating arm pain, such as somatic referred pain, in patients visiting primary and secondary care.

Metoder

A literature search was performed in six electronic databases, combining results from an original review (search dates up to March 2016) and an updated search (March 2016 to June 5th, 2025). The PICOS format was used to identify eligible studies:

Participants (P): Patients suspected of having cervical radiculopathy, with the diagnosis made clinically by a medical specialist and/or confirmed with medical imaging (MRI or CT).
Index Test (I): Physical examination tests aimed at assessing the diagnostic accuracy for identifying cervical radiculopathy.
Comparator/Reference Standard (C): The results of the index test(s) were compared with a reference standard consisting of either (1) diagnostic imaging (MRI, CT, or myelography) or (2) surgical findings.
Outcome (O): Studies reporting diagnostic accuracy outcomes like sensitivity, specificity, positive predictive value, or negative predictive value were included.
Setting (S): Primary and secondary care cross-sectional studies were eligible.

Studies using electromyography (EMG) as the sole reference standard were excluded. Case-control designs including healthy controls were also excluded.

Data Analysis

Sensitivity, specificity, positive likelihood ratio (LR+), and negative likelihood ratio (LR-) were calculated. To enhance clinical utility, Positive Predictive Values (PPV) and Negative Predictive Values (NPV) were calculated across four potential pre-test probabilities (5%, 15%, 30%, and 50%). Fagan nomograms were also generated to visually demonstrate the shift in probability following a test.

Resultater

Of more than 1300 studies retrieved, 8 were eligible for inclusion. Compared to the previous review, three new studies were included. Diagnostic accuracy was investigated for the following tests:

Spurling’s test

Five studies included Spurling’s test, but all used slightly different methods of performing it, leading to difficulties in interpreting the results.

Specificity: Low certainty evidence of high specificity, ranging from 0.84 to 1.00 (95% CI range: 0.56-1.00).
Sensitivity: Very low certainty evidence, ranging from 0.38 to 0.98 (95% CI range: 0.22-0.99).
The certainty of evidence for the LR+ and LR- was very low.

Upper Limb Neurodynamic Tension (ULNT) tests

Pooled evidence from 3 studies investigating the ULNT 1 with the median nerve bias found very low certainty evidence for a:

Pooled Sensitivity: 0.70 (95% CI 0.60-0.79). (Low certainty evidence).
Pooled Specificity: 0.71 (95% CI 0.63-0.79). (Low certainty evidence).
LR+: 2.45 (95% CI 1.79-3.36).
LR-: 0.42 (95% CI 0.30-0.59).

Two studies delivered pooled evidence of using a combination of all four ULNT tests, with the criterion of at least one test being positive, again with very low certainty of evidence:

Pooled Sensitivity: 0.97 (95% CI 0.88-0.99). This is classified as high sensitivity.
Pooled Specificity: 0.51 (95% CI 0.40-0.62). This is classified as low specificity.
LR+: 1.99 (95% CI 1.57-2.52).
LR-: 0.06 (95% CI 0.02-0.25)

In the discussion, the authors point to one large study using the combination of all 4 ULNTs, which reported an almost infinite LR+ when all 4 tests were positive. With 3 out of 4 ULNTs positive, this large study reported an LR+ of 12.89, allowing the ruling in of the condition. When only 1 out of 4 tests was positive, the LR- was 0.08, giving the ability to rule out cervical radiculopathy.

Shoulder Abduction sign

Two studies were pooled and gave very low certainty evidence:

Pooled Sensitivity: 0.49 (95% CI 0.39–0.60). This is classified as low sensitivity.
Pooled Specificity: 0.76 (95% CI 0.66–0.84). This is classified as moderate specificity.
LR+: 2.08 (95% CI 1.32-3.27).
LR-: 0.66 (95% CI 0.52-0.85)

Arm Squeeze test

Evidence from only one study gave very low certainty evidence:

High sensitivity: 0.97 (95% CI 0.93-0.98)
High specificity: 0.97 (95% CI 0.95-0.98)

Traction test

Evidence from only one study gave very low certainty evidence:

Low sensitivity: 0.33 (95% CI 0.13-0.61)
High specificity: 0.97 (95% CI 0.83-0.99)

Neck Tornado test

Evidence from only one study gave very low certainty evidence:

High sensitivity 0.85 (95% CI 0.74-0.93)
High specificity 0.87 (95% CI 0.76-0.94)

Spørsmål og tanker

Although the risk of bias of the three newly included studies was lower than that of the five studies identified in the 2018 review, the evidence base remained of very low quality. How should we use the findings of this review in our own practice, then?

We can use these findings as a current “best evidence” synthesis to guide our clinical decision-making, but not as diagnostic proof. These tests are adjuncts to a thorough history taking and neurological examination of sensory changes, motor changes, and reflex changes. So these physical examination tests can help in differentiating painful cervical radiculopathy from other causes of radiating neck and arm pain. But these tests alone can not be used to definitively diagnose or rule out Cervical Radiculopathy.

Rather, the tests can be used to support or refute your hypothesis that you formulated during history taking. Let’s say a patient comes in with neck and arm pain. Here are two examples:

Example 1

This patient reports a nagging, diffuse ache in the trapezius and shoulder blade, with an occasional tingling down the lateral aspect of the arm. No weakness or specific sensory loss. Symptoms are aggravated mostly by prolonged sitting and general neck positions, but not by specific end-range neck movements (like combined extension/lateral flexion).

You assume that, since the pain is vague and not consistently “lancinating” or “electric” and since the sensory complaint is non-dermatomal, and since there is no motor loss, that the pre-test probability of a painful cervical radiculopathy is low. You work in primary care and assume a pre-test probability of 20%. Then you’ll choose a test with high sensitivity to become more confident in ruling out the possibility of a painful cervical radiculopathy.

You use the combination of 4 ULNTs, and they are all negative. Due to the high Sensitivity (0.97) and low LR- (0.06), this is the best test for ruling out (SnOUT) painful cervical radiculopathy. So your examination points toward somatic referred pain or a very mild, non-compressive nerve irritation. Your nomogram points to an almost absent post-test probability.

Differentiating Painful Cervical Radiculopathy

Example 2

This patient reports a recent onset of “shocking” or “electric” pain radiating in a specific, narrow strip (dermatomal pattern) down the forearm and hand. She complains of clumsiness or a slight feeling of weakness (though objective weakness is not confirmed yet). The symptoms are easily aggravated by putting the head back and to the side, and often feel worse first thing in the morning.

You assume that, because the quality and distribution of pain strongly suggest a direct nerve root issue (radicular pain) and the weakness is a high-risk factor for underlying cervical radiculopathy (conduction block), the pre-test probability of a painful radiculopathy is 30%.

Here, you would choose a test with high specificity. The Spurling’s test has a high reported specificity, but no single pooled value was provided. Your examination reveals a positive result of the Spurling test. Because you suspect a painful cervical radiculopathy is becoming more and more likely, you conduct a neurological examination. You find a weakness in the C6 myotome and a sensory deficit in the C6 dermatome, and the Biceps Brachii reflex is diminished. You further raise your suspicion. The ULNTs reveal 3 tests out of 4 positive, and you know that one large study in this review found an LR+ of more than 12. When you now enter the data on the nomogram, you find a post-test probability of around 80%. You can now confidently refer the patient back to their general practitioner or specialist.

Snakk nerdete til meg

The most critical limitation of this study is the small number of studies available for each index test, restricting the evidence base. This led the researchers to use fixed-effect models instead of random-effect models, which limits the generalizability to other settings, populations or different test executions.

Ideally, a random-effects model is the preferred model because it assumes clinical reality, by assuming that the true sensitivity, for example, is different depending on where the study was conducted.

In a primary care clinic (where patients have mild cases), the true sensitivity might be 80%, while the true sensitivity in a secondary care surgical clinic (where patients have severe cases) might be 95%.
Taking this variation into account, the random-effects model calculates a global average (e.g., 87.5%) and also estimates how much this “true sensitivity” varies between the different types of clinics. Therefore, the findings are generalizable. You could confidently apply the 87.5% average sensitivity to any patient in any clinic because the model accounted for the real-world variation.

However, the systematic review was forced to use a fixed-effect model due to the sparse data, since there were only a few studies available for each test. With this, the fixed model is forced to assume that there is only one single true sensitivity across all studies, and any difference reported is due only to random error.

It is forced to assume that the true sensitivity in the Primary Care Clinic must be the same as in the Surgical Clinic. It calculates a weighted average without trying to estimate the real-world variation between clinics.
Because the model ignored the known differences between the patient populations (e.g., secondary care vs. primary care) or differences in how the test was performed, the resulting pooled sensitivity is not generalizable.

The certainty of the evidence was very low for all outcomes of all tests, primarily due to methodological shortcomings (risk of bias), broad confidence intervals (imprecision), and clinical heterogeneity. This implies that no solid conclusions can be drawn based on the available literature. All included studies were performed in secondary health care settings, which limits the applicability of the findings to primary care, as secondary care patients may have more serious complaints.

Ta med hjem meldinger

The evidence on the diagnostic accuracy of physical examination tests for painful cervical radiculopathy is sparse, and the certainty of the evidence is very low for all outcomes. However, clinicians may use the outcome of Spurling’s test and the combined four Upper Limb Neurodynamic Tests (ULNTs) as an adjunct to clinical reasoning. A best evidence synthesis suggests that a positive Spurling’s test combined with a positive four-test ULNT cluster increases the likelihood of a painful cervical radiculopathy diagnosis. The specific criteria for a positive cluster vary: the criterion of having one positive test out of four ULNTs is the most sensitive (good for ruling out painful cervical radiculopathy), while having four out of four ULNT tests positive is the most specific. Negative outcomes for the cluster, along with a negative Spurling’s test, can increase the likelihood of ruling painful cervical radiculopathy out. These findings are limited by the small number of studies, meaning the pooled estimates are valid only for the specific populations and tests studied in this review, and they cannot be reliably generalized to other settings, such as primary care, as all studies were performed in secondary health care settings. The low certainty of the current evidence underscores the urgent need for high methodological value studies that can more definitively establish the value of physical tests in differentiating painful cervical radiculopathy.

Referanse

Thoomes EJ, Arvanitidis M, van Geest S, van der Windt DA, Verhagen AP, de Graaf M, Kuijper B, Scholten-Peeters GGM, Vleggeert-Lankamp CL, Falla D. Diagnostic accuracy of physical examination tests for painful cervical radiculopathy: update of a systematic review and meta-analysis. BMC Musculoskelet Disord. 2026 Feb 13. doi: 10.1186/s12891-026-09551-0. Epub ahead of print. PMID: 41680685.

Ellen Vandyck

Forskningsleder

OPPMERKSOMHETSTERAPEUTER SOM ØNSKER Å BEHANDLE PASIENTER MED HODEpine

100 % gratis hodepine hjemmetreningsprogram

Last ned dette GRATIS hjemmetreningsprogrammet for dine pasienter som lider av hodepine. Bare skriv den ut og gi den til dem slik at de kan utføre disse øvelsene hjemme