Max van der Velden
Find out whether HIT is superior to MIT
Find out whether HIT is even effective at all
Get a critical view of a pragmatic study
Exercise therapy is frequently used by physical therapists in the treatment of chronic non-specific low back pain (CNSLBP). Current low to moderate intensity rehabilitation could provide an insufficient stimulus for these patients. Patients with CNSLBP can present with physical deconditioning. High-intensity training (HIT) could improve outcomes.
Recruited participants were between 25 and 60 years of age, had localized pain below the costal margin and above the inferior gluteal folds with or without nociceptive referred leg pain. If known sinister pathologies, structural deformities, and/or history of spinal surgery were present, patients were excluded. Subjects were randomized into an experimental HIT group and a moderate-intensity training (MIT) group. Both groups received a 12-week exercise program consisting of 24 individual sessions. Exercises and the number of sets were identical in both groups, intensity differed.
Both groups received cardiorespiratory training on a cycle ergometer. The HIT group trained with intervals at 100% VO2 max while the MIT trained continuously at 60%. Both groups received progression criteria.
General resistance exercises were the following: vertical traction, leg curl, chest press, leg press, arm curl, and leg extension.
Core exercises were: glute bridge, glute clam, lying diagonal back extension, adapted knee plank, adapted knee side plank, elastic band shoulder retraction with a hip hinge.
The HIT and MIT group trained at 80% and 60% of their 1RM, 12 reps and 15 reps, respectively. Three sets were to be performed for each exercise. The load was progressed if the participants were able to perform more than the prescribed number of reps on two consecutive sessions.
The primary outcome was disability, measured using the Modified Oswestry Disability Index (MODI). Secondary outcomes were pain intensity (NRPS), function (PSFS), exercise capacity (VO2 max), and muscle strength (abdominal flexion and back extension force output using an isokinetic dynamometer).
A general linear model was used to evaluate differences in each measure.
Thirty-eight participants were included (69% women) with a mean age of 44.1 years old ± 9.8. No differences were found in any tracked demographics except for trunk extension force output which was higher in the HIT group. Mean session adherence was 22.3/24 and did not differ between groups. Three participants (one HIT-group, two MIT) dropped out due to sickness, one of them (MIT) was still analyzed halfway through the protocol.
MODI improved 14.6% (absolute reduction) in the HIT group and 6.2% in the MIT group, which was statistically significant. However, of arguable clinical significance.
First things first, this was a good trial investigating a question with decent methodology. Exercise parameter reporting isn’t something we see as much as we should (i.e. always). Both groups improved from baseline, however, there was no control group as the authors noted. Luckily, due to the chronic nature of the participants and the abundance of research on the natural history of chronic low back pain, this wasn’t really needed. Although some would argue that the participants need other co-modalities like education, it could be a good thing this wasn’t the case. This way, the point-estimates we’re noticing are not “muddied’ by other interventions. The external validity of the study is rather high. You could immediately implement it in your private practice. Although some expensive gym equipment is used, one could argue that about the same results will be reported in free weights, although that requires another study. Looking at the exercise program, we’re seeing lots of strict machine strengthening exercises. Would results be superior if the authors implemented compound exercises? Like a squat, a deadlift, use of a Roman chair, … Maybe, we don’t really know.
Another point to be made is that the subjects had generally low disability (22.8 and 18.8/100 MODI), are we seeing the potential for a floor effect here?
The authors conclude that greater improvements were found for HIT compared to MIT for disability (MODI) and exercise capacity (VO2max), although one cannot be certain about that. One of the issues is that there’s a noted between group-difference of 8.6% on the MODI. An argument could be made that this does not exceed clinical relevance. It could be implemented, but superiority is questionable at this time. As to exercise capacity, the study was simply not powered to make conclusive statements about this, or any other secondary measure.
At the end of the day, this was a much-needed study. HIT seems safe and perhaps non-inferior to MIT. Larger studies with robust methodology might give some clearance.
From a methodological standpoint, some alterations could be made for the future. It’s important to calculate your study power a priori, meaning upfront. Since there’s an overwhelming amount of research on low back pain, with identical primary outcomes, the researchers could’ve easily done this. Even the researchers themselves held a feasibility study with a similar protocol, published one year before. Enough power was reached to notice a difference of 10 points on the MODI (100 total points). However, they noted a post-intervention between-group difference of 8.6%. The researchers follow-up with post hoc power calculations for outcome measures specifically, which are mathematically redundant calculations.
There’s quite a list of secondary outcome measures. Notice that when one calculates study power, it’s for one outcome measure at one time-point. All others are merely suggestive. Low power — which is obviously the case for noted secondary outcome measures in this study — results in false-negative and indirect false-positive results via the multiple comparison issue. Since the study was powered for the MODI (10 points difference), conclusive statements outside of this measure can be ignored. However, they do present a suggestion for further research. When authors measure multiple outcomes, correcting for false-positives should be a priority. This was not the case, as in many clinical trials. A simple Bonferroni correction — to minimize some errors — would result in a p-value threshold of around 0.00714, which in turn would mean every between-group difference would disappear.
Nonetheless, HIT might well be feasible for CNSLBP, but larger trials are much needed.