Discussion in 'Health News and Research unrelated to ME/CFS' started by Andy, Sep 6, 2020.
Open access, https://physoc.onlinelibrary.wiley.com/doi/full/10.14814/phy2.14564
wonder how they report the patients that are to sick from pem to turn up on the second day .
Notably, there was no difference in work rate at the Gas Exchange Threshold (VT1) in the GWS patients, suggesting that CFS and GWS are different!
The authors talk about poorer test-retest validity for the workrate at VT1, but they are ignoring the fact
(1) the test-retest validity in healthy participants was high
(2) The mean workrate in GWS patients increased on the second day (but not statistically significantly)! If you hypothesise the workrate to decrease (for the patient group) then you apply a one-tailed analysis and this would alter the SRD statistic.
As an aside, the GWS patients had a significantly greater respiratory frequency on the 2nd day, despite the workrate and VO2 being the same - just pointing this out to those people who incorrectly believe that the Gas Exchange Threshold (ventilatory threshold 1) can be significantly shifted through an alteration in voluntary breathing rate. (or the people who mistakenly think that the first ventilatory threshold has anything to do with a feeling of being out of breath...)
This comment kind of misses the point given the drop in WR at the GET/VT1 is the whole point. Specifically, their study actually demonstrated that this measure does have high (but not perfect) retest validity after 24 hours in healthy participants. "Valid" maximal effort on the second day isn't needed to measure this drop.
Given the lack of significant drops in WR at the GET/VT1 in healthy participants in all the other studies, this implies that the retest validity is high, even without performing specialised statistics. If the goal is clinical usefulness in differentiating this condition from other fatiguing conditions, then the key measure is specificity for a (pre-defined) threshold. Similar to what Nelson et al. proposed.
No, this conclusion is illogical (non sequitur)! Based on that statement, I don't think these authors understand the physiological determinants of VO2Max. Peripheral fatigue does not cause a drop in VO2Max! Only a significant reduction in blood volume, a lower peak heat rate or a significant increase in obstruction of the lungs will cause a drop in VO2Max on the second day. And PEM is not simply post-exertional fatigue.
From the last author, a recent publication:
"Post-exertional malaise in veterans with gulf war illness"
The latter point suggests a heterogeneous group and may go some way in explaining their finding of a low Intraclass Correlation Coefficient for WR at the GET in the GWI group in the CPET study.
Also, I don't fully understand why they used the "smallest real difference" approach and why they use the figure for the GWI group, rather than the controls. Given a such a small sample size and I don't understand how they calculated it. It is supposed to be the 95% CI of the Standard error of the difference scores. This approach seems to make non-obvious assumptions about the distributions and specificity threshold...
For the "valid effort" GWI group, the mean difference score for the WR at GET was 0.63 (95% CI of −4.09, 5.35).
The standard deviation of this, assuming a Gaussian distribution is ~9.3, and a standard error of the mean of ~2.4. So the SRD of 23.73 seems rather large (this is 2.5 standard deviations, over 99% specificity). But if the goal is to differentiate patients from controls then it would be the SRD for the controls that is relevant, not the SRD for patients, (and the mean difference data was not provided for the controls) so...
What do you reckon was going on with the failure to meet valid peak effort @Snow Leopard? It looked like lots of the participants failed, both GWS and controls. I wouldn't have thought that was normal.
I've commented on failure of both controls and CFS patients failing to reach a true maximal effort before - low peak heart rates for example.
I've had a conversation with Max Nelson about this, he believes that it can often be a result of insufficient (or inconsistent across participants) encouragement to reach a true VO2Max.
From personal experience, I think it can be difficult for participants suffering from unusually high fatigue to reach a true peak because it requires greater effort (greater proportional recruitment of motor units) to achieve a true VO2Max than a healthy person, which is also reflected by the rate of change of scores on the Borg RPE scale.
I'd also like to point out that VO2Max is maximal only in the sense of oxygen consumption (which is always limited by either the lungs or the amount of blood that the heart can pump to the muscles), not motor unit recruitment. The power output at VO2Max can be around 20-25% of what can be achieved in a short 8 second burst and less than 50% what can be achieved in a 30 second wingate test.
The Workwell Methods paper said this about maximal effort - interesting this list does not include the 85% of age-predicted maximum described above.
A high RER is not enough. RPE could be biased due reporting issues (unfamiliarity with exercise, high RPE baseline), high peak blood lactate could exist earlier than VO2Max if the person has a mitochondrial disease.
I agree with the effort criteria of the above study - RER>1.1 and a plateau are key criteria. The 85% of predicted max heart rate is sort of a fudge, if the participant fails to reach the other two criteria and they're willing to consider it as close enough.
Though I wouldn't call the criteria "valid effort", I'd call it a "valid VO2 peak" as there are other *valid* reasons why a participant may fail to reach a true max.
This quote is from the 2018 Nielsen 48 and 72 hour 2xCPET study (Hodges)
Maybe that is why the Workwell people don't use heart rate as one of the criteria for maximal effort?
I just thought the percentages of people who got to RER=1.1 seemed really low across the board in this Falvo study - 50% to slightly more than 60% for both GWS and controls. When I did the 2xCPET, getting to the point of RER =1.1 did not feel as though it took an enormous effort.
It works the other way too. On the first CPET my heart rate peaked at 110% of age-predicted maximum. I reached a new VO2Peak around 2.5 minutes after reaching an RER of 1.1 and an RPE of 18, and 85% of my age-predicted HR. If I stopped after reaching 2/3 of the criteria, I would have not reached my true VO2Max.
What did it feel like? The hardest exercise I had ever done in my life, I had never felt my heart beat so hard or so fast. All I was thinking about at the time was "what does a true VO2Max feel like, when should I stop"? When I started to feel dizzy due to oxygen levels dropping in my brain, I knew I had reached a true maximum and had to stop. (and I used to enjoy racing up mountains on my bike as a child before I became ill)
I suspect those who have little experience at doing very hard exercising would not push themselves to quite this point (hence why studies always refer to it as a VO2Peak, not VO2Max as they don't know if this is a true maximum).
It's all sounding a bit random. When I was told to stop I was certainly not thinking 'this is the hardest exercise I have ever done in my life', not even close.
Maybe the ventilatory threshold is where we should be looking for differences, given the VO2 peak is assessed on such moveable criteria?
Yes, exactly. I've been saying that since Nielsen 2018!!!!
I don't much care whether the patients reached their true peak, so long as the exercise was sufficiently hard to trigger the physiological effects leading to the reduction in WR at the VT1 on the second day.
On the second day, I don't even care if the participants barely make it past the VT1.
I have an alternative protocol in mind that doesn't actually require patients to reach a true VO2Max on either day, though that is not to say the exercise will be easy (it will still take a similar amount of time too).
It takes me a while. And then I mostly forget. And so it takes a bit longer.
Revisiting the "smallest real difference" calculations, they seem to have made several faulty statistical assumptions.
Firstly, the 1.96 in the formula (SRD = 1.96 × √SEM × √2) refers to the fact that 95% of the area of a Gaussian distribution is within 1.96 Standard deviations from the mean. Thus this assumes that the error will have a Gaussian distribution, which requires a sufficiently large sample size. (not the N=15 or N=8 in the study!).
In the online stat textbook, it states this assumption is reasonable for N>100, but with small sample sizes the t distribution is leptokurtic.
Hence the we cannot conclude much about the calculation due to the low sample sizes.
The second flaw is that the "smallest real difference" calculations is agnostic to whether there is a rise or fall between both tests. We are only interested in reductions on the second test, hence the real test is how much of a reduction is meaningful, not merely a change. If there is a test-retest bias (in healthy controls or rested patients), towards the upside (due to participants becoming acclimatised to the test), this increase will actually widen the SRD value due to this bias, even though it is only reductions that we are interested in. This is obviously undesirable for the statistical approach.
Lastly, it makes no sense to perform this statistical analysis for ME/CFS patients on consecutive 2 day tests - this analysis only makes sense for controls, or in patients where the patients have had sufficient rest days between the two tests.
This is why the sensitivity/specificity approach by Nelson et al. is much more justified than this "smallest real difference" approach.
I note that Davenport et al. attempted to replicate the SRD approach, but suggested flaws of this approach in the discussion:
Separate names with a comma.