Use of EEfRT in the NIH study: Deep phenotyping of PI-ME/CFS, 2024, Walitt et al

Thanks @bobbler for your useful analysis.

Here's a bit more:

The Effort-Expenditure for Rewards Task (EEfRT)



So, yes, Method in the Walitt et al study

for each choice, the participant chooses one of two tasks:
the easy task involving 30 button presses in seven seconds with the dominant index finger
the hard task involving 98 button presses in 21 seconds with the non-dominant little finger

Each choice of task has a reward of a specified value and a probability of winning it, if the task was completed it. And it sounds as if the probabilities are kept the same within a choice, but vary between choices. It isn't clear to me if, in the Walitt et al study, the probabilities tended to increase towards the end, or if participants were told that they would (both of which seems to be the case with the test validation study that bobbler linked).

So, the first choice might look like (and I'm just making this up)
Choice 1a: Easy task reward $2.00; probability of winning it if easy task completed - 25%
Choice 1b: Hard task reward $3.50; probability of winning it if hard task completed - 25%

Once Choice 1 is done and the participant finds out if they won the reward, the participant goes on to other choices, until 15 minutes is done.
At the end, two rewards are chosen randomly, and the participants receives the sum of those two rewards.

From bobbler's posts, it appears that the test in the Walitt et al study differed from the standard method in crucial ways, including the outcome measured i.e.
  • ratio of hard tasks to easy tasks, or
  • a measure that takes into account variation in the propensity to choose the hard tasks based on the probability of receiving the reward
It is not at all clear to me yet if or how probabilities and rewards varied between tasks, if participants were told about the variation and how any such variation was accounted for in the analysis.


Results
View attachment 21176

We aren't shown the actual data in Figures 3a-c, just confidence intervals. I think that's a problem. The statistics used to produce the confidence intervals could be worth looking at. If the probability of choosing a hard task is much lower (and it was especially towards the end of the trial), there should be less data in Fig 3c, and I would have thought that less data would make the confidence intervals bigger. But, they are not.

Figure 3 a - Probability of choosing the hard task
The probability of choosing the hard task was a bit lower in the ME/CFS group than in the healthy group, throughout the trial. It wasn't actually that big of a difference. The difference didn't change over time, which I think the authors concluded meant that there wasn't more fatigue in the ME/CFS group. But actually, if your dominant index finger was getting tired, you might choose an occasional hard task using another finger, to give the dominant index finger a break for a while.

Figure 3b - Button press rate for easy tasks

What they mean there is that people with ME/CFS got slower with the button pressing for the easy tasks over time, but were still fast enough to reliably complete it. That is what the authors see as 'pacing'. And that approach makes good sense, because the participants were not rewarded on the number of tasks that were completed - however many they did, the computer would only pick two rewards at random to pay out on. The only reason you might want to get through the easy tasks fast is to get to later tasks that might have a higher reward.

(I think it's questionable how big an incentive something like $5 would be to a participant with ME/CFS who probably wanted to make sure that they could get through all of the studies, many of which probably seemed a lot more important than making choices about button pushing. I assume it was clear to the participants that this wasn't meant to be a test of fine motor control. So, would you really go all out, concentrating and pressing buttons for 15 minutes, in that context, after you had completed the two tasks needed to qualify for some reward, knowing that this was a government study and so it was very unlikely that later rewards would be a lot more than what you had already secured? There was a lot less on the line for the healthy participants, and so they probably felt freer to fully engage in the game. In terms of effort preference then, I think the people with ME/CFS may well have been expressing a very reasonable effort preference, given the circumstances. )

It is possible that fatigue played a part in the decline in button pushing rate and the substantial difference in button pushing rate between the two groups seen in later easy tasks - I don't see how the authors can rule that out.

Figure c - Button press rate for hard tasks
Most people undertaking the hard tasks pressed the button fast enough to complete them. Both groups showed a very slight increase in press rate with experience. But Figure 3a tells us that the people with ME/CFS were unlikely to choose to do the hard task, especially towards the end, so I expect that there wasn't much data behind that finding.

(See my note about the confidence intervals above - I think they are generalised in a way that conceals the paucity of data at different trial numbers. It would have been better to show the button press rates as plotted data points, with the x axis being time. Maybe such a chart can be constructed using the source data.)

****
Frankly, I find this task a joke in terms of providing insight into the supposed pathological psychology of people with ME/CFS. I feel appalled that conclusions are being made about the 'effort preference' of people with ME/CFS on the basis of this 15 minute study with 15 ME/CFS participants.


Aha thanks @Hutan for the helping hand (before I tie myself in knots) and apologies if I missed the link here of the button-pressing stuff with the point about the fatigue checks relating to the following from Treadway et al (2009):

"Effects of fatigue during the EEfRT
An important requirement for the EEfRT is that it measure individual differences in motivation for rewards, rather than individual differences in ability or fatigue. The task was specifically designed to require a meaningful difference in effort between hard and easy-task choices while still being simple enough to ensure that all subjects were capable of completing either task, and that subjects would not reach a point of exhaustion. Two manipulation checks were used to ensure that neither ability nor fatigue shaped our results. First, we examined the completion rate across all trials for each subject, and found that all subjects completed between 96%-100% of trials. This suggests that all subjects were readily able to complete both the hard and easy tasks throughout the experiment. As a second manipulation check, we used trial number as an additional covariate in each of our GEE models."

SO yes, we can't be sure either way on the first manipulation check as those with ME/CFS could have paced their way around any completion issues.

And we have not been provided with either reassurance or enough granulated data regarding the second check?
 
If some of the questions are not asked, or not regarded as important, in other diseases of energy limitation, would that be part of the reason for considering them technically illegitimate?

I am thinking more in the terms bobbler is raising, that the measurements of effort choices in this NIH study aren't being taken in a context that can be interpreted as meaning anything at all. The experiments on effort preference are pretty complicated and obscure. They involve what are more or less gambling choices. People vary a lot in how they approach such choices and I think in the study context it is pretty hard to know what they would mean. But it looks as if the study does not even bring in the weighting factors that would make the choices interpretable. Someone else's sophisticated method has been half-used - producing something meaningless.
 
Last edited:
People vary a lot in how they approach such choices and I think in the study context it is pretty hard to know what they would mean. But it looks as if the study does not even bring in the weighting factors that would make the choices interpretable.
Yes! For them to conclude like they do, risk has to be viewed as a constant and solely tied to the actual probability of getting the reward. It would have made just as much sense to assume that the two groups will seek to take the same risk, and that the difference in what the authors call ‘effort preference’ in reality mirrors the real risk (adding in probability of avoiding PEM etc.). If they were versed in ME literature, the latter would actually be a much better model.
 
I am thinking more in the terms bobbler is raising, that the measurements of effort choices in this NIH study aren't being taken in a context that can be interpreted as meaning anything at all.

Bobbler has done an excellent job highlighting the changes. Personally, I can imagine as a healthy person not having an interest in straining my non-dominant little finger to get a couple of bucks. It seems like a ridiculous test that could be interpreted any number of ways. Did they control for arthritis in everyone's pinkies? I mean, there are any number of reasons people might or might not want to rush to do a harder task.
 
For me, the fundamental problem with the “Effort Choice” task, and the interpretation of the easy task/hard task ratio in this paper, is that you could have predicted the outcome before the study. The groups do not come into this task on an equal footing.

Normal volunteers come in with nothing to lose and play a trivial financial game – a bit of fun.

Patients come in with ME/CFS and the following:

· they have learned from their own experience that regardless of how satisfying it might be to complete a task quickly, or how enticing a reward, they often have to forego these short-term satisfactions and rewards in order to manage their condition, in order to get more, smaller rewards long-term, in terms of being able to do a certain amount of activity rather than being crashed

· they are counselled by doctors and other health professionals to pace their activities, to be the tortoise and not the hare – the clinicians from whose practices these patients were drawn all, to my knowledge, advise their patients to pace

· the game might have been viewed by some ME/CFS participants as a test of whether they were pacing or whether they were somewhat reckless even when given a choice, such that the “right answer” was to choose quite a few easy tasks or achieve a balance of easy and hard tasks

· they are playing this game at some point in a week of testing which would be gruelling at any level of ME/CFS, and where their ability to pace is extremely limited – most of the time (perhaps all of the time except in this task), they just have to do the exact same as the healthy volunteers. To choose some easy tasks would have been wise, in order to increase their chances of being able to complete the week in some sort of decent shape.


For all of those reasons, I would have expected patients to choose more easy tasks than controls, regardless of whether they thought they could do the tasks in the moment or not.

So the test does not show that they thought they couldn’t do the tasks but actually they could. It shows that people with ME/CFS are coming in with a different reward and penalty system than healthy volunteers. Or more accurately, it shows nothing.

Patients are balancing these trifling rewards of $2-$8.42 with the penalty of PEM if they get it wrong, during a week when researchers are repeatedly and explicitly trying to trigger PEM. For controls, there’s no penalty.
 
a) Twitter user Zedsrigil (Sunsweptforest) wrote a good thread on the use of the EEfRT, making clear how ridiculous it is.
Twitter.

Friends. Welcome to my ZedTalk(TM). Today I want to speak about “effort preference,” because that is the termite-infested foundation on which the Paper-Which-Shall-Not-Be-Named is constructed. Effort preference in this paper is defined as “how much effort a person
1/
subjectively wants to exert.” The authors warn us that, “as fatigue develops, failure can occur because of...an unfavorable preference.”
But how do they measure this remarkable concept, so key to their ground-breaking analysis? They use the Effort-Expenditure for Rewards Task
2/
(EEfRT), an instrument designed to explore effort-based decision making in people suffering from Major Depressive Disorder (MDD), anhedonic subtype (Treadway et al 2009). The anhedonic subtype of MDD is characterised by “aberrant motivation and reward responsivity”--ie,
3/
a lack of “willingness to expend effort for rewards” due to “decreased motivation for and sensitivity to rewarding experiences.” Anhedonia is used to distinguish MDD from other psychiatric disorders.
But wait! Surely ME isn’t...MDD? No indeed. In fact, the authors of The Paper
4/
carefully (rigorously!) screened out participants who had additional diagnoses that could confuse the results of the study.
This is not unlike taking an instrument designed to measure Fear of Swimming--as a subset of Fear of Water--and asking people who’ve been pre-screened
5/
and found *not* to suffer from Fear of Water to complete it...then using the results to claim they suffer from Fear of Swimming whilst also deliberately ignoring the fact that these people are all wearing casts bc they have broken limbs, so their behavior during the exercise
6/
will reflect the fact that they cannot currently swim.
It is patently, flagrantly absurd.
To be very clear, the authors selected people to study who are very sick and then deliberately applied an inappropriate tool that assumes *low motivation* and *lack of appreciation of
7/
rewards* leads to less effort. Results of the activity are interpreted accordingly. (Amusing in context of ME, where ppl are more likely to behave like the Black Knight--continuing to push thru even as their life blood drains away--than a couch potato.) [Monty Python's Quest for the Holy Grail clip here]
8/
Anyway, the whole paper is constructed around this, and it’s actually totally irrelevant to ME.
The authors even run regressions on the EEfRT results and attempt to correlate all the biophysical findings with this very particular key trait of MDD, piling a
9/
statistical disaster on top of the dumpsterfire they made of research design.
The way they discuss the findings also tacitly implies some causality (suggesting that effort preference for non-effort leads to deconditioning over time leads to, you guessed it, disability),
10/
although they are careful to say they can’t make causal claims.
As anything other than a joke or a teaching tool (perfect for undergrad methods classes!), this paper is an outrage.
11/

So, outrage thread to follow.
12/12




Thread Reader App:
Thread by @sunsweptforest on Thread Reader App – Thread Reader App


b) I'm sorry if I'm repeating comments, but physical disability is not anhedonia FFS. (What an incredibly cruel notion.) Why is this study using a tool that was specifically designed for a subcriterium of depression, to "test the relationship between anhedonia and putative reward 'wanting' in humans" and why is it giving its results such a prominent place, woven throughout the paper? (I know, Walitt believes ME is hysteria etc,etc, but who let him?) Link to Treadway et al presenting the EEfRT.

You do not take this illness seriously at all if you are still preferring to focus on behaviour and "motivation" as the key things that should be studied to solve the problem. You also demonstrate that you have not listened at all to the colossal megamass of people with lived experience who told you that these things are not the problem.

Also: I don't see how you can claim to have found "effort preference" from the EEfRT. I can see how it might observe anhedonia in people with major depressive disorder, where you're not really motivated to push a bit harder for an extra dollar because of the depression, but that's not the same as securely measuring a "preference" of effort in non-depressed people, let alone disabled ones. (And don't get me started on that it justifies then proclaiming that "preference" the core issue.)

ME patients lives are hard, and they have to make tough choices and accept painful limitations on their activities daily - you're not measuring any "effort preference", "how much effort a person subjectively wants to exert" in their daily lives from wheter a couple of people are inclined to push a button a tad faster for $3.23 instead of $2.05. Nor can you claim that they have "avoided" the harder task which would have gotten them $3.23.

The paper describes "effort preference" as "the decision to avoid the harder task when decision-making is unsupervised and reward values and probabilities of receiving a reward are standardized".

So this paper made up the term "effort preference" and describes it as the decision to avoid harder tasks when no one is looking. It's incredibly disablist and offensive to coin such a term to describe people with a serious disabling physical illness.

(It's also a weird explanation as its negative-based, in that it means "avoidance of hard things"; "effort preference" as a term in itself should be neutral no? And then theoretically you could have a high or low effort preference. But hey, when you're busy putting the glittery new words on your same-old presentation of ME as activity avoidance I guess one might overlook that.)
 
Walitt et al said:
"Effort preference....was estimated using the Proportion of Hard-Task Choices (PHTC) metric."

Walitt et al said:
The primary measure of the EEfRT task is Proportion of Hard Task Choices (effort preference). This behavioral measure is the ratio of the number of times the hard task was selected compared to the number of times the easy task was selected. This metric is used to estimate effort preference...

Again, the EEfRT focusses on anhedonia in people with major depression. And for that it uses "the proportion of hard task choices" as a metric. The original study found that "the proportion of hard-task choices correlated inversely with Chapman Anhedonia score" and the Beck Depression Indext (BDI) melancholy items.

So they took a tool that was not designed to measure why disabled people are not functioning at a level of healthy people, and applied its metric to the group it was not intended to be used on.

Walitt et al said:
]Together these findings suggest that effort preference, not fatigue, is the defining motor behavior of this illness.
Walitt et al said:
Fatigue is defined by effort preferences
'Because some ME patients weren't particularly motivated to push a button a bit faster for a couple of dollars we are now declaring that the defining movement of this illness [by which I guess they mean the defining behaviour leading to disability] is that patients are not motivated to do more difficult things.'

(Linguistically, can an illness have "motor behavior"?)


There might be interesting data to be had from this paper, but there are little effort preference wees all over it and I'm appalled at the people who agreed to present their work alongside this. They might be doing good and capable work, but apparently they were also willing to whitewash this disablist and low quality crap by giving it an impression of respectability via letting it lift along with them - they can't be ignorant of it causing serious harm.
 
Last edited:
I haven't been able yet to read the whole paper or study the effort preference nonsense, so I may be repeating what others have said.

Choosing the easier option would likely have been necessary for me after a few minutes, as I would not be able to continue high speed repetitive finger presses for the whole 20 minutes.

Muscle fatiguability is real and prevents me handwriting more than a cheque, and slows down my typing speed and accuracy if I don't take breaks every few minutes. It's not just about pacing to avoid future crashing, it's immediate and increasing debilitation.
 
Choosing the easier option would likely have been necessary for me after a few minutes, as I would not be able to continue high speed repetitive finger presses for the whole 20 minutes.

I understand the EEfRT should be set up to avoid such limitations, as it is meant to measure purely motivation in MDD patients, but I was wondering how they would have done so in Walit et al..

Treadway et al said:
Effects of fatigue during the EEfRT

An important requirement for the EEfRT is that it measure individual differences in motivation for rewards, rather than individual differences in ability or fatigue. The task was specifically designed to require a meaningful difference in effort between hard and easy-task choices while still being simple enough to ensure that all subjects were capable of completing either task, and that subjects would not reach a point of exhaustion.

(In Treadway et al participants would also play four practise trials before starting to play the game)
 
I understand the EEfRT should be set up to avoid such limitations, as it is meant to measure purely motivation in MDD patients, but I was wondering how they would have done so in Walit et al..
(In Treadway et al participants would also play four practise trials before starting to play the game)

But as I understand it the test is designed to avoid fatiguing physically healthy people, not ones with abnormally debilitating physical and cognitive fatiguability.
 
But as I understand it the test is designed to avoid fatiguing physically healthy people, not ones with abnormally debilitating physical and cognitive fatiguability.

Indeed. Using EEfRT as saying something significant about non-depressed ME patients seems bull to me (it was never meant to do that), and what Walitt et al did with it is IMO reprehensible, but if we go along with it for a minute: how did they ensure that the result was not the effect of physical or cognitive fatiguability? I can't see it addressed in their methods section.

If you reverse it and wanted to study depression-caused motivation issues in ME patients, then you'd have to figure out a way around that too.
 
Re:effort preference; I'm not able to make a cogent argument for this but for anyone looking to come up with rebuttal points, I wonder if evoked potentials offer a useful contrast. The common set up for EP testing is scalp electrode measure of brain activity while the testee makes a mechanical response (pushing a button) to simple images displayed on a screen - the measure is of cognitive delay in various forms rather than physical strength, but delay in making exertion has an impact on level of exertion in a timed test. That is cognitive disfunction quite seperate from an motivation issues may play a role in reduced exertion.
 
Does the EEfRT test mean anything when used on PwME vs healthy volunteers?

Thanks to @bobbler for digging out so much useful information. I'm afraid I bailed out after reading ten pages of printed posts, and haven't reached the end of this ever-growing thread. My thoughts so far:

Comparing Hard vs Easy tasks
I always find it helps to understand the test properly first, ideally by doing it myself. That's not possible, so here is an attempted summary:

Hard task: Press a button 98 times in 21 seconds with the little finger of the non-dominant hand. + 9 seconds to read screen information (=resting the finger) = 30 seconds in total

Easy task: Press a button 30 times in 7 seconds with the index finger of dominant hand + 7 seconds of screen information = 15 seconds in total.

Participants complete as many trials (individual tasks) as they can in 15 minutes.

(I think that if you press the button faster, e.g. 30 times in five seconds on the easy task, you reach the end of the task and can start the next one (so you can complete more tasks that way))

Let’s compare the extremes of always selecting hard tasks vs always selecting easy ones.

Hard task (little finger) 30 trials, 2,940 button presses, 10 ½ minutes pressing/4 1/2 minutes ‘resting’.
Easy task (index finger): 60 trials, 1,800 button presses, 7 minutes pressing/8 minutes ‘resting’.

The hard-task-only option takes 63% more presses using their non-dominant little finger and comes with a lot less rest/non-pressing) time (less than a third vs more than half)than sticking to easy tasks.

If I understand correctly, the lowest reward on offer is $2 and the highest (which depends on an element of luck) is $8.42 i.e. an extra $6.42 for the best possible performance. This is on top of a baseline payment to take part, so it's not a huge incentive.

The test was developed and validated on undergraduates – not on sick people or even older healthy volunteers. The developers validated that even the hard task didn’t cause exhaustion to their undergraduate subjects.

The paper presents no evidence that this is not a fatiguing task for people with ME. The paper makes a leap by comparing with the test grip test, but presents neither subjective measures of fatigue nor measures of muscle or motor fatigue for the hard/easy test. Nor do they explore if the modest rewards were a meaningful incentive, given that this is a CHOICE test (unlike the CPET where everyone understands the whole point is to exercise to exhaustion).

Another way of looking at this is the test offers a reward in return for the cost of the minor exertion of button pressing. But that relies on an assumption that button-pressing is not exhausting for PwME. If it is, the cost of any trial, and especially cumulative trials, is higher for patients than for healthy volunteers.

So it does not appear to be a fair comparison.
 
Last edited:
My head is spinning and I'm going to bow out of the EFFrt discussion. Here's what I think I've learnt from others' helpful posts (and quotes from the underlying papers):

1. EEfRT was developed to probe anhedonia (an inability or relative inability to experience pleasure, presumably the rough opposite of hedonism) by measuring "reward motivation". They did this to better understand depression, where anhedonia is a core concept:
Of the putative psychopathological endophenotypes in major depressive disorder (MDD), the anhedonic subtype is particularly well supported. Anhedonia is generally assumed to reflect aberrant motivation and reward responsivity. However, research has been limited by a lack of objective measures of reward motivation

They developed the test on an apparently-healthy cohort of undergraduates, picking a selection that had a range of anhedonia scores. Crucially, EEfRT was tested to ensure that it did not exhaust the undergrad sample (shown by a 96% to 100% completion rate for trials).

2. The test rests on choices. It's also designed to be complex to make it hard to gauge what the best strategy is, though I think most people would conclude that at least sometimes harder choices are a better call. The rewards are small. Having a choice, with small rewards for probably marginal gain would probably bias those with a fatiguing illness to choose the easier task more often.

3. Wallit, 2024 chose to repurpose EEfRT as a measure of "effort preference". There was no explanation for doing so.

Not only did the paper not justify the repurposing of a test designed to measure reward motivation, it also failed to check that the test did not exhaust subjects who were chosen because they had a fatiguing illness.

In the absence of any evidence to the contrary, we can assume that the test of reward motivation is harder for fatigued people than for non-fatigued people. The whole point of the test is to cause fatigue (effort), to allow testing of how motivatable subjects were.

4. They tested a tiny cohort (16 patients, 17 HV) and found only a modest and unsurprising increase in the selection of easy tasks by patients.

5. They then compared the proportion of hard choices from this test with an entirely separate hand grip test.

Wallit et al found...
correction: there was a strong correlation for MEcfs, none for HV
They found NO correlation (p=0.89) for HV between the proportion of hard tasks in EFEfRT and the time to failure in the separate hand grip test. This isn't surprising, since a measure of reward motivation would not be expected to correlate with hand grip fatigability, a test with no reward.

However, they found a relatively high correlation (r=0.57) for PwM. The correlation might be because, for PwME, both tests are a test of fatigability/exhaustability.

6. The authors concluded that since apples didn't behave the same way as oranges, the oranges had an effort preference issue - measured by a test of reward motivation.

Time to go back to bed.
 
Last edited:
Thank you so much @Simon M for clarifying a very confusing situation. Would it be OK for your analysis to be used by those writing to people at NIH, and possibly writing to the journal that published it, to complain about the publication? I think it would be very helpful to have it spelled out so clearly.
I would like Nath to be asked how it came about that such misuse of a test and invention of a false hypothesis came about and why noone had insisted the whole invented and unevidenced premise of effort preference be removed from what is meant to be a biomedical science paper.
 
A couple of points I left off my 'final' post:
@bobbler quoted Treadway saying that men were more likely than women to choose hard tasks. The HVs had a slightly higher male ratio - I wonder if the study allowed for that? The marginal 0.04 p value for hard choice ratio being lower for ME folk might disappear.

@bobbler also said Treadway (EEfRT source) said EEfRT's main effect was visible when taking into account the probability of a win given to each person before they made a choice. The win prob applies to both easy and hard choices, so if the probability is 88% or 50% people would be more likely to choose a hard task than an easy one. What I'm not clear on is if Treadway specified what primary measure of EEfRT should be used: Wallit 2024 said it was the proportion of hard tasks (presumably independent of probability).
 
There's at least that other, far more likely and rational explanation, that happens to be the opposite of what is alluded in the paper's text: that the opposite of choosing the low effort is happening, that we want to make the effortful choice, but our brain, knowing better, is telling us that we shouldn't. Similar to someone trying to lift a car, and our brain knowing better is saying "uh, don't do that".
Yes, and we don't know that the patients were able to do the same amount of hard tasks as the HVs, because they didn't do them!
 
Back
Top Bottom