S4ME: Submission to the public review on common data elements for ME/CFS: Problems with the Chalder Fatigue Questionnaire

Andy · Jan 23, 2018

Science for ME are pleased to announce that we have today submitted the following critique of the Chalder Fatigue Questionnaire to the NIH/CDC review. All credit for this submission should go to the authors, we are very grateful for all their hard work.

We have replicated the submission below, to view it in it's original format please see the PDF file attached.

--------------------------------------------------------------------------------

Submission to the public review on common data elements for ME/CFS: Problems with the Chalder Fatigue Questionnaire

Wilshire, C.E., McPhee, G., and the Science for ME CFQ working group

The Chalder Fatigue Questionnaire (CFQ; Chalder et al., 1993) is among the scales being proposed to provide common data elements (CDEs) on fatigue for future NIH- and CDC-funded studies of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS).

Because the CFQ was used in the PACE trial it has received close scrutiny from patients and researchers who have been critical of the trial (e.g., Wilshire et al., 2016). Some of those same individuals were involved in the drafting of the present submission.

The Chalder Fatigue Scale

Many of the problems with the scale are obvious upon inspection, and so it is important to examine the scale. The complete scale, in its final 11-item form, is reproduced below (bolding is ours).

We would like to know more about any problems you have had with feeling tired, weak or lacking in energy in the last month. Please answer ALL the questions by ticking the answer which applies to you most closely. If you have been feeling tired for a long while, then compare yourself to how you felt when you were last well.

Do you have problems with tiredness?
Do you need to rest more?
Do you feel sleepy or drowsy?
Do you have problems starting things?
Do you lack energy?
Do you have less strength in your muscles?
Do you feel weak?
Do you have difficulties concentrating?
Do you make slips of the tongue when speaking?
Do you find it more difficult to find the right word?
How is your memory?

[Response options: less than usual; no more than usual; more than usual; much more than usual.]

The scale items can be scored ‘bimodally’ or with ‘Likert’ scoring, as shown below. The scores for each item are then summed to produce an overall score.

Problems with the scale

1. Few items appear clearly related to fatigue

Only three of the eleven items on the scale (#1, #2 and #5) appear to be clearly related to fatigue. For the rest, the scale assumes that memory problems, speech errors, sleepiness/drowsiness, muscle weakness and so on are indicators of fatigue, and that the more such symptoms a patient reports, the greater their overall fatigue. These assumptions are untested and their basis is unclear.

The item on ‘problems starting things’ is particularly puzzling. It appears to be probing for lassitude, a common symptom in depression. Indeed, similar items appear in several depression scales, such the Montgomery-Asberg Depression Scale (Montgomery & Asberg, 1979). The relationship of lassitude to fatigue outside the context of depression is unknown.

Chalder et al. (1993) defined ‘caseness’ as a bimodal score of 4 or more on the CFQ, which means that a patient could be defined as a fatigue case if their only symptoms were difficulties in concentrating, making slips of the tongue, word-finding, and having memory problems. This appears to be entirely inappropriate, since it is unclear whether any of these symptoms are effective at discriminating between those with fatigue and other types of complaints (for example, mild cognitive impairment).

The lack of obvious or validated relevance to fatigue of the majority of items on the scale would, on its own, appear to make the CFQ unfit for purpose as a fatigue scale.

2. Focus on change in fatigue rather than intensity

The CFQ asks patients who have been feeling tired for a long while to rate their fatigue compared to when they were last well. It does not take ‘no fatigue’ as its baseline.

For ME/CFS patients – who, by definition, must have been ill for some time in order to achieve a diagnosis – this means remembering how it felt to be well. Patients may have been unwell for anything from several months to several decades and their recollection may well not be accurate.

An added source of confusion is that respondents are told to compare themselves to ‘when [they] were last well’, but the response options ask whether respondents are having problems ‘less/more than usual’. ‘Usual’ to a patient with a chronic illness such as ME/CFS is clearly not the same as ‘when [they] were last well’, and this conflicting wording is likely to lead to response errors.

The fact that respondents can mark each fatigue problem as occurring ‘less than usual’ is also problematic. It is unclear how anyone could feel less tired than when they were well, and therefore unclear what a respondent means when they select this option. Confusingly, a score of zero on the ‘Likert’ scoring of the CFQ is therefore not the base-point of the scale; a patient who scores 11/33 is no more fatigued than when they were last well, not one who scores 0/33. This makes interpretation of the scale difficult.

3. Arbitrary weighting of physical and mental components

Chalder et al. (1993) report a principal components analysis indicating that the scale has two major components – mental and physical fatigue. They combine these into a single score in the CFQ but the weighting of these components appears arbitrary, and is based simply on the number of questions of the two types in the questionnaire.

Even putting aside concerns about the validity (particularly of some of the ‘mental’ fatigue items), the consequence of combining mental and physical fatigue questions is that the scale is not necessarily monotonic, as an improvement in one form of fatigue could be accompanied by a worsening of the other type.

4. Incompatibility of scoring schemes

There are two alternative scoring methods. The ‘bimodal’ method assigns a 0 or 1 to each response, depending upon whether the complaint is present or absent (maximum score 11). The ‘Likert’ method rates each response from 0–3. The minimum score of 0 is given only for ‘less than usual’ (paradoxically less fatigue than before illness). A response of ‘no more than usual’ scores a higher 1, even though it indicates full recovery. Scores of 2 and 3 are given for ‘more than’ and ‘much more than’ respectively (maximum score 33).

The relationship between the two scoring schemes is far from transparent. One of them counts the number of symptoms, the other weights the intensity of the symptoms (and confusingly, gives extra credit for being even better than before the illness). Indeed, these two methods can generate contradictory findings: in the PACE trial, in 23 cases, fatigue scores decreased during the course of the trial based on one scoring method, but actually increased based on the other method. 1

1 Dataset available at https://sites.google.com/site/pacefoir/pace-ipd_foia-qmul-2014-f73.xlsx?attredirects=0, ‘readme’ file https://sites.google.com/site/pacefoir/pace-ipd-readme.txt?attredirects=0

5. Failure to directly measure fatigue intensity

In the table on p.3 of the Fatigue Subgroup Materials section of the CDE Public Review document (NINDS/CDC, 2017), the CFQ is described as an index of ‘fatigue intensity’. As noted above, the bimodal scoring method simply yields a count of symptoms on a present/absent basis, while the ‘Likert’ version blends the number of symptoms with their intensity in a manner that is impossible to interpret from the total score.

6. Ceiling effect

Kindlon (2010) has pointed out that findings reported by Morriss et al. (1998) indicate that ceiling effects are likely when the CFQ is used. These investigators applied the questionnaire to 136 CFS patients in an outpatient clinic, and reported near-maximal scoring on six physical fatigue-scale items from the questionnaire, irrespective of which scoring method is used.

Clearly, it is important to know whether ME/CFS patients are experiencing worsening fatigue – or even harm – in response to an intervention. It is also important to know whether fatigue correlates with a potential biomarker. The CFQ’s ceiling effect is therefore a problem.

Conclusions

We have here identified a number of serious problems with the CFQ, and note that the Fatigue Subgroup Draft Recommendations document also summarises some problems with it (p.33, our bolding):

Scoring:

‘This instrument can be scored in two ways: Bimodal and Likert scoring. It appears that the choice of scoring method may result in significant differences in interpretation of outcomes. (Rebecca Goldin. Sense About Science USA. March 21, 2016 http://www.senseaboutscienceusa.org/pace-research-sparked-patientrebellion-challenged-medicine/). This will need to be further researched.’

‘Thresholds have been reported for both methods. (Bimodal: Case (>4) vs. non-case (<4) Mean score = 9.14 (SD 2.73) and 3.27 (SD 3.21) for Community sample. Mean “Likert” score 24.4 (SD 5.8) and 14.2 (SD 4.6)). However, the study referenced for these thresholds in the Chalder instrument required patients to meet either Oxford or Fukuda. As NIH’s ME/CFS Pathways to Prevention report noted, Oxford could have selected patients with other fatiguing conditions. Thus, it is difficult to know if these thresholds apply to ME/CFS cohorts. Further research is needed.’ (typo in first line of quote was corrected by us)

We are pleased to see these problems acknowledged, but concerned to see a call for further research on a questionnaire which appears unfit for purpose, and which is unlikely to become so with even major modification.

We would much prefer to see a questionnaire developed from the ground up: one that begins with researchers conducting a narrative interview, and then identifies items worth including on the basis of their ability to discriminate severely fatigued individuals from healthy ones. Perhaps one already exists and is being considered – we do not know the wider literature – but it is clearly not the CFQ.

We are pleased also to see (p.6 of the document) that the Fatigue Subgroup is aware that a challenge in assessing fatigue in ME/CFS is not only symptom variability, but also that symptoms are exertion-dependent. It is perfectly possible for a patient who is very severely disabled by ME/CFS to experience little fatigue most of the time because they are pacing themselves and restricting their activities to remain below their fatigue-triggering threshold.

We are grateful for the opportunity to contribute to the development of common data elements for our disease and will follow the work on this with great interest.

Wilshire, C.E., McPhee, G., and the Science for ME CFQ working group, January 22, 2018

References

Chalder T, Berelowitz G, Pawlikowska T, Watts L, Wessely S, Wright D, Wallace EP.

Development of a fatigue scale. J Psychosom Res. 1993;37(2):147-53.

Kindlon T. Data on the level of maximal scoring (on the Chalder Fatigue Scale) would be useful. BMJ 2010;340:c1777. Available at: http://www.bmj.com/rapid-response/2...scoring-chalder-fatigue-scale-would-be-useful

Montgomery SA, Asberg M. A new depression scale designed to be sensitive to change. British Journal of Psychiatry. 1979;134:382–89.

Morriss RK, Wearden AJ, Mullis R. Exploring the validity of the Chalder Fatigue scale in chronic fatigue syndrome. J Psychosom Res. 1998 Nov;45(5):411-7.

NINDS/CDC. Public Review Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

(ME/CFS) Common Data Elements (CDE) Fatigue Subgroup Materials. 2017. Available at: https://www.commondataelements.nind...atigue_Subgroup_CDE_Draft_Recommendations.pdf

Wilshire C, Kindlon T, Matthees A & McGrath S. Can patients with chronic fatigue syndrome really recover after graded exercise or cognitive behavioural therapy? A critical commentary and preliminary re-analysis of the PACE trial. Fatigue: Biomedicine, Health & Behavior 2017;5.

ETA: Link, usable by members and non-members, to a PDF version of the submission, https://www.s4me.info/docs/CFQ-Critique-S4me.pdf

Simone · Jan 24, 2018

An excellent critique of the CDQ. Thank you to everyone who participated!

petrichor · Jan 24, 2018

That seems like a quite a good critique of the CDQ. I feel like any questionnaire that focuses on the subjective feelings of fatigue is going to be relatively flawed, because, many people with ME/CFS don't know how to compare their feelings of fatigue with anything. I might call the kind of fatigue that prevents me from doing anything in a day apart from some basic tasks a moderate fatigue, whereas others might call it severe or mild. Plenty of healthy people would call a certain fatigue level severe, which I would call mild.

I also often convince myself that I'm doing fairly well, or I'm not that fatigued, or that I'm not feeling that bad, or that I have the capacity to do certain activities more than usual, but if I actually systematically go though the activities I've been able to do, day by day, over the past few weeks (for example), it doesn't usually show the same thing.

So, especially in the case of ME/CFS, I would think constructing questionnaires like this is especially difficult.

Cheshire · Jan 24, 2018

Great work! Thanks to @Graham @Carolyn Wilshire and everyone involved.

I am really pleased to see that S4ME is starting to be a real platform for advocacy!

Hoopoe · Jan 24, 2018

Thank you so much for the hard work.

Trish · Jan 24, 2018

Excellent work. Huge thanks to all involved. This is so important, as the CFQ is nonsensical. I have huge difficulty in understanding how anyone could ever have invented such a daft instrument.

Hoopoe · Jan 24, 2018

Andy said:
The fact that respondents can mark each fatigue problem as occurring ‘less than usual’ is also problematic. It is unclear how anyone could feel less tired than when they were well, and therefore unclear what a respondent means when they select this option

Something like this just invites biased self reporting, doesn't it? I remember reading that CBT made MS patients less fatigued than healthy people according to this scale in one study. More likely, this was just an artefact of biased self reporting.

Valentijn · Jan 24, 2018

petrichor said:
I feel like any questionnaire that focuses on the subjective feelings of fatigue is going to be relatively flawed, because, many people with ME/CFS don't know how to compare their feelings of fatigue with anything.

I think the bigger problem is that many patients wouldn't describe their symptoms as fatigue in the first place

strategist said:
I remember reading that CBT made MS patients less fatigued than healthy people according to this scale in one study. More likely, this was just an artefact of biased self reporting.

Hrm ... perhaps the likert score of 0 can be more effectively used to measure bias instead of fatigue?

arewenearlythereyet · Jan 24, 2018

This looks good. It would have been even better to have recommended some examples of "what good looks like" to illustrate some of the design failings even more.

E.g instead of saying do you have a problem with tiredness, it should say something like

Please rate how tired you feel on a typical day between 10am and 4pm on the sliding scale below

Exhausted .........alert

i think some examples would be good by way of illustration for the layman

Other points I think could be emphasised more is the effect of questionnaire design biases. some examples off the top of my head would be:

Using sliding scales instead of fixed scales (e.g rating scales with even number end point (e.g 1-10) have been shown to suffer from end point avoidance so 1-9 is better for this ...sliding scales are even better)
1). Testing and analysis of wording used in the question phrasing to show that wording had been considered to avoid questionnaire bias ( there seems to be a lot of leading questions used rather than neutral questions)
2). Using open questions rather than specific questions is never a good thing for quantitative data collection
3). Repeating the same question in different guises also causes bias from questionnaire fatigue
4). The screened target has been tested with the questionnaire vs a control group to give an indication of the effectiveness of the scale used.

I think if it is a fatigue scale, fatigue is a word that may need clarifying vs other synonyms. I understand people's frustration with the word but I suspect that this may mean different things to different people. you can get around this by clarifying what you mean as per the example above. At the end of the day you need a design that pulls out severity vs a "normal" control group.

guest001 · Jan 24, 2018

Thankyou . This is very helpful. Chalder has insinuated her peculiar scale into so many BPS projects that to highlight its flaws and limitations also brings into question the validity of results where papers have used her scale as part of the protocol. Not just confined to PACE. Happy days...

Woolie · Jan 24, 2018

petrichor said:
That seems like a quite a good critique of the CDQ. I feel like any questionnaire that focuses on the subjective feelings of fatigue is going to be relatively flawed, because, many people with ME/CFS don't know how to compare their feelings of fatigue with anything. I might call the kind of fatigue that prevents me from doing anything in a day apart from some basic tasks a moderate fatigue, whereas others might call it severe or mild. Plenty of healthy people would call a certain fatigue level severe, which I would call mild.

I also often convince myself that I'm doing fairly well, or I'm not that fatigued, or that I'm not feeling that bad, or that I have the capacity to do certain activities more than usual, but if I actually systematically go though the activities I've been able to do, day by day, over the past few weeks (for example), it doesn't usually show the same thing.

So, especially in the case of ME/CFS, I would think constructing questionnaires like this is especially difficult.

Your symptoms are quite unusual, @petrichor. For most of us, our activity is not primarily limited by current feelings of fatigue, its limited by the flare-ups we get the next day if we overdo it. You have to predict exactly how much is enough - for that particular day - to avoid PEM.

Sly Saint · Jan 24, 2018

this is way over my brain capacity but this paper appears to be relevant (although it does conflate chronic fatigue with chronic fatigue syndrome ):
"
Psychometric properties of the Chalder Fatigue Scale revisited: an exploratory structural equation modeling approach

Abstract
Objective
Previous validation studies of the Chalder Fatigue Scale (CFS) suffer methodological shortcomings. The present study aimed to re-evaluate its psychometric properties using exploratory structural equation modeling (ESEM).

Methods
A Chinese sample of 1259 community-dwelling residents completed the 11-item Chinese CFS and a variety of health measures (anxiety, depression, exhaustion, sleep disturbance, and quality of life). In addition to traditional confirmatory factor analysis, ESEM was performed to assess the fit of two- and three-factor models using robust maximum likelihood estimation and oblique geomin rotation. Convergent validity of the CFS was examined via associations with five covariates (gender, age, exercise, perceived health, and life event) and the health measures in the ESEM model.

Results
The ESEM models displayed a superior fit to confirmatory factor models. The three-factor ESEM model showed a satisfactory model fit to the data but not for the two-factor model. The three factors were physical fatigue (three items, α = .800), low energy (four items, α = .821), and mental fatigue (four items, α = .861). The factors exhibited convergent validity with the model covariates and health measures."

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4529874/

or is it just a good example of 'with [fancy] statistics you can prove anything can be made to fit' (?)

arewenearlythereyet · Jan 24, 2018

Sly Saint said:
this is way over my brain capacity but this paper appears to be relevant (although it does conflate chronic fatigue with chronic fatigue syndrome ):
"
Psychometric properties of the Chalder Fatigue Scale revisited: an exploratory structural equation modeling approach

Abstract
Objective
Previous validation studies of the Chalder Fatigue Scale (CFS) suffer methodological shortcomings. The present study aimed to re-evaluate its psychometric properties using exploratory structural equation modeling (ESEM).

Methods
A Chinese sample of 1259 community-dwelling residents completed the 11-item Chinese CFS and a variety of health measures (anxiety, depression, exhaustion, sleep disturbance, and quality of life). In addition to traditional confirmatory factor analysis, ESEM was performed to assess the fit of two- and three-factor models using robust maximum likelihood estimation and oblique geomin rotation. Convergent validity of the CFS was examined via associations with five covariates (gender, age, exercise, perceived health, and life event) and the health measures in the ESEM model.

Results
The ESEM models displayed a superior fit to confirmatory factor models. The three-factor ESEM model showed a satisfactory model fit to the data but not for the two-factor model. The three factors were physical fatigue (three items, α = .800), low energy (four items, α = .821), and mental fatigue (four items, α = .861). The factors exhibited convergent validity with the model covariates and health measures."

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4529874/

or is it just a good example of 'with [fancy] statistics you can prove anything can be made to fit' (?)

You also have to banish plain English apparently ....big word syndrome ...at least they didn't add too many "ise's" at the end of perfectly good words I suppose.

petrichor · Jan 24, 2018

Woolie said:
Your symptoms are quite unusual, @petrichor. For most of us, our activity is not primarily limited by current feelings of fatigue, its limited by the flare-ups we get the next day if we overdo it. You have to predict exactly how much is enough - for that particular day - to avoid PEM.

I didn't say that my activity is primarily limited by my primary feelings of fatigue, but that doesn't meen that I don't experience feelings of fatigue, and the things this questionnaire covers.

That's a very big extrapolation to jump to, and it's completely untrue. I assure you, I know my symptoms far better than you know them.

Barry · Jan 24, 2018

That is blindingly good folks!

Andy · Jan 24, 2018

Barry said:
That is blindingly good folks!

We quite like them too

unicorn7 · Jan 24, 2018

Wow, this seems very good!

I didn't know what the chalder fatigue scale was, but now I read it, I'm pretty shocked. How does this measure anything? Why are the talking about sleepiness/tiredness? I feel exactly like the time I was reading about GETSET-Julie for the first time, I can hardly believe this is serious...

Andy · Jan 24, 2018

petrichor said:
I didn't say that my activity is primarily limited by my primary feelings of fatigue, but that doesn't meen that I don't experience feelings of fatigue, and the things this questionnaire covers.

That's a very big extrapolation to jump to, and it's completely untrue. I assure you, I know my symptoms far better than you know them.

Huh? Woolie isn't saying that you don't experience fatigue, why are you claiming that she is? And in your post you say

petrichor said:
I might call the kind of fatigue that prevents me from doing anything in a day apart from some basic tasks a moderate fatigue, whereas others might call it severe or mild. Plenty of healthy people would call a certain fatigue level severe, which I would call mild.

I also often convince myself that I'm doing fairly well, or I'm not that fatigued, or that I'm not feeling that bad, or that I have the capacity to do certain activities more than usual, but if I actually systematically go though the activities I've been able to do, day by day, over the past few weeks (for example), it doesn't usually show the same thing.

which IS different to what most patients describe. Reading your post, it does come across as you saying that it's fatigue limiting your activities, not the concern of triggering PEM, so I can understand why Woolie says what she does. So no big extrapolation to jump to, your post leads us to believe it to be the case. Now with your later post you explain that isn't the case, which helps us understand your situation - and of course nobody here is claiming to know your symptoms better than you, all we can go on is the information that you give us in your posts.

Barry · Jan 24, 2018

I think it originates out of a BSP fold where fatigue is seen as a symptom of various psychiatric disorders, and is a perceived sensation so can only be measured subjectively, which is home ground for psychiatrists anyway. It would very likely therefore have been conceived with the underlying presumption that the fatigue being queried has psychological origins. Dropping a psychological bias into the questions was probably inevitable therefore, maybe even desirable from their perspective.

Then comes the knotty problem of fatigue originating from physical conditions. But so what? ... Psychiatrists design and run the trials; they believe the fatigue is of mental origin not physical; fatigue is still only ever measurable subjectively; and its called Chronic Fatigue Syndrome anyway (lots of back patting).

What could possibly go wrong? Ans: The internet age, and democracies where even though the media may be hobbled and running scared, the truth is steadily finding its way through.

Also, over the past few weeks it has finally clicked with me why the name Chronic Fatigue Syndrome is such a misnomer and such a disaster; I've been reading it for ages but not really understood. The "fatigue" word hands ME patients to psychiatrists on a plate! Fatigue can only ever be measured subjectively, which is the psychiatrists' natural habitat. And because fatigue is also associated with genuine mental conditions, I think the BSP psychiatrists think they own any condition with the word 'fatigue' in its name. And their scientific skills seem inversely proportional to their arrogance.

Are there any clues back in the history of all this, whether the CFQ was developed for assessing fatigue in patients with any particular sorts of medical conditions? Mental conditions for instance?

Graham · Jan 24, 2018

Intelligence is a purely mental phenomena that can only be measured subjectively, but a lack of it certainly can lead to some pretty awful physical consequences.

I wouldn't be bothered by my "fatigue" if it didn't have measurable consequences.

S4ME: Submission to the public review on common data elements for ME/CFS: Problems with the Chalder Fatigue Questionnaire

Senior Member (Voting rights)

Attachments

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Guest

Senior Member (Voting Rights)

guest001

Guest

Senior Member

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting rights)

Senior Member (Voting Rights)

Senior Member (Voting rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)