When thinking about PROMS for ME/CFS I think in particular the following part of the above in the 'Response Shift' para is interesting to bear in mind, when looking 'big picture' about the format/approach:
Lack of group-level change could also be attributed to how patients internalize and respond to PROMs.
Individuals’ ratings reflect their frame of reference, whether comparing their health to people with the same diagnosis or others without their diagnosis, or whether their expectations are based on their past health or their ideal health.
A study of physical and mental HRQOL in MS demonstrated these appraisal processes explained a large amount of variation in individual scores, and concluded HRQOL should not be assessed without measuring appraisal in MS patients
By which I mean that instead of thinking 'questionnaire', with familiar questions and large 'load' that are then weighted and calculated
something that forces or checks/helps an 'overall judgement' that is based on 'thinking afresh' and providing new 'points of reference' might be more useful/accurate. It might be that people are quite good at measuring their own ‘ew threshold of energy’ or pain or exhaustion as an 'overall' better than a questionnaire attempting to use a battery to weight component parts (which might vary across individuals in weighting), but just need a method that provide 'checks and balances' to that 'overall' judgement that accounts for these 'response shifts/adapting'.
There is also perhaps an argument to say that because of the nature and situational context for ME/CFS, combined with there perhaps being a few 'objective' signs/hints/pieces of info that might keep those with SPMS more 'in reality check' - meaning those with ME/CFS might be even worse at judging because they don't have those check-ins and measures to 'calibrate'.
I'm not keen on the 'multi-informant' idea [suggested by Lisbeth Utens in her letter in the comment above] for ME/CFS for a multitude of reasons including dynamic issues, and our vulnerability in workplace, family etc and that power-dynamic it could involve or affect (unless 'volunteered'/chosen by patient). But maybe there are lateral versions that could be inspired from this, that could utilise some of these suggestions to aid accuracy?
I don't know whether this is a useful piece to muse on, particularly thinking for instance about the situation where someone things they are 'doing really well spinning the plates/riding the fine line on the threshold of doability'. And whether when I was doing it there were things slowly dropping off that the right questions might have pointed out, or if it really was all great until the adrenaline stopped and the cumulative effect of just a bit too much each day hit.
EDIT: the other scenario to accommodate is a crash of various sorts that we might not yet (for various reasons including being a novice to the illness in experience of deteriorations) have considered could not just be a bad few weeks of ‘still getting over that cold’. But that/those dips could perhaps be encapsulated a bit as well in detailed scales that maybe have an a,b,c (almost functioning like a half measure to indicate I think I’m still moderate but…). Anyway I don’t know why the same questions/text/tests that you might ask someone to do to ‘help’ then think afresh each time what their level is on an overall scale wouldn’t also help here.
Now I know this next bit is more related to PROMs ideas for ME/CFS and might be more another thread but taking these lessons re: frames of references etc above..
Is there a way that you can avoid a 'computed score' that doesn't go without a 'sense-check' that it feels about right, whilst also avoiding the 'expectations based on past or ideal health' type issues above? A lot of the things that feel relevant to ME/CFS admittedly might be things that apps nowadays might inform people to help them calibrate anyway. But I don't know how good they are at getting people to realise they are 'at a new level' rather than 'in a crash'. Or - without research into that - how relevant that distinction is anyway/we just kid ourselves.
By which I'm thinking that the 'output' would be to ask a pwme to perhaps just select one very specific increment on what could be quite a large scale that has a lot of specifics . I'm also trying to think how introducing these could actually work or translate. And what would be valid or relevant. Vs questions about all the component symptoms or areas of ME and then a total perhaps being computed from that. And the frames of reference being spelled out/made clear snd having tools to help that seem to be one thing above hints at trying to work on.
This would indeed make selecting exactly the right box a lot of work. However, people would narrow down their approx zone quite quickly (we normally don't change that much) by defined limitations. And maybe could feel confident in their precise selection more easily by finding more exact comparators to their own situation - ie fresh frames of reference (rather than negotiating with what 'mostly homebound' does and doesn't encompass, it could almost have like market research 'profiles' that people can drop-down to check as they narrow-in on where they think they fit of e.g. Jenny who does still does her full-time job, but from home, and cannot schedule more than 2hrs of conversation in a day/10hrs in a week but can keep set hours wash, dress, watch films/socialise low key at weekend vs 'Jilly: works full-time with 3 days in the office with the day a mixture of meetings and quiet work, and can only do this and showers only in preparation for those days', and these should be worked up from/with actual patients in their terms that mean something, not guesses/cliches from HCPs).
Or you could add other home-based simple tasks to triangulate an assessment - knowing some would know they can't do that at all without needing to attempt it, and some could just be a case of self-monitoring what the categorisation means they already probably do. Maybe steps or reaction tests, but the difficulty there is good day/bad day/what you've done in the days leading up I guess so would need instructions.
But if the issue is comparing to past or ideal health then could something work directly from a list of things people might include (I guess 'meaningful items'). Did you struggle to cook that meal/watch that film in the past month. Have a conversation with someone close to you on a good day about a news item when sat upright and time it asking them to flag when your 'performance' starts to decline.
For those less severe when looking at PEM and threshold, seeing whether if they spend a weekend resting then their first day at work or doing an activity they might already have to do but which tends to cause PEM or fatiguability (agreed as appropriate to level), which day and how much their symptoms then hit. I imagine that if it was a clinic PROM then most people could actually negotiate with their employer if needed to ensure they could complete this. These things might then help to inform someone's own self-judgement/calibrate
Anyway, I just think it is an interesting point about frames of reference, and we are all aware that we assume our own norms and adapt naturally as things get worse without noticing we've been adapting for a year and now have to admit we actually can't really do a food shop in the supermarket anymore (hence eg direct questions: 'have you actually been able to complete a grocery shop in the last 6months'), or you now can only shower for x amount of time, are better wake-up calls and could be used as some sort of 'self-calculator'. If we are thinking about energy-threshold.
But, importantly, which could be of a formats so could provide an output that is a visual representation showing eg 'time able to shower', 'time able to talk seriously', 'time able to small talk', and so on. So that by that point by glance someone could have helped them, but they can see/feel if it seems about right. OR, if they are doing it themselves they can look at the 'overall' which might be represented visually as activity/energy in a week/fortnight and think 'I'm missing something' or 'no, that's more than I can do'.
Again a way of aiding people to work out which one, but then be able to check the 'overall' adds up themselves which hopefully works towards removing those response shifts due to adapting to deterioration.
And to keep things fresh or accessible (by providing choices in tasks) researchers could have for such 'self-calculator' tasks, groups of patients (of different severities, in case some of these task do vary by severity) grouping tasks of similar 'strenuousness/energy level/payback' so that the tasks rotates rather than being a familiar battery tempting those filling it in to think they know the answer.
Of course... it's all very hard and I haven't perfected anything at all here, these are just incomplete 'bits' of thoughts, and we've lots of different measures to try and look at eg. PEM is different to 'threshold'/severity measures (have you actually got worse) and so. It is just sparking my imagination that maybe if some of these sources of error have been highlighted there could be method-type ways to help control them/build in some triangulation. And of course it could be that it is useful to compare these things with eg the objective patterns and if people use apps etc test if those are getting there.