1. Sign our petition calling on Cochrane to withdraw their review of Exercise Therapy for CFS here.
    Dismiss Notice
  2. Guest, the 'News in Brief' for the week beginning 8th April 2024 is here.
    Dismiss Notice
  3. Welcome! To read the Core Purpose and Values of our forum, click here.
    Dismiss Notice

Analyzing ordinal data with metric models: What could possibly go wrong?, Liddell et al. 2018

Discussion in 'Research methodology news and research' started by ME/CFS Skeptic, Nov 26, 2021.

  1. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,507
    Location:
    Belgium
    Abstract
    We surveyed all articles in the Journal of Personality and Social Psychology (JPSP), Psychological Science (PS), and the Journal of Experimental Psychology: General (JEP:G) that mentioned the term “Likert,” and found that 100% of the articles that analyzed ordinal data did so using a metric model. We present novel evidence that analyzing ordinal data as if they were metric can systematically lead to errors. We demonstrate false alarms (i.e., detecting an effect where none exists, Type I errors) and failures to detect effects (i.e., loss of power, Type II errors). We demonstrate systematic inversions of effects, for which treating ordinal data as metric indicates the opposite ordering of means than the true ordering of means. We show the same problems — false alarms, misses, and inversions — for interactions in factorial designs and for trend analyses in regression. We demonstrate that averaging across multiple ordinal measurements does not solve or even ameliorate these problems. A central contribution is a graphical explanation of how and when the misrepresentations occur. Moreover, we point out that there is no sure-fire way to detect these problems by treating the ordinal values as metric, and instead we advocate use of ordered-probit models (or similar) because they will better describe the data. Finally, although frequentist approaches to some ordered-probit models are available, we use Bayesian methods because of their flexibility in specifying models and their richness and accuracy in providing parameter estimates. An R script is provided for running an analysis that compares ordered-probit and metric models.

    Full article: https://www.sciencedirect.com/science/article/abs/pii/S0022103117307746
     
  2. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,507
    Location:
    Belgium
    A bit of background: This is mostly about statistics but it may be relevant to many ME/CFS studies.

    In many psychological studies, including many studies on ME/CFS researchers use a Likert-type questionnaire.

    A Likert item typically refers to a question for which the response is indicated on a discrete ordered scale ranging from one qualitative endpoint to another qualitative endpoint. For example, patients are given a treatment and then have to indicate how their health changed with 5 possible options: "much deteriorated", "deteriorated", "stayed the same", "improved", "much improved".

    These data are usually analyzed by giving each outcome a number from 1 to 5 and then it is treated as if this is a measurement like a count or measurement of height. The problem is that this assumes that the distance between each of the responses is exactly the same while this isn't necessarily the case. There is clearly an order in that much deteriorated" is worse than "deteriorated" etc. but the distances between the responses can vary.

    This paper argues that ignoring this problem can lead to misleading results (in both directions: falsely suggesting there is a difference when there is none or falsely suggesting there is none where there is one). There is a solution to this, namely modeling the responses as a cumulative distribution. Unfortunately, this is rarely done. The authors looked at 68 articles in psychological journals and found that, "every one treated the ordinal data as metric and used a metric model."
     
  3. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,507
    Location:
    Belgium
    @Lucibee as previously highlighted this issue in regard to the Chalder Fatigue Scale but I think it applies to many other scales of this format as well.
     
    ola_cohn, Michelle, Amw66 and 12 others like this.
  4. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,385
    This does not surprise me in the slightest. I've thought since first encountering this notion, that it is dangerously naive to assume it valid to simplistically map arbitrarily scaled ordinal data onto an analogue scale. Especially when there is no telling what the real intervals are, let alone assuming those intervals to be the same for everyone. As the paper's title says, "What could possibly go wrong?".
     
    Amw66, Mithriel, MSEsperanza and 6 others like this.
  5. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    3,507
    Location:
    Belgium
    It would be interesting if we could reanalyze the data with an appropriate ordinal regression model and see if the results differ.

    With the Chalder Fatigue Scale there is the additional difficulty that it combines multiple of these Likert scales. I think that to model these in an appropriate way we would need the responses of each participant to each individual question. And the Chalder Fatigue Scale data that is currently available is to my knowledge, just the total score summarized.
     
  6. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,445
    Location:
    London, UK
    It is good to see a detailed analysis of this.
    It is good example of the old adage 'lies, damned lies and statistics'. Statistics so easily creates make believe data.
     
    Ash, MSEsperanza, Sean and 8 others like this.
  7. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,445
    Location:
    London, UK
    Perhaps it should be called the Simon fatigue scale since Trudie Chalder has disowned it and said it was really Simon's.
     
    Ash, MSEsperanza, Sean and 8 others like this.
  8. Ariel

    Ariel Senior Member (Voting Rights)

    Messages:
    1,057
    Location:
    UK
    I did not know this. Why did she feel the need to do this?
     
    Ash, MSEsperanza, Sean and 4 others like this.
  9. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,445
    Location:
    London, UK
    I cannot remember where I read this but apparently she was suggesting that this was really Simon's idea and that she just made up a bit of a scale for him or something. So maybe disowned is a bit strong. Maybe she was just too modest to acknowledge her mastery of method
     
    Ash, Sean, Caroline Struthers and 6 others like this.
  10. Hutan

    Hutan Moderator Staff Member

    Messages:
    26,839
    Location:
    Aotearoa New Zealand
    There was a recent study, I think an Italian one of Long Covid that reported the results of a survey measuring general health by each question. It added a lot of richness to the results, highlighting which specific areas differed from controls. As well as the Likert scale issue being addressed, I'd like to see more researchers publishing the detail of their survey instrument results. It's a way of getting around the problem of adding up results from a range of quite separate questions and assuming the resulting sum means something.

    For example, if respondents are scoring 'I feel lonely' very highly in a survey assessing neuroticism, it might indicate that there is an external problem that can be fixed or at least recognised (e.g the stigma of the disease, help needed to access patient support groups...), rather than simply labelling them with a problematic personality characteristic.

    Or, for example, the pattern of responses to questions by people with ME/CFS in a depression survey compared to those who have depression may help to suggest that the some of the questions aren't measuring depression, but are instead measuring physical incapacity.
     
    Snow Leopard, Michelle, Amw66 and 8 others like this.
  11. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,385
    I think I have read somewhere, or maybe it was in an interview of his, how describes coming up with it in an afternoon, or something very akin to that. It seemed a very ad hoc and - dare I say it - subjective process.
     
    Ash, MSEsperanza, Sean and 5 others like this.
  12. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,445
    Location:
    London, UK
    An after lunch fatigue scale then?
     
    Ash, MSEsperanza, Sean and 6 others like this.
  13. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,385
    This is the only reference I can find to that, but thought there was a bit more somewhere:
    https://meassociation.org.uk/2012/0...-fatigue-syndrome-journal-article-march-2012/

    SW definitely claims ownership of it though.
     
    Michelle, Lucibee, Ash and 7 others like this.
  14. chrisb

    chrisb Senior Member (Voting Rights)

    Messages:
    4,602
    Michelle, Ash, MSEsperanza and 3 others like this.
  15. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,385
  16. Trish

    Trish Moderator Staff Member

    Messages:
    52,218
    Location:
    UK
  17. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,445
    Location:
    London, UK
    Yes that as it.

    And I think the reason it has been used so much is because its very short, it’s easy to administer, it includes items related to physical and mental fatigue, and its got eleven items

    Really rigorous stuff then.
     
  18. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,385
    Exactly. All the right ingredients a scientific measure could ever possibly want.
     
  19. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    13,445
    Location:
    London, UK
    Was that why Brian Hughes said something about scoring 11 for nostalgia?

    Seems it was.

    I should go back to sleep.
     
    Peter Trewhitt, Ash, FMMM1 and 4 others like this.
  20. Sean

    Sean Moderator Staff Member

    Messages:
    7,159
    Location:
    Australia
    And I think the reason it has been used so much is because its very short, it’s easy to administer, it includes items related to physical and mental fatigue, and its got eleven items.

    Accuracy and relevance seem to be missing from that list.
     

Share This Page