3 Matching Annotations
  1. Jul 2018
    1. On 2018 Jan 21, Tom Kindlon commented:

      References:

      1 Chalder T, Berelowitz G, Pawlikowska T, Watts L, Wessely S, Wright D, Wallace EP: Development of a Fatigue Scale. J Psychosom Res 1993, 37:147-153.

      2 Neu D, Hoffmann G, Moutrier R, Verbanck P, Linkowski P, Le Bon O. Are patients with chronic fatigue syndrome just 'tired' or also 'sleepy'? J Sleep Res. 2008 Dec;17(4):427-31. doi: 10.1111/j.1365-2869.2008.00679.x.

      3 Goudsmit, EM., Stouten, B and Howes, S. Fatigue in myalgic encephalomyelitis. Bulletin of the IACFS/ME, 2008, 16, 3, 3-10. https://web.archive.org/web/20140719090603/http://www.iacfsme.org/BULLETINFALL2008/Fall08GoudsmitFatigueinMyalgicEnceph/tabid/292/Default.aspx

      4 Goldsmith LP, Dunn G, Bentall RP, Lewis SW, Wearden AJ. Correction: Therapist Effects and the Impact of Early Therapeutic Alliance on Symptomatic Outcome in Chronic Fatigue Syndrome. PLoS One. 2016 Jun 1;11(6):e0157199. doi: 10.1371/journal.pone.0157199. eCollection 2016. https://doi.org/10.1371/journal.pone.0157199.s001 (CSV form: https://www.mediafire.com/file/rvh3brmgoaznude/Goldsmith+2015+FINE+data+journal.csv )

      5 A dataset from the PACE trial. Released following a freedom of information request. https://sites.google.com/site/pacefoir/pace-ipd_foia-qmul-2014-f73.xlsx Readme file: https://sites.google.com/site/pacefoir/pace-ipd-readme.txt.

      6 Morriss RK, Wearden AJ, Mullis R. Exploring the validity of the Chalder Fatigue scale in chronic fatigue syndrome. J Psychosom Res. 1998 Nov;45(5):411-7.

      7 Loge JH, Ekeberg O, Kaasa S. Fatigue in the general Norwegian population: normative data and associations. J Psychosom Res 1998; 45: 53-65. CrossRef | PubMed

      8 Van Kessel K, Moss-Morris R, Willoughby, Chalder T, Johnson MH, Robinson E, A randomized controlled trial of cognitive behavior therapy for multiple sclerosis fatigue, Psychosom. Med. 2008; 70:205-213.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2018 Jan 21, Tom Kindlon commented:

      My feedback on content of Common Data Elements (Fatigue) - Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) CDE Public Review

      I submitted this to the NIH and thought I would also post it here.

      Suggested Change: Don't use the Chalder fatigue questionnaire

      Rationale: The Chalder Fatigue questionnaire has two separate scoring systems, bimodal (0-11) and Likert (0-33) [1]. Some of the issues raised below are more significant with one system rather than the other.

      (i) Doubts about the validity of two of the items in the questionnaire as means to measure fatigue:

      The item “Do you have problems starting things” seems as though it could relate more to motivation or some other issue rather than fatigue specifically.

      The item “Do you feel sleepy or drowsy” relates more to sleepiness than fatigue. Sleepiness and fatigue are not necessarily the same thing [2].

      Most studies that used the Chalder fatigue scale do not give details of scores on individual items but one study [3] reported the following in participants with ME: “Focusing on the individual items revealed that 86.8% of the questions making up the physical fatigue subscale received near maximal or maximum scores. The items which received the greatest number of low scores were question 3 (‘do you feel sleepy or drowsy’) and question 4 (‘do you have problems starting things’).”

      (ii) Ceiling effects are a significant issue when the Chalder fatigue questionnaire is used with patients with ME and CFS score, particularly with bimodal scoring:

      A study of those with ME [3] found that “Fifty per cent of the patients recorded the maximum score using the bimodal method and 77% recorded the two highest scores [i.e. either 10 or 11].” In the FINE and PACE trials, 76% (147/193) and 65% (417/640) respectively of CFS participants reported the highest score [11] at baseline using bimodal scoring [4,5].

      With regards to Likert scoring, a study of those with ME found that there was some evidence of a ceiling effect in those who were severely affected (more details were not reported but the average score for those severely affected was 30.55 (SD: 2.66)). In the FINE and PACE trials 29.1% (57/196) and 14.5% (93/640) of the participants with CFS respectively scored the maximum score of 33 at baseline.

      There is also a 14-item version of the instrument with three extra items. A study of 136 individuals with CFS looking at Likert scoring found there was near-maximal scoring on 6 of the 8 physical fatigue items [6].

      The authors of the ME study [3] noted with regards to bimodal scoring that there was a “marked overlap between those who rated themselves as moderately or severely ill. These findings are indications of a low ceiling.” This could lead to the questionnaire failing to detect patients moving from being severely to moderately affected and vice versa.

      Furthermore, if patients are already at a ceiling score at the start of the intervention, the questionnaire cannot detect their getting worse. This could mean that evidence of harm would not be recorded. Also, this phenomenon could affect measures of efficacy: if a certain percentage of patients improved and the same percentage worsened to a similar level, this could show up as an average improvement because the scores for those who got worse would not change if they were already at the ceiling level.

      This could also make interventions that caused a significant number of deteriorations seem better than those that caused fewer. For example, consider a scenario in which one intervention caused a certain percentage of patients to improve while the same percentage, who began at the maximum score, worsened by the same amount. If another intervention caused half the number of patients to both improve and worsen, the average numerical improvement for the first intervention would be twice that of the second, even though rationally the scores should be the same.

      (iii) Discussion of the ability of respondents to mark symptoms as occurring “less than usual”:

      The fact that participants can rate their fatigue symptoms as occurring “less than usual” can lead to some odd results with Likert scoring of the Chalder scale (it is not an issue with its bimodal scoring). People who have no fatigue problems should generally score 11/33, indicating that they had problems ‘no more than usual’. And, indeed, a study in Norway found that those in the category “No disease/current health problem” had a mean score of 11.2 [7].

      However, a study found that people with "multiple sclerosis fatigue" after an intervention reported an average fatigue score of 7.80 – that is, lower than 11; this score also showed lower fatigue than that of a healthy, nonfatigued comparison group in the study [8]. It is very unlikely to be true that patients with multiple sclerosis fatigue at baseline ended the study with lower fatigue than healthy people. Scores of less than 11 were also reported by those with CFS in the FINE and PACE trial [4,5].

      I will explore further now how pooling the scores of people who give scores of less than 11 with other scores can give odd results. Say 75% of participants gave a Likert score of 4 and 25% gave a score of 24. This would be an average score of 9 which is a better score than the score of 11 that healthy people report. However, it is likely that people who scored 4 on the scale were confused by the peculiar option on the Chalder questionnaire that allows them to rate themselves as having fewer problems with fatigue than when they were last well (choosing that option is the only way to get a score below 11). If they really meant to say that they had no more fatigue than when they were last well, then their score should really have been similar to that of the average healthy person, at 11.2. Substituting this score instead of 4 in this example would give an average score for the group of 14.4, a worse score than what healthy people score. The latter is, I believe, a better representation of what the average fatigue score for the group would be: that is, if a significant percentage still had significant fatigue, than the overall fatigue level should be worse on average than a healthy group, not better. This shows that the ability to have better scores than healthy people doesn’t just affect the validity of individual scores, it also affects the validity of overall mean scores.

      [The references for this comment are in my reply below]


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2018 Jan 21, Tom Kindlon commented:

      My feedback on content of Common Data Elements (Fatigue) - Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) CDE Public Review

      I submitted this to the NIH and thought I would also post it here.

      Suggested Change: Don't use the Chalder fatigue questionnaire

      Rationale: The Chalder Fatigue questionnaire has two separate scoring systems, bimodal (0-11) and Likert (0-33) [1]. Some of the issues raised below are more significant with one system rather than the other.

      (i) Doubts about the validity of two of the items in the questionnaire as means to measure fatigue:

      The item “Do you have problems starting things” seems as though it could relate more to motivation or some other issue rather than fatigue specifically.

      The item “Do you feel sleepy or drowsy” relates more to sleepiness than fatigue. Sleepiness and fatigue are not necessarily the same thing [2].

      Most studies that used the Chalder fatigue scale do not give details of scores on individual items but one study [3] reported the following in participants with ME: “Focusing on the individual items revealed that 86.8% of the questions making up the physical fatigue subscale received near maximal or maximum scores. The items which received the greatest number of low scores were question 3 (‘do you feel sleepy or drowsy’) and question 4 (‘do you have problems starting things’).”

      (ii) Ceiling effects are a significant issue when the Chalder fatigue questionnaire is used with patients with ME and CFS score, particularly with bimodal scoring:

      A study of those with ME [3] found that “Fifty per cent of the patients recorded the maximum score using the bimodal method and 77% recorded the two highest scores [i.e. either 10 or 11].” In the FINE and PACE trials, 76% (147/193) and 65% (417/640) respectively of CFS participants reported the highest score [11] at baseline using bimodal scoring [4,5].

      With regards to Likert scoring, a study of those with ME found that there was some evidence of a ceiling effect in those who were severely affected (more details were not reported but the average score for those severely affected was 30.55 (SD: 2.66)). In the FINE and PACE trials 29.1% (57/196) and 14.5% (93/640) of the participants with CFS respectively scored the maximum score of 33 at baseline.

      There is also a 14-item version of the instrument with three extra items. A study of 136 individuals with CFS looking at Likert scoring found there was near-maximal scoring on 6 of the 8 physical fatigue items [6].

      The authors of the ME study [3] noted with regards to bimodal scoring that there was a “marked overlap between those who rated themselves as moderately or severely ill. These findings are indications of a low ceiling.” This could lead to the questionnaire failing to detect patients moving from being severely to moderately affected and vice versa.

      Furthermore, if patients are already at a ceiling score at the start of the intervention, the questionnaire cannot detect their getting worse. This could mean that evidence of harm would not be recorded. Also, this phenomenon could affect measures of efficacy: if a certain percentage of patients improved and the same percentage worsened to a similar level, this could show up as an average improvement because the scores for those who got worse would not change if they were already at the ceiling level.

      This could also make interventions that caused a significant number of deteriorations seem better than those that caused fewer. For example, consider a scenario in which one intervention caused a certain percentage of patients to improve while the same percentage, who began at the maximum score, worsened by the same amount. If another intervention caused half the number of patients to both improve and worsen, the average numerical improvement for the first intervention would be twice that of the second, even though rationally the scores should be the same.

      (iii) Discussion of the ability of respondents to mark symptoms as occurring “less than usual”:

      The fact that participants can rate their fatigue symptoms as occurring “less than usual” can lead to some odd results with Likert scoring of the Chalder scale (it is not an issue with its bimodal scoring). People who have no fatigue problems should generally score 11/33, indicating that they had problems ‘no more than usual’. And, indeed, a study in Norway found that those in the category “No disease/current health problem” had a mean score of 11.2 [7].

      However, a study found that people with "multiple sclerosis fatigue" after an intervention reported an average fatigue score of 7.80 – that is, lower than 11; this score also showed lower fatigue than that of a healthy, nonfatigued comparison group in the study [8]. It is very unlikely to be true that patients with multiple sclerosis fatigue at baseline ended the study with lower fatigue than healthy people. Scores of less than 11 were also reported by those with CFS in the FINE and PACE trial [4,5].

      I will explore further now how pooling the scores of people who give scores of less than 11 with other scores can give odd results. Say 75% of participants gave a Likert score of 4 and 25% gave a score of 24. This would be an average score of 9 which is a better score than the score of 11 that healthy people report. However, it is likely that people who scored 4 on the scale were confused by the peculiar option on the Chalder questionnaire that allows them to rate themselves as having fewer problems with fatigue than when they were last well (choosing that option is the only way to get a score below 11). If they really meant to say that they had no more fatigue than when they were last well, then their score should really have been similar to that of the average healthy person, at 11.2. Substituting this score instead of 4 in this example would give an average score for the group of 14.4, a worse score than what healthy people score. The latter is, I believe, a better representation of what the average fatigue score for the group would be: that is, if a significant percentage still had significant fatigue, than the overall fatigue level should be worse on average than a healthy group, not better. This shows that the ability to have better scores than healthy people doesn’t just affect the validity of individual scores, it also affects the validity of overall mean scores.

      [The references for this comment are in my reply below]


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.