Idea: Web app to compile all ME/CFS study test results

I downloaded every abstract from Pubmed from the search term "chronic fatigue syndrome" (in quotes). I wrote a script to send every abstract, one by one, to the Claude API, to respond with whether it is original research on ME/CFS. Here's the prompt:

You are tasked with analyzing a title and abstract from PubMed to determine if it reports original research testing something specifically in chronic fatigue syndrome (CFS), myalgic encephalomyelitis (ME), or ME/CFS. Here is the information you will be analyzing:

<title>
{{TITLE}}
</title>

<abstract>
{{ABSTRACT}}
</abstract>

To analyze the title and abstract, consider the following criteria:
1. Does it describe a new study or experiment?
2. Does it mention testing or measuring something specific?
3. Does it specifically involve chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS?
4. Are there comparisons between groups (e.g., patients vs. controls)?
5. Does it report new findings or results?

Based on these criteria, you should provide one of three decisions:

- YES: If the abstract clearly indicates original research testing something specifically in CFS/ME/ME-CFS.
- NO: If it describes a review, meta-analysis, other non-original research, or does not specifically focus on CFS/ME/ME-CFS.
- MAYBE: If there is ambiguity, insufficient information, or if the abstract is empty and the title does not provide enough clues.

Provide your response in the following format:

<analysis>
<decision>[YES/NO/MAYBE]</decision>
<explanation>
[Your explanation here, limited to two sentences at most]
</explanation>
</analysis>

Here are three examples of correct responses:

Example 1:
<analysis>
<decision>YES</decision>
<explanation>
The abstract indicates that the study tested levels of serotonin and other chemicals in ME/CFS patients and healthy controls. This is original research as it involves a specific measurement in ME/CFS and includes a comparison group.
</explanation>
</analysis>

Example 2:
<analysis>
<decision>NO</decision>
<explanation>
This abstract describes a systematic review and meta-analysis of existing studies on the effectiveness of cognitive behavioral therapy for chronic fatigue syndrome. It does not report new, original research, but rather synthesizes and analyzes previously published data.
</explanation>
</analysis>

Example 3:
<analysis>
<decision>MAYBE</decision>
<explanation>
The abstract is missing, and the title alone does not provide sufficient information to determine if this is original research on CFS/ME/ME-CFS. More details would be needed to make a definitive decision.
</explanation>
</analysis>

Please analyze the provided title and abstract and give your response using the specified format.

I sent 121 abstracts so far as a test. It's a bit expensive. There are about 8,400 abstracts. It'll cost about $40-50 to get responses for all of them. There's another Claude model that is 10 times cheaper, but the answers I was getting weren't making much sense. This model seems pretty good at making decisions. A lot of the cost is the length of the prompt above, so I might have to figure out a way to shorten it without it losing accuracy.

Anyway, here are the results of the first few. I attached a text file with all the abstract responses I've gotten so far.

NO: Towards an understanding of physical activity-induced post-exertional malaise: Insights into microvascular alterations and immunometabolic interactions in post-COVID condition and myalgic encephalomyelitis/chronic fatigue syndrome.
* This abstract describes a review of current evidence on post-exertional malaise in Post-COVID Condition and ME/CFS, rather than reporting original research. While it focuses on ME/CFS, it does not present new experimental findings or comparisons between groups.

NO: Experiences Using Nonpharmacological Interventions for Chronic Fatigue: A Focus Group Study of Long-Term Survivors of Young Adult Cancers With Fatigue.
* This study focuses on cancer-related fatigue in young adult cancer survivors, not specifically on chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS. While it is original research, it does not meet the criteria of testing something specifically in CFS/ME/ME-CFS.

MAYBE: Correction: Socioeconomic determinants of myalgic encephalomyelitis/chronic fatigue syndrome in Norway: a registry study.
* The title suggests a registry study on socioeconomic determinants of ME/CFS in Norway, which could be original research. However, as there is no abstract available and the title indicates this is a correction, there is insufficient information to determine if this meets all criteria for original research testing something specific in ME/CFS.

NO: Ear abnormalities in chronic fatigue syndrome (CFS), fibromyalgia (FM), Coronavirus-19 infectious disease (COVID) and long-COVID syndrome (PCS), sick-building syndrome (SBS), post-orthostatic tachycardia syndrome (PoTS), and autoimmune/inflammatory syndrome induced by adjuvants (ASIA): A systematic review.
* This abstract describes a systematic review of literature regarding ear abnormalities in various conditions including CFS, rather than reporting original research. While it focuses partly on CFS, it is a review of existing studies from 1990 to 2024 and does not present new experimental findings.

NO: Microglia and Sleep Disorders.
* This abstract describes a book chapter or review article discussing the involvement of microglia in various sleep disorders, including ME/CFS, rather than reporting original research. While it mentions ME/CFS, it does not describe a specific experiment or new findings related to the condition.

MAYBE: A Multimodal Magnetic Resonance Imaging Study on Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Feasibility and Clinical Correlation.
* The title suggests a multimodal MRI study on ME/CFS, which could indicate original research. However, the abstract is missing, so there is insufficient information to determine if this study meets all the criteria for original research testing something specific in ME/CFS.

MAYBE: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Impact on Quality of Life (QoL) of Persons with ME/CFS.
* The title suggests a focus on ME/CFS and quality of life, but without an abstract, it's unclear if this is original research or a review. More information would be needed to determine if the study involves new measurements, comparisons, or reports original findings.

NO: Long COVID Is Not a Functional Neurologic Disorder.
* This abstract presents a perspective on why long COVID should not be considered a functional neurologic disorder, referencing ME/CFS but not focusing on it specifically. It does not describe new original research or experiments, but rather discusses existing evidence and provides a clinical and scientific rationale.

YES: Patient experiences of remote consulting with chronic fatigue syndrome/myalgic encephalomyelitis and fibromyalgia: a qualitative study.
* This abstract describes a new qualitative study exploring the experiences of people with CFS/ME and fibromyalgia when consulting remotely in primary care. The study involves semi-structured interviews with patients, focusing specifically on CFS/ME, and reports new findings about their experiences with remote consulting.

NO: Long COVID diagnostic with differentiation from chronic lyme disease using machine learning and cytokine hubs.
* This abstract describes original research on developing a diagnostic tool for long COVID using machine learning and cytokine analysis, with a comparison to chronic Lyme disease. However, it does not specifically focus on or mention chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS.

NO: Pursuit-worthy research in health: Three examples and a suggestion.
* This abstract describes a philosophical argument for community participation in research, using chronic fatigue syndrome as one of three examples. It does not report original research or new findings specifically testing something in CFS/ME/ME-CFS, but rather presents a conceptual framework for research approaches.

YES: Dysregulation of tetrahydrobiopterin metabolism in myalgic encephalomyelitis/chronic fatigue syndrome by pentose phosphate pathway.
* The abstract describes a new study investigating the molecular mechanism of biopterin biogenesis in ME/CFS patients with orthostatic intolerance. It reports original research findings on the role of the pentose phosphate pathway in regulating biopterin levels specifically in ME/CFS, including gene screening, enzyme kinetic studies, and a novel cell culture model.

NO: What Makes It Tick: Exploring the Mechanisms of Post-treatment Lyme Disease Syndrome.
* The title and abstract focus on Post-treatment Lyme Disease Syndrome (PTLDS) and do not specifically mention or study chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS. While it describes PTLDS symptoms, it appears to be an overview or exploration of mechanisms rather than reporting original research or specific testing.

NO: From human herpes virus-6 reactivation to autoimmune reactivity against tight junctions and neuronal antigens, to inflammation, depression, and chronic fatigue syndrome due to Long COVID.
* While this study does involve original research and compares Long COVID patients with healthy controls, it does not specifically focus on chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS as the primary condition being studied. The research primarily investigates autoimmune responses and inflammation in Long COVID patients, with chronic fatigue syndrome mentioned only as a symptom of Long COVID.

NO: Evidence of aberrant anti-epstein-barr virus antibody response, though no viral reactivation, in people with post-stroke fatigue.
* While this study does describe original research testing Epstein-Barr virus reactivation and antibody response in relation to fatigue, it specifically focuses on post-stroke fatigue, not CFS/ME/ME-CFS. The study mentions ME/CFS only as a comparison point, but does not directly involve CFS/ME/ME-CFS patients in its research.

YES: Fast Targeted Metabolomics for Analyzing Metabolic Diversity of Bacterial Indole Derivatives in ME/CFS Gut Microbiome.
* The abstract describes a new study that developed a fast targeted metabolomics method to analyze metabolic diversity of bacterial indole derivatives in ME/CFS gut microbiome. It specifically mentions testing metabolic differences in indole derivatives among samples, indicating original research focused on ME/CFS.

NO: New-onset sarcoidosis in a patient with long COVID.
* The abstract describes a potential link between long COVID and sarcoidosis, but does not mention chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS. It does not report original research or specific testing, instead discussing general observations and potential implications for future research.

NO: Nonpelvic comorbid symptoms of 45 patients with pain of pelvic venous origin, before and after treatment.
* While the study mentions ME/CFS symptoms, it primarily focuses on patients with venous-origin chronic pelvic pain (VO-CPP) and their treatment outcomes. The research does not specifically test or measure anything in CFS/ME/ME-CFS patients, but rather observes overlapping symptoms in VO-CPP patients.

NO: Methylsulfinyl Hexyl Isothiocyanate (6-MSITC) from Wasabi Is a Promising Candidate for the Treatment of Cancer, Alzheimer's Disease, and Obesity.
* The title and partial abstract do not mention chronic fatigue syndrome, myalgic encephalomyelitis, or ME/CFS. Instead, they focus on a compound from wasabi as a potential treatment for cancer, Alzheimer's disease, and obesity.

NO: Hypothesis: 2 Major Environmental and Pharmaceutical Factors-Acetaminophen Exposure and Gastrointestinal Overgrowth of Clostridia Bacteria Induced By Ingestion of Glyphosate-Contaminated Foods-Dysregulate the Developmental Protein Sonic Hedgehog and Are Major Causes of Autism.
* This abstract presents a hypothesis about potential causes of autism spectrum disorder, not chronic fatigue syndrome or myalgic encephalomyelitis. It does not describe original research or experiments, but rather proposes a theoretical link between environmental factors and autism.

YES: Absence of BOLD adaptation in chronic fatigue syndrome revealed by task functional MRI.
* This abstract describes an original research study that specifically tested BOLD signal changes during cognitive tasks in ME/CFS patients compared to healthy controls. The study reports new findings on the absence of BOLD adaptation in ME/CFS, providing insights into potential neurophysiological mechanisms of the condition.

NO: Understanding the kynurenine pathway: A narrative review on its impact across chronic pain conditions.
* This abstract describes a narrative review of existing research on the kynurenine pathway across various chronic pain conditions, including chronic fatigue syndrome. It does not report new, original research specifically on CFS/ME/ME-CFS, but rather summarizes current knowledge on the topic.

NO: Case-control study on post-COVID-19 conditions reveals severe acute infection and chronic pulmonary disease as potential risk factors.
* While this study does describe original research on long COVID and mentions chronic fatigue syndrome as a potential risk factor, it does not specifically focus on testing or measuring something in CFS/ME/ME-CFS patients. The primary focus is on long COVID risk factors and phenotypes in a general COVID-19 patient population.

YES: Comorbidity and sex differences in functional disorders and internalizing disorders.
* The abstract describes an original study analyzing comorbidity rates between functional disorders (including ME/CFS) and internalizing disorders using self-reported diagnostic criteria. It reports new findings on comorbidity and sex differences in a large population-based cohort, specifically including ME/CFS as one of the conditions studied.

YES: Systemic antibody responses against gut microbiota flagellins implicate shared and divergent immune reactivity in Crohn's disease and chronic fatigue syndrome.
* The abstract describes a new study comparing antibody responses against gut microbiota flagellins in both Crohn's disease and ME/CFS patients versus healthy controls. It reports original findings on the differences in antibody binding patterns between these groups, specifically involving ME/CFS patients.

NO: Clinical and Diagnostic Features of Post-Acute COVID-19 Vaccination Syndrome (PACVS).
* While the study involves original research on Post-acute COVID-19 vaccination syndrome (PACVS), it is not specifically focused on CFS/ME/ME-CFS. Although ME/CFS is mentioned as a potential fit for some participants, the primary focus of the study is on characterizing PACVS, not on testing something specifically in CFS/ME/ME-CFS patients.

NO: Stress-Related Chronic Fatigue Syndrome: A Case Report with a Positive Response to Alpha-Methyl-P-Tyrosine (AMPT) Treatment.
* This abstract describes a case report of a single patient with chronic fatigue syndrome treated with alpha-methyl-p-tyrosine (AMPT). While it focuses on CFS and reports on treatment outcomes, a single case report does not constitute original research testing something specifically in CFS/ME/ME-CFS across a broader population.

YES: Plasma Neurofilament Light Chain: A Potential Biomarker for Neurological Dysfunction in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome.
* The abstract describes a new study that measured plasma neurofilament light chain (NfL) levels in ME/CFS patients compared to healthy controls. It reports original research testing a specific biomarker in ME/CFS, including comparisons between groups and new findings.

NO: Widespread Myalgia and Chronic Fatigue: Phagocytes from Macrophagic Myofasciitis Patients Exposed to Aluminum Oxyhydroxide-Adjuvanted Vaccine Exhibit Specific Inflammatory, Autophagic, and Mitochondrial Responses.
* While the study involves original research on macrophagic myofasciitis (MMF) patients who exhibit symptoms similar to ME/CFS, it does not specifically test or measure anything in CFS/ME/ME-CFS patients. The research focuses on MMF phagocytes and their responses to aluminum adjuvants, with ME/CFS mentioned as a comparison for symptoms.

NO: Research progress on pathogenesis of chronic fatigue syndrome and treatment of traditional Chinese and Western medicine.
* This abstract describes a comprehensive review of recent advancements in understanding chronic fatigue syndrome, including its etiology, pathophysiology, and treatment approaches. It does not report original research or new experimental findings, but rather synthesizes existing literature on CFS.

YES: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome After SARS-CoV-2 Infection.
* This abstract describes a prospective cohort study that specifically evaluates the prevalence of ME/CFS-like illness following SARS-CoV-2 infection. The study compares COVID-19 positive and negative groups, reports new findings on ME/CFS-like illness prevalence, and follows participants for up to 12 months.
 

Attachments

Interesting, are you using sonnet or opus? If you don’t mind rate limits maybe it’s worth seeing what Gemini comes up with, I’ve been doing some simple experiments (unrelated to this) using it. While Gemini 1.5 Flash may not be powerful enough the limits are low, 1.5 Pro would take too long, but 1.0 Pro may fit the bill if it produces good enough results and you can schedule the work over a week.
https://ai.google.dev/pricing

And perhaps an obvious or silly question, but have you tried any of the local models to see if they’re up to the task? Depending upon what hardware you’ve got access to. I’ve only experimented with small (2B) models for basic tasks which wouldn’t but maybe if you can run the larger models?
 
Last edited:
Interesting, are you using sonnet or opus? If you don’t mind rate limits maybe it’s worth seeing what Gemini comes up with, I’ve been doing some simple experiments (unrelated to this) using it. While Gemini 1.5 Flash may not be powerful enough the limits are low, 1.5 Pro would take too long, but 1.0 Pro may fit the bill if it produces good enough results and you can schedule the work over a week.
https://ai.google.dev/pricing

And perhaps an obvious or silly question, but have you tried any of the local models to see if they’re up to the task? Depending upon what hardware you’ve got access to. I’ve only experimented with small (2B) models for basic tasks which wouldn’t but maybe if you can run the larger models?

This was with Sonnet. Opus would cost about 5 times more.

Interesting, looks like for the free tier of 1.0 Pro, I can do all 8,400 in about 10 hours. I'm not too hopeful it'll be much better than Claude Haiku, which was pretty bad, but I'll try with the same 121 abstracts.
 
And perhaps an obvious or silly question, but have you tried any of the local models to see if they’re up to the task? Depending upon what hardware you’ve got access to. I’ve only experimented with small (2B) models for basic tasks which wouldn’t but maybe if you can run the larger models?

I don't have GPU on my computer. I once tried the biggest that could comfortably run on my laptop, and it was both pretty bad at understanding compared to the biggest ones and incredibly slow.
 
but 1.0 Pro may fit the bill if it produces good enough results and you can schedule the work over a week.
Oh I wasn't looking carefully at the rate limits. There's also a daily limit of 1500 requests, so it would be closer to a week. But it's also a lot cheaper if I just want to pay and do it quickly. Somewhere around $5-10 for all of them, which isn't that bad.

Anyway, I ran the same 121 abstracts through Gemini Pro 1.0. It had a different opinion on 28 of them. 9 of which were a flip from a YES to a NO or vice versa. For the rest, one of the models said MAYBE.

Gemini said this for one: "YES: The study is a meta-analysis of six randomized controlled trials, pooling data to investigate whether cognitive behavioral therapy (CBT) effectiveness is moderated by depressive symptoms in patients with ME/CFS. This analysis is original research and focuses specifically on ME/CFS."

Even though the prompt explicitly says: "NO: If it describes a review, meta-analysis, other non-original research, or does not specifically focus on CFS/ME/ME-CFS."

I think it might have to be a better model, or there will be a lot of mistakes like this.

Here are the first five that didn't match, along with each model's explanation. There's a character limit in these posts, so all 28 are in an attached text file.

1: Patient experiences of remote consulting with chronic fatigue syndrome/myalgic encephalomyelitis and fibromyalgia: a qualitative study.
Claude: YES
* This abstract describes a new qualitative study exploring the experiences of people with CFS/ME and fibromyalgia when consulting remotely in primary care. The study involves semi-structured interviews with patients, focusing specifically on CFS/ME, and reports new findings about their experiences with remote consulting.

Gemini: NO
* This study focuses on patient experiences with remote consulting specifically in CFS/ME and fibromyalgia, but it does not report any original research testing or measuring something specific. It is a qualitative study exploring opinions and experiences.

------

2: Case-control study on post-COVID-19 conditions reveals severe acute infection and chronic pulmonary disease as potential risk factors.
Claude: NO
* While this study does describe original research on long COVID and mentions chronic fatigue syndrome as a potential risk factor, it does not specifically focus on testing or measuring something in CFS/ME/ME-CFS patients. The primary focus is on long COVID risk factors and phenotypes in a general COVID-19 patient population.

Gemini: MAYBE
* While the abstract mentions "chronic fatigue syndrome (CFS)" as a risk factor for long COVID, it doesn't describe any specific research conducted *on* CFS itself. The focus is on long COVID and its risk factors.

------

3: Comorbidity and sex differences in functional disorders and internalizing disorders.
Claude: YES
* The abstract describes an original study analyzing comorbidity rates between functional disorders (including ME/CFS) and internalizing disorders using self-reported diagnostic criteria. It reports new findings on comorbidity and sex differences in a large population-based cohort, specifically including ME/CFS as one of the conditions studied.

Gemini: MAYBE
* While the abstract mentions ME/CFS as part of a group of functional disorders (FDs), it doesn't specify any particular test or measurement specifically focused on ME/CFS. The research focuses on comorbidity between FDs and internalizing disorders (IDs), not specific testing within ME/CFS.

------

4: Clinical and Diagnostic Features of Post-Acute COVID-19 Vaccination Syndrome (PACVS).
Claude: NO
* While the study involves original research on Post-acute COVID-19 vaccination syndrome (PACVS), it is not specifically focused on CFS/ME/ME-CFS. Although ME/CFS is mentioned as a potential fit for some participants, the primary focus of the study is on characterizing PACVS, not on testing something specifically in CFS/ME/ME-CFS patients.

Gemini: MAYBE
* While the abstract mentions ME/CFS and compares PACVS patients to ME/CFS criteria, it doesn't explicitly describe a new experiment or measurement specific to ME/CFS. The focus is on characterizing PACVS, not directly investigating ME/CFS.

------

5: Stress-Related Chronic Fatigue Syndrome: A Case Report with a Positive Response to Alpha-Methyl-P-Tyrosine (AMPT) Treatment.
Claude: NO
* This abstract describes a case report of a single patient with chronic fatigue syndrome treated with alpha-methyl-p-tyrosine (AMPT). While it focuses on CFS and reports on treatment outcomes, a single case report does not constitute original research testing something specifically in CFS/ME/ME-CFS across a broader population.

Gemini: MAYBE
* The abstract describes a case report of a single patient with a complex medical history, including CFS, and their response to AMPT treatment. While this involves a patient with CFS, it does not provide clear evidence of a controlled study specifically testing AMPT in CFS.
 

Attachments

Last edited:
Oh, I can filter that list of studies down on PubMed to just clinical trials. Does that include everything that would have tested something? Not sure. But it comes down from 8,426 to 516. Much better.

My plan is to try the embedding approach. After having Claude tell me which of these matches the criteria, as above, then for those I'll try to get the full text of each somehow and send that to Claude and ask it to list every test and result in detail. Depending on how long papers are and how many there are, that step might still turn out very expensive, we'll see.
 
Oops, I had used Gemini 1.5 Flash previously, not 1.0 Pro. I just ran it again with 1.0 Pro, and it's worse. 37 mismatches out of 120. Here's just a few in case anyone is interested, but I'm going to just use Claude Sonnet.

1: Correction: Socioeconomic determinants of myalgic encephalomyelitis/chronic fatigue syndrome in Norway: a registry study.
https://pubmed.ncbi.nlm.nih.gov/39232735
Claude: MAYBE
* The title suggests a registry study on socioeconomic determinants of ME/CFS in Norway, which could be original research. However, as there is no abstract available and the title indicates this is a correction, there is insufficient information to determine if this meets all criteria for original research testing something specific in ME/CFS.

Gemini: YES
* The title indicates a study on "socioeconomic determinants" of ME/CFS, suggesting it is original research. While the abstract is unavailable, the title specifically mentions ME/CFS, implying the research focuses on this condition.

------

2: A Multimodal Magnetic Resonance Imaging Study on Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Feasibility and Clinical Correlation.
https://pubmed.ncbi.nlm.nih.gov/39202651
Claude: MAYBE
* The title suggests a multimodal MRI study on ME/CFS, which could indicate original research. However, the abstract is missing, so there is insufficient information to determine if this study meets all the criteria for original research testing something specific in ME/CFS.

Gemini: YES
* The title and abstract mention a multimodal magnetic resonance imaging study specifically for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS), indicating original research on this condition.

------

3: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Impact on Quality of Life (QoL) of Persons with ME/CFS.
https://pubmed.ncbi.nlm.nih.gov/39202496
Claude: MAYBE
* The title suggests a focus on ME/CFS and quality of life, but without an abstract, it's unclear if this is original research or a review. More information would be needed to determine if the study involves new measurements, comparisons, or reports original findings.

Gemini: NO
* The abstract is missing, and the title only mentions the impact of ME/CFS on quality of life, not any specific testing or research.

------

4: Patient experiences of remote consulting with chronic fatigue syndrome/myalgic encephalomyelitis and fibromyalgia: a qualitative study.
https://pubmed.ncbi.nlm.nih.gov/39191480
Claude: YES
* This abstract describes a new qualitative study exploring the experiences of people with CFS/ME and fibromyalgia when consulting remotely in primary care. The study involves semi-structured interviews with patients, focusing specifically on CFS/ME, and reports new findings about their experiences with remote consulting.

Gemini: NO
* The study involves qualitative interviews with CFS/ME and fibromyalgia patients to explore their experiences with remote consulting. It does not describe an original experiment or test something specific in these conditions.

------

5: Nonpelvic comorbid symptoms of 45 patients with pain of pelvic venous origin, before and after treatment.
https://pubmed.ncbi.nlm.nih.gov/39126670
Claude: NO
* While the study mentions ME/CFS symptoms, it primarily focuses on patients with venous-origin chronic pelvic pain (VO-CPP) and their treatment outcomes. The research does not specifically test or measure anything in CFS/ME/ME-CFS patients, but rather observes overlapping symptoms in VO-CPP patients.

Gemini: MAYBE
* The abstract mentions that "overlapping individual symptoms characteristic of fibromyalgia and myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS)" were present in 64% to 82% of patients. However, it does not clearly state whether this is original research or if the patients with ME/CFS symptoms were specifically tested or measured in any way.

------

Edit: Oh right, I need to filter for more than clinical trials to get tests like serotonin levels. But no filter seems to include the deep phenotyping study. I'll probably have to filter backwards by downloading them all and eliminating the ones that are tagged as reviews, commentaries, etc.

Edit 2: No, this doesn't seem as straightforward as I had hoped. A couple main issues:

1. I don't think this will work for interventions. At least getting a nice binary result for the heatmap of interventions to show if it improved or got worse, since there can be multiple outcomes per intervention. Maybe just for observational studies this would be okay.

2. It's expensive to give it full studies, and it's not very good at following instructions perfectly if the text is really long. It was about 5 cents for just the methods and results sections of a random study. If I'm doing 1000 studies, that's $50, more if there are significantly longer ones.

I may have burned out my brain for a while too. So I'll leave this alone for now, I think. I still think there might be something cool if I could get a bunch of data like this:

<test>SF-36 Physical Function score</test><result>increased</result>
<test>xanthine metabolism compounds in urine samples</test><result>increased</result>

And make a map where it groups similar items together (e.g. serotonin would be closer to dopamine than to symptom questionnaire) and makes items that are increased one color, like blue dots, and items that are decreased red dots. If you see a lot of blue dots clumped together in one spot, or red dots clumped together in one spot, you can zoom in and see that many somewhat similar tests have gotten the same result.

I'm not even sure this would work as well as I hope. Anyway, maybe a project for the future or for someone else.
 
Last edited:
On cost: you don't actually have to use the provided APIs. When ChatGPT first came out I wrote a quick Python script to interact with it using browser instrumentation (Selenium/ChromeDriver with a few modifications) - much cheaper!

Also occurs to me that the hallucination risk might be reduced by using ensemble (e.g. consensus of multiple LLMs) or cross-verification (where one LLM evaluates the output of another LLM for correctness) methods.

Lots of interesting ideas on this thread. I've no energy to take on a project like this but hope someone picks it up and runs with it.
 
It seems that the Pubmet format has a field PT ("publication type") that indicates if the article is a review: https://pubmed.ncbi.nlm.nih.gov/help/#pt (the list: https://pubmed.ncbi.nlm.nih.gov/help/#publication-types) so you don't have to use a LLM for that. These would also give the answer to your question 1.

If you really want to make something that works, you will have to label some data manually. This is absolutely necessary if you at least want to know how well your extraction system work. Incidently you could use the labelled data to train a classifier model using one of the transformer models specifically trained on medical data (for example Med-BERT, Clinical BERT, etc). The advantages of classifiers are that they don't rely on text generations, which are prone to hallucinations no matter what, their performance is quantifiable (as opposed to using a LLM with no manually labelled data), and finally, they're cheaper than querying a LLM.

As for measurements you probably want to train a dedicated NER model, for the same reasons as above. Of course you could always try a LLM and ask it to extract the relevant information into JSON format, as long as you have some manually labelled data in hand to evaluate the LLM outputs.
 
I decided I'm going to try to do a variation of this idea. The plan is to make a wiki (using the same software as Wikipedia/MEpedia) about the research of chronic illnesses (including ME/CFS, multiple sclerosis, depression, cancer, etc). It won't just be a normal encyclopedia format like MEpedia, though. Each page will be dedicated to a single research paper, and the goal would be to include as many papers as possible about these illnesses.

Then on top of that structure, I will try to incorporate a MediaWiki extension called Semantic Mediawiki. This allows turning a wiki into something more like a database for making queries.

What that means is that I can add "parameters" (there's a chance I'm mixing up the terminology they use, I'm just learning about the extension) to every page which include information about that study. For example, a page could have "Condition:ME/CFS" and "Criteria:IOM". Then it becomes simple to search for every study that used IOM criteria.

But the main goal is to have each study include parameters for findings. For example, "ventricular brain lactate increased" or "plasma lactate decreased". Then a search for "ME/CFS" and "lactate" would bring up both of these studies. A search for "plasma lactate" would only bring up the second. This way it would be trivial to look up all studies that tested lactate. Or all studies that tested a specific drug, and so on. A simple text search for "lactate", like you could do on MEpedia, would list all these studies, but would also bring up irrelevant pages - anything that mentioned that word.

Also, with this tool, the data could be formatted in various other ways. As another example, one could do a search for "lactate" and view a timeline chart showing when every study on lactate was done. This would use a date parameter from the pages.

Other than the parameter information for using it as a database, the pages can also include text information, like regular wiki pages, relevant to the specific study. For example, there could be links to all online discussions about the specific paper (e.g. links to threads on S4ME, Phoenix Rising, Reddit, PubPeer), quoted interesting bits of the study, or "meta" information about the study (like links to pages about the institutions or researchers that performed the study).

No plans to incorporate AI, as when I tested it before, it was too expensive/bad for accurately pulling out information from long papers. It'll just be whoever wants to contribute, and hopefully over time it can grow to include a lot of papers.

The information would be entered in a structured form. Here is an example of a wiki, MitoPedia, that uses the extension I plan to use. This one actually also has pages for individual research papers. The topic of the wiki is "mitochondrial and chloroplast physiology".

Screenshot of MitoPedia data entry:
upload_2024-10-30_20-32-37.png
 
Last edited:
Back
Top Bottom