Can Large Language Models (LLMs) like ChatGPT be used to produce useful information?

That's probably because 90% of texts and papers on ME/CFS talk about weak data and all these possible connections that have no sound basis.

I wonder if you would get better results if you call it illnessX and describe it yourself with what are considered the most robust findings. You could also ask it to be very skeptical, to check if a connection makes sense and that you want to avoid making an error at all costs.
 
I don't know how accurate those two AI statements above are (1, 2), but they seem to me clear and concise, which is always a big plus, if for no other reason than that makes it much easier to spot the limitations and flaws.

It demonstrates what I have long thought is the main potential benefit of AI (at least in its current form and level) which is efficiency gains. How long would it have taken a human or even a team of humans to produce that?

It also shows that the prompts AI is given are very important (e.g. ME/CFS v. ME). That is where the human skill component of the equation matters.
 
Last edited:
That's probably because 90% of texts and papers on ME/CFS talk about weak data and all these possible connections that have no sound basis.

I wonder if you would get better results if you call it illnessX and describe it yourself with what are considered the most robust findings. You could also ask it to be very skeptical, to check if a connection makes sense and that you want to avoid making an error at all costs.

Yes, I got much better results after insisting that it limit its answers to only the most robust findings, be highly skeptical, and avoid overinterpretation. This was in response to a demand for a hypothesis about type-1 diabetes and some other randomly selected genes.

It gave me this:

Got it — let’s strip this right back to what can actually be said with high confidence from the literature and your GWAS list, without speculative functional chains.


Here’s a skeptical, minimal interpretation:




1. Which genes have established, reproducible links to immune function or autoimmunity?​


  • ETV6 – Strong evidence from hematology: encodes a transcriptional repressor essential for hematopoiesis; germline and somatic variants cause altered blood cell development and immune phenotypes. No direct, replicated T1D link yet, but immune relevance is well-established.
  • THRB – Encodes thyroid hormone receptor β. Thyroid hormones affect immune system function, and autoimmune thyroid disease often co-occurs with T1D, but there is no direct mechanistic evidence tying THRB variants to β-cell autoimmunity.



2. Which genes have plausible cellular roles but little to no direct autoimmune/T1D evidence?​


  • MTCL1 – Microtubule crosslinking factor; required for microtubule organization in polarized cells. Known in neuronal and epithelial contexts; β-cell or immune-specific roles not well validated in humans.
  • TBC1D19 – Predicted Rab GTPase-activating protein; general role in vesicle trafficking is plausible, but no confirmed function in immune cells or β cells in the context of autoimmunity.
  • STYXL1 – Catalytically inactive phosphatase; some evidence for involvement in stress responses in cell models, but no robust immune or diabetes link.
  • ERICD, ZBTB40-IT1 – Long noncoding RNAs; regulatory potential is possible, but functional and disease-relevant roles are unconfirmed.
  • OR1F12 – Olfactory receptor pseudogene or low-expression GPCR; no validated function outside olfactory tissue in the context of immunity or diabetes.



3. Mechanistic inferences we can make with caution​


From this set, the only well-supported connection to an immune-mediated disease pathway is via ETV6 (hematopoietic regulation) and, more indirectly, THRB (thyroid autoimmunity co-morbidity patterns). The rest are largely uncharacterized in T1D or immune biology, and functional connections would need to be demonstrated from scratch.




4. Minimal working hypothesis​


Certain loci in the GWAS may influence T1D risk through effects on immune cell development or function (ETV6), while others may reflect novel, as-yet-uncharacterized mechanisms. Most genes here lack direct experimental evidence in β-cell biology or immunity, so their relevance remains speculative.


This seems a lot more useful, but maybe still overoptimistic.
 
But here it makes sense for objective findings. But we need to be quite careful.
Yes of course we need to be careful, just like we wouldn’t blindly accept something a human medic or scientist said, I think most pwme are beyond that from our experience. But we could miss useful insight from being too cynical.

The training is from all sorts of sources, but also includes directly what we’ve been writing about for years. I’d be shocked if this forum isn’t in the training data. I know something like 27,000 words from my blog was in GPT3.5 model. It does also know about BPS papers too. It’s a couple of years since it’s said anything like that to me, as if it’s valid knowledge, but we should be aware that it’s in there.

Anyway this is probably a different discussion thread.
 
I suppose you could say they’re a bit like seagulls. Good at regurgitation but you wouldn’t depend on them for reliable deduction.

Well crafted promote definitely help, out of the box their system prompts can give a tendency to be sycophantic, although there are some recent examples of Gemini being more like Marvin the Paranoid Android.

I’ve found hem incredibly useful tools for many things but as @Yann04 says, not for reasoning or critically assessing information or anything speculative. None of the LLMs seem capable of knowing when they’re wrong. And I do worry when I see how some (individuals or companies and governments) use them.
 
Derya Unutmaz, immunology professor at The Jackson Laboratory, said GPT-5 "has become top expert and true scientific collaborator, capable of true insight" after it "proposed the exact experiment [...] that had taken us weeks to design" and "suggested a mechanism that finally explained these results"

Full post on x:



He also did an interview in March that may be relevant:

Especially starting at (25:28) "AI's Role in Hypothesis Generation and Data Analysis"

I think getting use out of LLMs for science depends on two things:

1. How knowledgeable the user is about the field they're trying to advance
2. How good the user is at prompting

You have to handle the LLMs in the right way, notice when they're BSing, and adjust your prompt accordingly, or else you risk getting AI slop.

Some people forget that you can have an extended, naturalistic conversation. You can ask it followup questions, criticize its answers, and even explicitly tell it in plain English "stop using this thinking style, use this one instead" or "you keep making this category of mistake. Avoid that." and it will do its best to do what you want. You can experiment and do things like tell it to emulate a specific person. It's a mistake to treat it as a single "question-answer" machine.
 
I believe the reason LLMs are useless here is because they repeat the speculations of others with no ability to discern sense from nonsense, and mix in knowledge of how things are connected in the body as well, with little ability to figure out what's actually relevant. If among these speculations were ideas that generally worked well, one would expect the accumulation of evidence supporting them.

LLMs now have now reasoning capabilities that we can definitely use. There are numerous cases where LLMs have found solutions to medical problems of individuals. I see no reason for LLMs not being able to put the pieces of the puzzle together for us (if this hasn't happened already that is - and no one cared)
 
LLMs now have now reasoning capabilities that we can definitely use. There are numerous cases where LLMs have found solutions to medical problems of individuals.

I don't see any reason to extrapolate from personal diagnosis, which is a simple pattern recognition exercise, to evaluating hypotheses that have to be based on a mass of incomplete information, guesstimates of probabilities and complicated dynamic modelling.

The sad reality is that the LLMs seem every time to come out with exactly the stock answer you would expect to get from a group of PhD students who had a course of lectures based on trendy dogma.
 
@Jonathan Edwards I guess we should investigate whether we have actionable results or not, whoever or whatever outputs these hypotheses is irrelevant. I hope we can agree on that.

Please have a look at what Derya Unutmaz (Professor, biomedical scientist, human immunologist, aging & cancer immunotherapy.) thinks about the reasoning capabilities from #AI. As I do not have the knowledge, a dialogue from experts having opposite views would be quite enlightening for the rest of us.

https://x.com/DeryaTR_
 
@Jonathan Edwards I guess we should investigate whether we have actionable results or not, whoever or whatever outputs these hypotheses is irrelevant. I hope we can agree on that.

Please have a look at what Derya Unutmaz (Professor, biomedical scientist, human immunologist, aging & cancer immunotherapy.) thinks about the reasoning capabilities from #AI. As I do not have the knowledge, a dialogue from experts having opposite views would be quite enlightening for the rest of us.

https://x.com/DeryaTR_
Link without having to sign in
 
I agree that I find the LLM quite the opposite of what one wants to probably be doing very generally: It predicts the most likely sequence of events vs one wants to predict the in some sense least likely sequence of events that can still be contextually made sense of. Arguably that should fit quite well into the realms of a reinforcement type machinery but I think for such an approach you'd need some sort of proof verification which still seems years outside of the possible realm in medicine (arguably in mathematics and computer science one might be quite close already).
 
Regarding AI: what happens when the disease doesn’t follow the known rules? I don’t understand how an AI model would be able to find the solution in that case. Maybe others have more insight?
That was what me previous post was about. Arguably you could still get there if you allow for enough exploration, but you'd need some way to know whether the millions of things you end up with are sensible and that currently to me seems well outside of the scope in medicine.
 
I guess we should investigate whether we have actionable results or not, whoever or whatever outputs these hypotheses is irrelevant.

Of course. I am simply pointing out that so far AI has failed to go beyond a rehash of whatever is prominent in media and reviews. And as we know, it is empty vessels that make the most noise.

I have seen some of Dr Unutmaz's comments on AI. They are mostly about what is promised, not what has been delivered as far as I can see. My way of doing science may be messy and Maverick but it got me results repeatedly where others had got nowhere.
 
one wants to predict the in some sense least likely sequence of events that can still be contextually made sense of.

Yes, I think this is something like it. If putting together all we know about the usual rules leads to a putative mechanism it is likely to be wrong because if it followed the rues everyone would have it. One has to hold in the back of one's head a series of leaps of faith into the near impossible and look for a way that would make them stick on the rare occasion they operated.

You also have to be very clear about what you want your hypothesis to predict. My impression is that AI cannot avoid being primed with all the false assumptions that you find in textbook and review explanations of things. The biggest problem with ME/CFS is that people have thought hypotheses should predict all sorts of things that we don't actually have any reason to think occur.
 
Back
Top Bottom