Artificial intelligence in medicine and science

Yann04 · Jun 21, 2024

rvallee said:
Video about AI use in cancer detection from youtuber Coldfusion. Great channel for technology enthusiasts, and always pleasant to listen.

Ah I used to watch coldfusion before I developed ME. Such a soothing voice…

Amw66 · Jul 24, 2024

https://medicalxpress.com/news/2024-07-genetic-diagnostics-ultra-rare-diseases.html

AI being used to support diagnosis

SNT Gatchaman · Aug 5, 2024

Influence of believed AI involvement on the perception of digital medical advice (2024)
Reis, Moritz; Reis, Florian; Kunde, Wilfried

Large language models offer novel opportunities to seek digital medical advice. While previous research primarily addressed the performance of such artificial intelligence (AI)-based tools, public perception of these advancements received little attention.

In two preregistered studies (n = 2,280), we presented participants with scenarios of patients obtaining medical advice. All participants received identical information, but we manipulated the putative source of this advice (‘AI’, ‘human physician’, ‘human + AI’). ‘AI’-and ‘human + AI’-labeled advice was evaluated as significantly less reliable and less empathetic compared with ‘human’-labeled advice. Moreover, participants indicated lower willingness to follow the advice when AI was believed to be involved in advice generation.

Our findings point toward an anti-AI bias when receiving digital medical advice, even when AI is supposedly supervised by physicians. Given the tremendous potential of AI for medicine, elucidating ways to counteract this bias should be an important objective of future research.

Link | PDF (Nature Medicine) [Open Access]

glennthefrog · Aug 26, 2024

SNT Gatchaman said:
Influence of believed AI involvement on the perception of digital medical advice (2024)
Reis, Moritz; Reis, Florian; Kunde, Wilfried

Large language models offer novel opportunities to seek digital medical advice. While previous research primarily addressed the performance of such artificial intelligence (AI)-based tools, public perception of these advancements received little attention.

In two preregistered studies (n = 2,280), we presented participants with scenarios of patients obtaining medical advice. All participants received identical information, but we manipulated the putative source of this advice (‘AI’, ‘human physician’, ‘human + AI’). ‘AI’-and ‘human + AI’-labeled advice was evaluated as significantly less reliable and less empathetic compared with ‘human’-labeled advice. Moreover, participants indicated lower willingness to follow the advice when AI was believed to be involved in advice generation.

Our findings point toward an anti-AI bias when receiving digital medical advice, even when AI is supposedly supervised by physicians. Given the tremendous potential of AI for medicine, elucidating ways to counteract this bias should be an important objective of future research.

Link | PDF (Nature Medicine) [Open Access]

these results truly don't reflect my experience, the last thing I'd say is that I found human doctors to show more empathy than the simulated empathy of large language models. I also don't think they reflect the experience shared by most ME/POTS/LYME/MCAS, etc sufferers shared on patient groups. The problem of this study, I believe, is that the participating doctors were aware that they were part of a study and that they were being monitored, so their behavior doesn't reflect the typical behavior of a medical practitioner working with the absolute lack of accountability they work in their common practice, and also I'm pretty sure that neglected diseases weren't included as possible diagnosis.

Yann04 · Oct 26, 2024

Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said - Associated Press

Tech behemoth OpenAI has touted its artificial intelligence-powered transcription tool Whisper as having near “human level robustness and accuracy.”

But Whisper has a major flaw: It is prone to making up chunks of text or even entire sentences, according to interviews with more than a dozen software engineers, developers and academic researchers. Those experts said some of the invented text — known in the industry as hallucinations — can include racial commentary, violent rhetoric and even imagined medical treatments.

Experts said that such fabrications are problematic because Whisper is being used in a slew of industries worldwide to translate and transcribe interviews, generate text in popular consumer technologies and create subtitles for videos.

More concerning, they said, is a rush by medical centers to utilize Whisper-based tools to transcribe patients’ consultations with doctors, despite OpenAI’ s warnings that the tool should not be used in “high-risk domains.”

https://apnews.com/article/ai-artif...lth-business-90020cdf5fa16c79ca2e5b6c4c9bbb14

rvallee · Oct 27, 2024

it truly learned from the professionals. It's so beautiful.

Joking aside, yikes on going ahead with it before it's ready. It will be ready soon. Jumping the gun here is a good way to turn the work culture against something that will soon be superior in all cases. Especially if they explicitly say not to use it in such cases.

Health care systems have been essentially hostile towards telemedicine and any means to provide better access. They don't even have patient portals dealing in tickets and case management available in most cases. Even though it's all mature and tried-and-tested in many other industries. But they go right ahead with technology that hardly anyone else uses yet. Very odd people making bizarre choices.

Creekside · Oct 28, 2024

rvallee said:
Health care systems have been essentially hostile towards telemedicine and any means to provide better access.

I need a retinal exam, and have to travel 300 km for that (and 300 back). I read an article about how teleophthalmology is working so well in my town. It wasn't offered to me, and they say their camera isn't good enough, and referral time is really long. So, despite the glowing reviews, it's really not available. Even with a lesser quality camera, it should be good enough for an expert to judge whether a more detailed exam is necessary. The government would probably save money by setting up some regional adequate-quality cameras, rather than paying for specialists' offices and staff. Yes I'm annoyed about it.

SNT Gatchaman · Nov 14, 2024

Sequence modeling and design from molecular to genome scale with Evo (2024, Science)

See also perspective: Learning the language of DNA

Starrynight · Nov 30, 2024

if-ai-can-provide-a-better-diagnosis-than-a-doctor-whats-the-prognosis-for-medics
More interesting, though, were two other revelations: the experiment demonstrated doctors’ sometimes unwavering belief in a diagnosis they had made, even when ChatGPT suggested a better one

From today's The Guardian

mariovitali · Dec 31, 2024

As this being the last day of the year, I wanted to share the yearly update for 2024 regarding a software framework I have been using to identify biological targets of high predictive value . The framework uses machine learning, natural language processing and network analysis algorithms to identify hotspots of research related to ME/CFS. Disclaimer : I own a patent for its methodology

To date, no patient organisation decided to fund it or researchers decided to use it. The first email was sent in 2015 to a number of researchers from myself mentioning the liver, endoplasmic reticulum stress and bile acid metabolism disruption (hint : bile acids required for lipid absorption) which have all been identified years later. If @Chris Ponting 's paper is accepted then "Liver Disease" will be yet one row in the table I am sharing below.

Some questions

1) Is this "cherry picking" or not ? The people to ask -and I am grateful for your scepticism- would be @Simon M , @Murph @forestglip @Yann04 and -of course- @ME/CFS Skeptic . How can we evaluate this system as to whether it indeed has been outperforming human researchers?

2) Why has no one decided to evaluate and subsequently use the system despite the fact that we have repeated targets -ever since 2015- that have been later identified ?

It looks as follows. This means that the framework has been able to identify targets earlier by a median of 6 years. If Liver Disease is added the median value of 6 years is not affected. Of course I can provide references for all the information shown below.

Wishing you all a Happy New Year:

ME/CFS Science Blog · Dec 31, 2024

mariovitali said:
Is this "cherry picking" or not ? The people to ask -and I am grateful for your scepticism- would be @Simon M , @Murph @forestglip @Yann04 and -of course- @ME/CFS Skeptic .

I'm not quite sure how your system works but regarding cherry picking it depends on how many findings it predicts, for example if there are predictions that were not confirmed or perhaps contradicted by subsequent research. I suppose you would have to publish all the predictions beforehand somewhere much like a pre-registration to make it verifiable.

I also think that many findings highlighted in ME/CFS research are false positives, either from random sampling error or from selection bias in the study.

forestglip · Dec 31, 2024

ME/CFS Skeptic said:
I suppose you would have to publish all the predictions beforehand somewhere much like a pre-registration to make it verifiable.

I was also going to say this.

And also, now that the predictions have "come true", maybe present the data in a format that includes all the details. How many total were predicted, the details of your system that found it, and the details of the studies that confirmed it, instead of a simple list. Maybe write something up in a research paper format.

Sly Saint · Jan 15, 2025

Study Warns of Risks from Medical Misinformation in Large Language Models

A study published in Nature is drawing attention to the risks of large language models (LLMs) accidentally spreading medical misinformation. Researchers found that even small amounts of false information in training datasets could lead to harmful outputs that are nearly impossible to distinguish from accurate ones during standard testing.

To address this issue, the research team proposed using biomedical knowledge graphs to verify and flag problematic outputs, emphasizing the need for transparency and oversight in LLM development—particularly for healthcare applications.

The Problem with LLM Training Data
LLMs, like GPT-4 and LLaMA, are trained on massive datasets sourced from the open Internet, where information quality varies widely. While automated filters can catch overtly offensive content, more subtle misinformation often slips through, especially when it appears credible. This makes these models vulnerable to "data poisoning," a tactic where bad actors intentionally introduce false information into training data.

rvallee · Jan 15, 2025

trained on massive datasets sourced from the open Internet, where information quality varies widely

This makes these models vulnerable to "data poisoning," a tactic where bad actors intentionally introduce false information into training data.

If only this problem were unique to datasets from the open Internet...

Because the psychobehavioral garbage that is so beloved in medicine is just as much misinformation as the junk out in the fringes of the conspiracy crowds, and just as detached from reality.

Jaybee00 · Feb 19, 2025

Merged thread
Google— Accelerating scientific breakthroughs with an AI co-scientist

https://twitter.com/user/status/1892214154372518031

Today we introduce an AI co-scientist system, designed to go beyond deep research tools to aid scientists in generating novel hypotheses & research strategies. Learn more, including how to join the Trusted Tester Program, at https://goo.gle/417wJrA

https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/

Jaybee00 · Feb 19, 2025

Could forward this to your favorite researcher.

Announcing Trusted Tester access to the AI co-scientist system
We are excited by the early promise of the AI co-scientist system and believe it is important to evaluate its strengths and limitations in science and biomedicine more broadly. To facilitate this responsibly we will be enabling access to the system for research organizations through a Trusted Tester Program. We encourage interested research organizations around the world to consider joining this program here.

mariovitali · Mar 11, 2025

This is an important moment I believe. This is something I have been working on and it involves the use of Ensemble Learning for Large Language Models (LLMs) which we can call "EnsembleLLMs".

What does this mean? Essentially, a single prompt is passed to many LLMs -for example, to find causal factors for ME/CFS- and each LLM then responds according to its own reasoning process.

The next step involves finding where there are agreements / disagreements between the given answers then novelty detection is applied which can find interesting details that are relevant to the given question (=prompt). In the following snapshot, Grok3 mentioned something that no other LLM mentioned regarding 5 alpha reductase activity. This is considered novelty :

In the next step each LLM is asked to perform reality check in the answers given by LLMs. In the example shown above, all other LLMs will be asked as to whether the metabolites mentioned are indeed 5 alpha reductase metabolites.

Needless to say, I strongly believe that very soon we will have answers we have been waiting for decades.

Utsikt · Mar 11, 2025

mariovitali said:
In the next step each LLM is asked to perform reality check in the answers given by LLMs. In the example shown above, all other LLMs will be asked as to whether the metabolites given are indeed 5 alpha reductase metabolites.

How do you check the validity of the origins of the findings? How can we know that it isn’t just making up relationships that doesn’t exist?

mariovitali · Mar 11, 2025

Utsikt said:
How do you check the validity of the origins of the findings? How can we know that it isn’t just making up relationships that doesn’t exist?

Good question. Let's assume that 4 LLMs named A,B,C,D are given the same prompt. It basically has to do with the rate of agreement. B,C,D will evaluate facts given by A (=metabolite 1 and metabolite 2 are all 5 alpha reductase metabolites). If all 3 LLMs agree that this is indeed true then we have a 100% agreement. We can then filter responses using an agreement rate cutoff (=all concepts discussed with a less than 100% agreement should be evaluated by humans). I am not suggesting that humans need not evaluate responses with a 100% agreement rate..for the time being that is.

These models will get better and better. No turning back.

Utsikt · Mar 11, 2025

@mariovitali thank you for the example.

I understand the point about comparing responses and responses to responses in order to do some kind of quality assurance. But I’ve got two questions:

How does inter-LLM agreement signify truth? How can it ever signify truth? If my understanding is correct, LLMs simply give you the answer they believe that ‘sounds’ right, they have no way of assessing if it is right.
How do we know that what it’s talking about is relevant? Saying that X=Y does not mean that either have to be relevant for Z. Where did X and Y even come from?

Artificial intelligence in medicine and science

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Established Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Established Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)