Artificial intelligence in medicine and science

AI is set to be used in NHS hospitals for the first time
AI will be used in NHS hospitals for the first time after the health regulator approved its deployment in radiology departments.

Nine artificial intelligence platforms have been given the green light in draft guidance from the National Institute for Health and Care Excellence (Nice).

The AI software can relieve 'severe pressure' on radiographers by creating outlines of cancer patients' healthy organs in lung, prostate or colorectal cancers.

At the moment, radiographers outline healthy organs on a CT or MRI scan by hand.

Nice found AI platforms 'generally produce similar quality contours' as those carried out manually, with most only needing 'minor edits'.

Using the technology could be quicker, giving clinicians more time to spend with patients or focus on more complex cases.

It is the first piece of Nice guidance to recommend the use of AI to aid clinicians in their role.
AI is set to be used in NHS hospitals for the first time after health regulator gives the green light for its deployment in radiology departments | Daily Mail Online
 
Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study
https://www.jmir.org/2023/1/e48659

Methods
We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared its accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. Accuracy was measured by the proportion of correct responses to the questions posed within the clinical vignettes tested, as calculated by human scorers. We further conducted linear regression to assess the contributing factors toward ChatGPT’s performance on clinical tasks.​

Results
ChatGPT achieved an overall accuracy of 71.7% (95% CI 69.3%-74.1%) across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI 67.8%-86.1%) and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI 54.2%-66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (β=–15.8%; P<.001) and clinical management (β=–7.4%; P=.02) question types.​
 
Documentary not on AI in medicine but AI in general;
very disturbing.
iHuman
A Storyville documentary. Artificial intelligence now permeates every aspect of our lives, but only a handful of people have any control over its influence on our world.

With unique access to some of the most powerful pioneers of the AI revolution, iHuman asks whether we know the limits of what artificial intelligence is capable of and its true impact.

https://www.bbc.co.uk/iplayer/episode/m001q1zl/storyville-ihuman
 
A new medical large language model (LLM) was published today in open source. It claims to beat some non-specialist models like GPT 3.5, which still performs quite well. The open source model makes it far easier for researchers to use it and improve on it.

Clinical Camel: An Open Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding
https://arxiv.org/abs/2305.12031

We present Clinical Camel, an open large language model (LLM) explicitly tailored for clinical research. Fine-tuned from LLaMA-2 using QLoRA, Clinical Camel achieves state-of-the-art performance across medical benchmarks among openly available medical LLMs. Leveraging efficient single-GPU training, Clinical Camel surpasses GPT-3.5 in five-shot evaluations on all assessed benchmarks, including 64.3% on the USMLE Sample Exam (compared to 58.5% for GPT-3.5), 77.9% on PubMedQA (compared to 60.2%), 60.7% on MedQA (compared to 53.6%), and 54.2% on MedMCQA (compared to 51.0%). In addition to these benchmarks, Clinical Camel demonstrates its broader capabilities, such as synthesizing plausible clinical notes. This work introduces dialogue-based knowledge encoding, a novel method to synthesize conversational data from dense medical texts. While benchmark results are encouraging, extensive and rigorous human evaluation across diverse clinical scenarios is imperative to ascertain safety before implementation. By openly sharing Clinical Camel, we hope to foster transparent and collaborative research, working towards the safe integration of LLMs within the healthcare domain. Significant challenges concerning reliability, bias, and the potential for outdated knowledge persist. Nonetheless, the transparency provided by an open approach reinforces the scientific rigor essential for future clinical applications.​

The model: https://huggingface.co/wanglab/ClinicalCamel-70B. It's open source but requires a login on this platform to view.
 
Artificial intelligence for breast cancer detection in screening mammography in Sweden: a prospective, population-based, paired-reader, non-inferiority study
https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00153-X/fulltext

Findings
From April 1, 2021, to June 9, 2022, 58 344 women aged 40–74 years underwent regular mammography screening, of whom 55 581 were included in the study. 269 (0·5%) women were diagnosed with screen-detected breast cancer based on an initial positive read: double reading by one radiologist plus AI was non-inferior for cancer detection compared with double reading by two radiologists (261 [0·5%] vs 250 [0·4%] detected cases; relative proportion 1·04 [95% CI 1·00–1·09]). Single reading by AI (246 [0·4%] vs 250 [0·4%] detected cases; relative proportion 0·98 [0·93–1·04]) and triple reading by two radiologists plus AI (269 [0·5%] vs 250 [0·4%] detected cases; relative proportion 1·08 [1·04–1·11]) were also non-inferior to double reading by two radiologists.

Interpretation
Replacing one radiologist with AI for independent reading of screening mammograms resulted in a 4% higher non-inferior cancer detection rate compared with radiologist double reading. Our study suggests that AI in the study setting has potential for controlled implementation, which would include risk management and real-world follow-up of performance.​

TL;DR:
When #AI was added to radiologists for reading mammograms in >55,000 women
  • 4% more cancers were detected
  • 50% less radiologist reading time
  • 6-55% fewer false positives
 
Also probably somewhere around 50% lower cost, which means more women would be able to afford to have screening done, and there would be more slots available since the radiologist time is reduced. The fewer false positives would also free up resources used in follow-ups, reducing strain on the system. The negative effects seem mainly to the radiologist training industry and existing radiologists, but that issue will be affecting many parts of the economy, so it's a global issue.
 
Google DeepMind AI speeds up search for disease genes

Google's AI firm DeepMind has used artificial intelligence to identify changes in human DNA that might cause diseases.
The researchers believe they have pinpointed 89% of all the key mutations.
The development is expected to speed up diagnosis and help in the search for better treatments.
A leading independent scientist told BBC News that the work was "a big step forward".

Prof Ewan Birney, deputy director general of the European Molecular Biology Laboratory, said: "It will help clinical researchers prioritise where to look to find areas that could cause disease."

The technique works by checking the order of the components in human DNA strands.

All living organisms are built from DNA. It is made from four blocks of chemicals called adenine (A), cytosine (C), guanine (G) and thymine (T). In humans, when an embryo is developing, the order of these letters are read to produce proteins, which are the building blocks of the the cells and tissues that make up various parts of the body.

But if the letters are in the wrong order - perhaps because of an inherited disorder - the body cells and tissues aren't made properly - and this can lead to disease.

Currently genetic disease hunters have fairly limited limited knowledge of which areas of human DNA can lead to disease. They have classified 0.1% of letter changes, or mutations, as either benign or disease causing.

Google DeepMind's Pushmeet Kohli said that the new model pushed that percentage up to 89%.

Currently, researchers have to search for potentially disease-causing regions across billions of chemical building blocks that make up DNA. That has now changed, according to Mr Kohli.

''Researchers can now focus their efforts on the new areas, that they were not aware of and we have highlighted as potentially disease-causing,'' he said.

Last year Google DeepMind's AI worked out the shape of nearly all proteins in the human body.

The new system, called AlphaMissense, can tell If the letters in the DNA will produce the correct shape. If not, it is listed as potentially disease-causing.


https://www.bbc.co.uk/news/science-environment-66847977
 
The AI findings don't "feel right" for ME's cause. Another study I heard about today was a re-analysis of existing studies which revealed that peroxysomes were abnormal in ME. To me that seems a much better fit for explaining ME.
 
The AI findings don't "feel right" for ME's cause. Another study I heard about today was a re-analysis of existing studies which revealed that peroxysomes were abnormal in ME. To me that seems a much better fit for explaining ME.
Sorry, I do not understand what "feel right" stands for and how "feelings" can help us solve a complex multisystemic syndrome such as ME/CFS.

As it appears, Peroxisome dysfunction, Bile acid metabolism and Phospholipid metabolism disruption ,Endoplasmic reticulum stress and potential liver involvement are all concepts that have been identified years before conventional research using AI methods. And this was the point of this post.
 
I agree. Feelings have nothing to do with biomedical science.
Okay, bad wording. My brain weighs the evidence for and against the claims based on whatever is stored in my neurons from past reading, and has a positive or negative judgement. Intuition is a valid method of judging things when there's inadequate data for stronger forms of judgement. If you have 500 hypotheses with apparently equal evidence, choosing one that "feels right" is likely going to be more successful than choosing one at random. There might be some factor we're not aware of that our neurons are factoring in.
 
I think intuition is only as good as the knowledge and information on which that intuition is based. It may work when deciding which colour to wear, but it's a poor substitute for sound scientific understanding, data and evidence for deciding what's going on inside our cells.
 
but it's a poor substitute for sound scientific understanding, data and evidence for deciding what's going on inside our cells.
Sure, but there is no sound scientific understanding, data or evidence to base decisions on yet. Intuition may not be reliable, but if the alternative is pure random choice, intuition at least has some chance of being better. The brain might factor in some long-forgotten bit of textbook knowledge, or some insight based on something that the conscious mind considers irrelevant.

I'm not going to give all hypotheses ("ME is caused by 5G!!!") equal weight. In the absence of sound scientific understanding, data or evidence, I'll weigh them on my existing knowledge, personal observations, and maybe the opinions and comments by people who also lack sound scientific understanding, data or evidence, but who at least have expertise in the subject.
 
Scientists excited by AI tool that grades severity of rare cancer

Artificial intelligence is nearly twice as good at grading the aggressiveness of a rare form of cancer from scans as the current method, a study suggests.

By recognising details invisible to the naked eye, AI was 82% accurate, compared with 44% for lab analysis.

Researchers from the Royal Marsden Hospital and Institute of Cancer Research say it could improve treatment and benefit thousands every year.

They are also excited by its potential for spotting other cancers early.

AI is already showing huge promise for diagnosing breast cancers and reducing treatment times.

Computers can be fed huge amounts of information and trained to identify the patterns in it to make predictions, solve problems and even learn from their own mistakes.


https://www.bbc.co.uk/news/health-67264350
 
ChatGPT generates fake data set to support scientific hypothesis
https://www.nature.com/articles/d41586-023-03635-w


Oh, you don't even need AI to do that. And the problem with fake clinical trial conclusions is not because of fake data, it's not even because of fake analysis of real data, it's because of the human need for desired conclusions to be true, even when the data contradict it and a culture that encourages up to outright fraud.

Elizabeth Bik, mentioned in the article, has received threats over her work debunking fraudulent research. The culture of academia is as hostile to being policed as police forces are, it's actually disturbing.

In a paper published in JAMA Ophthalmology on 9 November1, the authors used GPT-4 — the latest version of the large language model on which ChatGPT runs — paired with Advanced Data Analysis (ADA), a model that incorporates the programming language Python and can perform statistical analysis and create data visualizations. The AI-generated data compared the outcomes of two surgical procedures and indicated — wrongly — that one treatment is better than the other.
The solution here is not less AI, it's to have an actual quality control process on published research, which the current combination of a for-profit publishing industry and a quantity over quality culture make impossible. This is not a technological problem, it's a human one, including the near indifference to researchers doing what most other researchers do, even if it's wrong, and the resulting lack of consequences.

And then of course you have the whole so-called evidence-based medicine, where supporting data don't even matter at all and entire fantasy conclusions are commonly forced onto the world despite being massive failures. Again a cultural problem.
 
Back
Top Bottom