Review Low-Dose Naltrexone: What is the Evidence? A Narrative Review, 2026, Gouda

Dolphin

Senior Member (Voting Rights)

Low-Dose Naltrexone: What is the Evidence? A Narrative Review​

You have full access to thisopen accessarticle

Download PDF
Save article
Advances in TherapyAims and scopeSubmit manuscript

Abstract​

Naltrexone is prescribed off-label at low doses, typically 0.5–6.0 mg, for a variety of therapeutic indications. This review evaluates the clinical evidence for low-dose naltrexone (LDN).

A literature search was conducted in February 2026 across PubMed, Embase and CINAHL for studies published from 1989 to 2026.

Title and abstract searches for “low dose naltrexone” identified peer-reviewed English-language studies using doses of ≤ 12.5 mg in humans.

A total of 105 studies were reviewed, including 15 randomised controlled trials (RCTs) in chronic pain, autoimmune and neuroimmune disorders, gastrointestinal disease, dermatological conditions, post-infectious syndromes, mental health and oncology.

Across these fields, early positive findings from uncontrolled studies were rarely replicated in placebo-controlled trials. Most available evidence consists of case reports and small feasibility studies that are prone to publication bias and rely heavily on subjective outcomes.

LDN is generally safe, inexpensive and well tolerated, with most studies using a daily dose of 4.5 mg.

Although these features contribute to its appeal, current evidence does not support routine clinical use.

LDN may have a pragmatic role in treatment-resistant cases where standard therapies have failed, provided its experimental status and uncertain efficacy are clearly explained.

Larger, well-designed RCTs with objective endpoints, along with N-of-1 approaches to identify potential responders, are needed to clarify its true clinical value.
 
This looks to be a very useful contribution, not least in the collation of relevant studies.

It seems that the evidence base does not support the use of LDN in ME/CFS and similar conditions. It's the usual situation of promising results in small open-label trails not being borne out when larger blinded trials are done.


Of the 105 studies reviewed, 15 were randomised controlled trials (RCTs) spanning seven therapeutic areas. The remaining evidence consisted primarily of case reports, small case series and observational or retrospective studies. Once-daily dosing was most common, and a daily dose of 4.5 mg was the most frequently reported, often achieved through gradual titration.

The strongest controlled evidence was found in chronic pain, although early positive findings have not been reproduced in larger, more recent trials. Small RCTs in autoimmune and gastrointestinal disorders show possible benefit, although findings remain inconsistent. All other therapeutic areas, including neuroimmune, dermatological, post infectious, mental health, oncological and other investigational uses, are supported predominantly by uncontrolled or anecdotal evidence. The few placebo-controlled trials conducted in these areas have not demonstrated clear clinical benefit.

Across all indications, LDN appears well tolerated, with adverse effects that are mild, transient and most often limited to insomnia, vivid dreams, nausea and dizziness. As studies vary widely in quality and design, findings are presented descriptively by indication, with evaluation reserved for the discussion.

Trials for fibromyalgia, gulf war illness, post-Covid conditions and CFS are discussed.


In a retrospective analysis of a Veterans Affairs post-COVID clinic cohort, records for patients taking LDN (1.5–4.5 mg/day) were compared with patients receiving physical therapy (n=50). LDN was linked to an approximate five-fold higher likelihood of documented improvement in at least one symptom (fatigue, pain, brain fog or dyspnoea) [67].
I thought that Veterans Affairs study was interesting. People with post-Covid conditions were 5 times more likely to have a documented improvement in at least one symptom when taking what LDN, which is probably ineffective, than when receiving physical therapy. We'd have to look at the details of that study, but, assuming some large blinded studies of LDN in Long Covid find a null result, then it probably provides some useful evidence that physical therapy is not helpful.
 
Discussion
The evidence base for LDN is broad but lack methodological strength. Of the 105 studies reviewed, only 15 were RCTs, which spanned seven conditions and recruited between 10 and 99 participants. The remaining literature is predominantly comprised of case reports, cohort studies and retrospective surveys. These lower-quality designs frequently report favourable outcomes, while negative or inconclusive findings are rarely published, leaving the field highly vulnerable to publication bias.

Consequently, perceptions of LDN’s efficacy are likely inflated by selective reporting, with early enthusiasm often contradicted by more rigorous study designs [122]. At present, no indication has sufficient high-quality evidence to support routine clinical use of LDN.
"no [condition] has sufficient high-quality evidence to support routine clinical use of LDN"
Much the same could be said about pretty much every therapy promoted for ME/CFS and similar conditions. This is what we need clinicians who prescribe unevidenced therapies to people with ME/CFS to understand.

The clinicians might think 'it's clearly safe, what harm can it do?'. Actually, lots. A willingness to prescribe unevidenced treatments outside of well-conducted trials condemns patients to years of taking a range of almost certainly ineffective therapies, hoping that they will eventually find the one that 'works for them'. When clinicians and patient charities suggest woefully inadequate trial results are useful evidence, when they don't differentiate between open label trials and blinded trials, it has a negative effect on patients becoming scientifically literate consumers of health services. It makes them less likely to be able to properly evaluate the many therapies that will be suggested to them, some of which are not at all safe.
 
Last edited:
Safety
LDN is generally well tolerated, with reported adverse effects typically mild and transient, including vivid dreams, insomnia, nausea, dizziness and mild gastrointestinal upset [17,18, 22, 67, 68]. Its favourable safety profile likely contributes to widespread off-label use, particularly among patients who have not responded to or cannot tolerate standard therapies.
Studies specifically examining the safety of LDN are sparse, and the literature does not currently address its use in high-risk groups. This is an area worth further investigation, given the concerns reported with conventional naltrexone doses (50 mg) in pregnancy and people with hepatic and renal impairment, and the absence of evidence on whether these risks extend to low doses [1].
I thought this was a good point. Yes, LDN is probably safe for most people, but the evidence base is lacking when it comes to pregnancy and people who have liver or kidney problems.
 
The most reliable data therefore suggest that, if LDN has therapeutic effects, it is likely modest, inconsistent and patient specific.
There is more discussion about why the anecdotes and lower quality trial results don't hold when good trials are done.

Several factors likely contribute to the gap between anecdotal reports and controlled findings.
They also rely heavily on subjective patient-reported outcomes, which are particularly vulnerable to placebo effects in conditions with fluctuating symptoms or strong patient expectations.
It's fantastic to see published literature mentioning this issue, and applying it to their analysis of the evidence-base. (Unfortunately, lots of people think the placebo effect is some useful magical effect, where the body fixes itself through belief alone, so that reason is not as convincing as it should be.)

Other problems mentioned are negative results not being published, small sample sizes, short durations, poor participant characterisation, and co-administration with other medications.


Although case reports are susceptible to selective reporting, they do document genuine individual responses. These accounts indicate that LDN may offer meaningful benefit to some individuals, even when this is not consistently replicated in RCTs. This raises the possibility that LDN functions as an individualised therapy with variable and unpredictable responses rather than a uniformly effective treatment. Despite being considered the highest-quality study design, conventional parallel-group RCTs may therefore be poorly suited to detecting benefit in small responder subgroups.
In this context, N-of-1 RCTs offer a promising alternative, preserving randomisation and blinding, while directly quantifying within-patient treatment effects [123, 124]. Aggregation of data from a series of N-of-1 trials is possible with appropriate statistical techniques. Such designs are particularly suited to identifying which individual patients may benefit from LDN and therefore warrant consideration in future research to determine genuine therapeutic value beyond population-level averages.
Damn, the paper was going extremely well, until they got to this bit. "Although case reports are susceptible to selective reporting, they do document genuine individual responses."

No!! Case reports without relevant biomarker data and sound knowledge of the therapeutic mechanism are, at best, evidence of a 'genuine improvement', not a 'genuine response'. The authors of this review have already acknowledged the problems with conditions with fluctuating symptoms, short duration evaluation periods and the impact of patient expectations that make case reports unreliable. But, crucially, a case report with only subjective outcomes is not enough to know that a reported improvement is caused by the treatment.

The authors call for N-of-1 RCTs, where an individual is given the treatment or a placebo without them knowing which. And, they suggest that these N-of-1 RCTs can be combined to give some useful data. But, surely, the aggregation of N-of-1 RCTs is pretty much a normal RCT?

I think what they should have been calling for here, is detailed characterisation of participants and sufficient sample sizes, so that post-trial stratification studies can be done to identify possible responder groups (that can be investigated in later studies).

Future research should focus on adequately powered, placebo-controlled trials with objective endpoints in the most promising indications, alongside N-of-1 methodologies to identify which patients derive benefit and refine treatment selection.
Perhaps I'm wrong, but I think the authors sort of lost the plot at the end here with their support for N=1 evidence. Perhaps if you have the patient cycling on and off a treatment multiple times, you could get to some certainty about whether something works for an individual, but even a couple of cycles of treatment and placebo won't produce proof that a treatment works, certainly not with the marginal benefits that most of these therapies are reported to produce.

It's like some clinician (a reviewer or supervisor?) came along at the end and said 'but, we can't take away people's hope. These therapies don't work for everyone, but I'm convinced they do work for some people', and got the author team to water down their conclusions.

Until such evidence emerges, LDN should be considered a low-risk experimental therapy with insufficiently validated clinical benefit. Nonetheless, the use of LDN may be a genuinely reasonable option in treatment-resistant cases.
 
Last edited:
Here are the affiliations of the authors, from the paper:

A. H. K. Gouda (*) · K. J. Steadman
School of Pharmacy and Pharmaceutical Sciences,
The University of Queensland, 20 Cornwall Street,
Woolloongabba, QLD 4102, Australia
e-mail: a.gouda@uq.edu.au

N. E. C. Aitcheson
Metro South Pain Rehabilitation Centre, Metro
South Hospital and Health Service, Woolloongabba,
QLD 4102, Australia


I'm going to be disappointed if we see LDN turn up in the forthcoming Australian ME/CFS Clinical Guidelines justified only on the current evidence.
 
current evidence does not support routine clinical use.
I can agree with that. It was a miraculous treatment for my ME aches, but that's not the standard response for PWME, so I wouldn't recommend it be prescribed to everyone. However, I'm very glad that there weren't any official regulations preventing me from trying it.

Should there be a category of "This treatment has been effective for a small number of people with these specific symptoms--and useless or harmful for the majority of people who tried it--so you can approve a trial prescription if the patient requests it and lacks the markers for likelihood of harm."?
 
Should there be a category of "This treatment has been effective for a small number of people with these specific symptoms--and useless or harmful for the majority of people who tried it--so you can approve a trial prescription if the patient requests it and lacks the markers for likelihood of harm."?

Not unless you have some reliable data.
 
Not unless you have some reliable data.
Well yes, I was thinking about actual clinical evidence that a treatment works for some people, rather than anecdotal evidence. I expect it's not easy to get such clinical evidence, but not impossible. Surely there have been trials where a drug failed to achieve reliable benefits, but provided clinical evidence of it working for some of the subjects. At some point in the future AIs might be able to identify the factors that allowed the treatments to work for those few individuals, providing new approaches to try, and also making that treatment reliable enough as a prescription for people with the specific factors.

Treatments aren't a matter of working for everyone or working for no one; the problem is identifying which treatment will work (or not work) for a specific patient. That's a messy problem, both from research and legal perspectives. It also results in doctors blaming a patient for a prescription not working, because "The drug is supposed to work (It says so on the bottle!)."
 
Well yes, I was thinking about actual clinical evidence that a treatment works for some people, rather than anecdotal evidence.
What is «clinical evidence»?

Unless the trials were done poorly, a negative result means that the treatment doesn’t work for anyone. If it did work for some, it would have been visible in the group level data from the trial.
 
I can agree with that. It was a miraculous treatment for my ME aches, but that's not the standard response for PWME, so I wouldn't recommend it be prescribed to everyone. However, I'm very glad that there weren't any official regulations preventing me from trying it.

Should there be a category of "This treatment has been effective for a small number of people with these specific symptoms--and useless or harmful for the majority of people who tried it--so you can approve a trial prescription if the patient requests it and lacks the markers for likelihood of harm."?
Not unless you have some reliable data.
And yet this is the literal basis behind the entire psychobehavioral evidence-based medicine where trial after trial has null results but they want to do more, more, MORE trials to figure out which sub-group they could find some uses for. And they keep getting away with it.

One might even be tempted to point out that there are completely different standards of evidence here and this hypocritical system is arbitrary when it feels and implacable when it wants. Mutually exclusive, even. As in it's an exclusive club and we're not in it.
 
Surely there have been trials where a drug failed to achieve reliable benefits, but provided clinical evidence of it working for some of the subjects.

I think that is a misconception. You can never draw a causal inference from a single set of events (like a patient taking a drug and getting better). You always need some form of control observation to infer cause. Control observations can be historical or other wise temporal in the same individual but in the context of a fluctuating illness like ME/CFS you really need to infer cause from test populations against control populations. So 'clinical evidence of working for some' simply does not exist.

The caveat to this is that you may have very powerful circumstantial evidence from objective data such as pharmacodynamics. In my initial RA trial with rituximab there were no control patients. The fact that all five patients got better to a level they had not been at for years was encouraging but would not on its own mean anything reliable. What I and the pharmaceutical company scientists took as reliable evidence was that objective measures changed in close correlation in time with improvement and also with relapse and re-improvement - C-reactive protein, rheumatoid factor and so on.

Unfortunately we do not have objective markers like this that we know follow disease in ME/CFS. Nevertheless, the Daratumumab trial raised an intriguing possibility. Let us say that drug A is being tested on disease B and while you are testing you measure C,D,E and F, not because they are correlated to B or expected to be markers of B but for 'housekeeping' reasons. Say you were treating RA and wanted to measure blood sugar because it might indicate a possible side effect from steroid used as premed.

If you then find that X out Y patients report being hugely better and the rest no difference and you happen to find that all and only the X patients started off with rather high D (maybe blood sugar) such that a chi square analysis showed that the likelihood of finding this by chance was p<0.0001 then you would have good grounds for believing that A had made the X people better - as long as you were quite sure you were not doing a post-hoc analysis that made the p value spurious. In the dara trial the NK cell number seemed to be like this but sadly it must count as post-hoc.

But the bottom line is that there is no such thing as reliable 'clinical evidence' from a single case.
 
Discussion

"no [condition] has sufficient high-quality evidence to support routine clinical use of LDN"
Much the same could be said about pretty much every therapy promoted for ME/CFS and similar conditions. This is what we need clinicians who prescribe unevidenced therapies to people with ME/CFS to understand.

The clinicians might think 'it's clearly safe, what harm can it do?'. Actually, lots. A willingness to prescribe unevidenced treatments outside of well-conducted trials condemns patients to years of taking a range of almost certainly ineffective therapies, hoping that they will eventually find the one that 'works for them'. When clinicians and patient charities suggest woefully inadequate trial results are useful evidence, when they don't differentiate between open label trials and blinded trials, it has a negative effect on patients becoming scientifically literate consumers of health services. It makes them less likely to be able to properly evaluate the many therapies that will be suggested to them, some of which are not at all safe.
LDN helps some (including myself) but not others.
 
What is «clinical evidence»?
I was thinking of diseases with reliable markers. If most diabetic patients had no improvement in insulin levels after taking a treatment, but 3% did show a reliable improvement, then it's a matter of figuring out why it worked for that subgroup. As Jonathan points out, it's not really that simple. It just seems a shame that there might be treatments that do help some people, but they never even get to hear about it, except from forums such as this.

I'm still hoping that someone else will get real PEM-blocking from cumin.
 
I was thinking of diseases with reliable markers. If most diabetic patients had no improvement in insulin levels after taking a treatment, but 3% did show a reliable improvement, then it's a matter of figuring out why it worked for that subgroup. As Jonathan points out, it's not really that simple. It just seems a shame that there might be treatments that do help some people, but they never even get to hear about it, except from forums such as this.

I'm still hoping that someone else will get real PEM-blocking from cumin.
I think this to be a real effect you would have to be able to “subgroup” beforehand.

You might be interested in the subgrouping discussion:

You can’t just pick 3% of ppl that did well in a trial, statistical noise is too high. Maybe if you can preselect for a specific gene mutation or something else beforehand then see if that 3% had it.
 
You can’t just pick 3% of ppl that did well in a trial, statistical noise is too high. Maybe if you can preselect for a specific gene mutation or something else beforehand then see if that 3% had it.

If you monitored insulin (or maybe blood sugar) continually, gave the drug intermittently and observed a regular saw tooth pattern corresponding to the drug dosing you could do it.
 
Have there been any trials where a drug has failed to be effective for enough patients to be accepted, but where it had been effective on a few patients in the trial, and later been accepted as a treatment for what had later been proven to be a subgroup? I don't have a proper feel for the statistics and approvals process, but my personal experience is that compounds can work well for a few people even though they fail to work for the majority.
 
Have there been any trials where a drug has failed to be effective for enough patients to be accepted, but where it had been effective on a few patients in the trial

How does one know that it is 'effective' rather than just that some people got better for some reason?

If there is circumstantial evidence like pharmacodynamics one may be able to decide. But I don't remember any trial where that occurred - the pharmacodynamics of improvement in a small group went on to be taken as evidence of value is a subgroup.
but my personal experience is that compounds can work well for a few people even though they fail to work for the majority.

But how would you know? I think people are entitled to have a firm belief in their own response based perhaps of repeated taking and withdrawing from a drug. But for anyone else we know that accounts of causal relation of drug to improvement are so unreliable that they are a waste of time. People are hugely unreliable about this. And you can prove that they are unreliable as a whole even if you cannot prove an individual is unreliable. The statistics are subtle but can be very elegant. People are unreliable about cause and effect.
 
Back
Top Bottom