No, Splenda/Sucralose and Its Metabolites Won’t Give You Cancer; Peer Review Failed Us Again

How the paper by Schiffman et al. (2023) got it wrong, and more evidence that peer review simply isn't working.

Let me be clear — no, sucralose, or better known by its trade name, Splenda, is not going to give you cancer. It’s not going to cause DNA damage. Furthermore, sucralose-6-acetate, which is a very minor impurity in sucralose, isn’t going to give you cancer. It’s not going to cause DNA damage.

So what’s going on? Why am I bringing this up?

Back in late May/early June 2023 a press release circulated from NC State University that stated that NC State researchers (none of whom are recognized as toxicologists) published a toxicology study that said sucralose will damage your DNA.

Headlines and news stories followed. The New York Post said sucralose could be broken down into sucralose-6-acetate in human bodies, and lead to cancer. WRAL, a Raleigh-based TV broadcaster, said that “a chemical found in sucralose” could lead to cancer. Prevention also raised the suggestion that sucralose ingestion could lead to cancer.

But here’s the rub — the study is seriously flawed. In addition, Dr. Schiffman hasn’t shared the data with me (I requested it on 4 August 2023). I also requested it from NC State University under North Carolina’s Public Records law (NCGS Chapter 132, and per NC State policies). I’m still awaiting official word from NC State University. I’ll update if I ever get the data.

Bottom-Line Up-Front: No Genotoxicity, No Mutagenicity, No Cancer.

Here’s what you need to know — I can confidently conclude that their study offers no evidence of genotoxicity, no evidence of mutagenicity, no evidence that sucralose or sucralose-6-acetate causes cancer. None whatsoever.

But, even if Schiffman et al. are right — how many cans of diet soda with Splenda would you need to drink to even see these results?

What Did Schiffman et al. Do?

They did a lot of things.

First they did MultiFlow assay looking at DNA damage in TK6 cells. Next, they looked for cytogenetic/chromosomal damage using a micronucleus assay. Then, they used a QSAR program to see if there were any chemical structures that would suggest mutagenicity. Then they performed the Ames test for mutagenicity. Then they performed a transepithelial electrical resistance (TEER) test for permeability in colon cells. Then they performed RNA-seq analysis. And then they used microsomes to study the in vitro half-life of sucralose and sucralose-6-acetate when exposed to liver microsomes (which it should be noted is not the same as the actual clearance and plasma half-life). And finally, the authors looked at the inhibition of certain CYPs.

They did a lot, and there was a lot to go through. But, unfortunately, it did not take me long to find massive problems that call of Schiffman et al.’s results and conclusions not only in to question, but to worry about the scientific integrity of this study.

Let me be clear — it is my opinion that this study has a complete loss of scientific integrity, to the point that this study is completely misleading.

MultiFlow Interpretation Is Wrong

What’s interesting about MultiFlow is that the company that makes the assay has “Global Evaluation Factors (GEFs)” that can be used to identify clastogens, based on 4 biomarkers. What’s even better is that the GEFs have been validated through an interlaboratory process. That’s cool. The interlaboratory process and the GEFs are provided in this paper (Bryce, et al. 2017).

Okay, so here’s the problem: Schiffman et al. use a set of GEFs that don’t exist in Bryce, et al. 2017. Yes, you read that right. Schiffman et al. appear to have made up their own GEFs. Either they made them up, or they failed to cite where they came from.

So which GEFs did Schiffman et al. use that aren’t in Bryce? The ones for the experiments with S9 treatments:

“≥1.44-fold 4-hr γH2A×,

≥1.31-fold 24-hr γH2A×,

≥1.23-fold 4-hr nuclear p53,

≥1.12-fold 24-hr nuclear p53.”

These criteria don’t exist anywhere in Bryce, et al. 2017. And, if you apply the 24h p53 criteria to the supplemental data in Bryce et al. 2017, you’ll see that 1.12x is not used.

Here are the GEFs from Bryce et al. 2017:

“GEFs for the three clastogen-responsive biormarkers 4 hr γH2AX, 4 hr p53, and 24 hr γH2AX, were 1.51-, 1.40-, and 2.11-fold, respectively;

GEFs for the three aneugen-responsive biomarkers 4 hr p-H3, 24 hr p-H3, and 24 hr polyploidy, were 1.71-, 1.52-, and 5.86-fold, respectively; and

the GEF for the pan-genotoxicant (clastogen- and aneugen-responsive) biomarker, 24 hr p53, was 1.45-fold.”

The only GEF from Bryce et al. 2017 that uses the 24h p53 level was the GEF for the pan-genotoxicant. So, it appears that Schiffman et al. have the GEFs in their paper wrong.

Beyond that, if you look at Table 4 in Schiffman et al., you will see that they have 2 concentrations consecutively that do meet the 24h p53 fold-change threshold of 1.45 — 2274 and 1607uM. Keep in mind, 2274 and 1607uM are the same as 2.274 and 1.607mM — these are exceedingly high concentrations.

But here’s the deal — they have an n=1 — that is, they only used 1 sample!

So, this experiment isn’t scientifically valid. There is no excuse for doing 1 sample. Also keep in mind MultiFlow is a screening assay — the results are hypothesis generating, not confirmatory.

Why is an n=1 a big deal? Because we have no idea what the technical noise in this system is like. We also have no idea what the biological noise in this assay system is like. There could be all kinds of noise in the system (and looking at data from Bryce et al. 2017 I know there is). So using a single sample to then stoke fear in the public in my opinion is not only irresponsible, it’s unethical. In my opinion, and others agree, causing fear, which ethically is a harm, to others with no sound scientific basis or rationale is irresponsible and unethical.

So bottom-line: there is no evidence of genotoxicity, the study is unreliable with only one single sample. And somehow peer review let this through? Ridiculous.

Micronucleus Assay

Schiffman et al. report that only the highest concentration test had a “significant p-value”, and that was in the group that didn’t have S9 treatment (Table 5).

The problem — when I reanalyzed the data using Fisher’s Exact Test and the data in the table, I got a p-value of 0.06.

If you’d like to try it to, here’s the R code:

> x <- data.frame("micronucleated" = c(12, 18, 21, 13, 18, 24), 
                "total_cells" = c(2261, 2272, 2235, 2242, 2270, 2278), 
                row.names = c("veh", "227", "569", "1137", "1705", "2274"), 
                stringsAsFactors = FALSE)

> fisher.test(x[c(6,1), ]) 

	Fisher's Exact Test for Count Data

data:  x[c(6, 1), ]
p-value = 0.06439
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.9518829 4.3668345
sample estimates:
odds ratio 

So, yeah — I reanalyzed their data and couldn’t reproduce the p-value.

And it’s not like it’s hard. This is something any peer reviewer could do. You don’t even need that much in the way of statistical knowledge. And there are free Fisher’s Exact Test calculators on websites that work well. And yet…here we are.

It’s just ridiculous that no one caught this in peer review. Again — this paper got through and it was peer reviewed?? Ridiculous! Again!

The Ames Test — Says It’s Not Mutagenic

At least the authors got this one right — the authors read the Ames test correctly. So, the Leadscope analysis suggested the authors do the Ames test, the authors did it, and the Ames test came back negative.

Thankfully the authors realize this and are careful to say that the chemical is not mutagenic. That’s good at least.

The problem is that the authors


Generally, I find that transcriptomics, including RNA-seq is hypothesis generating. That’s true here, too.

But what concerns me is the use of Ballgown for the analysis. Ballgown uses an F-test, and there’s no shrinkage to control false positives. So what happens is that Ballgown actually causes more tiny p-values to exist, than other more conventional approaches. And that’s great if you’re going to do hypothesis generation.

But you can’t take these results and then conclude anything.

Unfortunately, the authors do make conclusions regarding their transcriptomics. In my opinion, their transcriptomic results are nothing more than false positives. There’s also the small sample size issue — so it’s even more likely that we have false positives here (long-time readers of this blog will know why).

Lyle D. Burgoon, Ph.D., ATS
Lyle D. Burgoon, Ph.D., ATS
Dr. Burgoon is a pharmacologist/toxicologist, biostatistician, ethicist and risk assessor. Dr. Burgoon writes on chemical safety, biostatistics, biosecurity, sustainability, and scientific ethics. He is the President and CEO of Raptor Pharm & Tox, Ltd, a consulting firm.

Latest articles

Related articles

Leave a reply

Please enter your comment!
Please enter your name here