The Sydney Morning Herald logo
Advertisement

This was published 7 months ago

CSIRO breakthrough shields against sexualised AI deepfakes

David Swan

CSIRO researchers say they’ve developed a new algorithm that can block images from being used to create deepfakes, as Australian state governments scramble to criminalise sexually explicit AI-generated content.

The use of generative AI deepfakes to create non-consensual sexualised deepfake images has soared in popularity among high school students and the broader public. Victoria banned image-based sexual abuse in 2022, and the NSW and South Australian state governments are following suit.

CSIRO researchers say they’ve developed a new algorithm that can block images from being used to create deepfakes.Bloomberg

Now, a scientific breakthrough developed by Australian researchers could stop a user’s photos from being used to create deepfakes altogether.

The technique, developed by CSIRO in partnership with the Cyber Security Cooperative Research Centre and the University of Chicago, subtly alters content to make it unreadable to AI models while remaining unchanged to the human eye.

Advertisement

The method could not only help block deepfakes but it could also help artists protect their work from being used to train AI, as debate rages locally about whether copyrighted material should be used to train large language models. Last week, the Productivity Commission announced it was investigating how AI models could be more easily trained on Australian copyrighted content, a move that prompted an outcry from the creative industry.

CSIRO’s algorithm could also help defence organisations shield their sensitive satellite imagery from being absorbed into AI models, for example.

The CSIRO research could block images from being used to create deepfakes.

CSIRO research scientist Dr Derui (Derek) Wang said the technique changed an image’s pixels so that it could not be used to train artificial intelligence models.

He said it provided a mathematical guarantee that this protection held even against retraining attempts.

Advertisement

“Existing methods rely on trial and error or assumptions about how AI models behave,” he said. “Our approach is different; we can mathematically guarantee that unauthorised machine learning models can’t learn from the content beyond a certain threshold. That’s a powerful safeguard for creators and organisations.”

He said the technique could be applied automatically at scale: a social media platform, for example, could embed the protective layer into every image uploaded.

“This could curb the rise of deepfakes, reduce intellectual property theft, and help users retain control over their content,” he said.

“It wasn’t an easy task to achieve this. It’s been a great challenge from a scientific perspective, and I think this is the first time that we have seen this kind of guarantee in the field.”

Advertisement

There are plans to expand the development to text, music and videos. The paper, Provably Unlearnable Data Examples, was presented at the 2025 Network and Distributed System Security Symposium, where it received the distinguished paper award.

To date, the algorithm is still theoretical. Results are validated in a controlled lab setting, but the team has released the code on GitHub for academic use and is hoping to partner with researchers or the private sector to make it a commercial reality.

“I think this has great potential to transform the research in this field,” Wang said.

The Business Briefing newsletter delivers major stories, exclusive coverage and expert opinion. Sign up to get it every weekday morning.

David SwanDavid Swan is the technology editor for The Age and The Sydney Morning Herald. He was previously technology editor for The Australian newspaper.Connect via X or email.

From our partners

Advertisement
Advertisement