Researchers Uncover Flaw Allowing AI to Create Inappropriate Images

Date:

Updated: [falahcoin_post_modified_date]

AI tools like Dall-e, Midjourney can create pornographic images despite safety filters, new study reveals

If you are familiar with AI image generating tools like Dall-e, Midjourney and even Bing for that matter, you would notice that the moment you type words like nude or naked, the AI tool refuses to fetch results. However, a flaw in the algorithm of the AI tools is actually letting people create such images despite the safety filters that prohibits the AI tools from generating inappropriate images.

A breakthrough discovery by researchers from Johns Hopkins University in Baltimore and Duke University in Durham, N.C., has revealed a concerning vulnerability in popular text-to-image AI systems like DALL-E 2 and Midjourney. These systems, designed to create pictures based on text descriptions, can be misled into producing inappropriate images using a newly developed algorithm.

Our goal is to find weaknesses in AI systems and make them stronger, explained Yinzhi Cao, a cybersecurity researcher at Johns Hopkins. Just like we identify vulnerabilities in websites, we are now investigating vulnerabilities in AI models.

The scientists experimented with prompts that they knew would typically be blocked by safety filters, like a naked man riding a bike. Surprisingly, nonsense words were effective triggers for these AI systems to generate innocent pictures, with some seemingly gibberish terms prompting images of cats or dogs.

Cao highlighted that AI systems perceive language differently from humans. The researchers suspect that these systems might interpret certain syllables or combinations similarly to words in other languages, leading to unexpected associations.

Moreover, the team uncovered that these nonsense words, which don’t seem directly linked to forbidden terms, could provoke the AI systems to produce not-safe-for-work (NSFW) images. Despite safety filters not blocking these prompts, the AI systems interpreted them as commands to create inappropriate content.

This discovery suggests a significant gap in the AI safety filters, where seemingly innocuous or nonsensical words can slip through and prompt the generation of inappropriate images. The researchers plan to present their findings at the IEEE Symposium on Security and Privacy in May 2024, aiming to shed light on these vulnerabilities and improve the safeguards in AI systems.

The implications of these findings underscore the need to refine AI models’ safety measures, ensuring they accurately discern and prevent the creation of inappropriate content, even when faced with deceptive or unconventional language inputs.

[single_post_faqs]
Neha Sharma
Neha Sharma
Neha Sharma is a tech-savvy author at The Reportify who delves into the ever-evolving world of technology. With her expertise in the latest gadgets, innovations, and tech trends, Neha keeps you informed about all things tech in the Technology category. She can be reached at neha@thereportify.com for any inquiries or further information.

Share post:

Subscribe

Popular

More like this
Related

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

Revolutionary SBEN connects small business sellers and buyers, transforming the way businesses are bought and sold in the U.S.

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

District 1 Commissioner Race in Orange County faces delays with recounts and ballot reviews. Find out who will come out on top in this close election.

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Federal Reserve minutes suggest potential rate cut in September amid economic uncertainty. Find out more about the upcoming policy decisions.

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

Experience the powerful testimonies of Baltimore Orioles players on their first-ever 'Faith Night.' Hear how their faith impacts their lives on and off the field.