AI Chatbots ‘Jailbroken’ in Groundbreaking NTU Study: Discover How Bots Bypass Bans, Singapore

Date:

Updated: [falahcoin_post_modified_date]

Researchers have made a groundbreaking discovery in the field of AI chatbots. A team of computer scientists from Nanyang Technological University (NTU) in Singapore has found a way to bypass the restrictions that prevent chatbots from responding to banned or sensitive topics. By using a unique training method involving multiple AI chatbots, the researchers have successfully unlocked ChatGPT and other similar models.

The method, unofficially referred to as a jailbreak by the NTU team, is formally known as the Masterkey process. It involves training two chatbots, such as ChatGPT, Google Bard, and Microsoft Bing Chat, to learn each other’s models. With this new knowledge, the chatbots are able to divert any commands related to banned topics.

The research team, led by Professor Liu Yang and including NTU Ph.D. students Mr. Deng Gelei and Mr. Liu Yi, developed a proof-of-concept attack method similar to a bad actor hack. They first reverse-engineered one large language model (LLM) to understand its defense mechanisms, which prevented it from answering certain prompts due to violent, immoral, or malicious intent.

Once the defense mechanisms were exposed, they trained a different LLM to create a bypass. By using the reverse-engineered LLM as a reference, the second model was able to express itself more freely. The team named this process the Masterkey because it should continue to work even if LLM chatbots are strengthened with additional security measures or patched in the future.

According to Professor Lui Yang, the Masterkey process demonstrates how easily LLM AI chatbots can learn and adapt. The team claims that their process is three times more effective at bypassing banned topics than traditional prompt methods. This finding challenges the notion that glitches experienced by certain LLMs, like GPT-4, are indicators of them becoming lazier and less advanced.

The emergence of AI chatbots, particularly with the popularity of OpenAI’s ChatGPT, has raised concerns regarding their safety and inclusivity. OpenAI has implemented safety warnings and periodic updates to address unintentional language slipups. However, spinoffs of various chatbots have allowed offensive language up to a certain extent.

Unfortunately, bad actors have already taken advantage of the demand for AI chatbots. Campaigns promoting ChatGPT, Google Bard, and other chatbots on social media platforms often included malware attachments or other cyberattacks. This highlights the fact that AI has become the new frontier of cybercrime.

The NTU research team has shared their proof-of-concept data with the AI chatbot service providers involved in the study, confirming the reality of jailbreaking chatbots. They will also present their findings at the Network and Distributed System Security Symposium in San Diego in February.

In summary, the breakthrough achieved by the NTU team in unlocking AI chatbots has significant implications for the development and application of these technologies. While it raises concerns about the potential misuse of chatbots, it also highlights the need for continuous advancements in safety and security measures. As AI continues to evolve, researchers and developers must work together to ensure the responsible and ethical use of these powerful tools.

[single_post_faqs]
Neha Sharma
Neha Sharma
Neha Sharma is a tech-savvy author at The Reportify who delves into the ever-evolving world of technology. With her expertise in the latest gadgets, innovations, and tech trends, Neha keeps you informed about all things tech in the Technology category. She can be reached at neha@thereportify.com for any inquiries or further information.

Share post:

Subscribe

Popular

More like this
Related

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

Revolutionary SBEN connects small business sellers and buyers, transforming the way businesses are bought and sold in the U.S.

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

District 1 Commissioner Race in Orange County faces delays with recounts and ballot reviews. Find out who will come out on top in this close election.

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Federal Reserve minutes suggest potential rate cut in September amid economic uncertainty. Find out more about the upcoming policy decisions.

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

Experience the powerful testimonies of Baltimore Orioles players on their first-ever 'Faith Night.' Hear how their faith impacts their lives on and off the field.