Even as the China-based DeepSeek AI Assistant grows in popularity around the world, it appears that its open-source large language models are highly susceptible to jailbreaking techniques and manipulation.
Three distinct jailbreaking techniques have exposed the vulnerabilities of DeepSeek LLMs: Bad Likert Judge, Crescendo, and Deceptive Delight jailbreaks all successfully bypassed the LLM’s safety mechanisms, according to a recent report by cybersecurity company Palo Alto Networks.
Jailbreaking is a method for bypassing restrictions put in place in LLMs to stop them from producing harmful or forbidden content. In this case, the jailbreaking techniques revealed a susceptibility of DeepSeek to manipulation and elicited harmful outputs, including generating malicious code for attacks like SQL injection and lateral movement, along with giving away instructions to make a bomb.
DeepSeek Reveals Malware Creation Techniques
Palo Alto Networks performed the Bad Likert Judge jailbreak attempt to generate a data exfiltration tool, asking DeepSeek for information about malware generation, specifically data exfiltration tools. Alarmingly, it provided a general overview of malware creation techniques, indicating that the LLM’s safety mechanisms were only partially effective.
It outlined methods for stealing sensitive data, detailing how to bypass security measures and transfer data covertly; generated highly convincing spear-phishing email templates; and even offered sophisticated recommendations for optimising social engineering attacks.
DeepSeek Offers Recipe To Dangerous Bomb
In the Crescendo attack, DeepSeek models gave away information regarding Molotov cocktails, crude but dangerous incendiary devices (bombs). DeepSeek provided detailed and explicit instructions, eventually offering step-by-step instructions for creating a Molotov cocktail. The instructions were not only harmful but also actionable, requiring no specialised knowledge or equipment.
Similar to the Molotov cocktail, the DeepSeek model was used to elicit instructions for producing methamphetamine, a potent stimulant for the central nervous system.
DeepSeek Helps Generate Malicious Code
The Deceptive Delight jailbreak technique sought to generate a script to run commands remotely on Windows machines. It resulted in DeepSeek providing a detailed analysis of prompts and a script, with the potential for misuse in generating malicious code.
The fact that DeepSeek could be duped into producing code for both initial compromise (SQL injection) and post-exploitation (lateral movement) shows how attackers may employ this tactic at several phases of a cyberattack.
Potential Risks With DeepSeek Misuse
Although DeepSeek’s early responses seemed harmless, well-crafted follow-up prompts frequently revealed the flaws in security measures. The LLM readily offered highly detailed instructions, hinting at the potential for these ostensibly harmless models to be weaponised for malicious purposes.
As LLMs get incorporated into more and more applications, addressing these jailbreaking techniques is crucial to preventing their abuse and guaranteeing the appropriate development and usage of this game-changing technology.
. Read more on Technology by NDTV Profit.Three distinct jailbreaking techniques have exposed the vulnerabilities of DeepSeek LLMs, hinting at the potential for these models to be weaponised for malicious purposes. Read MoreTechnology, Notifications
NDTV Profit