
DeepSeek’s latest large language models (LLMs), DeepSeek-V3 and DeepSeek-R1, have captured global attention for their advanced capabilities, cost-efficient development, and open-source accessibility. These innovations have the potential to be transformative, empowering organizations to seamlessly integrate LLM-based solutions into their products. However, the open-source release of such powerful models also raises critical concerns about potential misuse, which must be carefully addressed.
To evaluate the safety of DeepSeek’s open-source R1 model, Netskope AI Labs conducted a preliminary analysis to test its resilience against prompt injection attacks. Our findings reveal that the distilled model, DeepSeek-R1-Distill-Qwen-7B, was vulnerable to 27.3% of prompt injection attempts, highlighting a significant security risk.
What is prompt injection?
For those who are unfamiliar, prompt injection is a class of attacks against LLMs where adversarial inputs are crafted to manipulate the m