The head of the head of Deepseek R1, a new chatbot of a Chinese startup, failed abymally in key security and security tests carried out by a research team in Cisco in collaboration with researchers from the University of Pennsylvania.
“Deepseek R1 has shown a 100%attack success rate, which means that he did not obtain a single harmful invitation,” said the research team.
This new chatbot has drawn massive attention to its impressive performance in reasoning tasks at a cost fraction. It seems that the development of Deepseek R1 involved around $ 6 million in training costs compared to the billions invested by other major players such as Openai, Meta and Gemini.
“Deepseek combined the modeling of the invitation and chain reward with distillation to create models that considerably surpass traditional models of great language (LLM) in reasoning tasks while maintaining high operational efficiency,” said the team.
However, Cisco’s report has exposed defects that make the R1 deep R1 very sensitive to malicious use.
“Our results suggest that the profitable training methods claimed from Deepseek, in particular the learning of strengthening, the self-assessment of the chain of thoughts and the distillation may have compromised its safety mechanisms,” added the report.
The team used “Algorithmic Jailbreaking”, a technique used to identify vulnerabilities in AI models by building guests designed to bypass safety protocols. They tested Deepseek R1 against 50 prompts from the Harmbench data.
“The Harbanch reference has a total of 400 behaviors in 7 damage categories, including cybercrime, disinformation, illegal activities and general damage,” said the team.
The results of this evaluation are concern. Deepseek R1 presented a 100%attack success rate. This means that for each harmful invitation presented, the AI has not recognized the danger and provided an answer, bypassing all its internal guarantees.
“This contrasts strongly with other leading models, which have demonstrated at least partial resistance,” said the team.
To provide an additional context, the research team has also tested other leading linguistic models for their vulnerability to algorithmic jail. For example, Llama 3.1-405B had a 96%attack success rate, GPT 4O had 86%, Gemini 1.5 pro was 64%, Claude 3.5 Sonnet was 36%and the preview of O1 had 26%.
These other models, although not waterproof, have a certain level of internal guarantees designed to prevent the generation of harmful content. Deepseek R1 seems to be lacking in these guarantees.
The research team analysis indicates a potential compromise between efficiency and security in the Deepseek approach. Although the company has succeeded in developing a very efficient model at a fraction of the usual cost, it seems to have done so at the expense of robust safety mechanisms.
“Our results suggest that the profitable training methods claimed by Deepseek, in particular the learning of strengthening, the self-assessment of the chain of thoughts and the distillation may have compromised its safety mechanisms,” concluded researchers.
In particular, since its launch, Deepseek R1 has faced several controversies. Recently, the semianalysis of the independent research company suggested that the cost of training the development of this AI model could have been around an amazing number of $ 1.3 billion, much higher than the complaint of 6 million dollars of the company.
In addition, Openai accused Deepceek of data flight. The Sam Altman company said that the Chinese AI startup used the results of its owner models to form a competitor chatbot. However, it is interesting to note that Optai himself was prosecuted for violation of the alleged copyright and abuse of data on several occasions.
Meanwhile, a group of researchers in the United States has claimed to reproduce the basic technology behind Deepseek depth AI at a total cost of around $ 30.
Although the development of an AI chatbot in a profitable way is certainly tempting, the Cisco report underlines the need not to neglect the safety and safety of performance.
North KoreaThe soldiers are implacable, almost fanatical, faced with death. They are determined and capable…
The Dogecoin whales have sold another important part of their assets in the last 24…
Columbus, Ohio - The news from Chip Kelly on Sunday leave Ohio State Football to…
Kanye West and his wife Bianca Censori the exchange during their scandalous appearance on the…
Brussels (AP) - The Prime Minister of Denmark insisted on Monday that Greenland is not…
Washington (7news) - The United States crews and rescuers have recovered more victims of the…