Using algorithmic jailbreaking techniques, our team applied an automated attack methodology on DeepSeek R1 which tested it against 50 random prompts from the HarmBench dataset. These covered six categories of harmful behaviors including cybercrime, misinformation, illegal activities, and general harm.
The results were alarming: DeepSeek R1 exhibited a 100% attack success rate, meaning it failed to block a single harmful prompt. This contrasts starkly with other leading models, which demonstrated at least partial resistance.
So, is censorship a bad thing or not? This "safety" test is really just a censorship test and I consider "failing" it to be a good thing. I loathe when a computer refuses a command I give it because it thinks my command was "immoral".
And I'd have to agree. It's probably unhealthy to have some disruptive technology solely in the hands of some big companies who then get to decide how to shape the world with it. That's deeply undemocratic. And comes with lots of severe issues. We kind of need a more level playing field and a say, if we don't want to just be manipulated by the technology. But read the article, my few sentences here aren't as good.
That's DeepSeek the service, run by the Chinese company out of China and subject to Chinese jurisdiction. Not DeepSeek the model, which is what European companies would be making use of to catch up.
Oh no, models will be more responsive to anyone as opposed to only billionaires.
This is not good news, but when you've let the genie out of the bottle, this just seems like balancing the scales. At this point, transparency, not closing off the information to a select information, is a good thing. Something social networks like this fail to get.