Mistral, an AI company founded by former Google and Meta alums pushed an “unmoderated” model into the world that will readily tell users how to kill their wives or restore Jim Crow-style discrimination.
The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. We’re looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.
“Whoops, it’s done now, oh well, guess we’ll have to do it later”
What none of these idiots realize is the reason most big LLM vendors carefully filter what their models output is not because they're namby-pamby liberals intent on throttling free speech, it's because headlines like "ChatGPT teaches kids how to make meth with the help of Adolf Hitler" are a fucking nightmare for a business to deal with.
and, infuriatingly, that's what makes this mistral play "good" - it gives them free distance, free protection for causal culpability.
research and solutions exist for ensuring poison pills or traceability or so.... and I'd bet it's more likely than not that they used none of that.
there are so many gating points where they could've gone "hmm, wait", and they just ... didn't. I am not inclined to believe any of this was done in good faith (whether towards their stated goals or towards societally good outcomes
(and, given the circles and actions, probably it wasn't either really either of those two as target goals either)
This highlights an inherent issue in trying to create ostensibly informative tools based on input data scraped indiscriminately from all over the internet. Misral's simply doesn't even pretend to paper over it while the rest go
The instruction "Do not act like Slobodan Milošević" in my AI's initial prompt has people asking a lot of questions answered by my AI's initial prompt.
Unrelated, I would call the opposite of a promptfan a "prompt critical" but unfortunately it reminds me of TERFs.
Good article. If nothing else, TIL from it that there is an “effective accelerationist” community and that we are all decels. A priori I’m guessing they’re all just NRXers cosplaying as pro “acceleration”.
@self@techtakes To neoreactionaries, accelerationism offers an attractive stalking-horse for their forward-to-the-past politics. Feudalism shall rise once more in spaaaaace! And the beta cucks will be put in their place alongside the wimmins and other chattels, or something, I guess. (Ack, spit.)
I did. I'm not convinced the author knows the space very well though. There are larger models out there with similarly absent safety features. This isn't a remarkable release, and the tone is of ragebait.
Guardrails are a term of art for something like Nemo, which is more like the unreal ramen shop demo or a corporate chatbot. Most raw open models I've tried will tell you how to make meth if you ask them.