ChatGPT can offer coding solutions, but its tendency for hallucination presents attackers with an opportunity. Here's what we learned.
“* People ask LLMs to write code
LLMs recommend imports that don't actually exist
Attackers work out what these imports' names are, and create & upload them with malicious payloads
People using LLM-written code then auto-add malware themselves”
What so many people don't understand: LLMs like ChatGPT are nothing but statistical engines. They break their incoming text into tokens, and see which tokens usually follow which others. When they generate output, they just roll the dice: After tokens A, B, and C, usually comes a D.
The point is: they have no understanding. If their training data included a good code example, they might regurgitate it. If their training data included broken code, they may regurgitate that. Or they could mix it all together and produce something weird. It's a lottery, based on what they sucked out of StackOverflow and other places.
What's with the massive outflow of scaremongering AI articles now? This is a huge reach, like, even for an AI scare piece.
I tried their exact input, and it works fine in ChatGPT, recommending a package called "arangojs", which, link, seems to be the correct package that's been around for 1841 commits. Which seems to be the pattern of "ChatGPT will X", and I try it, and "X" works perfectly fine with no issues that I've seen for literally every single article explaining how scary ChatGPT is because of "X".
ChatGPT and similar LLMs don't really "know" anything. They can only predict what the answer should look like. This means that they can't be trusted for much and their answers should be reviewed before used, because anything they produce will sound correct by default.
and the devs copy+pasting code from it probably are aware of that it doesn't know anything, and that it is likely synthesizing something based on StackOverflow, which they used to happily copy+paste from a few months ago.
If the libraries ChatGPT suggests work ~80% of the time, this leaves an opportunity for someone to provide a "solution" the other 20%.
This is pretty much my experience. It did a pretty good job with the grunt work of setting up a Qt UI in python, but something like 5/20 imports were wrong.