Code Written with AI Assistants Is Less Secure
Interesting research: “Do Users Write More Insecure Code with AI Assistants?“:
Abstract: We conduct the first large-scale user study examining how users interact with an AI Code assistant to solve a variety of security related tasks across different programming languages. Overall, we find that participants who had access to an AI assistant based on OpenAI’s codex-davinci-002 model wrote significantly less secure code than those without access. Additionally, participants with access to an AI assistant were more likely to believe they wrote secure code than those without access to the AI assistant. Furthermore, we find that participants who trusted the AI less and engaged more with the language and format of their prompts (e.g. re-phrasing, adjusting temperature) provided code with fewer security vulnerabilities. Finally, in order to better inform the design of future AI-based Code assistants, we provide an in-depth analysis of participants’ language and interaction behavior, as well as release our user interface as an instrument to conduct similar studies in the future.
At least, that’s true today, with today’s programmers using today’s AI assistants. We have no idea what will be true in a few months, let alone a few years.
Clive Robinson • January 17, 2024 7:55 AM
@ Bruce, ALL,
The reason for,
Might currently be simple statistics.
If we assume the LLM has all the available code on the Internet in it’s training data set, then unless experts went through tagging each as having good or bad security you would expect not a “current average” but a “historical average” output.
As code security is alledgedly improving with time, you would thus expect to see the output from the LLM as being two to ten years behind current code security.
Hence the security practices being produced by the LLM and followed by the programmers are very probably “out of date” (as you would expect).
Even with ML you would still expect the security practices produced by AI as we know it currently would be significantly behind the curve.
It’s the same argument as used to say “Don’t roll your own crypto”.
That is you need significant experience in breaking algorithms before you can make secure algorirhms.
As far as I’m aware no current AI has any experience in breaking algorithms to the point they can find new breaks. Therefore their security is based on the past, not the present or where it needs to be future, thus ahead of the curve.
We’ve known about this type of AI failing ever since the 1980’s when “decision tree” following “Expert Systems” were first fielded.