Research8 April 2026

AI writes the code. Your developers ship it. Here is the security problem nobody is talking about.

Peer-reviewed research from IEEE, ACM, and USENIX finds that AI-generated code carries significantly elevated security risk. The bigger problem is that developers trust it anyway.

Your development team is using AI coding assistants. GitHub Copilot. Claude. ChatGPT. Maybe all three.

They are writing code faster. Shipping more features. Closing tickets at a rate that would have been impossible two years ago.

But a question most engineering leads are not asking loudly enough: are they checking what the AI wrote before it goes to production?

AI-generated code contains significantly more vulnerabilities

The most cited study in this area comes from researchers at New York University, published at the 2022 IEEE Symposium on Security and Privacy — one of the top peer-reviewed venues in computer security. Hammond Pearce and colleagues tested GitHub Copilot across 89 scenarios, generating 1,689 programs. They found that approximately 40% of Copilot-generated programs contained security vulnerabilities drawn from MITRE's CWE Top 25 list.

That finding has since been replicated at scale in production environments. Veracode's 2025 GenAI Code Security Report tested 100+ large language models across 80 carefully designed coding tasks in Java, JavaScript, Python, and C#. It found that 45% of AI-generated code failed security tests or introduced OWASP Top 10 vulnerabilities. Java showed the highest failure rate at 72%. Notably, Veracode found that security performance has not improved over time despite models improving rapidly at syntax and functionality.

Hammond Pearce et al. "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions." IEEE S&P 2022. N=1,689 programs. DOI: 10.1109/SP46214.2022.9833571

Veracode, "2025 GenAI Code Security Report," July 2025. N=80 tasks, 100+ LLMs.

Developers trust AI output more than they should

A peer-reviewed study published at ACM CCS 2023 — another top-tier security conference — adds an important dimension. Neil Perry and colleagues at Stanford tested 47 developers, half with access to an AI coding assistant. They found that those with AI access wrote significantly less secure code than those without — but were significantly more likely to believe they had written secure code. The AI created a false sense of security.

The Stack Overflow 2025 Developer Survey of 49,000+ developers across 177 countries found that only 29% of developers trust the accuracy of AI tools — down from 40% in 2024. But distrust of AI accuracy does not translate into verification behaviour. Sonar's 2026 State of Code Developer Survey of 1,100+ developers found that only 48% always check AI-assisted code before committing. 61% agree that AI produces code that looks correct but is not reliable — yet more than half commit it anyway.

Neil Perry et al. "Do Users Write More Insecure Code with AI Assistants?" ACM CCS 2023. N=47. DOI: 10.1145/3576915.3623157

Stack Overflow 2025 Developer Survey. N=49,000+, 177 countries.

Sonar, "State of Code Developer Survey," January 2026. N=1,100+.

AI-generated code now constitutes a significant share of production

GitHub reports that active Copilot users now have an average of 46% of their code generated by the tool, rising to 61% for Java developers. 90% of Fortune 100 companies use GitHub Copilot. AI-generated code is not a future scenario — it is already a majority of what many development teams ship.

Apiiro's 2025 analysis of tens of thousands of repositories across Fortune 50 enterprises found that AI-assisted developers produced 3-4 times more commits but generated 10 times more security findings. Secrets exposure — hardcoded API keys, passwords, and credentials in source code — increased by 40%. AI-assisted commits were merged into production 4 times faster, bypassing review cycles.

GitHub Octoverse and Copilot official data.

Apiiro, "4x Velocity, 10x Vulnerabilities," September 2025. Fortune 50 enterprise repo analysis. (Industry research)

Security incidents linked to AI-generated code are now being documented

A 2026 survey of 450 CISOs, developers, and AppSec engineers by Aikido Security found that 1 in 5 organisations had experienced a major breach directly linked to AI-generated code. Separately, researchers at Georgia Tech's Systems Software and Security Lab have been tracking CVEs attributable to AI-generated code since May 2025. By March 2026 they had confirmed 74 CVEs — with the actual number estimated at 5-10 times higher due to metadata stripping in code repositories.

Aikido Security, "State of AI in Security and Development 2026." N=450. (Industry research)

Georgia Tech SSLab, "Vibe Security Radar," ongoing from May 2025.

The capability question for engineering leads

The risk is not that developers are using AI coding assistants. The risk is that they are using them without the judgment to know when to trust the output, when to review it carefully, and when not to use AI at all.

Catching a hallucinated library that does not exist. Identifying a SQL injection vulnerability in AI-generated query code. Knowing that authentication and session management are precisely the areas where AI output requires security specialist review before it ships.

These are not instincts that come automatically with AI tool access. They are skills that can be measured, developed, and tracked.

Probe Learning's Developer vertical assesses exactly this — how developers perform when reviewing AI-generated code in realistic scenarios. Not a knowledge quiz. Real code. Real decisions. Instant results.

If your engineering team is shipping AI-generated code — and they almost certainly are — the question is whether they have the judgment to catch what the AI gets wrong. Book a demo at probelearning.com to see what that assessment looks like.