~/today's vibe
Published on

45% of AI-Generated Code Ships with OWASP Vulnerabilities

Authors
  • avatar
    Name
    오늘의 바이브
    Twitter

They gave 100 AI models the same coding tasks. Nearly half submitted code riddled with security holes. When a developer asks "build me a login feature," AI confidently delivers code vulnerable to SQL injection. Because the word "security" never appeared in the prompt.

Image symbolizing AI code security vulnerabilities

Veracode's Alarming Experiment

In their 2025 GenAI Code Security Report, Veracode tested over 100 large language models across 80 real-world coding tasks. The results were grim. 45% of cases produced code containing OWASP Top 10 vulnerabilities.

OWASP Top 10 is the definitive list of the most common and critical web application security vulnerabilities. It includes injection, authentication failures, sensitive data exposure, and more. Every security professional knows this list. It is the absolute baseline.

The problem is that AI does not know this baseline. More precisely, AI does not consider security unless explicitly asked. When a developer requests something "fast" or "simple," AI skips validation code. Speed and security are separate requirements to an AI.

Failure Rates by Programming Language

Not all languages are equally dangerous. Veracode's report broke down vulnerability rates by language. The results defied expectations.

LanguageVulnerability Rate
JavaOver 70%
Python38-45%
C#38-45%
JavaScript38-45%

Java topped the list at over 70%. The irony is brutal: the language most used in enterprise environments is the one AI handles worst. Java's complex security model and framework dependencies seem to confuse AI models.

Python, C#, and JavaScript clustered in the 38-45% range. That looks better by comparison, but it still means roughly one in three outputs has a security hole. In production code, those odds are catastrophic.

Matrix-style image symbolizing code security

XSS and Log Injection: 86% and 88% Failure Rates

Veracode also measured defense success rates against specific vulnerability types. The results were even bleaker.

In Cross-Site Scripting (XSS, CWE-80) tests, 86% of AI-generated code samples failed to defend against the attack. XSS lets attackers inject malicious scripts into web pages. It enables session hijacking, phishing, and malware distribution.

For Log Injection (CWE-117), 88% of samples exposed vulnerabilities. Log injection lets attackers manipulate log files to erase their tracks or insert false information. It neutralizes forensic analysis and poisons audit trails.

Both of these are perennial OWASP Top 10 entries. They are the first things taught in security training. Yet AI fails to prevent them nearly nine times out of ten. To AI, security is "nonexistent unless requested."

The Dark Side of Vibe Coding

Veracode CTO Jens Wessling nailed the core issue: "The trend of 'vibe coding,' where developers rely on AI without explicitly defining security requirements, is dangerous."

Vibe Coding is the development approach coined by Andrej Karpathy. The developer conveys a rough intent, and AI handles the implementation details. Code gets written from a vibe. Productivity explodes.

But security does not work on vibes. Input validation, output encoding, and authorization checks require explicit requirements. "Build me a login feature" does not include SQL injection defense. "Accept user input and save it to the database" does not include XSS filtering.

Vibe coding's core value of "rapid prototyping" fundamentally clashes with security's core principle of "never trust any input." When developers fail to recognize this tension, vulnerabilities stop being tech debt and become ticking time bombs.

AI Code Reviewers Miss Security Too

Some developers might counter: "Can't we just have another AI review the generated code?" Veracode's research crushes that hope too.

AI code review tools share the same limitations. Without an explicit request for security-focused review, AI reviewers focus on readability, performance, and style. You have to ask "does this code have security vulnerabilities?" before it even considers security.

The deeper problem is that developers do not know what to ask. For someone unfamiliar with XSS, CSRF, SSRF, and IDOR, "request a security review" is an abstract concept. If you do not know what specifically to check for, you cannot ask AI the right questions.

Ultimately, AI is a tool, not a security expert. Tools operate at the user's level of knowledge. When a developer without security knowledge uses AI, the output lacks security knowledge too.

Image showing a developer screen

Veracode's report laid out specific countermeasures for organizations.

First, integrate AI tools for real-time security remediation. Security verification must happen the moment code is generated. IDE plugins that scan AI-generated code in real time already exist. Tools like Snyk and Semgrep that work alongside GitHub Copilot are prime examples.

Second, catch flaws early through static analysis. SAST (Static Application Security Testing) tools must be integrated into CI/CD pipelines. Vulnerabilities need to be caught before code is pushed to the repository. Finding them after production deployment increases remediation costs tenfold or more.

Third, embed security into agentic workflows. When AI agents generate code, security guardrails should be applied automatically. Rules like "all user input must be validated" and "SQL queries must be parameterized" should be baked into system prompts.

Fourth, leverage Software Composition Analysis (SCA). Scan the libraries and packages that AI recommends for vulnerabilities. AI frequently suggests packages with known CVEs without hesitation. SCA tools automatically detect these dependency vulnerabilities.

Fifth, implement automated detection and blocking of malicious packages. Attacks exploiting AI "hallucinations" -- where models generate nonexistent package names -- are on the rise. Attackers publish malicious code under package names that AI commonly hallucinates. Automated verification before package installation is essential.

3 Things Developers Should Do Right Now

Organizational responses matter, but there are things individual developers can do immediately.

First, specify security requirements in your prompts. Instead of "build me a login feature," ask for "a login feature that defends against SQL injection and XSS." Instead of "build a file upload feature," ask for "a file upload feature with file type validation and size limits."

Second, study the OWASP Top 10. At minimum, know what these ten vulnerabilities are. Injection, authentication failure, sensitive data exposure, XML external entities, broken access control, security misconfiguration, XSS, insecure deserialization, using components with known vulnerabilities, and insufficient logging and monitoring. Without this knowledge, you cannot ask AI the right questions.

Third, do not blindly trust AI-generated code. AI is confidently wrong. Code that is syntactically perfect with helpful comments can be riddled with security holes. Authentication, authorization, and encryption-related code must always be manually reviewed.

Image symbolizing a security warning

45% Is the Average. The Worst Is Far Worse

Veracode's 45% figure is the average across over 100 models. The worst performers likely showed far higher vulnerability rates. The report did not publish model-by-model rankings, but some open-source models were almost certainly more dangerous than others.

These tests were also conducted under conditions where "security was not explicitly requested" -- identical to real-world vibe coding environments. Requesting security would improve success rates, but the whole point of vibe coding is "it should just figure things out without being told." That very premise conflicts with security.

A consensus is already forming in the developer community that "AI-generated code is for prototypes only." But in practice, prototypes ship to production far too often. Especially in startups and teams with aggressive release cycles.

In the AI Era, Security Is Still a Human Responsibility

The conclusion is clear. AI does not bear responsibility for security. The security of AI-generated code remains squarely on developers and organizations. No matter how advanced AI becomes, this principle will not change.

The lesson from 45% is simple. Use AI, but own security yourself. Include security requirements in your prompts. Validate generated code with SAST tools. Manually review authentication, authorization, and encryption code.

Even if AI boosts productivity tenfold, a single security incident erases all those gains. The average cost of a data breach exceeds $5 million. The time saved by vibe coding does not outweigh the time spent responding to a security incident.

AI coding tools are genuinely revolutionary. But every revolution comes with a price. That price must not be security.


Sources