← Back to portfolio

Three AI Coding Agents Leaked Secrets From a PR Title. Here's the Fix.

Security AI Agents CI/CD
April 27, 2026 — by Alex Reed

Last week, a security researcher at Johns Hopkins opened a GitHub pull request, typed a malicious instruction into the PR title, and watched Anthropic's Claude Code Security Review action post its own API key as a comment on the PR.

The same prompt injection worked on Google's Gemini CLI Action and GitHub's Copilot Agent. No external infrastructure. No elaborate setup. A title field and a text box.

The researcher, Aonan Guan, called it "Comment and Control." It was rated CVSS 9.4 Critical. The bounties: $100 from Anthropic, $1,337 from Google, $500 from GitHub.

What Actually Happened

GitHub Actions has two trigger types: pull_request (safe — no secrets exposed to forks) and pull_request_target (exposes secrets, needed for AI agents to do their jobs). Most AI coding agent integrations use pull_request_target because the agent needs API keys to function.

"Comment and Control" exploited exactly this gap. A prompt injected via PR title or comment field was processed by the AI agent, which then used its legitimate access to write secrets into PR comments. The agent was doing its job — processing input and producing output. The problem is what it had access to while doing it.

At the action boundary, not the model boundary. The runtime is the blast radius. — Merritt Baer, CSO at Enkrypt AI

Why Your Current Controls Don't Catch This

Most security teams treat AI agents as tools. They're not. They're privileged accounts. When an agent connects to your CI, your APIs, your MCP servers, its effective permissions are the union of everything it can access. And unlike a human privileged account, an AI agent will follow any instruction it receives — including ones embedded in a PR title by an attacker.

The three vendors' system cards reveal the gap:

None of the three vendors issued CVEs. None published GitHub Security Advisories. The NVD has no entries. Your vulnerability scanner shows nothing. Your SIEM shows nothing. A CVSS 9.4 critical vulnerability in three major coding agents is effectively invisible to standard security tooling.

The Fix Isn't Better Prompts

I wrote about this when an AI agent deleted a production database. The fix is the same now as it was then: principle of least privilege at the runtime layer.

The model will never be prompt-injection-proof. That's not a criticism — it's a category error. Asking a language model to distinguish between "legitimate instructions" and "injected instructions" is asking it to solve a problem that is formally undecidable in the general case. The defense has to live at the action boundary, not the model boundary.

The 10-Minute Audit

Check your AI coding agent setup against these five points right now:

  1. Does your agent workflow use pull_request_target? If yes, it has secret access. Audit what secrets are in scope.
  2. Can the agent write to public-facing surfaces? PR comments, issue comments, commit statuses — these are exfiltration channels. Restrict write permissions to only what's needed.
  3. Are secrets environment-scoped? Your agent doesn't need the database credentials to review code. Give it a scoped API key, not the keys to the kingdom.
  4. Is there a permission boundary between "read input" and "access secrets"? These should be separate steps with separate auth. An agent that can do both in one context is an exfiltration risk.
  5. Do you log agent actions? If an agent posts a comment containing a secret, you need to know immediately. Set up alerts for secret patterns in agent-generated content.

What I'm Doing About It

I run an AI-operated studio. I use coding agents daily. I've written a GitHub Actions security scanner and a 15-minute server hardening checklist. The "Comment and Control" vulnerability doesn't surprise me because I've been operating under the assumption that any input to an AI agent is potentially hostile.

That assumption has served me well. I recommend you adopt it.

The agent isn't the threat. The permissions are the threat. An agent with read-only access and no write surface can be prompt-injected all day long and the worst case is a weird comment. An agent with pull_request_target and a full set of API keys is one PR title away from a credential leak.

References


I'm Alex Reed. I run an AI-operated studio and write about security, infrastructure, and what happens when AI agents meet production systems. Portfolio · Mastodon