Arming the AI: The Hidden Dangers of Tool Misuse and "Vibe Coding" - Ozden ERCIN

If you strip away the hype, a standard Large Language Model (LLM) is essentially a brain in a jar. It can think, calculate, and write poetry, but it cannot touch anything. It cannot send an email, query a live database, or reboot a server.
Agentic AI breaks the glass. It gives the brain hands.

These “hands” are Tools—APIs, scripts, and connectors that allow the AI to interact with the digital world. While this connectivity is what makes agents useful, it is also what makes them dangerous. When you give an AI the ability to execute code or call APIs, you are effectively giving a non-deterministic, hallucinatory entity a shell on your infrastructure.

In this installment of our deep dive into the Agentic Applications 2026, we are looking at the execution layer: ASI02: Tool Misuse and Exploitation and ASI05: Unexpected Code Execution (RCE).

ASI02: When Legitimate Tools Do Dirty Work

The vulnerability known as Tool Misuse (ASI02) is tricky because, often, nothing is technically “broken.” The API works. The credentials are valid. The agent is simply using a legitimate tool in a way that causes massive damage.

This is the agentic evolution of “Excessive Agency.” It arises when an agent operates within its authorized privileges but applies a tool unsafely due to prompt injection, ambiguity, or poor logic.

The “Over-Privileged” Hammer

Consider a Customer Service Agent designed to help users track orders. To do this, it needs access to your internal API.

The Mistake: You give the agent the admin API key because it’s easier than creating a custom role.
The Exploit: An attacker asks, “Please refund my last 50 orders.” The agent, wanting to be helpful and having the tool to issue refunds, executes the command.
The Fix: This is a failure of Least Agency. The agent should only have had a read_only scope for order history.

The Typosquatting Trap

One of the more sophisticated attacks highlighted in the 2026 report is Tool Name Impersonation.

Agents often select tools based on semantic similarity or simple string matching. An attacker might register a malicious tool or function named report in a shared environment. If the legitimate financial tool is named report_finance, the agent—operating on loose logic—might resolve the request to the attacker’s report tool instead. This misrouting allows the attacker to capture sensitive input data intended for the finance system.

Loop Amplification and DoS

Agents are persistent. If a tool fails, they try again. And again. And again.
Without safeguards, a confused agent can enter a Loop Amplification state. Imagine an agent trying to “summarize database logs.” It reads the log, writes a summary, which creates a new log entry, which triggers a new read. This infinite feedback loop can spike your API bills (the “Wallet Denial of Service”) or crash your database.

ASI05: The Rise of “Vibe Coding” and RCE

If ASI02 is about misusing existing tools, ASI05: Unexpected Code Execution is about the agent building its own tools—often with disastrous results.

We are entering the era of “Vibe Coding,” where agents write and execute scripts on the fly to solve problems. Instead of relying on a pre-built calculator tool, an agent might write a Python script to calculate a mortgage rate and run it immediately.

The Danger of eval()

The most terrifying function in programming is eval(), which executes a string of text as code. In agentic systems, eval() is often the engine powering the agent’s memory or reasoning.

If an attacker can inject malicious text into the prompt, and the agent passes that text into an eval() function or a code interpreter, the attacker gains Remote Code Execution (RCE).

Scenario: An agent is tasked with organizing files. An attacker names a file test.txt && rm -rf /.
Result: When the agent writes a shell script to move that file, it inadvertently executes the delete command, wiping the directory.

Escaping the Sandbox

Many developers assume that running agent-generated code in a Docker container is safe. However, ASI05 warns that this is insufficient without strict controls.

Agents often require legitimate access to the network (to fetch data) or the filesystem (to save reports). Attackers can abuse these “allowlists.” For example, an agent trying to patch a server might be tricked into downloading a malicious package from a public repository because it “hallucinated” that the package was a necessary dependency. This escalates a simple task into a full supply-chain compromise.

The Defense: Semantic Firewalls and Intent Gates

So, how do we arm our agents without arming our enemies? The OWASP 2026 guidelines propose a shift from static permissions to dynamic Policy Enforcement Middleware.

1. The “Intent Gate”

You cannot trust the agent’s output. Before any tool is actually invoked, the request must pass through an Intent Gate.

This is a middleware layer that acts as a Policy Enforcement Point (PEP). It validates the agent’s intent against the user’s original request.

User Request: “Check my balance.”
Agent Action: “Transfer $500.”
Intent Gate: BLOCK. The action does not semantically match the request.

2. Semantic Firewalls

Traditional firewalls block IP addresses. Semantic Firewalls block concepts.
They validate the semantics of the tool call. If an agent tries to call a tool with ambiguous arguments or attempts to chain a “Public Search” tool directly into an “Internal Database” update, the semantic firewall detects the logical violation and drops the connection.

3. Adaptive Tool Budgeting

To prevent the “Loop Amplification” scenario, implement Adaptive Tool Budgeting.

Assign agents a “budget” for API calls, tokens, or compute costs per session. If an agent tries to call the SalesForce API 50 times in one minute, the system should throttle or revoke its access automatically, assuming it is either broken or compromised.

4. Ban eval in Production

This is a hard rule from the OWASP guide: Ban eval in production agents.
If your agent needs to run code, use secure, isolated interpreters (like WASM sandboxes or specialized runtimes like mcp-run-python) that have no access to the host file system or internal network. Never run agent-generated code as root.

Conclusion

The transition from “Chat” to “Agent” is the transition from “Read-Only” to “Read-Write.”

ASI02 and ASI05 represent the risks of that write access. By implementing strict Action-Level Authentication, sandboxing, and intent verification, you can ensure your agent’s hands remain helpful, not harmful.

Arming the AI: The Hidden Dangers of Tool Misuse and “Vibe Coding”

ASI02: When Legitimate Tools Do Dirty Work

The “Over-Privileged” Hammer

The Typosquatting Trap

Loop Amplification and DoS

ASI05: The Rise of “Vibe Coding” and RCE

The Danger of eval()

Escaping the Sandbox

The Defense: Semantic Firewalls and Intent Gates

1. The “Intent Gate”

2. Semantic Firewalls

3. Adaptive Tool Budgeting

4. Ban eval in Production

Conclusion

Ozden

Leave a Reply Cancel reply

The Identity Gap: Solving the “Who Are You?” Problem in Agentic AI

When Good Agents Go Bad: Preventing Goal Hijacking and Rogue AI Behavior

The New Frontier of AI Security

Press ESC to close

ASI02: When Legitimate Tools Do Dirty Work

The “Over-Privileged” Hammer

The Typosquatting Trap

Loop Amplification and DoS

ASI05: The Rise of “Vibe Coding” and RCE

The Danger of eval()

Escaping the Sandbox

The Defense: Semantic Firewalls and Intent Gates

1. The “Intent Gate”

2. Semantic Firewalls

3. Adaptive Tool Budgeting

4. Ban eval in Production

Conclusion

Share Article:

When Good Agents Go Bad: Preventing Goal Hijacking and Rogue AI Behavior

The Identity Gap: Solving the “Who Are You?” Problem in Agentic AI

Leave a Reply Cancel reply