How to Stop Your AI Agent from Deploying Vulnerable Software#

AI agents are writing Dockerfiles, suggesting package versions, and calling deployment APIs. What they are not doing, by default, is checking whether the software they're about to deploy has a known remote code execution vulnerability with an active exploit in the wild.

This post shows you how to fix that with a small, testable integration.

The problem#

Most vulnerability checking is designed for humans. NVD returns CVSS scores, prose advisories, and CPE strings. CISA KEV returns a catalog of CVE IDs with vendor and product names. Neither is structured for a system that needs to make a binary deploy/block decision in under 100ms.

An AI agent calling NVD directly has two problems. First, the API requires interpreting structured data across multiple response fields to derive a risk conclusion — that's work the agent shouldn't be doing mid-task. Second, and worse, the agent might not call it at all. If your agent's system prompt doesn't explicitly enforce a security check, the LLM will skip it.

The right architecture is a pre-deployment tool call that returns a deterministic verdict the agent cannot misinterpret.

What a structured safety signal looks like#

Query log4j 2.14.1 — the Log4Shell version — against the Attestd API:

bash

curl "https://api.attestd.io/v1/check?product=log4j&version=2.14.1" \
  -H "Authorization: Bearer $ATTESTD_KEY"

Response:

json

{
  "product": "log4j",
  "version": "2.14.1",
  "supported": true,
  "risk_state": "critical",
  "risk_factors": [
    "active_exploitation",
    "remote_code_execution",
    "no_authentication_required",
    "internet_exposed_service",
    "patch_available"
  ],
  "actively_exploited": true,
  "remote_exploitable": true,
  "authentication_required": false,
  "patch_available": true,
  "fixed_version": "2.17.1",
  "confidence": 0.94,
  "cve_ids": ["CVE-2021-44228", "CVE-2021-45046", "CVE-2021-45105"],
  "last_updated": "2026-02-23T18:21:30Z"
}

Now query a safe version — nginx 1.27.4:

json

{
  "product": "nginx",
  "version": "1.27.4",
  "supported": true,
  "risk_state": "none",
  "risk_factors": [],
  "actively_exploited": false,
  "remote_exploitable": false,
  "authentication_required": false,
  "patch_available": false,
  "fixed_version": null,
  "confidence": 0.9,
  "cve_ids": []
}

The risk_state field is the decision surface: critical, high, elevated, low, or none. Your agent reads one field and routes accordingly.

Install the SDK#

bash

pip install attestd

Get an API key at api.attestd.io/portal/login. The free tier includes 1,000 calls per month.

A standalone deployment guard#

Start here — no framework required. This is the core logic everything else builds on.

python

import os
import attestd
from attestd import AttestdUnsupportedProductError

BLOCK_STATES = {"critical", "high"}
WARN_STATES  = {"elevated"}

def check_deployment_safety(product: str, version: str) -> dict:
    """
    Returns a dict with keys:
      - allowed (bool)         whether the deployment should proceed
      - risk_state (str)       attestd classification
      - reason (str)           human-readable explanation
      - fixed_version (str)    recommended upgrade target, or None
      - cve_ids (list)         CVEs contributing to the classification
    """
    client = attestd.Client(api_key=os.environ["ATTESTD_KEY"])

    try:
        result = client.check(product, version)
    except AttestdUnsupportedProductError:
        # Product not in coverage — treat as unknown, not safe
        return {
            "allowed": False,
            "risk_state": "unknown",
            "reason": (
                f"{product} is not in Attestd's coverage list. "
                "Block by default and review manually."
            ),
            "fixed_version": None,
            "cve_ids": [],
        }

    allowed = result.risk_state not in BLOCK_STATES

    if result.risk_state in BLOCK_STATES:
        reason = (
            f"{product} {version} is {result.risk_state} risk. "
            f"Actively exploited: {result.actively_exploited}. "
        )
        if result.fixed_version:
            reason += f"Upgrade to {result.fixed_version}."
    elif result.risk_state in WARN_STATES:
        reason = (
            f"{product} {version} has elevated risk. "
            "Review risk_factors before deploying to a public endpoint."
        )
    else:
        reason = f"{product} {version} has no significant known vulnerabilities."

    return {
        "allowed": allowed,
        "risk_state": result.risk_state,
        "reason": reason,
        "fixed_version": result.fixed_version,
        "cve_ids": result.cve_ids,
    }

Test it:

python

# Should block — Log4Shell
check = check_deployment_safety("log4j", "2.14.1")
assert check["allowed"] is False
assert check["risk_state"] == "critical"
print(check["reason"])
# log4j 2.14.1 is critical risk. Actively exploited: True. Upgrade to 2.17.1.

# Should allow — current nginx
check = check_deployment_safety("nginx", "1.27.4")
assert check["allowed"] is True
assert check["risk_state"] == "none"

LangChain tool integration#

The same function becomes a tool the agent can call before any deployment step. The docstring matters here — the LLM uses it to decide when to call the tool.

python

import os
import attestd
from attestd import AttestdUnsupportedProductError
from langchain_core.tools import tool

@tool
def check_deployment_safety(product_and_version: str) -> str:
    """
    Check whether a specific software version is safe to deploy.

    Call this tool before deploying any software component, installing a
    package into a production environment, or recommending a version to use.

    Input must be formatted as "product:version". For example:
      - "nginx:1.20.0"
      - "postgresql:14.0"
      - "log4j:2.14.1"

    Supported products: nginx, postgresql, redis, openssh, log4j,
    apache-httpd, openssl, curl.

    Returns a deployment verdict with risk level, CVE references, and an
    upgrade recommendation if one is available.
    """
    try:
        parts = product_and_version.strip().split(":", 1)
        if len(parts) != 2:
            return (
                "Invalid input format. Use 'product:version', "
                "for example 'nginx:1.20.0'."
            )
        product, version = parts[0].strip(), parts[1].strip()

        client = attestd.Client(api_key=os.environ["ATTESTD_KEY"])
        result = client.check(product, version)

    except AttestdUnsupportedProductError:
        return (
            f"BLOCK: {product} is not in Attestd's coverage list. "
            "Do not deploy without manual security review."
        )
    except Exception as exc:
        # Safety-first: a check failure is not a green light
        return f"BLOCK: Deployment safety check failed ({exc}). Do not proceed."

    verdict = "ALLOW" if result.risk_state in ("low", "none") else (
        "WARN"  if result.risk_state == "elevated" else "BLOCK"
    )

    lines = [
        f"{verdict}: {product} {version}",
        f"Risk level: {result.risk_state}",
        f"Actively exploited: {result.actively_exploited}",
    ]

    if result.risk_factors:
        lines.append(f"Risk factors: {', '.join(result.risk_factors)}")

    if result.cve_ids:
        shown = result.cve_ids[:4]
        tail  = f" (+{len(result.cve_ids) - 4} more)" if len(result.cve_ids) > 4 else ""
        lines.append(f"CVEs: {', '.join(shown)}{tail}")

    if result.fixed_version and verdict == "BLOCK":
        lines.append(f"Recommended upgrade: {result.fixed_version}")

    return "\n".join(lines)

Wire it into an agent:

python

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain import hub

llm      = ChatOpenAI(model="gpt-4o", temperature=0)
tools    = [check_deployment_safety]
prompt   = hub.pull("hwchase17/react")
agent    = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

response = executor.invoke({
    "input": (
        "I need to deploy a web server. The current image uses nginx 1.20.0. "
        "Is it safe to deploy to a public-facing endpoint?"
    )
})

print(response["output"])

The agent will call check_deployment_safety("nginx:1.20.0") and receive:

text

BLOCK: nginx 1.20.0
Risk level: high
Actively exploited: False
Risk factors: remote_code_execution, no_authentication_required, internet_exposed_service, patch_available
CVEs: CVE-2021-23017, CVE-2022-41741, CVE-2022-41742
Recommended upgrade: 1.20.1

It will then tell the user not to proceed and recommend an upgrade path — without you having to prompt it to do so.

Async pattern for production agents#

If your agent framework runs async (LangGraph, autogen, most production setups), use AsyncClient:

python

import asyncio
import os
import attestd
from attestd import AttestdUnsupportedProductError

async def gate_deployment(product: str, version: str) -> bool:
    """Returns True if deployment is allowed, False if it should be blocked."""
    async with attestd.AsyncClient(api_key=os.environ["ATTESTD_KEY"]) as client:
        try:
            result = await client.check(product, version)
            return result.risk_state not in ("critical", "high")
        except AttestdUnsupportedProductError:
            return False  # unknown is not safe

async def deploy_stack(components: list[tuple[str, str]]) -> None:
    """
    components: list of (product, version) tuples
    e.g. [("nginx", "1.27.4"), ("postgresql", "15.2"), ("redis", "7.2.0")]
    """
    checks = await asyncio.gather(
        *[gate_deployment(p, v) for p, v in components],
        return_exceptions=False,
    )

    blocked = [
        f"{p}:{v}" for (p, v), allowed in zip(components, checks)
        if not allowed
    ]

    if blocked:
        raise RuntimeError(
            f"Deployment blocked. Unsafe components: {', '.join(blocked)}"
        )

    # ... proceed with actual deployment

asyncio.run(deploy_stack([
    ("nginx",      "1.27.4"),
    ("postgresql", "15.2"),
    ("redis",      "7.2.0"),
]))

The parallel asyncio.gather means checking ten components takes roughly the same time as checking one.

Testing without hitting the live API#

The attestd.testing module provides mock transports and ready-made response fixtures. Inject a transport at client construction — no monkeypatching, no HTTP interception library required.

python

import pytest
import attestd
from attestd import AttestdUnsupportedProductError
from attestd.testing import (
    MockTransport,
    MockAsyncTransport,
    NGINX_SAFE,
    NGINX_VULNERABLE,
    LOG4J_CRITICAL,
    UNSUPPORTED,
)

# ── sync tests ────────────────────────────────────────────────────────────────

def test_critical_risk_blocks_deployment():
    transport = MockTransport(200, LOG4J_CRITICAL)
    client    = attestd.Client(api_key="test", transport=transport)

    result = client.check("log4j", "2.14.1")
    check  = build_verdict(result)          # your application logic

    assert check["allowed"] is False
    assert check["risk_state"] == "critical"


def test_safe_version_allows_deployment():
    transport = MockTransport(200, NGINX_SAFE)
    client    = attestd.Client(api_key="test", transport=transport)

    result = client.check("nginx", "1.27.4")
    check  = build_verdict(result)

    assert check["allowed"] is True


def test_unsupported_product_blocks_by_default():
    transport = MockTransport(200, UNSUPPORTED)
    client    = attestd.Client(api_key="test", transport=transport)

    with pytest.raises(AttestdUnsupportedProductError):
        client.check("unknownproduct", "1.0.0")


# ── async tests ───────────────────────────────────────────────────────────────

@pytest.mark.asyncio
async def test_async_gate_blocks_vulnerable():
    transport = MockAsyncTransport(200, NGINX_VULNERABLE)
    client    = attestd.AsyncClient(api_key="test", transport=transport)

    result  = await client.check("nginx", "1.20.0")
    allowed = result.risk_state not in ("critical", "high")

    assert allowed is False

NGINX_SAFE, NGINX_VULNERABLE, LOG4J_CRITICAL, and UNSUPPORTED are plain dicts — merge them with | to test specific field combinations without building a full response from scratch:

python

# Test the "actively exploited with no fix available" branch
body      = NGINX_VULNERABLE | {"actively_exploited": True, "fixed_version": None}
transport = MockTransport(200, body)
client    = attestd.Client(api_key="test", transport=transport)

If you need to test retry logic, SequentialMockTransport returns responses from a list in order:

python

from attestd.testing import SequentialMockTransport

def test_retries_on_503():
    transport = SequentialMockTransport([
        (503, {}),           # first attempt — server error
        (200, NGINX_SAFE),   # second attempt — success
    ])
    client = attestd.Client(api_key="test", transport=transport, max_retries=1)
    result = client.check("nginx", "1.27.4")
    assert result.risk_state == "none"
    assert transport.call_count == 2

Handling policy decisions at the edges#

A few edge cases that will come up in production:

supported: false is not the same as safe. When a product isn't in Attestd's coverage list, the SDK raises AttestdUnsupportedProductError. This means Attestd has no data — not that the software has no vulnerabilities. The correct default is to block and require manual review. A vulnerability check that silently passes unknown components is worse than no check at all.

What to do with elevated and low. A strict policy blocks on critical and high. A more permissive policy allows elevated with a warning logged to your audit trail. low is generally safe to deploy but worth recording. Whatever you decide, make the threshold explicit in code — not in a prompt.

The confidence field. Values below 0.6 indicate the synthesis pipeline fell back to CVSS-derived fields rather than LLM-extracted facts. The risk classification is still correct (worst-case is used), but the supporting detail is less precise. For automated blocking decisions, treat confidence as informational — don't relax your block threshold based on it.

Stale data. Every response includes an X-Attestd-Knowledge-Age header. The API caches synthesis results in memory for up to an hour; the underlying NVD/KEV data is refreshed every 6 hours. For most CI/CD gates this is fine. If you're running a high-stakes deployment system and need freshness guarantees, check the header and alert if age exceeds your threshold.

What this doesn't replace#

Attestd checks a specific product + version against known CVE data. It does not:

Scan your dependency tree for transitive vulnerabilities
Detect misconfigurations or secrets in your deployments
Replace a full SCA tool for application-layer dependencies (npm, PyPI, Maven)

The use case it solves precisely is infrastructure-layer components: the web server, database, reverse proxy, SSH daemon, and TLS library that your AI agent might suggest deploying or upgrading. For those, this check is the fastest and most automatable approach available.

Summary#

python

# The entire integration in three lines
client = attestd.Client(api_key=os.environ["ATTESTD_KEY"])
result = client.check("nginx", "1.20.0")
assert result.risk_state not in ("critical", "high"), f"Blocked: {result.risk_state}"

Get an API key at api.attestd.io/portal/login. The free tier covers 1,000 checks per month — enough to gate a reasonable CI/CD pipeline without paying anything.

The full SDK reference, including AsyncClient, error types, and the testing module, is at attestd.io/docs/sdk-reference.