Python guardrails for AI agent payments
Three Python patterns for guarding AI agent payments — decorators, middleware, and explicit gates. Trade-offs on testability, mocking, and integration shape.
If your Python agent can call make_payment, you need a guardrail in front of it. The interesting question isn't whether — it's what shape that guardrail takes in your codebase.
What are Python guardrails for AI payments?
Python guardrails for AI payments are the code that sits between an LLM's tool call and the payment rail. They evaluate a proposed transaction against policy, route to human approval when required, and write an audit record before any money moves. In Python agent codebases, guardrails show up in three shapes: a decorator on the tool function, middleware on the agent's tool dispatcher, or an explicit gate function the tool calls itself.
All three solve the same problem. They differ in where the policy boundary lives, how easy the code is to test, and how cleanly they compose with frameworks like LangGraph and CrewAI.
How does the decorator pattern work?
The decorator pattern wraps the tool function itself. The policy check runs every time the function is called, regardless of who calls it.
from paygraph import PolicyEngine, Policy
policy = Policy(
max_per_transaction_usd=500,
daily_cap_usd=2000,
require_approval_above_usd=100,
)
engine = PolicyEngine(policy)
@engine.guarded_tool
def make_payment(amount_usd: float, vendor: str, category: str):
return stripe_client.charges.create(
amount=int(amount_usd * 100),
source=vendor,
)The guardrail is now a property of the function. You cannot accidentally route around it by calling make_payment from a different code path. That's the headline benefit: the policy boundary is glued to the dangerous primitive, not to the caller.
The cost is testability. To unit-test the wrapped function, you either patch the engine or build a permissive Policy for the test. Mocking make_payment.__wrapped__ works but couples your tests to the decorator's internals. For most teams the trade is worth it — one decorator, one policy boundary, no missed paths.
How does the middleware pattern work?
Middleware moves the guardrail up to the agent's dispatcher. Every tool call the agent makes passes through a single function that checks policy before invoking the tool.
from paygraph import PolicyEngine, Policy
engine = PolicyEngine(Policy(max_per_transaction_usd=500))
def policy_middleware(tool_name: str, kwargs: dict, next_):
decision = engine.evaluate(tool_name, kwargs)
if decision.blocked:
raise PermissionError(decision.reason)
if decision.requires_approval:
engine.request_approval(decision)
return {"status": "pending_approval"}
return next_(kwargs)
agent = Agent(tools=[make_payment, send_email], middleware=[policy_middleware])Middleware is the right answer when you have many tools and a single dispatcher — LangGraph nodes, CrewAI tool wrappers, or a custom ToolRouter. One place to enforce, one place to test. The same wrapping shows up inside a LangGraph state machine where the dispatcher is the state graph itself.
The risk is coverage. If a tool gets called outside the dispatcher — a background job, a retry handler, a developer poking at the REPL — the middleware never runs. Decorators don't have this gap. Middleware does.
How does the explicit gate pattern work?
The explicit gate pattern puts the policy check inside the tool function as a normal function call. No magic, no decorator, no framework hooks.
from paygraph import PolicyEngine, Policy
engine = PolicyEngine(Policy(max_per_transaction_usd=500))
def make_payment(amount_usd: float, vendor: str, category: str):
decision = engine.evaluate(
"make_payment",
{"amount_usd": amount_usd, "vendor": vendor, "category": category},
)
if decision.blocked:
raise PermissionError(decision.reason)
if decision.requires_approval:
return engine.await_approval(decision)
return stripe_client.charges.create(
amount=int(amount_usd * 100),
source=vendor,
)The gate is just code. It mocks like any other function call. You can unit-test the tool in isolation by passing a fake engine. There's no metaprogramming for a new engineer to reverse-engineer.
The cost is discipline. Every tool author has to remember to write the gate. Miss one, and the agent has a hole. In practice, teams that pick this pattern pair it with a lint rule or a code review checklist.
Which pattern should you pick?
The choice depends on how many tools you have, how your agent dispatches them, and how much testing infrastructure you want to keep simple.
| Decorator | Middleware | Explicit gate | |
|---|---|---|---|
| Policy boundary | On the tool | On the dispatcher | Inside the tool body |
| Coverage of side paths | Full | Dispatcher only | Per-tool, manual |
| Unit-test ergonomics | Patch engine | Patch middleware | Inject fake engine |
| Mocking complexity | Medium | Low | Low |
| Best fit | One or two payment tools | Many tools, single dispatcher | Small codebase, strict typing |
| Risk | Decorator opacity in stack traces | Side-path bypass | Forgotten gate |
A useful rule: if your agent has one make_payment tool and a dozen read-only tools, decorate the payment tool. If your agent has five payment-capable tools all routed through one node, use middleware. If your team distrusts metaprogramming and runs strict mypy, write explicit gates and lint for them.
You can also combine. A decorator on the payment primitive plus middleware on the dispatcher gives you defense in depth — the decorator catches the side paths the middleware misses, and the middleware gives you per-agent context the decorator doesn't see. The same policy-evaluation step runs in both, so the audit log stays single-sourced.
Testing the guardrail itself
Whichever pattern you pick, the guardrail's tests should cover four cases at minimum:
- A transaction below all thresholds executes and returns the real response.
- A transaction above
max_per_transaction_usdraises and logs the block. - A transaction above
require_approval_above_usdpauses and emits an approval request. - A transaction in a blocked category raises before the rail is touched.
In a decorator setup, you test by calling the decorated function directly with a stub PolicyEngine. In middleware, you call the middleware function with a fake next_. In an explicit gate, you call the tool with the engine injected as an argument or via a module-level fixture. The fourth case — category block — is the one most teams forget. It's also the one prompt injection most often targets.
Where to start
- GitHub: github.com/paygraph-ai/paygraph — MIT-licensed Python SDK with
@guarded_tool, middleware adapters, and an explicitengine.evaluate()API. - Docs: docs.paygraph.dev — pattern guides for decorators, LangGraph middleware, CrewAI tool wrappers, and bare-Python gates.
- Discord: discord.gg/PPVZWSMdEm — ask which pattern fits your stack before you commit to one.
Pick the pattern that matches how your agent already dispatches tools, then make the policy boundary the one thing your code review never lets through unguarded.