·5 min read·The PayGraph Team

AI agent audit logs: what they are and what they must capture

An AI agent audit log is the immutable record of what your agent tried, what policy allowed, who approved it, and what happened. Here's what every row must contain.

Agents that spend money without an audit log are agents you cannot defend in a review, a dispute, or a deposition. This post defines the AI agent audit log, the six fields every row must contain, and how retention and immutability work in practice.

What is an AI agent audit log?

An AI agent audit log is an append-only, immutable record of every action an autonomous agent attempted, the policy decision that was made, any human approval involved, and the final outcome. It's the compliance and forensics layer that sits underneath policy-controlled spending — without it, you have rules but no receipts.

A standard payment log records what moved. An AI agent audit log records what the agent wanted to move, what was permitted, why, and by whom. That gap — between intent and execution — is where every interesting incident lives.

How is it different from a normal transaction log?

A Stripe or ledger transaction log answers one question: what payments occurred? That's necessary but insufficient for agents, because most of what an agent does never reaches the payment rail. It gets blocked, deferred for approval, or retried. Those events are the ones you need when something goes wrong.

Payment transaction logAI agent audit log
Records executed paymentsYesYes
Records blocked attemptsNoYes
Records policy reason for decisionNoYes
Records approver identityRarelyYes
Captures the model's stated intentNoYes
Immutable by defaultVariesRequired
Retention horizon1–7 years7+ years, often indefinite

If your auditor asks "why did the agent try to wire $40,000 to a vendor not on the allowlist at 2am?" — a payment log says nothing. The agent transaction log tells the whole story.

What six fields must every audit row contain?

Every row in a PayGraph audit log carries six mandatory fields. Miss one and the record is incomplete for forensic or SOC 2 purposes.

  1. Who — the agent identity, the session or run ID, and the model version. "Claude-3.5-Sonnet, agent=procurement-bot, run=r_8f2a" is useful. "The agent" is not.
  2. What — the tool name, the full argument payload, and the dollar amount if applicable. Capture arguments before and after any policy-side mutation.
  3. When — a UTC timestamp with millisecond precision and the monotonic sequence number within the run.
  4. Why — the model's stated reason for the action, pulled from the reasoning trace or tool-call rationale. This is what makes the log forensically useful when prompt injection is suspected.
  5. Approved-bypolicy if auto-approved by rule, human:<user_id> if a person approved, or denied:<policy_rule_id> if blocked. Never leave this null.
  6. Outcomeexecuted, denied, pending_approval, approval_timeout, downstream_failure, with the downstream transaction ID when executed.

A minimal row, serialized:

{
    "who": {
        "agent_id": "procurement-bot",
        "run_id": "r_8f2a1c",
        "model": "claude-3-5-sonnet-20241022",
    },
    "what": {
        "tool": "make_payment",
        "args": {"amount_usd": 487.00, "vendor": "aws", "category": "software"},
    },
    "when": "2026-04-20T14:03:11.482Z",
    "why": "Monthly AWS invoice received via email, matches expected amount.",
    "approved_by": "policy:rule_software_under_500",
    "outcome": {"status": "executed", "txn_id": "pi_3OxY2kL1a"},
}

This is the row your compliance team, your incident responder, and your on-call engineer all need at 3am. One schema, six fields, no exceptions.

Why must the log be immutable?

An audit log that can be edited is not an audit log. It's a note. Immutability matters for three reasons.

First, legal defensibility. If you face a dispute or regulator, a mutable log has no evidentiary weight. You need write-once storage or cryptographic chaining that proves records weren't altered after the fact.

Second, insider risk. The person who has motive to edit the log is often the person with access to edit it. Removing that possibility by construction is the only sound design.

Third, debugging trust. When engineers trust the log, they fix the real bug. When they suspect the log, they chase ghosts.

PayGraph writes audit records to an append-only store with per-row SHA-256 chaining — each row includes the hash of the previous row, so tampering anywhere in the chain invalidates everything after it. Export targets include S3 Object Lock, Postgres with revoked UPDATE/DELETE grants, and any WORM-compliant bucket.

How long should audit logs be retained?

Retention depends on what your agent touches, but the floor is higher than most teams assume.

  • SOC 2: minimum 1 year of audit evidence, 7 years recommended for Type 2 history.
  • PCI DSS (Requirement 10.7): 1 year minimum, with 3 months immediately available.
  • SOX (if applicable): 7 years for financial controls evidence.
  • GDPR / CCPA: retention must be bounded and justified — indefinite retention of personal data in audit logs is itself a violation.
  • Internal incident review: 2 years covers nearly all post-mortems.

The practical default for most PayGraph deployments is 7 years, with PII redacted at ingest. Store the fact of the transaction forever; store the personal details only as long as you need them. For related guidance on what flows into the log in the first place, see our write-up on policy-controlled spending architecture.

Where to start

  • GitHub: github.com/paygraph-ai/paygraph — MIT-licensed SDK with the audit log schema, hash chaining, and WORM export adapters built in.
  • Docs: docs.paygraph.dev — full audit row schema, retention configuration, and SOC 2 evidence export guide.
  • Discord: discord.gg/PPVZWSMdEm — compare notes with other teams on compliance reviews and incident forensics.

If your agent can move money and you can't produce these six fields for every action it took last Tuesday, the audit log is the first thing to ship this week.