What Can an AI System Prove?

Governance Review: Approved·Claims Reviewed: 17·Unsupported: 0·2026-06-08

An AI system can prove only what it can produce a verifiable, tamper-evident record of. The question "what can this system prove?" is different from "what does this system claim?" A system can claim to be governed, auditable, and compliant. What it can prove is determined by the records it actually generates, the tamper-evidence mechanisms protecting those records, and the reconstruction capability those records provide.

Definition

Claims vs. evidence

"Prove," in this context, means: produce a verifiable record that enables an independent party to confirm a fact without relying on the system's own assertion.

A claim

What the system states about itself — "this action was authorized," "this workflow is governed," "this output was reviewed."

Evidence

What the system can produce for independent verification — a tamper-evident record that an auditor who did not witness the event can examine and confirm.

A system can produce claims that are accurate descriptions of intended behavior while having no evidence infrastructure that allows those claims to be verified. These are not the same thing.

What an AI system can potentially prove

·That a specific action occurred at a specific time (if logged with tamper-evident record)
·That a specific evaluation was performed before an action executed (if gate decision record exists)
·That a specific policy version was applied to an evaluation (if policy was versioned and recorded)
·That a specific identity authorized an action (if authorization context was captured)
·That a record has not been modified since creation (if tamper-evidence mechanism was applied)

What an AI system typically cannot prove

·That no unauthorized actions occurred — negative proof requires complete tamper-evident records with no gaps
·That model outputs accurately represented model reasoning — internal model state is not auditable
·That the system was not manipulated between two audited events — gaps in the audit trail are not evidence of absence
·Causal chains in non-deterministic systems — probabilistic systems cannot produce deterministic execution records in the general case

Why It Matters

Where governance failures hide

The gap between what a system claims and what it can prove is where governance failures hide. A system that claims to be governed but cannot produce evidence of governance decisions is, from an audit perspective, ungoverned — regardless of the governance infrastructure described in its documentation.

For AI systems, this gap is structurally wider than in conventional software:

Model reasoning is not observable

A model's internal state during inference is not recorded in a form that enables reconstruction. What is observable is input, output, and tool invocations. The reasoning between those is not auditable.

Actions can occur faster than human review

In agentic systems executing sequences of actions, the time between intent and execution may be milliseconds. The record of what occurred is the only mechanism for review.

The same input may produce different outputs

In non-deterministic systems, reproducing the exact conditions of a prior event to verify the record is not possible. The record must stand on its own.

Governance documentation and enforcement can diverge

A system can have detailed governance policies and no enforcement of them. The documentation is real; the governance is not.

Common Failure Modes

How assertion substitutes for evidence

1.Documentation as proof

Governance policies, architecture diagrams, and process descriptions are cited as evidence that the system is governed. Documentation describes the intended system. An audit based on documentation is an audit of claims, not of evidence.

2.Self-attestation

The system is evaluated by reviewing its own claims about itself — its logs, its reports, its assertions. A system that controls its own records can produce records consistent with any claim about its behavior. Self-attestation is not independent verification.

3.Incomplete evidence chain

Records exist for some events but not others. A chain of evidence is only as strong as its weakest link — a gap in the record means the events in the gap are not in evidence, regardless of what the records around the gap contain.

4.Non-tamper-evident records

Records can be modified after the fact, with modifications undetectable. Any audit of those records is an audit of the current state of the records — which may differ from the original state of the events.

5.Negative proof attempted

The system claims to prove that something did not happen. Negative proofs require complete visibility with tamper-evident records and no gaps. Without these conditions, the absence of a record is not evidence that the event did not occur — it may be evidence that the event was not recorded.

6.Capability conflated with enforcement

A system that can produce audit records and can evaluate policies is described as governed. Whether these capabilities are exercised — consistently, in all cases — determines what the system can prove. A capability unused in a specific instance leaves that instance without evidence.

Evidence Requirements

What proof requires

For a system to prove a specific claim, it must produce:

·A tamper-evident record of the event — not just a timestamp, but a verifiable integrity mechanism
·Sufficient context to establish that the record refers to the specific event, not a similar one
·A mechanism for the verifying party to confirm tamper-evidence independently of the system that generated it
·A complete chain from the event to its authorization: this action occurred, at this time, because this evaluation approved it, under this policy version, authorized by this identity

What cannot be proved without complete records

·The absence of an event — requires a complete record of everything that did happen, with no gaps, with tamper-evidence
·Model reasoning — what the model produced is recordable; what the model considered is not
·Events that occurred before the audit trail was implemented — governance cannot be retroactively applied to records that were not generated under governance requirements

Governance Considerations

Evaluating systems on evidence, not description

Systems should be evaluated on what they can prove, not what they claim. The practical implication for governance review:

"Ask for evidence, not description"

"How does your system govern AI actions?" invites description. "Produce the record of the last five governance evaluations" requires evidence. The gap between the description and the evidence is the governance gap.

Distinguish evidenced from documented governance

Both may be present. Neither implies the other. Documentation without evidence is a compliance claim without a record.

The provability assessment is itself subject to the evidence standard

An assessment that says "this system can prove X" must be based on a review of the system's actual record production — not on review of the system's descriptions of its record production.

Related Concepts

A-01AI audit trail — The primary mechanism by which a system produces provable evidence of its behavior

A-04AI visibility vs. governance — Both claims are subject to the evidence standard

A-06Authorization boundary — Provability is bounded by the scope within which evidence was generated

B-01Execution governance gate — A gate produces a verifiable record that makes the governance decision provable

A-10Evidence requirements for AI systems — Extends this concept to specific evidence types by decision class

Auditome Perspective

"What can this system prove?" is the primary evaluation frame ASE was built around. ASE evaluates a system's evidence posture across six dimensions:

TraceabilityCan actions be traced to their authorization?
AuthorityCan the authorizing entity for each action be identified?
AuditabilityCan records be reviewed independently of the system being audited?
Evidence supportAre governance claims backed by records that meet the evidence standard?
Fail-closed behaviorWhen the system cannot evaluate an action, does it default to halt or proceed?
Governance consistencyAre governance controls applied uniformly, or only in monitored cases?

ASE does not produce a compliance certification. It produces an assessment of what a system can and cannot prove across these dimensions. That assessment is itself subject to the same evidence standard: ASE's findings are based on what the system produced, not what the system claimed.

Learn about ASE →

References

1.NIST SP 800-53 Rev. 5 — AU (Audit and Accountability) control family. Establishes what auditable events must be recorded and what integrity properties those records must have.
2.NIST SP 800-92 — Guide to Computer Security Log Management. Context for record completeness and integrity requirements.
3.The epistemological distinction between assertion and evidence, and the framework for evaluating provability in AI systems, is an Auditome design position. The underlying distinction is foundational in audit theory.
4.NIST AI RMF (AI 100-1) — Artificial Intelligence Risk Management Framework. Context for AI governance evaluation practices.

ASE — Auditome

ASE evaluates what your system can prove across traceability, authority, auditability, evidence support, fail-closed behavior, and governance consistency — and produces an assessment based on what the system actually generates, not what it claims.

Explore ASE

Audit Record

article_idA-02

version1.0

statusapproved

review_date2026-06-08

claims_reviewed17

unsupported_claims0

sha256f4c4c12c5a77b9ca…

This article passed ASE review, claim validation, and evidence review before publication. Claims are dispositioned as supported by cited literature, Auditome design positions, or verifiable logical consequences of stated definitions.

All knowledge articles