How It Works
Architecture
Axiomatic is composed of three layers:
CLI (user interface)
└── Core (agent loop, tools, caching)
└── LLM Provider (Anthropic, OpenAI)The CLI parses commands, loads configuration, discovers test files, and renders output. It delegates all analysis work to the Core.
The Core orchestrates the agent loop. For each test, it constructs a system prompt from the condition, provides the agent with tools, and manages the conversation until the agent submits a verdict.
The LLM Provider layer handles API communication with the configured provider (Anthropic or OpenAI), including authentication, request formatting, and response parsing.
The Agent Loop
When Axiomatic runs a test, the following sequence occurs:
-
System prompt -- the core builds a system prompt that includes the condition text, the file scope (
onglob), and instructions for how to explore the codebase and submit findings. -
Tool use -- the agent calls tools to read files, search for patterns, and navigate the project structure. Each tool call returns results that the agent uses to build its understanding.
-
Iteration -- the agent continues calling tools and reasoning until it has gathered enough evidence. Most tests complete in 3 to 10 tool-call rounds.
-
Verdict -- the agent calls
submit_verdictwith its conclusion: pass or fail, a confidence score, and a list of violations (if any) with file paths and line numbers.
You can watch the agent's reasoning in real time with axm run --verbose.
Agent Tools
The agent has access to these sandboxed, read-only tools during analysis:
| Tool | Description |
|---|---|
read_file | Read the contents of a file by path |
glob | Find files matching a glob pattern |
grep | Search file contents using regex patterns |
list_dir | List the contents of a directory |
tree | Show a recursive directory tree |
submit_verdict | Submit the final pass/fail verdict with evidence |
The agent cannot modify files -- it has read-only access. It uses these tools strategically: typically starting with glob or tree to understand the project structure, then grep to find patterns, and read_file to examine specific code in detail.
How the on Field Guides the Agent
The on globs serve as entry points. The agent is told to start its investigation with these files, but it can read any file in the repository from there. This means a test scoped to src/api/**/*.ts can still follow imports to check how a utility function in src/lib/auth.ts works.
Agent Memory
The agent maintains persistent notes about your codebase across runs. This is the key to Axiomatic's cost efficiency.
How It Works
First run: The agent thoroughly explores your codebase, reading files and documenting architectural patterns, package structure, conventions, and key file locations.
Subsequent runs (no changes): The agent retrieves cached notes before starting. Since relevant files have not changed, it bypasses exploration and focuses directly on evaluation. These runs are significantly cheaper.
Subsequent runs (some changes): When files change, notes that reference those files are automatically invalidated. The agent re-explores the affected areas and updates its notes, but reuses notes for unchanged areas.
What Gets Recorded
The agent stores two categories of notes:
- Codebase-level observations -- architectural patterns, package structure, conventions, and key file locations
- Per-test observations -- specific file contents, function signatures, and implementation details relevant to individual tests
Storage
Agent memory notes are stored alongside cached results in .axiomatic/. These files are:
- Auto-generated -- you never need to create or edit them manually
- Safe to gitignore -- they are machine-specific and rebuild automatically
Caching
Axiomatic caches test results in a local SQLite database at .axiomatic/cache.db to avoid redundant LLM calls.
Cache Keys
Each cache entry is keyed on:
- A hash of the test condition and configuration (provider, model, severity)
- Content hashes of all files matching the
onglob pattern
Invalidation
The cache is automatically invalidated when:
- The condition text changes
- Any file matching the
onglob is modified, added, or deleted - The provider or model configuration changes
You can manually bypass the cache:
axm run --no-cacheStorage
The cache database is lightweight (typically under 1 MB) and should be added to .gitignore. The axm init command does this automatically.
Cost Management
LLM API calls have associated costs. Axiomatic provides several mechanisms to manage spending.
Token Budgets
Each test run has an implicit token budget based on the model. If the agent approaches the budget limit, it submits a verdict with the evidence gathered so far rather than continuing exploration.
Model Selection
Choose models based on the importance of the test:
| Model | Cost per test | Best for |
|---|---|---|
| Claude Haiku | $0.01--0.05 | Most tests, fast iteration |
| Claude Sonnet | $0.05--0.20 | Standard accuracy tests |
| Claude Opus | $0.20+ | Critical security and architecture audits |
You can set model overrides per test:
# Use the best model for security-critical tests
condition: >
All SQL queries use parameterized queries, never string concatenation.
on:
- "src/db/**/*.ts"
severity: error
model: claude-opus-4-20250514Caching
The cache is the most effective cost control. Tests that pass against unchanged code are served from cache at zero cost. Keep caching enabled and use reasonable TTLs.
Scoping with on
Narrow the on glob to limit how many files the agent needs to examine. Scoping to src/api/**/*.ts is cheaper than scanning the entire src/ tree.
Axiomatic vs. Other Tools
Axiomatic fills a specific gap in the testing ecosystem.
Linters (ESLint, golangci-lint, Pylint)
Linters operate on syntactic patterns within individual files. They are fast and catch surface-level issues, but cannot reason about behavior across files. A linter detects eval() usage, but only Axiomatic can confirm that user input cannot reach eval() after sanitization across modules.
Semgrep
Semgrep performs structural AST pattern matching -- syntactically intelligent text searching. Axiomatic operates at a higher abstraction level, reasoning about behavioral intent regardless of implementation variations.
ArchUnit / ArchUnitNET
ArchUnit is the closest comparable tool, but limited to Java/C# and import graph rules. Axiomatic works with any language and handles broader behavioral and security properties described in plain English.
Unit Tests
Unit tests validate individual function behavior; Axiomatic verifies cross-cutting properties spanning packages and files. They are complementary.
Code Review
Axiomatic automates the repeatable, mechanical aspects of code review -- verifying consistent application of known architectural invariants. Human review remains essential for nuanced design decisions.
When to Use Which
| Property | Best tool |
|---|---|
| Code formatting | prettier, gofmt, black |
| Known code patterns | Semgrep, linter rules |
| Type correctness | Compiler / type checker |
| Single function behavior | Unit tests |
| Import graph rules (Java/C#) | ArchUnit |
| Cross-cutting behavioral properties | Axiomatic |