tests/
Put test files in tests/. They are standard Bun test files — bun:test for the runner,
Axon() from @axon/test as the harness. The harness boots the complete agent runtime
against your actual axon.config.ts, so you are testing the real thing.
tests/
├── tools.test.ts
├── scripts.test.ts
└── behavior.test.ts
The harness
Axon() boots a full agent runtime scoped to a test session. It auto-discovers your
axon.config.ts by walking up from the current working directory — no path needed.
Destructure what you need. Call stop() when the suite finishes — the harness warns on
process exit if you forget.
import { describe, it, expect, afterAll } from "bun:test"
import { Axon } from "@axon/test"
import { Mock } from "@axon/engines"
describe("my agent", () => {
const { axon, bus, stop } = await Axon({ config: { engine: Mock() } })
afterAll(() => stop())
it("responds to a request", async () => {
const result = await axon.request("hello")
expect(result.entries.length).toBeGreaterThan(0)
})
})
axon is the same handle available in scripts and routes. Same API, no special test mode.
The Mock engine
Real inference in tests is slow, nondeterministic, and costs money. Mock() replaces the
inference step with a deterministic responder — the full agent loop still runs. Tool calls
execute, stop conditions fire, threads accumulate context. Everything behaves as in
production, just without the model.
Three forms:
Echo — reflects the user message back. Useful for testing routing and tool execution without caring about content:
engine: Mock()
Map — pattern-matched fixed responses. Patterns are matched as case-insensitive substrings against the user message:
import { Mock, air } from "@axon/engines"
engine: Mock({
"summarise": air.text("Here is the summary."),
"run tests": air.typescript("tools.ci.runTests()"),
})
Function — full control. Receives the raw engine request, returns an AIR block:
engine: Mock(async (req) => {
const last = req.messages.at(-1)?.content ?? ""
return air.text(`Echo: ${last}`)
})
Mock responses must be valid AIR blocks, not plain strings. Use the air helpers:
| Helper | Behaviour |
|---|---|
air.text(content) | Renders as an agent message |
air.typescript(code) | Executes in the capsule |
air.done | Terminates the loop with no output |
See Mock engine for the full reference.
Asserting on results
axon.request() returns a result with the full entry log for that call:
const result = await axon.request("run the test suite")
// Text content of the final reply
expect(result.text).toContain("passed")
// All entries (messages, tool calls, tool results, lifecycle events)
expect(result.entries.length).toBeGreaterThan(0)
// Filter to tool calls only
const toolCalls = result.entries.filter(e => e.type === "tool:call")
expect(toolCalls).toHaveLength(1)
expect(toolCalls[0].name).toBe("ci.runTests")
Entry types you'll use most:
| Type | What it is |
|---|---|
text | Agent message content |
tool:call | The agent invoking a tool |
tool:result | The return value from the tool |
Testing with named threads
Pass thread to scope a request to an isolated conversation history:
const r1 = await axon.request({ prompt: "start a plan", thread: "planning" })
const r2 = await axon.request({ prompt: "add a step", thread: "planning" })
// The agent on "planning" has context from r1 when producing r2
expect(r2.text).not.toBe("")
// A different thread has no context from "planning"
const r3 = await axon.request({ prompt: "what plan?", thread: "other" })
Observing the event bus
bus gives you access to every event the runtime emits — tool invocations, engine calls,
session lifecycle, and more. Use it when you need to assert on things that aren't in the
result entries.
const { axon, bus, stop } = await Axon({ config: { engine: Mock() } })
await axon.request("deploy the service")
const events = bus.history()
const engineCalls = events.filter(e => e.type === "engine:call")
expect(engineCalls.length).toBeGreaterThan(0)
Hermetic sessions
The harness writes session data to a temporary directory, not your agent's data/sessions/.
Tests never pollute your local session history. Each Axon() call gets its own isolated
session.
To point sessions somewhere specific:
const { axon, stop } = await Axon({ sessionsDir: "/tmp/my-test-sessions" })
Config overrides
config merges over your axon.config.ts. Override only what the test needs:
// Swap the engine, leave everything else (tools, prompts, policy) as-is
const { axon, stop } = await Axon({ config: { engine: Mock() } })
// Tighten policy for a specific test
const { axon, stop } = await Axon({
config: {
engine: Mock(),
policy: { maxTurns: 1 },
},
})
Running tests
bun test
bun test tests/tools.test.ts
bun test --watch
Tests run against your local agent. The capsule boots, tools load, and the runtime is available for the duration of the test suite.