Testing
An agent has three things worth testing: the tools it calls, the prompts it loads, and the flows it runs when hooks fire. The model is not one of them — it's nondeterministic and not yours to test.
This is the key mental shift. You are not testing "does the agent say the right thing." You are testing "does the agent call the right tool, does the prompt render correctly, does a message trigger a reply." Those are deterministic. They have correct answers.
What to test
Tools — your tool functions are plain TypeScript. Test them directly via
axon.tools.*. Assert on return shapes and guard conditions. Don't test the external
services they call — test the logic you control.
Prompts — render them with axon.prompt() and assert on content. Catches template
regressions: broken interpolation, missing conditional sections, stale variable names.
Hook flows — fire hooks via axon.hooks.callHook() and assert on the outcome. This
is the primary integration test: the module emits an event, your plugin handles it, the
agent runs, a reply comes back.
The harness
Axon() from @axon/test boots your full agent runtime against your actual
axon.config.ts. Same capsule, same tools, same prompts, same hooks. The only thing you
swap is the engine.
import { Axon } from "@axon/test"
import { Mock } from "@axon/engines"
const { axon, stop } = await Axon({ config: { engine: Mock() } })
Mock() replaces the inference step with a deterministic local responder. The full agent
loop still runs — tool calls execute, threads accumulate context, stop conditions fire.
Everything behaves as in production, just without a model call.
The two contexts
Testing an agent — you own the axon.config.ts and the full agent source. Boot
with Axon(), test your tools, prompts, and scripts directly, fire hooks and assert on
replies. Your test files live in tests/ at the agent root.
Testing a module — modules have no standalone runtime. You run your tests from inside an agent that has the module installed. The harness is the same; what you're asserting on is different: that your tool namespace appeared, your prompts are renderable, your hooks trigger the right behavior in the host agent.
Where to go
- Agent tests — full reference for testing agents
- Module tests — full reference for testing modules