Storage Layout
All run artifacts are written under a singleruns/ directory (configurable via --runs-dir). The layout is:
run_NNN.json at the top level is the full session object — it includes the complete event log, final status, and an embedded copy of the diagnosis. The run_NNN/ subdirectory holds the two canonical artifacts that the dashboard reads directly: session.json (raw evidence from the OpenClaw plugin) and diagnosis.json (the structured output of critiqor finalize).
Session File (session.json)
runs/<run_id>/session.json is written by the OpenClaw plugin as events occur during agent execution. Its top-level structure is:
| Field | Description |
|---|---|
schema_version | Schema identifier (critiqor.session.v1) |
run_id | The run this evidence belongs to |
events[] | Ordered list of runtime events (tool calls, outputs, memory events, retries, errors, state transitions) |
metrics{} | Aggregate counters — total events broken down by event type and source layer |
session.json is never modified. The original plugin evidence is preserved exactly as captured, even if you re-run finalization with updated logic. This means your raw observations are always auditable and reproducible.
Diagnosis File (diagnosis.json)
runs/<run_id>/diagnosis.json is written by critiqor finalize after processing session.json. It contains the full structured output of the diagnosis engine:
| Field | Description |
|---|---|
run_id | The run this diagnosis belongs to |
executive_summary | Trust score, readiness level, primary diagnosis, recommended action |
primary_diagnosis | Highest-impact failure cause: type, severity, causal chain, description |
failure_causes | All detected failure patterns with severity and evidence links |
causal_graph | Event nodes and causal edges (causes / precedes relationships) |
evidence_panel | Full audit panel: tool calls, outputs, memory events, retries, errors |
cost_analysis | Token and call cost breakdown for the session |
session.json is immutable, diagnosis.json can be regenerated at any time by re-running the finalization logic against the original evidence — without re-running the agent session itself.
Run IDs
Run IDs follow the formatrun_001, run_002, etc. — sequential integers zero-padded to three digits. The next run ID is determined at session creation by scanning runs/run_*.json for the highest existing number and incrementing it by one. The sequence is guaranteed to be monotonically increasing within a given runs/ directory, making run IDs also a chronological ordering.
While a session is active, runs/active_session.json tracks the current run ID and status (MONITORING or FINALIZING). This file is removed once finalization completes, so its presence always means a session is in progress.
Listing Runs
critiqor runs for all available flags.
This command reads every runs/run_*/diagnosis.json and prints a one-line summary for each completed run, ordered most-recent first:
- Trust score — the 0–100 reliability score assigned by the diagnosis engine
- Tool call count — total tool calls observed in the session
- Hallucination Risk — derived from trust score:
Low(>=75),Review(60–74),High(less than 60) - Primary issue — the top failure type detected, or
No Major Issueif none
critiqor finalize.
To use a non-default runs directory:
Switching Between Runs
critiqor dashboard for all available flags.
If a dashboard server is already running for the same runs/ directory, Critiqor detects it via the runs/.critiqor_dashboard/server.json record and navigates directly to the requested run — no server restart required. If no server is running, a new one is started.
To specify a non-default runs directory or port:
Run History in the Dashboard
The Run History section of the dashboard lists all completed runs alongside their trust scores. Clicking any entry loads that run’s diagnosis without restarting the server. Runs are listed inrun_id order, which is also chronological, so you can track reliability trends across successive sessions.
Working With Run Artifacts Directly
Because all run artifacts are plain JSON files, you can inspect them without the dashboard:runs/<run_id>.json top-level record also contains status (COMPLETED, ABORTED, etc.), timestamps.created_at, timestamps.finalized_at, and the full event_log — useful for scripting or CI integrations.