Run Management: Store, Review, and Compare Agent Runs

Every time Critiqor observes an OpenClaw agent session, it writes a permanent record — the run. Runs are the foundation of Critiqor’s reliability tracking: they store the raw event evidence your agent produced, the structured diagnosis generated from that evidence, and the metadata needed to revisit and compare behaviour over time. Once a run is finalized it is never modified, so your audit trail stays intact even as Critiqor’s diagnosis logic evolves.

Storage Layout

All run artifacts are written under a single runs/ directory (configurable via --runs-dir). The layout is:

runs/
  active_session.json        # present only while monitoring is active
  run_001.json               # complete session record (status, events, diagnosis)
  run_002.json
  run_001/
    session.json             # raw plugin evidence (events[], metrics{})
    diagnosis.json           # structured diagnosis (trust score, failures, causal graph)
  run_002/
    session.json
    diagnosis.json
  .critiqor_dashboard/
    server.json              # dashboard server record (host, port, pid)

Each run_NNN.json at the top level is the full session object — it includes the complete event log, final status, and an embedded copy of the diagnosis. The run_NNN/ subdirectory holds the two canonical artifacts that the dashboard reads directly: session.json (raw evidence from the OpenClaw plugin) and diagnosis.json (the structured output of critiqor finalize).

Session File (session.json)

runs/<run_id>/session.json is written by the OpenClaw plugin as events occur during agent execution. Its top-level structure is:

Field	Description
`schema_version`	Schema identifier (`critiqor.session.v1`)
`run_id`	The run this evidence belongs to
`events[]`	Ordered list of runtime events (tool calls, outputs, memory events, retries, errors, state transitions)
`metrics{}`	Aggregate counters — total events broken down by event type and source layer

Once the monitoring session ends, session.json is never modified. The original plugin evidence is preserved exactly as captured, even if you re-run finalization with updated logic. This means your raw observations are always auditable and reproducible.

Diagnosis File (diagnosis.json)

runs/<run_id>/diagnosis.json is written by critiqor finalize after processing session.json. It contains the full structured output of the diagnosis engine:

Field	Description
`run_id`	The run this diagnosis belongs to
`executive_summary`	Trust score, readiness level, primary diagnosis, recommended action
`primary_diagnosis`	Highest-impact failure cause: type, severity, causal chain, description
`failure_causes`	All detected failure patterns with severity and evidence links
`causal_graph`	Event nodes and causal edges (causes / precedes relationships)
`evidence_panel`	Full audit panel: tool calls, outputs, memory events, retries, errors
`cost_analysis`	Token and call cost breakdown for the session

Because session.json is immutable, diagnosis.json can be regenerated at any time by re-running the finalization logic against the original evidence — without re-running the agent session itself.

Run IDs

Run IDs follow the format run_001, run_002, etc. — sequential integers zero-padded to three digits. The next run ID is determined at session creation by scanning runs/run_*.json for the highest existing number and incrementing it by one. The sequence is guaranteed to be monotonically increasing within a given runs/ directory, making run IDs also a chronological ordering. While a session is active, runs/active_session.json tracks the current run ID and status (MONITORING or FINALIZING). This file is removed once finalization completes, so its presence always means a session is in progress.

Listing Runs

critiqor runs

See the CLI reference for critiqor runs for all available flags. This command reads every runs/run_*/diagnosis.json and prints a one-line summary for each completed run, ordered most-recent first:

Available Runs

run_002 | Trust: 47 | 31 Tool Calls | Hallucination Risk: High | Infinite Tool Loop
run_001 | Trust: 82 | 14 Tool Calls | Hallucination Risk: Low | No Major Issue

Each line shows:

Trust score — the 0–100 reliability score assigned by the diagnosis engine
Tool call count — total tool calls observed in the session
Hallucination Risk — derived from trust score: Low (>=75), Review (60–74), High (less than 60)
Primary issue — the top failure type detected, or No Major Issue if none

If no completed runs are found, Critiqor will remind you to run critiqor finalize. To use a non-default runs directory:

critiqor runs --runs-dir /path/to/runs

Switching Between Runs

# View the latest finalized run (default)
critiqor dashboard

# View a specific run by ID
critiqor dashboard run_001

See the CLI reference for critiqor dashboard for all available flags. If a dashboard server is already running for the same runs/ directory, Critiqor detects it via the runs/.critiqor_dashboard/server.json record and navigates directly to the requested run — no server restart required. If no server is running, a new one is started. To specify a non-default runs directory or port:

critiqor dashboard run_001 --runs /path/to/runs --port 3000

Run History in the Dashboard

The Run History section of the dashboard lists all completed runs alongside their trust scores. Clicking any entry loads that run’s diagnosis without restarting the server. Runs are listed in run_id order, which is also chronological, so you can track reliability trends across successive sessions.

Working With Run Artifacts Directly

Because all run artifacts are plain JSON files, you can inspect them without the dashboard:

# View the raw event evidence for run_003
cat runs/run_003/session.json

# View the structured diagnosis
cat runs/run_003/diagnosis.json

# Check the top-level session record (status, timestamps, embedded diagnosis)
cat runs/run_003.json

The runs/<run_id>.json top-level record also contains status (COMPLETED, ABORTED, etc.), timestamps.created_at, timestamps.finalized_at, and the full event_log — useful for scripting or CI integrations.

​Storage Layout

​Session File (session.json)

​Diagnosis File (diagnosis.json)

​Run IDs

​Listing Runs

​Switching Between Runs

​Run History in the Dashboard

​Working With Run Artifacts Directly