Build: add runtime build (openclaw#17636) thanks @joshp123

Build: add runtime build
refactor(gateway): dedupe transcript tail preview
2026-02-15 18:26:25 -08:00 · 2026-02-15 18:26:25 -08:00 · 2026-02-16 02:21:39 +00:00 · 2026-02-16 02:19:53 +00:00 · 2026-02-16 03:19:50 +01:00 · 2026-02-16 02:17:45 +00:00
444 changed files with 22934 additions and 20245 deletions
--- a/.github/workflows/formal-conformance.yml
+++ b/.github/workflows/formal-conformance.yml
@@ -108,6 +108,7 @@ jobs:

      - name: Comment on PR (informational)
        if: steps.drift.outputs.drift == 'true'
+        continue-on-error: true
        uses: actions/github-script@v7
        with:
          script: |
--- a/.github/workflows/install-smoke.yml
+++ b/.github/workflows/install-smoke.yml
@@ -33,19 +33,17 @@ jobs:
      - name: Checkout CLI
        uses: actions/checkout@v4

-      - name: Setup pnpm (corepack retry)
-        run: |
-          set -euo pipefail
-          corepack enable
-          for attempt in 1 2 3; do
-            if corepack prepare pnpm@10.23.0 --activate; then
-              pnpm -v
-              exit 0
-            fi
-            echo "corepack prepare failed (attempt $attempt/3). Retrying..."
-            sleep $((attempt * 10))
-          done
-          exit 1
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: 22.x
+          check-latest: true
+
+      - name: Setup pnpm + cache store
+        uses: ./.github/actions/setup-pnpm-store-cache
+        with:
+          pnpm-version: "10.23.0"
+          cache-key-suffix: "node22"

      - name: Install pnpm deps (minimal)
        run: pnpm install --ignore-scripts --frozen-lockfile
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -119,6 +119,19 @@
 - Never commit or publish real phone numbers, videos, or live configuration values. Use obviously fake placeholders in docs, tests, and examples.
 - Release flow: always read `docs/reference/RELEASING.md` and `docs/platforms/mac/release.md` before any release work; do not ask routine questions once those docs answer them.

+## GHSA (Repo Advisory) Patch/Publish
+
+- Fetch: `gh api /repos/openclaw/openclaw/security-advisories/<GHSA>`
+- Latest npm: `npm view openclaw version --userconfig "$(mktemp)"`
+- Private fork PRs must be closed:
+  `fork=$(gh api /repos/openclaw/openclaw/security-advisories/<GHSA> | jq -r .private_fork.full_name)`
+  `gh pr list -R "$fork" --state open` (must be empty)
+- Description newline footgun: write Markdown via heredoc to `/tmp/ghsa.desc.md` (no `"\\n"` strings)
+- Build patch JSON via jq: `jq -n --rawfile desc /tmp/ghsa.desc.md '{summary,severity,description:$desc,vulnerabilities:[...]}' > /tmp/ghsa.patch.json`
+- Patch + publish: `gh api -X PATCH /repos/openclaw/openclaw/security-advisories/<GHSA> --input /tmp/ghsa.patch.json` (publish = include `"state":"published"`; no `/publish` endpoint)
+- If publish fails (HTTP 422): missing `severity`/`description`/`vulnerabilities[]`, or private fork has open PRs
+- Verify: re-fetch; ensure `state=published`, `published_at` set; `jq -r .description | rg '\\\\n'` returns nothing
+
 ## Troubleshooting

 - Rebrand/migration issues or legacy config/service warnings: run `openclaw doctor` (see `docs/gateway/doctor.md`).
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,8 @@ Docs: https://docs.openclaw.ai

 ### Changes

+- Build: add `pnpm build:runtime` for packagers/runtime builds to skip plugin-sdk declaration generation when types are not needed. (#17636) Thanks @joshp123.
+- Cron/Gateway: add finished-run webhook delivery toggle (`notify`) and dedicated webhook auth token support (`cron.webhookToken`) for outbound cron webhook posts. (#14535) Thanks @advaitpaliwal.
 - Plugins: expose `llm_input` and `llm_output` hook payloads so extensions can observe prompt/input context and model output usage details. (#16724) Thanks @SecondThread.
 - Subagents: nested sub-agents (sub-sub-agents) with configurable depth. Set `agents.defaults.subagents.maxSpawnDepth: 2` to allow sub-agents to spawn their own children. Includes `maxChildrenPerAgent` limit (default 5), depth-aware tool policy, and proper announce chain routing. (#14447) Thanks @tyler6204.
 - Discord: components v2 UI + embeds passthrough + exec approval UX refinements (CV2 containers, button layout, Discord-forwarding skip). Thanks @thewilloftheshadow.
@@ -14,6 +16,8 @@ Docs: https://docs.openclaw.ai

 ### Fixes

+- Sandbox/Security: block dangerous sandbox Docker config (bind mounts, host networking, unconfined seccomp/apparmor) to prevent container escape via config injection. Thanks @aether-ai-agent.
+- Control UI: prevent stored XSS via assistant name/avatar by removing inline script injection, serving bootstrap config as JSON, and enforcing `script-src 'self'`. Thanks @Adam55A-code.
 - Web UI/Agents: hide `BOOTSTRAP.md` in the Agents Files list after onboarding is completed, avoiding confusing missing-file warnings for completed workspaces. (#17491) Thanks @gumadeiras.
 - Telegram: omit `message_thread_id` for DM sends/draft previews and keep forum-topic handling (`id=1` general omitted, non-general kept), preventing DM failures with `400 Bad Request: message thread not found`. (#10942) Thanks @garnetlyx.
 - Subagents/Models: preserve `agents.defaults.model.fallbacks` when subagent sessions carry a model override, so subagent runs fail over to configured fallback models instead of retrying only the overridden primary model.
@@ -28,8 +32,10 @@ Docs: https://docs.openclaw.ai
 - Gateway/Send: return an actionable error when `send` targets internal-only `webchat`, guiding callers to use `chat.send` or a deliverable channel. (#15703) Thanks @rodrigouroz.
 - Gateway/Agent: reject malformed `agent:`-prefixed session keys (for example, `agent:main`) in `agent` and `agent.identity.get` instead of silently resolving them to the default agent, preventing accidental cross-session routing. (#15707) Thanks @rodrigouroz.
 - Gateway/Security: redact sensitive session/path details from `status` responses for non-admin clients; full details remain available to `operator.admin`. (#8590) Thanks @fr33d3m0n.
+- Web Fetch/Security: cap downloaded response body size before HTML parsing to prevent memory exhaustion from oversized or deeply nested pages. Thanks @xuemian168.
 - Agents: return an explicit timeout error reply when an embedded run times out before producing any payloads, preventing silent dropped turns during slow cache-refresh transitions. (#16659) Thanks @liaosvcaf and @vignesh07.
 - Agents/OpenAI: force `store=true` for direct OpenAI Responses/Codex runs to preserve multi-turn server-side conversation state, while leaving proxy/non-OpenAI endpoints unchanged. (#16803) Thanks @mark9232 and @vignesh07.
+- Agents/Security: sanitize workspace paths before embedding into LLM prompts (strip Unicode control/format chars) to prevent instruction injection via malicious directory names. Thanks @aether-ai-agent.
 - Agents/Context: apply configured model `contextWindow` overrides after provider discovery so `lookupContextTokens()` honors operator config values (including discovery-failure paths). (#17404) Thanks @michaelbship and @vignesh07.
 - CLI/Build: make legacy daemon CLI compatibility shim generation tolerant of minimal tsdown daemon export sets, while preserving restart/register compatibility aliases and surfacing explicit errors for unavailable legacy daemon commands. Thanks @vignesh07.
 - Telegram: replace inbound `<media:audio>` placeholder with successful preflight voice transcript in message body context, preventing placeholder-only prompt bodies for mention-gated voice messages. (#16789) Thanks @Limitless2023.
@@ -106,6 +112,7 @@ Docs: https://docs.openclaw.ai
 - Tools/Write/Edit: normalize structured text-block arguments for `content`/`oldText`/`newText` before filesystem edits, preventing JSON-like file corruption and false “exact text not found” misses from block-form params. (#16778) Thanks @danielpipernz.
 - Ollama/Agents: avoid forcing `<final>` tag enforcement for Ollama models, which could suppress all output as `(no output)`. (#16191) Thanks @Glucksberg.
 - Plugins: suppress false duplicate plugin id warnings when the same extension is discovered via multiple paths (config/workspace/global vs bundled), while still warning on genuine duplicates. (#16222) Thanks @shadril238.
+- Agents/Process: supervise PTY/child process lifecycles with explicit ownership, cancellation, timeouts, and deterministic cleanup, preventing Codex/Pi PTY sessions from dying or stalling on resume. (#14257) Thanks @onutc.
 - Skills: watch `SKILL.md` only when refreshing skills snapshot to avoid file-descriptor exhaustion in large data trees. (#11325) Thanks @household-bard.
 - Memory/QMD: make `memory status` read-only by skipping QMD boot update/embed side effects for status-only manager checks.
 - Memory/QMD: keep original QMD failures when builtin fallback initialization fails (for example missing embedding API keys), instead of replacing them with fallback init errors.
@@ -207,6 +214,7 @@ Docs: https://docs.openclaw.ai
 - Docs/Hooks: update hooks documentation URLs to the new `/automation/hooks` location. (#16165) Thanks @nicholascyh.
 - Security/Audit: warn when `gateway.tools.allow` re-enables default-denied tools over HTTP `POST /tools/invoke`, since this can increase RCE blast radius if the gateway is reachable.
 - Security/Plugins/Hooks: harden npm-based installs by restricting specs to registry packages only, passing `--ignore-scripts` to `npm pack`, and cleaning up temp install directories.
+- Security/Sessions: preserve inter-session input provenance for routed prompts so delegated/internal sessions are not treated as direct external user instructions. Thanks @anbecker.
 - Feishu: stop persistent Typing reaction on NO_REPLY/suppressed runs by wiring reply-dispatcher cleanup to remove typing indicators. (#15464) Thanks @arosstale.
 - Agents: strip leading empty lines from `sanitizeUserFacingText` output and normalize whitespace-only outputs to empty text. (#16158) Thanks @mcinteerj.
 - BlueBubbles: gracefully degrade when Private API is disabled by filtering private-only actions, skipping private-only reactions/reply effects, and avoiding private reply markers so non-private flows remain usable. (#16002) Thanks @L-U-C-K-Y.
@@ -337,6 +345,7 @@ Docs: https://docs.openclaw.ai
 - Configure/Gateway: reject literal `"undefined"`/`"null"` token input and validate gateway password prompt values to avoid invalid password-mode configs. (#13767) Thanks @omair445.
 - Gateway: handle async `EPIPE` on stdout/stderr during shutdown. (#13414) Thanks @keshav55.
 - Gateway/Control UI: resolve missing dashboard assets when `openclaw` is installed globally via symlink-based Node managers (nvm/fnm/n/Homebrew). (#14919) Thanks @aynorica.
+- Gateway/Control UI: keep partial assistant output visible when runs are aborted, and persist aborted partials to session transcripts for follow-up context.
 - Cron: use requested `agentId` for isolated job auth resolution. (#13983) Thanks @0xRaini.
 - Cron: prevent cron jobs from skipping execution when `nextRunAtMs` advances. (#14068) Thanks @WalterSumbon.
 - Cron: pass `agentId` to `runHeartbeatOnce` for main-session jobs. (#14140) Thanks @ishikawa-pro.
--- a/apps/macos/Sources/OpenClawProtocol/GatewayModels.swift
+++ b/apps/macos/Sources/OpenClawProtocol/GatewayModels.swift
@@ -2087,6 +2087,7 @@ public struct CronJob: Codable, Sendable {
    public let name: String
    public let description: String?
    public let enabled: Bool
+    public let notify: Bool?
    public let deleteafterrun: Bool?
    public let createdatms: Int
    public let updatedatms: Int
@@ -2103,6 +2104,7 @@ public struct CronJob: Codable, Sendable {
        name: String,
        description: String?,
        enabled: Bool,
+        notify: Bool?,
        deleteafterrun: Bool?,
        createdatms: Int,
        updatedatms: Int,
@@ -2118,6 +2120,7 @@ public struct CronJob: Codable, Sendable {
        self.name = name
        self.description = description
        self.enabled = enabled
+        self.notify = notify
        self.deleteafterrun = deleteafterrun
        self.createdatms = createdatms
        self.updatedatms = updatedatms
@@ -2134,6 +2137,7 @@ public struct CronJob: Codable, Sendable {
        case name
        case description
        case enabled
+        case notify
        case deleteafterrun = "deleteAfterRun"
        case createdatms = "createdAtMs"
        case updatedatms = "updatedAtMs"
@@ -2167,6 +2171,7 @@ public struct CronAddParams: Codable, Sendable {
    public let agentid: AnyCodable?
    public let description: String?
    public let enabled: Bool?
+    public let notify: Bool?
    public let deleteafterrun: Bool?
    public let schedule: AnyCodable
    public let sessiontarget: AnyCodable
@@ -2179,6 +2184,7 @@ public struct CronAddParams: Codable, Sendable {
        agentid: AnyCodable?,
        description: String?,
        enabled: Bool?,
+        notify: Bool?,
        deleteafterrun: Bool?,
        schedule: AnyCodable,
        sessiontarget: AnyCodable,
@@ -2190,6 +2196,7 @@ public struct CronAddParams: Codable, Sendable {
        self.agentid = agentid
        self.description = description
        self.enabled = enabled
+        self.notify = notify
        self.deleteafterrun = deleteafterrun
        self.schedule = schedule
        self.sessiontarget = sessiontarget
@@ -2202,6 +2209,7 @@ public struct CronAddParams: Codable, Sendable {
        case agentid = "agentId"
        case description
        case enabled
+        case notify
        case deleteafterrun = "deleteAfterRun"
        case schedule
        case sessiontarget = "sessionTarget"
--- a/apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift
+++ b/apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift
@@ -2087,6 +2087,7 @@ public struct CronJob: Codable, Sendable {
    public let name: String
    public let description: String?
    public let enabled: Bool
+    public let notify: Bool?
    public let deleteafterrun: Bool?
    public let createdatms: Int
    public let updatedatms: Int
@@ -2103,6 +2104,7 @@ public struct CronJob: Codable, Sendable {
        name: String,
        description: String?,
        enabled: Bool,
+        notify: Bool?,
        deleteafterrun: Bool?,
        createdatms: Int,
        updatedatms: Int,
@@ -2118,6 +2120,7 @@ public struct CronJob: Codable, Sendable {
        self.name = name
        self.description = description
        self.enabled = enabled
+        self.notify = notify
        self.deleteafterrun = deleteafterrun
        self.createdatms = createdatms
        self.updatedatms = updatedatms
@@ -2134,6 +2137,7 @@ public struct CronJob: Codable, Sendable {
        case name
        case description
        case enabled
+        case notify
        case deleteafterrun = "deleteAfterRun"
        case createdatms = "createdAtMs"
        case updatedatms = "updatedAtMs"
@@ -2167,6 +2171,7 @@ public struct CronAddParams: Codable, Sendable {
    public let agentid: AnyCodable?
    public let description: String?
    public let enabled: Bool?
+    public let notify: Bool?
    public let deleteafterrun: Bool?
    public let schedule: AnyCodable
    public let sessiontarget: AnyCodable
@@ -2179,6 +2184,7 @@ public struct CronAddParams: Codable, Sendable {
        agentid: AnyCodable?,
        description: String?,
        enabled: Bool?,
+        notify: Bool?,
        deleteafterrun: Bool?,
        schedule: AnyCodable,
        sessiontarget: AnyCodable,
@@ -2190,6 +2196,7 @@ public struct CronAddParams: Codable, Sendable {
        self.agentid = agentid
        self.description = description
        self.enabled = enabled
+        self.notify = notify
        self.deleteafterrun = deleteafterrun
        self.schedule = schedule
        self.sessiontarget = sessiontarget
@@ -2202,6 +2209,7 @@ public struct CronAddParams: Codable, Sendable {
        case agentid = "agentId"
        case description
        case enabled
+        case notify
        case deleteafterrun = "deleteAfterRun"
        case schedule
        case sessiontarget = "sessionTarget"
--- a/docs/automation/cron-jobs.md
+++ b/docs/automation/cron-jobs.md
@@ -27,6 +27,7 @@ Troubleshooting: [/automation/troubleshooting](/automation/troubleshooting)
  - **Main session**: enqueue a system event, then run on the next heartbeat.
  - **Isolated**: run a dedicated agent turn in `cron:<jobId>`, with delivery (announce by default or none).
 - Wakeups are first-class: a job can request “wake now” vs “next heartbeat”.
+- Webhook posting is opt-in per job: set `notify: true` and configure `cron.webhook`.

 ## Quick start (actionable)

@@ -288,7 +289,7 @@ Notes:
 - `schedule.at` accepts ISO 8601 (timezone optional; treated as UTC when omitted).
 - `everyMs` is milliseconds.
 - `sessionTarget` must be `"main"` or `"isolated"` and must match `payload.kind`.
- Optional fields: `agentId`, `description`, `enabled`, `deleteAfterRun` (defaults to true for `at`),
+- Optional fields: `agentId`, `description`, `enabled`, `notify`, `deleteAfterRun` (defaults to true for `at`),
  `delivery`.
 - `wakeMode` defaults to `"now"` when omitted.

@@ -333,10 +334,19 @@ Notes:
    enabled: true, // default true
    store: "~/.openclaw/cron/jobs.json",
    maxConcurrentRuns: 1, // default 1
+    webhook: "https://example.invalid/cron-finished", // optional finished-run webhook endpoint
+    webhookToken: "replace-with-dedicated-webhook-token", // optional, do not reuse gateway auth token
  },
 }
 ```

+Webhook behavior:
+
+- The Gateway posts finished run events to `cron.webhook` only when the job has `notify: true`.
+- Payload is the cron finished event JSON.
+- If `cron.webhookToken` is set, auth header is `Authorization: Bearer <cron.webhookToken>`.
+- If `cron.webhookToken` is not set, no `Authorization` header is sent.
+
 Disable cron entirely:

 - `cron.enabled: false` (config)
--- a/docs/channels/groups.md
+++ b/docs/channels/groups.md
@@ -105,7 +105,7 @@ Want “groups can only see folder X” instead of “no host access”? Keep `w
        docker: {
          binds: [
            // hostPath:containerPath:mode
-            "~/FriendsShared:/data:ro",
+            "/home/user/FriendsShared:/data:ro",
          ],
        },
      },
--- a/docs/experiments/plans/pty-process-supervision.md
+++ b/docs/experiments/plans/pty-process-supervision.md
@@ -0,0 +1,192 @@
+---
+summary: "Production plan for reliable interactive process supervision (PTY + non-PTY) with explicit ownership, unified lifecycle, and deterministic cleanup"
+owner: "openclaw"
+status: "in-progress"
+last_updated: "2026-02-15"
+title: "PTY and Process Supervision Plan"
+---
+
+# PTY and Process Supervision Plan
+
+## 1. Problem and goal
+
+We need one reliable lifecycle for long-running command execution across:
+
+- `exec` foreground runs
+- `exec` background runs
+- `process` follow up actions (`poll`, `log`, `send-keys`, `paste`, `submit`, `kill`, `remove`)
+- CLI agent runner subprocesses
+
+The goal is not just to support PTY. The goal is predictable ownership, cancellation, timeout, and cleanup with no unsafe process matching heuristics.
+
+## 2. Scope and boundaries
+
+- Keep implementation internal in `src/process/supervisor`.
+- Do not create a new package for this.
+- Keep current behavior compatibility where practical.
+- Do not broaden scope to terminal replay or tmux style session persistence.
+
+## 3. Implemented in this branch
+
+### Supervisor baseline already present
+
+- Supervisor module is in place under `src/process/supervisor/*`.
+- Exec runtime and CLI runner are already routed through supervisor spawn and wait.
+- Registry finalization is idempotent.
+
+### This pass completed
+
+1. Explicit PTY command contract
+
+- `SpawnInput` is now a discriminated union in `src/process/supervisor/types.ts`.
+- PTY runs require `ptyCommand` instead of reusing generic `argv`.
+- Supervisor no longer rebuilds PTY command strings from argv joins in `src/process/supervisor/supervisor.ts`.
+- Exec runtime now passes `ptyCommand` directly in `src/agents/bash-tools.exec-runtime.ts`.
+
+2. Process layer type decoupling
+
+- Supervisor types no longer import `SessionStdin` from agents.
+- Process local stdin contract lives in `src/process/supervisor/types.ts` (`ManagedRunStdin`).
+- Adapters now depend only on process level types:
+  - `src/process/supervisor/adapters/child.ts`
+  - `src/process/supervisor/adapters/pty.ts`
+
+3. Process tool lifecycle ownership improvement
+
+- `src/agents/bash-tools.process.ts` now requests cancellation through supervisor first.
+- `process kill/remove` now use process-tree fallback termination when supervisor lookup misses.
+- `remove` keeps deterministic remove behavior by dropping running session entries immediately after termination is requested.
+
+4. Single source watchdog defaults
+
+- Added shared defaults in `src/agents/cli-watchdog-defaults.ts`.
+- `src/agents/cli-backends.ts` consumes the shared defaults.
+- `src/agents/cli-runner/reliability.ts` consumes the same shared defaults.
+
+5. Dead helper cleanup
+
+- Removed unused `killSession` helper path from `src/agents/bash-tools.shared.ts`.
+
+6. Direct supervisor path tests added
+
+- Added `src/agents/bash-tools.process.supervisor.test.ts` to cover kill and remove routing through supervisor cancellation.
+
+7. Reliability gap fixes completed
+
+- `src/agents/bash-tools.process.ts` now falls back to real OS-level process termination when supervisor lookup misses.
+- `src/process/supervisor/adapters/child.ts` now uses process-tree termination semantics for default cancel/timeout kill paths.
+- Added shared process-tree utility in `src/process/kill-tree.ts`.
+
+8. PTY contract edge-case coverage added
+
+- Added `src/process/supervisor/supervisor.pty-command.test.ts` for verbatim PTY command forwarding and empty-command rejection.
+- Added `src/process/supervisor/adapters/child.test.ts` for process-tree kill behavior in child adapter cancellation.
+
+## 4. Remaining gaps and decisions
+
+### Reliability status
+
+The two required reliability gaps for this pass are now closed:
+
+- `process kill/remove` now has a real OS termination fallback when supervisor lookup misses.
+- child cancel/timeout now uses process-tree kill semantics for default kill path.
+- Regression tests were added for both behaviors.
+
+### Durability and startup reconciliation
+
+Restart behavior is now explicitly defined as in-memory lifecycle only.
+
+- `reconcileOrphans()` remains a no-op in `src/process/supervisor/supervisor.ts` by design.
+- Active runs are not recovered after process restart.
+- This boundary is intentional for this implementation pass to avoid partial persistence risks.
+
+### Maintainability follow-ups
+
+1. `runExecProcess` in `src/agents/bash-tools.exec-runtime.ts` still handles multiple responsibilities and can be split into focused helpers in a follow-up.
+
+## 5. Implementation plan
+
+The implementation pass for required reliability and contract items is complete.
+
+Completed:
+
+- `process kill/remove` fallback real termination
+- process-tree cancellation for child adapter default kill path
+- regression tests for fallback kill and child adapter kill path
+- PTY command edge-case tests under explicit `ptyCommand`
+- explicit in-memory restart boundary with `reconcileOrphans()` no-op by design
+
+Optional follow-up:
+
+- split `runExecProcess` into focused helpers with no behavior drift
+
+## 6. File map
+
+### Process supervisor
+
+- `src/process/supervisor/types.ts` updated with discriminated spawn input and process local stdin contract.
+- `src/process/supervisor/supervisor.ts` updated to use explicit `ptyCommand`.
+- `src/process/supervisor/adapters/child.ts` and `src/process/supervisor/adapters/pty.ts` decoupled from agent types.
+- `src/process/supervisor/registry.ts` idempotent finalize unchanged and retained.
+
+### Exec and process integration
+
+- `src/agents/bash-tools.exec-runtime.ts` updated to pass PTY command explicitly and keep fallback path.
+- `src/agents/bash-tools.process.ts` updated to cancel via supervisor with real process-tree fallback termination.
+- `src/agents/bash-tools.shared.ts` removed direct kill helper path.
+
+### CLI reliability
+
+- `src/agents/cli-watchdog-defaults.ts` added as shared baseline.
+- `src/agents/cli-backends.ts` and `src/agents/cli-runner/reliability.ts` now consume same defaults.
+
+## 7. Validation run in this pass
+
+Unit tests:
+
+- `pnpm vitest src/process/supervisor/registry.test.ts`
+- `pnpm vitest src/process/supervisor/supervisor.test.ts`
+- `pnpm vitest src/process/supervisor/supervisor.pty-command.test.ts`
+- `pnpm vitest src/process/supervisor/adapters/child.test.ts`
+- `pnpm vitest src/agents/cli-backends.test.ts`
+- `pnpm vitest src/agents/bash-tools.exec.pty-cleanup.test.ts`
+- `pnpm vitest src/agents/bash-tools.process.poll-timeout.test.ts`
+- `pnpm vitest src/agents/bash-tools.process.supervisor.test.ts`
+- `pnpm vitest src/process/exec.test.ts`
+
+E2E targets:
+
+- `pnpm test:e2e src/agents/cli-runner.e2e.test.ts`
+- `pnpm test:e2e src/agents/bash-tools.exec.pty-fallback.e2e.test.ts src/agents/bash-tools.exec.background-abort.e2e.test.ts src/agents/bash-tools.process.send-keys.e2e.test.ts`
+
+Typecheck note:
+
+- `pnpm tsgo` currently fails in this repo due to a pre-existing UI typing dependency issue (`@vitest/browser-playwright` resolution), unrelated to this process supervision work.
+
+## 8. Operational guarantees preserved
+
+- Exec env hardening behavior is unchanged.
+- Approval and allowlist flow is unchanged.
+- Output sanitization and output caps are unchanged.
+- PTY adapter still guarantees wait settlement on forced kill and listener disposal.
+
+## 9. Definition of done
+
+1. Supervisor is lifecycle owner for managed runs.
+2. PTY spawn uses explicit command contract with no argv reconstruction.
+3. Process layer has no type dependency on agent layer for supervisor stdin contracts.
+4. Watchdog defaults are single source.
+5. Targeted unit and e2e tests remain green.
+6. Restart durability boundary is explicitly documented or fully implemented.
+
+## 10. Summary
+
+The branch now has a coherent and safer supervision shape:
+
+- explicit PTY contract
+- cleaner process layering
+- supervisor driven cancellation path for process operations
+- real fallback termination when supervisor lookup misses
+- process-tree cancellation for child-run default kill paths
+- unified watchdog defaults
+- explicit in-memory restart boundary (no orphan reconciliation across restart in this pass)
--- a/docs/gateway/configuration-reference.md
+++ b/docs/gateway/configuration-reference.md
@@ -2295,12 +2295,16 @@ Current builds no longer include the TCP bridge. Nodes connect over the Gateway
  cron: {
    enabled: true,
    maxConcurrentRuns: 2,
+    webhook: "https://example.invalid/cron-finished", // optional, must be http:// or https://
+    webhookToken: "replace-with-dedicated-token", // optional bearer token for outbound webhook auth
    sessionRetention: "24h", // duration string or false
  },
 }
 ```

 - `sessionRetention`: how long to keep completed cron sessions before pruning. Default: `24h`.
+- `webhook`: finished-run webhook endpoint, only used when the job has `notify: true`.
+- `webhookToken`: dedicated bearer token for webhook auth, if omitted no auth header is sent.

 See [Cron Jobs](/automation/cron-jobs).

--- a/docs/gateway/sandboxing.md
+++ b/docs/gateway/sandboxing.md
@@ -76,7 +76,7 @@ Global and per-agent binds are **merged** (not replaced). Under `scope: "shared"
 - When set (including `[]`), it replaces `agents.defaults.sandbox.docker.binds` for the browser container.
 - When omitted, the browser container falls back to `agents.defaults.sandbox.docker.binds` (backwards compatible).

-Example (read-only source + docker socket):
+Example (read-only source + an extra data directory):

 ```json5
 {
@@ -84,7 +84,7 @@ Example (read-only source + docker socket):
    defaults: {
      sandbox: {
        docker: {
-          binds: ["/home/user/source:/source:ro", "/var/run/docker.sock:/var/run/docker.sock"],
+          binds: ["/home/user/source:/source:ro", "/var/data/myapp:/data:ro"],
        },
      },
    },
@@ -105,7 +105,8 @@ Example (read-only source + docker socket):
 Security notes:

 - Binds bypass the sandbox filesystem: they expose host paths with whatever mode you set (`:ro` or `:rw`).
- Sensitive mounts (e.g., `docker.sock`, secrets, SSH keys) should be `:ro` unless absolutely required.
+- OpenClaw blocks dangerous bind sources (for example: `docker.sock`, `/etc`, `/proc`, `/sys`, `/dev`, and parent mounts that would expose them).
+- Sensitive mounts (secrets, SSH keys, service credentials) should be `:ro` unless absolutely required.
 - Combine with `workspaceAccess: "ro"` if you only need read access to the workspace; bind modes stay independent.
 - See [Sandbox vs Tool Policy vs Elevated](/gateway/sandbox-vs-tool-policy-vs-elevated) for how binds interact with tool policy and elevated exec.

--- a/docs/tools/web.md
+++ b/docs/tools/web.md
@@ -224,6 +224,7 @@ Fetch a URL and extract readable content.
        enabled: true,
        maxChars: 50000,
        maxCharsCap: 50000,
+        maxResponseBytes: 2000000,
        timeoutSeconds: 30,
        cacheTtlMinutes: 15,
        maxRedirects: 3,
@@ -256,6 +257,7 @@ Notes:
 - `web_fetch` sends a Chrome-like User-Agent and `Accept-Language` by default; override `userAgent` if needed.
 - `web_fetch` blocks private/internal hostnames and re-checks redirects (limit with `maxRedirects`).
 - `maxChars` is clamped to `tools.web.fetch.maxCharsCap`.
+- `web_fetch` caps the downloaded response body size to `tools.web.fetch.maxResponseBytes` before parsing; oversized responses are truncated and include a warning.
 - `web_fetch` is best-effort extraction; some sites will need the browser tool.
 - See [Firecrawl](/tools/firecrawl) for key setup and service details.
 - Responses are cached (default 15 minutes) to reduce repeated fetches.
--- a/docs/web/control-ui.md
+++ b/docs/web/control-ui.md
@@ -83,6 +83,9 @@ Cron jobs panel notes:

 - For isolated jobs, delivery defaults to announce summary. You can switch to none if you want internal-only runs.
 - Channel/target fields appear when announce is selected.
+- New job form includes a **Notify webhook** toggle (`notify` on the job).
+- Gateway webhook posting requires both `notify: true` on the job and `cron.webhook` in config.
+- Set `cron.webhookToken` to send a dedicated bearer token, if omitted the webhook is sent without an auth header.

 ## Chat behavior

@@ -93,6 +96,10 @@ Cron jobs panel notes:
  - Click **Stop** (calls `chat.abort`)
  - Type `/stop` (or `stop|esc|abort|wait|exit|interrupt`) to abort out-of-band
  - `chat.abort` supports `{ sessionKey }` (no `runId`) to abort all active runs for that session
+- Abort partial retention:
+  - When a run is aborted, partial assistant text can still be shown in the UI
+  - Gateway persists aborted partial assistant text into transcript history when buffered output exists
+  - Persisted entries include abort metadata so transcript consumers can tell abort partials from normal completion output

 ## Tailnet access (recommended)

--- a/docs/web/webchat.md
+++ b/docs/web/webchat.md
@@ -25,6 +25,8 @@ Status: the macOS/iOS SwiftUI chat UI talks directly to the Gateway WebSocket.

 - The UI connects to the Gateway WebSocket and uses `chat.history`, `chat.send`, and `chat.inject`.
 - `chat.inject` appends an assistant note directly to the transcript and broadcasts it to the UI (no agent run).
+- Aborted runs can keep partial assistant output visible in the UI.
+- Gateway persists aborted partial assistant text into transcript history when buffered output exists, and marks those entries with abort metadata.
 - History is always fetched from the gateway (no local file watching).
 - If the gateway is unreachable, WebChat is read-only.

--- a/package.json
+++ b/package.json
@@ -41,6 +41,7 @@
    "android:test": "cd apps/android && ./gradlew :app:testDebugUnitTest",
    "build": "pnpm canvas:a2ui:bundle && tsdown && pnpm build:plugin-sdk:dts && node --import tsx scripts/write-plugin-sdk-entry-dts.ts && node --import tsx scripts/canvas-a2ui-copy.ts && node --import tsx scripts/copy-hook-metadata.ts && node --import tsx scripts/write-build-info.ts && node --import tsx scripts/write-cli-compat.ts",
    "build:plugin-sdk:dts": "tsc -p tsconfig.plugin-sdk.dts.json",
+    "build:runtime": "pnpm canvas:a2ui:bundle && tsdown && node --import tsx scripts/canvas-a2ui-copy.ts && node --import tsx scripts/copy-hook-metadata.ts && node --import tsx scripts/write-build-info.ts && node --import tsx scripts/write-cli-compat.ts",
    "canvas:a2ui:bundle": "bash scripts/bundle-a2ui.sh",
    "check": "pnpm format:check && pnpm tsgo && pnpm lint",
    "check:docs": "pnpm format:docs:check && pnpm lint:docs && pnpm docs:check-links",
@@ -177,13 +178,13 @@
    "@types/proper-lockfile": "^4.1.4",
    "@types/qrcode-terminal": "^0.12.2",
    "@types/ws": "^8.18.1",
-    "@typescript/native-preview": "7.0.0-dev.20260214.1",
+    "@typescript/native-preview": "7.0.0-dev.20260215.1",
    "@vitest/coverage-v8": "^4.0.18",
    "lit": "^3.3.2",
    "ollama": "^0.6.3",
    "oxfmt": "0.32.0",
    "oxlint": "^1.47.0",
-    "oxlint-tsgolint": "^0.12.2",
+    "oxlint-tsgolint": "^0.13.0",
    "rolldown": "1.0.0-rc.4",
    "tsdown": "^0.20.3",
    "tsx": "^4.21.0",
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -207,8 +207,8 @@ importers:
        specifier: ^8.18.1
        version: 8.18.1
      '@typescript/native-preview':
-        specifier: 7.0.0-dev.20260214.1
-        version: 7.0.0-dev.20260214.1
+        specifier: 7.0.0-dev.20260215.1
+        version: 7.0.0-dev.20260215.1
      '@vitest/coverage-v8':
        specifier: ^4.0.18
        version: 4.0.18(@vitest/browser@4.0.18(vite@7.3.1(@types/node@25.2.3)(jiti@2.6.1)(lightningcss@1.30.2)(tsx@4.21.0)(yaml@2.8.2))(vitest@4.0.18))(vitest@4.0.18)
@@ -223,16 +223,16 @@ importers:
        version: 0.32.0
      oxlint:
        specifier: ^1.47.0
-        version: 1.47.0(oxlint-tsgolint@0.12.2)
+        version: 1.47.0(oxlint-tsgolint@0.13.0)
      oxlint-tsgolint:
-        specifier: ^0.12.2
-        version: 0.12.2
+        specifier: ^0.13.0
+        version: 0.13.0
      rolldown:
        specifier: 1.0.0-rc.4
        version: 1.0.0-rc.4
      tsdown:
        specifier: ^0.20.3
-        version: 0.20.3(@typescript/native-preview@7.0.0-dev.20260214.1)(typescript@5.9.3)
+        version: 0.20.3(@typescript/native-preview@7.0.0-dev.20260215.1)(typescript@5.9.3)
      tsx:
        specifier: ^4.21.0
        version: 4.21.0
@@ -783,8 +783,8 @@ packages:
    resolution: {integrity: sha512-qMlSxKbpRlAridDExk92nSobyDdpPijUq2DW6oDnUqd0iOGxmQjyqhMIihI9+zv4LPyZdRje2cavWPbCbWm3eA==}
    engines: {node: '>=6.9.0'}

-  '@babel/helper-string-parser@8.0.0-rc.1':
-    resolution: {integrity: sha512-vi/pfmbrOtQmqgfboaBhaCU50G7mcySVu69VU8z+lYoPPB6WzI9VgV7WQfL908M4oeSH5fDkmoupIqoE0SdApw==}
+  '@babel/helper-string-parser@8.0.0-rc.2':
+    resolution: {integrity: sha512-noLx87RwlBEMrTzncWd/FvTxoJ9+ycHNg0n8yyYydIoDsLZuxknKgWRJUqcrVkNrJ74uGyhWQzQaS3q8xfGAhQ==}
    engines: {node: ^20.19.0 || >=22.12.0}

  '@babel/helper-validator-identifier@7.28.5':
@@ -2059,33 +2059,33 @@ packages:
    cpu: [x64]
    os: [win32]

-  '@oxlint-tsgolint/darwin-arm64@0.12.2':
-    resolution: {integrity: sha512-XIfavTqkJPGYi/98z7ZCkZvXq2AccMAAB0iwvKDRTQqiweMXVUyeUdx46phCHHH1PgmIVJtVfysThkHq2xCyrw==}
+  '@oxlint-tsgolint/darwin-arm64@0.13.0':
+    resolution: {integrity: sha512-OWQ3U+oDjjupmX0WU9oYyKF2iUOKDMLW/+zan0cd0vYIGId80xTRHHA8oXnREmK8dsMMP3nV3VXME3NH/hS0lw==}
    cpu: [arm64]
    os: [darwin]

-  '@oxlint-tsgolint/darwin-x64@0.12.2':
-    resolution: {integrity: sha512-tytsvP6zmNShRNDo4GgQartOXmd4GPd+TylCUMdO/iWl9PZVOgRyswWbYVTNgn85Cib/aY2q3Uu+jOw+QlbxvQ==}
+  '@oxlint-tsgolint/darwin-x64@0.13.0':
+    resolution: {integrity: sha512-wZvgj+eVqNkCUjSq2ExlMdbGDpZfaw6J+YctQV1pkGFdn7Y9cySWdfwu5v/AW2JPsJbFMXJ8GAr+WoZbRapz2A==}
    cpu: [x64]
    os: [darwin]

-  '@oxlint-tsgolint/linux-arm64@0.12.2':
-    resolution: {integrity: sha512-3W38yJuF7taEquhEuD6mYQyCeWNAlc1pNPjFkspkhLKZVgbrhDA4V6fCxLDDRvrTHde0bXPmFvuPlUq5pSePgA==}
+  '@oxlint-tsgolint/linux-arm64@0.13.0':
+    resolution: {integrity: sha512-nwtf5BgHbAWSVwyIF00l6QpfyFcpDMp6D+3cpe6NTgBYMSSSC0Ip1gswUwzVccOPoQK48t+J6vHyURQ96M1KDg==}
    cpu: [arm64]
    os: [linux]

-  '@oxlint-tsgolint/linux-x64@0.12.2':
-    resolution: {integrity: sha512-EjcEspeeV0NmaopEp4wcN5ntQP9VCJJDrTvzOjMP4W6ajz18M+pni9vkKvmcPIpRa/UmWobeFgKoVd/KGueeuQ==}
+  '@oxlint-tsgolint/linux-x64@0.13.0':
+    resolution: {integrity: sha512-Rkzgj38eVoGSBuGDaCrALS4FM19+m1Qlv0hjB4MWvXUej014XkB5ze+svYE3HX+AAm1ey9QYj/CQzfz203FPIg==}
    cpu: [x64]
    os: [linux]

-  '@oxlint-tsgolint/win32-arm64@0.12.2':
-    resolution: {integrity: sha512-a9L7iA5K/Ht/i8d9+7RTp6hbPa4cyXP0MdySVXAO6vczpL/4ildfY9Hr2m2wqL12uK6xe/uVABpVTrqay/wV+g==}
+  '@oxlint-tsgolint/win32-arm64@0.13.0':
+    resolution: {integrity: sha512-Y+0hFqLT5M7UIvGvTR3QFK27l17FqXk6UwwpBFOcyBGJ5bLd1RaAPWjqTmcgPvdolA6FCMeW1pxZuNtKDlYd7A==}
    cpu: [arm64]
    os: [win32]

-  '@oxlint-tsgolint/win32-x64@0.12.2':
-    resolution: {integrity: sha512-Cvt40UbTf5ib12DjGN+mMGOnjWa4Bc6Y7KEaXXp9qzckvs3HpNk2wSwMV3gnuR8Ipx4hkzkzrgzD0BAUsySAfA==}
+  '@oxlint-tsgolint/win32-x64@0.13.0':
+    resolution: {integrity: sha512-mXjTttzyyfl8d/XvxggmZFBq0pbQmRvHbjQEv70YECNaLEHG8j8WYUwLa641uudAnV1VoBI34pc7bmgJM7qhOA==}
    cpu: [x64]
    os: [win32]

@@ -2995,43 +2995,43 @@ packages:
  '@types/ws@8.18.1':
    resolution: {integrity: sha512-ThVF6DCVhA8kUGy+aazFQ4kXQ7E1Ty7A3ypFOe0IcJV8O/M511G99AW24irKrW56Wt44yG9+ij8FaqoBGkuBXg==}

-  '@typescript/native-preview-darwin-arm64@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-Jb2WcLGpTOC6x58e8QPYC/14xmDbnbFIuKqUvYoI77hVtojVyxZi8L5Y4CgYqXYx8vRWmIFk35c1OGdtPip6Sg==}
+  '@typescript/native-preview-darwin-arm64@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-icVO/hEMXjWlKhmpjIpqDyCzPvtHqfrPB+2rkd6M3rz84Bmw+o8Xgd7JvRxryZhR+D0y55me/bKh9xgvsgzuhA==}
    cpu: [arm64]
    os: [darwin]

-  '@typescript/native-preview-darwin-x64@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-O9l2gVuQFZsb8NIQtu0HN5Tn/Hw2fwylPOPS/0Y4oW+FUMhkqtvetUkb3zZ0qj7capilZ4YnmyGYg3TDqkP4Nw==}
+  '@typescript/native-preview-darwin-x64@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-Wz73wf1o9+4KwCLg8wnnIZZDAvv2KRZlDyP4X8GfBNzajfIAwYvI0ANWuIDznUUGeDAcqhBJXNe0Bkf4H9y4mg==}
    cpu: [x64]
    os: [darwin]

-  '@typescript/native-preview-linux-arm64@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-Hl4e3yxJqzIGgFI8aH/rLGW+a7kSLHJCpAd5JOLG7hHKnamZF4SjlunnoHLV4IcMri+G6UE3W/84i0QvQP5wLA==}
+  '@typescript/native-preview-linux-arm64@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-AYyXRxVwLZzfkEYN8FGdV4vqXwbTmv93nAZ6gMLvpDG4ItOybAE1R2obFjlFc+Or/rfQmVvfdkTym3c4bRJ3XQ==}
    cpu: [arm64]
    os: [linux]

-  '@typescript/native-preview-linux-arm@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-TaFrVnx3iXtl/oH1hzwvFyqWj9tzkjW8Ufl2m0Vx2/7GXnzZadm2KA6tFpGbzzWbZJznmXxKHL4O3AZRQYyZqQ==}
+  '@typescript/native-preview-linux-arm@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-6WVXFVSp3LBBiBgBMtAHQgTDN72mDhgjrmXH7GoABTxR9asK8oPfmy5cwTp1sPD46pYhqjnSHMrARyg2FaNSeA==}
    cpu: [arm]
    os: [linux]

-  '@typescript/native-preview-linux-x64@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-a/JypIXTc/tdodhYdQm24WH6aTfnJJjDbwxce4BS2g6IzYSc2GFcZBvlq1CJYS2FAVLpiSxj0OFAZmgjpCDAKg==}
+  '@typescript/native-preview-linux-x64@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-Ui6qbTO+nE7fwh5OGTGfL4ndaT+SpiUiv0F1m3+nMaiAKysY5GbgXUfzWzkSrOODsT8F/4jZ4wCzEzJordt8sQ==}
    cpu: [x64]
    os: [linux]

-  '@typescript/native-preview-win32-arm64@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-MJGPEDvdXj8olcWH0P+cWYcaN4r/0J4aSbcaISlen3MZ/2hrrgNl46PV4eGJKKCDniY2pH2fJzrMyJWZOcdb0w==}
+  '@typescript/native-preview-win32-arm64@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-dBFyAH9h3bMUaIp/84c3gKwyQ6jQmtzVoIBamSrYNw0xinJ56A/Ln5igdNOYrH8+/Aofmeh7pAWaa8U456XMjw==}
    cpu: [arm64]
    os: [win32]

-  '@typescript/native-preview-win32-x64@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-BtF48TRUyiCKznlOcQ7r7EXhonGSanm9X2eu7d8Yq1vaWO5SDgB0e+ISQXSoIfs3a1S3d5S5QV/vTE4+vocPxA==}
+  '@typescript/native-preview-win32-x64@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-bEMSwX71OGGvfsfHEa/aX7ZUWbPSI2oKEmeWcDQVY8vH1VK1ZwcFzMhKfgVJPt5pKH2bK3EO3xYnAyKkDO/Ung==}
    cpu: [x64]
    os: [win32]

-  '@typescript/native-preview@7.0.0-dev.20260214.1':
-    resolution: {integrity: sha512-BDM0ZLf2v6ilR0tDi8OMEr4X08lFCToPk3/p1SSE4GhagzmlU/5b+9slR0kKtaKMrds01FhvaKx6U9+NmAWgbQ==}
+  '@typescript/native-preview@7.0.0-dev.20260215.1':
+    resolution: {integrity: sha512-grs0BbJyPR7VLNerBVteEToPku1InMKVKVKBUTJi19LfK+LU3+pkU6/fsTfZhH3xmIzIxD/sNRQHLt4x/Yb9yg==}
    hasBin: true

  '@typespec/ts-http-runtime@0.3.3':
@@ -4728,8 +4728,8 @@ packages:
    engines: {node: ^20.19.0 || >=22.12.0}
    hasBin: true

-  oxlint-tsgolint@0.12.2:
-    resolution: {integrity: sha512-IFiOhYZfSgiHbBznTZOhFpEHpsZFSP0j7fVRake03HEkgH0YljnTFDNoRkGWsTrnrHr7nRIomSsF4TnCI/O+kQ==}
+  oxlint-tsgolint@0.13.0:
+    resolution: {integrity: sha512-VUOWP5T9R9RwuPLKvNgvhsjdPFVhr2k8no8ea84+KhDtYPmk9L/3StNP3WClyPOKJOT8bFlO3eyhTKxXK9+Oog==}
    hasBin: true

  oxlint@1.47.0:
@@ -6285,7 +6285,7 @@ snapshots:

  '@babel/helper-string-parser@7.27.1': {}

-  '@babel/helper-string-parser@8.0.0-rc.1': {}
+  '@babel/helper-string-parser@8.0.0-rc.2': {}

  '@babel/helper-validator-identifier@7.28.5': {}

@@ -6308,7 +6308,7 @@ snapshots:

  '@babel/types@8.0.0-rc.1':
    dependencies:
-      '@babel/helper-string-parser': 8.0.0-rc.1
+      '@babel/helper-string-parser': 8.0.0-rc.2
      '@babel/helper-validator-identifier': 8.0.0-rc.1

  '@bcoe/v8-coverage@1.0.2': {}
@@ -7546,22 +7546,22 @@ snapshots:
  '@oxfmt/binding-win32-x64-msvc@0.32.0':
    optional: true

-  '@oxlint-tsgolint/darwin-arm64@0.12.2':
+  '@oxlint-tsgolint/darwin-arm64@0.13.0':
    optional: true

-  '@oxlint-tsgolint/darwin-x64@0.12.2':
+  '@oxlint-tsgolint/darwin-x64@0.13.0':
    optional: true

-  '@oxlint-tsgolint/linux-arm64@0.12.2':
+  '@oxlint-tsgolint/linux-arm64@0.13.0':
    optional: true

-  '@oxlint-tsgolint/linux-x64@0.12.2':
+  '@oxlint-tsgolint/linux-x64@0.13.0':
    optional: true

-  '@oxlint-tsgolint/win32-arm64@0.12.2':
+  '@oxlint-tsgolint/win32-arm64@0.13.0':
    optional: true

-  '@oxlint-tsgolint/win32-x64@0.12.2':
+  '@oxlint-tsgolint/win32-x64@0.13.0':
    optional: true

  '@oxlint/binding-android-arm-eabi@1.47.0':
@@ -8473,36 +8473,36 @@ snapshots:
    dependencies:
      '@types/node': 25.2.3

-  '@typescript/native-preview-darwin-arm64@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-darwin-arm64@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview-darwin-x64@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-darwin-x64@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview-linux-arm64@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-linux-arm64@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview-linux-arm@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-linux-arm@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview-linux-x64@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-linux-x64@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview-win32-arm64@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-win32-arm64@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview-win32-x64@7.0.0-dev.20260214.1':
+  '@typescript/native-preview-win32-x64@7.0.0-dev.20260215.1':
    optional: true

-  '@typescript/native-preview@7.0.0-dev.20260214.1':
+  '@typescript/native-preview@7.0.0-dev.20260215.1':
    optionalDependencies:
-      '@typescript/native-preview-darwin-arm64': 7.0.0-dev.20260214.1
-      '@typescript/native-preview-darwin-x64': 7.0.0-dev.20260214.1
-      '@typescript/native-preview-linux-arm': 7.0.0-dev.20260214.1
-      '@typescript/native-preview-linux-arm64': 7.0.0-dev.20260214.1
-      '@typescript/native-preview-linux-x64': 7.0.0-dev.20260214.1
-      '@typescript/native-preview-win32-arm64': 7.0.0-dev.20260214.1
-      '@typescript/native-preview-win32-x64': 7.0.0-dev.20260214.1
+      '@typescript/native-preview-darwin-arm64': 7.0.0-dev.20260215.1
+      '@typescript/native-preview-darwin-x64': 7.0.0-dev.20260215.1
+      '@typescript/native-preview-linux-arm': 7.0.0-dev.20260215.1
+      '@typescript/native-preview-linux-arm64': 7.0.0-dev.20260215.1
+      '@typescript/native-preview-linux-x64': 7.0.0-dev.20260215.1
+      '@typescript/native-preview-win32-arm64': 7.0.0-dev.20260215.1
+      '@typescript/native-preview-win32-x64': 7.0.0-dev.20260215.1

  '@typespec/ts-http-runtime@0.3.3':
    dependencies:
@@ -10385,16 +10385,16 @@ snapshots:
      '@oxfmt/binding-win32-ia32-msvc': 0.32.0
      '@oxfmt/binding-win32-x64-msvc': 0.32.0

-  oxlint-tsgolint@0.12.2:
+  oxlint-tsgolint@0.13.0:
    optionalDependencies:
-      '@oxlint-tsgolint/darwin-arm64': 0.12.2
-      '@oxlint-tsgolint/darwin-x64': 0.12.2
-      '@oxlint-tsgolint/linux-arm64': 0.12.2
-      '@oxlint-tsgolint/linux-x64': 0.12.2
-      '@oxlint-tsgolint/win32-arm64': 0.12.2
-      '@oxlint-tsgolint/win32-x64': 0.12.2
+      '@oxlint-tsgolint/darwin-arm64': 0.13.0
+      '@oxlint-tsgolint/darwin-x64': 0.13.0
+      '@oxlint-tsgolint/linux-arm64': 0.13.0
+      '@oxlint-tsgolint/linux-x64': 0.13.0
+      '@oxlint-tsgolint/win32-arm64': 0.13.0
+      '@oxlint-tsgolint/win32-x64': 0.13.0

-  oxlint@1.47.0(oxlint-tsgolint@0.12.2):
+  oxlint@1.47.0(oxlint-tsgolint@0.13.0):
    optionalDependencies:
      '@oxlint/binding-android-arm-eabi': 1.47.0
      '@oxlint/binding-android-arm64': 1.47.0
@@ -10415,7 +10415,7 @@ snapshots:
      '@oxlint/binding-win32-arm64-msvc': 1.47.0
      '@oxlint/binding-win32-ia32-msvc': 1.47.0
      '@oxlint/binding-win32-x64-msvc': 1.47.0
-      oxlint-tsgolint: 0.12.2
+      oxlint-tsgolint: 0.13.0

  p-finally@1.0.0: {}

@@ -10791,7 +10791,7 @@ snapshots:
    dependencies:
      glob: 10.5.0

-  rolldown-plugin-dts@0.22.1(@typescript/native-preview@7.0.0-dev.20260214.1)(rolldown@1.0.0-rc.3)(typescript@5.9.3):
+  rolldown-plugin-dts@0.22.1(@typescript/native-preview@7.0.0-dev.20260215.1)(rolldown@1.0.0-rc.3)(typescript@5.9.3):
    dependencies:
      '@babel/generator': 8.0.0-rc.1
      '@babel/helper-validator-identifier': 8.0.0-rc.1
@@ -10804,7 +10804,7 @@ snapshots:
      obug: 2.1.1
      rolldown: 1.0.0-rc.3
    optionalDependencies:
-      '@typescript/native-preview': 7.0.0-dev.20260214.1
+      '@typescript/native-preview': 7.0.0-dev.20260215.1
      typescript: 5.9.3
    transitivePeerDependencies:
      - oxc-resolver
@@ -11269,7 +11269,7 @@ snapshots:

  ts-algebra@2.0.0: {}

-  tsdown@0.20.3(@typescript/native-preview@7.0.0-dev.20260214.1)(typescript@5.9.3):
+  tsdown@0.20.3(@typescript/native-preview@7.0.0-dev.20260215.1)(typescript@5.9.3):
    dependencies:
      ansis: 4.2.0
      cac: 6.7.14
@@ -11280,7 +11280,7 @@ snapshots:
      obug: 2.1.1
      picomatch: 4.0.3
      rolldown: 1.0.0-rc.3
-      rolldown-plugin-dts: 0.22.1(@typescript/native-preview@7.0.0-dev.20260214.1)(rolldown@1.0.0-rc.3)(typescript@5.9.3)
+      rolldown-plugin-dts: 0.22.1(@typescript/native-preview@7.0.0-dev.20260215.1)(rolldown@1.0.0-rc.3)(typescript@5.9.3)
      semver: 7.7.4
      tinyexec: 1.0.2
      tinyglobby: 0.2.15
--- a/scripts/test-parallel.mjs
+++ b/scripts/test-parallel.mjs
@@ -165,7 +165,7 @@ const defaultWorkerBudget =
            unit: Math.max(2, Math.min(8, Math.floor(localWorkers / 2))),
            unitIsolated: 1,
            extensions: Math.max(1, Math.min(4, Math.floor(localWorkers / 4))),
-            gateway: 1,
+            gateway: 2,
          };

 // Keep worker counts predictable for local runs; trim macOS CI workers to avoid worker crashes/OOM.
--- a/src/acp/session-mapper.test.ts
+++ b/src/acp/session-mapper.test.ts
@@ -1,6 +1,7 @@
-import { describe, expect, it, vi } from "vitest";
+import { afterEach, describe, expect, it, vi } from "vitest";
 import type { GatewayClient } from "../gateway/client.js";
 import { parseSessionMeta, resolveSessionKey } from "./session-mapper.js";
+import { createInMemorySessionStore } from "./session.js";

 function createGateway(resolveLabelKey = "agent:main:label"): {
  gateway: GatewayClient;
@@ -54,3 +55,26 @@ describe("acp session mapper", () => {
    expect(request).not.toHaveBeenCalled();
  });
 });
+
+describe("acp session manager", () => {
+  const store = createInMemorySessionStore();
+
+  afterEach(() => {
+    store.clearAllSessionsForTest();
+  });
+
+  it("tracks active runs and clears on cancel", () => {
+    const session = store.createSession({
+      sessionKey: "acp:test",
+      cwd: "/tmp",
+    });
+    const controller = new AbortController();
+    store.setActiveRun(session.sessionId, "run-1", controller);
+
+    expect(store.getSessionByRunId("run-1")?.sessionId).toBe(session.sessionId);
+
+    const cancelled = store.cancelActiveRun(session.sessionId);
+    expect(cancelled).toBe(true);
+    expect(store.getSessionByRunId("run-1")).toBeUndefined();
+  });
+});
--- a/src/acp/session.test.ts
+++ b/src/acp/session.test.ts
@@ -1,25 +0,0 @@
-import { describe, expect, it, afterEach } from "vitest";
-import { createInMemorySessionStore } from "./session.js";
-
-describe("acp session manager", () => {
-  const store = createInMemorySessionStore();
-
-  afterEach(() => {
-    store.clearAllSessionsForTest();
-  });
-
-  it("tracks active runs and clears on cancel", () => {
-    const session = store.createSession({
-      sessionKey: "acp:test",
-      cwd: "/tmp",
-    });
-    const controller = new AbortController();
-    store.setActiveRun(session.sessionId, "run-1", controller);
-
-    expect(store.getSessionByRunId("run-1")?.sessionId).toBe(session.sessionId);
-
-    const cancelled = store.cancelActiveRun(session.sessionId);
-    expect(cancelled).toBe(true);
-    expect(store.getSessionByRunId("run-1")).toBeUndefined();
-  });
-});
--- a/src/agents/auth-profiles.chutes.e2e.test.ts
+++ b/src/agents/auth-profiles.chutes.e2e.test.ts
@@ -2,6 +2,7 @@ import fs from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
 import { afterEach, describe, expect, it, vi } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import {
  type AuthProfileStore,
  ensureAuthProfileStore,
@@ -10,10 +11,7 @@ import {
 import { CHUTES_TOKEN_ENDPOINT, type ChutesStoredOAuth } from "./chutes-oauth.js";

 describe("auth-profiles (chutes)", () => {
-  const previousStateDir = process.env.OPENCLAW_STATE_DIR;
-  const previousAgentDir = process.env.OPENCLAW_AGENT_DIR;
-  const previousPiAgentDir = process.env.PI_CODING_AGENT_DIR;
-  const previousChutesClientId = process.env.CHUTES_CLIENT_ID;
+  let envSnapshot: ReturnType<typeof captureEnv> | undefined;
  let tempDir: string | null = null;

  afterEach(async () => {
@@ -22,29 +20,17 @@ describe("auth-profiles (chutes)", () => {
      await fs.rm(tempDir, { recursive: true, force: true });
      tempDir = null;
    }
-    if (previousStateDir === undefined) {
-      delete process.env.OPENCLAW_STATE_DIR;
-    } else {
-      process.env.OPENCLAW_STATE_DIR = previousStateDir;
-    }
-    if (previousAgentDir === undefined) {
-      delete process.env.OPENCLAW_AGENT_DIR;
-    } else {
-      process.env.OPENCLAW_AGENT_DIR = previousAgentDir;
-    }
-    if (previousPiAgentDir === undefined) {
-      delete process.env.PI_CODING_AGENT_DIR;
-    } else {
-      process.env.PI_CODING_AGENT_DIR = previousPiAgentDir;
-    }
-    if (previousChutesClientId === undefined) {
-      delete process.env.CHUTES_CLIENT_ID;
-    } else {
-      process.env.CHUTES_CLIENT_ID = previousChutesClientId;
-    }
+    envSnapshot?.restore();
+    envSnapshot = undefined;
  });

  it("refreshes expired Chutes OAuth credentials", async () => {
+    envSnapshot = captureEnv([
+      "OPENCLAW_STATE_DIR",
+      "OPENCLAW_AGENT_DIR",
+      "PI_CODING_AGENT_DIR",
+      "CHUTES_CLIENT_ID",
+    ]);
    tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-chutes-"));
    process.env.OPENCLAW_STATE_DIR = tempDir;
    process.env.OPENCLAW_AGENT_DIR = path.join(tempDir, "agents", "main", "agent");
--- a/src/agents/auth-profiles/oauth.fallback-to-main-agent.e2e.test.ts
+++ b/src/agents/auth-profiles/oauth.fallback-to-main-agent.e2e.test.ts
@@ -3,13 +3,16 @@ import os from "node:os";
 import path from "node:path";
 import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
 import type { AuthProfileStore } from "./types.js";
+import { captureEnv } from "../../test-utils/env.js";
 import { resolveApiKeyForProfile } from "./oauth.js";
 import { ensureAuthProfileStore } from "./store.js";

 describe("resolveApiKeyForProfile fallback to main agent", () => {
-  const previousStateDir = process.env.OPENCLAW_STATE_DIR;
-  const previousAgentDir = process.env.OPENCLAW_AGENT_DIR;
-  const previousPiAgentDir = process.env.PI_CODING_AGENT_DIR;
+  const envSnapshot = captureEnv([
+    "OPENCLAW_STATE_DIR",
+    "OPENCLAW_AGENT_DIR",
+    "PI_CODING_AGENT_DIR",
+  ]);
  let tmpDir: string;
  let mainAgentDir: string;
  let secondaryAgentDir: string;
@@ -30,22 +33,7 @@ describe("resolveApiKeyForProfile fallback to main agent", () => {
  afterEach(async () => {
    vi.unstubAllGlobals();

-    // Restore original environment
-    if (previousStateDir === undefined) {
-      delete process.env.OPENCLAW_STATE_DIR;
-    } else {
-      process.env.OPENCLAW_STATE_DIR = previousStateDir;
-    }
-    if (previousAgentDir === undefined) {
-      delete process.env.OPENCLAW_AGENT_DIR;
-    } else {
-      process.env.OPENCLAW_AGENT_DIR = previousAgentDir;
-    }
-    if (previousPiAgentDir === undefined) {
-      delete process.env.PI_CODING_AGENT_DIR;
-    } else {
-      process.env.PI_CODING_AGENT_DIR = previousPiAgentDir;
-    }
+    envSnapshot.restore();

    await fs.rm(tmpDir, { recursive: true, force: true });
  });
--- a/src/agents/bash-tools.exec-runtime.ts
+++ b/src/agents/bash-tools.exec-runtime.ts
@@ -1,17 +1,17 @@
 import type { AgentToolResult } from "@mariozechner/pi-agent-core";
-import type { ChildProcessWithoutNullStreams } from "node:child_process";
 import { Type } from "@sinclair/typebox";
 import path from "node:path";
 import type { ExecAsk, ExecHost, ExecSecurity } from "../infra/exec-approvals.js";
-import type { ProcessSession, SessionStdin } from "./bash-process-registry.js";
+import type { ProcessSession } from "./bash-process-registry.js";
 import type { ExecToolDetails } from "./bash-tools.exec.js";
 import type { BashSandboxConfig } from "./bash-tools.shared.js";
 import { requestHeartbeatNow } from "../infra/heartbeat-wake.js";
 import { mergePathPrepend } from "../infra/path-prepend.js";
 import { enqueueSystemEvent } from "../infra/system-events.js";
 export { applyPathPrepend, normalizePathPrepend } from "../infra/path-prepend.js";
+import type { ManagedRun } from "../process/supervisor/index.js";
 import { logWarn } from "../logger.js";
-import { formatSpawnError, spawnWithFallback } from "../process/spawn-utils.js";
+import { getProcessSupervisor } from "../process/supervisor/index.js";
 import {
  addSession,
  appendOutput,
@@ -23,7 +23,6 @@ import {
  buildDockerExecArgs,
  chunkString,
  clampWithDefault,
-  killSession,
  readEnvInt,
 } from "./bash-tools.shared.js";
 import { buildCursorPositionResponse, stripDsrRequests } from "./pty-dsr.js";
@@ -147,26 +146,6 @@ export const execSchema = Type.Object({
  ),
 });

-type PtyExitEvent = { exitCode: number; signal?: number };
-type PtyListener<T> = (event: T) => void;
-type PtyHandle = {
-  pid: number;
-  write: (data: string | Buffer) => void;
-  onData: (listener: PtyListener<string>) => void;
-  onExit: (listener: PtyListener<PtyExitEvent>) => void;
-};
-type PtySpawn = (
-  file: string,
-  args: string[] | string,
-  options: {
-    name?: string;
-    cols?: number;
-    rows?: number;
-    cwd?: string;
-    env?: Record<string, string>;
-  },
-) => PtyHandle;
-
 export type ExecProcessOutcome = {
  status: "completed" | "failed";
  exitCode: number | null;
@@ -319,138 +298,10 @@ export async function runExecProcess(opts: {
 }): Promise<ExecProcessHandle> {
  const startedAt = Date.now();
  const sessionId = createSessionSlug();
-  let child: ChildProcessWithoutNullStreams | null = null;
-  let pty: PtyHandle | null = null;
-  let stdin: SessionStdin | undefined;
  const execCommand = opts.execCommand ?? opts.command;
+  const supervisor = getProcessSupervisor();

-  const spawnFallbacks = [
-    {
-      label: "no-detach",
-      options: { detached: false },
-    },
-  ];
-
-  const handleSpawnFallback = (err: unknown, fallback: { label: string }) => {
-    const errText = formatSpawnError(err);
-    const warning = `Warning: spawn failed (${errText}); retrying with ${fallback.label}.`;
-    logWarn(`exec: spawn failed (${errText}); retrying with ${fallback.label}.`);
-    opts.warnings.push(warning);
-  };
-
-  const spawnShellChild = async (
-    shell: string,
-    shellArgs: string[],
-  ): Promise<ChildProcessWithoutNullStreams> => {
-    const { child: spawned } = await spawnWithFallback({
-      argv: [shell, ...shellArgs, execCommand],
-      options: {
-        cwd: opts.workdir,
-        env: opts.env,
-        detached: process.platform !== "win32",
-        stdio: ["pipe", "pipe", "pipe"],
-        windowsHide: true,
-      },
-      fallbacks: spawnFallbacks,
-      onFallback: handleSpawnFallback,
-    });
-    return spawned as ChildProcessWithoutNullStreams;
-  };
-
-  // `exec` does not currently accept tool-provided stdin content. For non-PTY runs,
-  // keeping stdin open can cause commands like `wc -l` (or safeBins-hardened segments)
-  // to block forever waiting for input, leading to accidental backgrounding.
-  // For interactive flows, callers should use `pty: true` (stdin kept open).
-  const maybeCloseNonPtyStdin = () => {
-    if (opts.usePty) {
-      return;
-    }
-    try {
-      // Signal EOF immediately so stdin-only commands can terminate.
-      child?.stdin?.end();
-    } catch {
-      // ignore stdin close errors
-    }
-  };
-
-  if (opts.sandbox) {
-    const { child: spawned } = await spawnWithFallback({
-      argv: [
-        "docker",
-        ...buildDockerExecArgs({
-          containerName: opts.sandbox.containerName,
-          command: execCommand,
-          workdir: opts.containerWorkdir ?? opts.sandbox.containerWorkdir,
-          env: opts.env,
-          tty: opts.usePty,
-        }),
-      ],
-      options: {
-        cwd: opts.workdir,
-        env: process.env,
-        detached: process.platform !== "win32",
-        stdio: ["pipe", "pipe", "pipe"],
-        windowsHide: true,
-      },
-      fallbacks: spawnFallbacks,
-      onFallback: handleSpawnFallback,
-    });
-    child = spawned as ChildProcessWithoutNullStreams;
-    stdin = child.stdin;
-    maybeCloseNonPtyStdin();
-  } else if (opts.usePty) {
-    const { shell, args: shellArgs } = getShellConfig();
-    try {
-      const ptyModule = (await import("@lydell/node-pty")) as unknown as {
-        spawn?: PtySpawn;
-        default?: { spawn?: PtySpawn };
-      };
-      const spawnPty = ptyModule.spawn ?? ptyModule.default?.spawn;
-      if (!spawnPty) {
-        throw new Error("PTY support is unavailable (node-pty spawn not found).");
-      }
-      pty = spawnPty(shell, [...shellArgs, execCommand], {
-        cwd: opts.workdir,
-        env: opts.env,
-        name: process.env.TERM ?? "xterm-256color",
-        cols: 120,
-        rows: 30,
-      });
-      stdin = {
-        destroyed: false,
-        write: (data, cb) => {
-          try {
-            pty?.write(data);
-            cb?.(null);
-          } catch (err) {
-            cb?.(err as Error);
-          }
-        },
-        end: () => {
-          try {
-            const eof = process.platform === "win32" ? "\x1a" : "\x04";
-            pty?.write(eof);
-          } catch {
-            // ignore EOF errors
-          }
-        },
-      };
-    } catch (err) {
-      const errText = String(err);
-      const warning = `Warning: PTY spawn failed (${errText}); retrying without PTY for \`${opts.command}\`.`;
-      logWarn(`exec: PTY spawn failed (${errText}); retrying without PTY for "${opts.command}".`);
-      opts.warnings.push(warning);
-      child = await spawnShellChild(shell, shellArgs);
-      stdin = child.stdin;
-    }
-  } else {
-    const { shell, args: shellArgs } = getShellConfig();
-    child = await spawnShellChild(shell, shellArgs);
-    stdin = child.stdin;
-    maybeCloseNonPtyStdin();
-  }
-
-  const session = {
+  const session: ProcessSession = {
    id: sessionId,
    command: opts.command,
    scopeKey: opts.scopeKey,
@@ -458,9 +309,9 @@ export async function runExecProcess(opts: {
    notifyOnExit: opts.notifyOnExit,
    notifyOnExitEmptySuccess: opts.notifyOnExitEmptySuccess === true,
    exitNotified: false,
-    child: child ?? undefined,
-    stdin,
-    pid: child?.pid ?? pty?.pid,
+    child: undefined,
+    stdin: undefined,
+    pid: undefined,
    startedAt,
    cwd: opts.workdir,
    maxOutputChars: opts.maxOutput,
@@ -477,59 +328,9 @@ export async function runExecProcess(opts: {
    exitSignal: undefined as NodeJS.Signals | number | null | undefined,
    truncated: false,
    backgrounded: false,
-  } satisfies ProcessSession;
+  };
  addSession(session);

-  let settled = false;
-  let timeoutTimer: NodeJS.Timeout | null = null;
-  let timeoutFinalizeTimer: NodeJS.Timeout | null = null;
-  let timedOut = false;
-  const timeoutFinalizeMs = 1000;
-  let resolveFn: ((outcome: ExecProcessOutcome) => void) | null = null;
-
-  const settle = (outcome: ExecProcessOutcome) => {
-    if (settled) {
-      return;
-    }
-    settled = true;
-    resolveFn?.(outcome);
-  };
-
-  const finalizeTimeout = () => {
-    if (session.exited) {
-      return;
-    }
-    markExited(session, null, "SIGKILL", "failed");
-    maybeNotifyOnExit(session, "failed");
-    const aggregated = session.aggregated.trim();
-    const reason = `Command timed out after ${opts.timeoutSec} seconds`;
-    settle({
-      status: "failed",
-      exitCode: null,
-      exitSignal: "SIGKILL",
-      durationMs: Date.now() - startedAt,
-      aggregated,
-      timedOut: true,
-      reason: aggregated ? `${aggregated}\n\n${reason}` : reason,
-    });
-  };
-
-  const onTimeout = () => {
-    timedOut = true;
-    killSession(session);
-    if (!timeoutFinalizeTimer) {
-      timeoutFinalizeTimer = setTimeout(() => {
-        finalizeTimeout();
-      }, timeoutFinalizeMs);
-    }
-  };
-
-  if (opts.timeoutSec > 0) {
-    timeoutTimer = setTimeout(() => {
-      onTimeout();
-    }, opts.timeoutSec * 1000);
-  }
-
  const emitUpdate = () => {
    if (!opts.onUpdate) {
      return;
@@ -565,116 +366,208 @@ export async function runExecProcess(opts: {
    }
  };

-  if (pty) {
-    const cursorResponse = buildCursorPositionResponse();
-    pty.onData((data) => {
-      const raw = data.toString();
-      const { cleaned, requests } = stripDsrRequests(raw);
-      if (requests > 0) {
+  const timeoutMs =
+    typeof opts.timeoutSec === "number" && opts.timeoutSec > 0
+      ? Math.floor(opts.timeoutSec * 1000)
+      : undefined;
+
+  const spawnSpec:
+    | {
+        mode: "child";
+        argv: string[];
+        env: NodeJS.ProcessEnv;
+        stdinMode: "pipe-open" | "pipe-closed";
+      }
+    | {
+        mode: "pty";
+        ptyCommand: string;
+        childFallbackArgv: string[];
+        env: NodeJS.ProcessEnv;
+        stdinMode: "pipe-open";
+      } = (() => {
+    if (opts.sandbox) {
+      return {
+        mode: "child" as const,
+        argv: [
+          "docker",
+          ...buildDockerExecArgs({
+            containerName: opts.sandbox.containerName,
+            command: execCommand,
+            workdir: opts.containerWorkdir ?? opts.sandbox.containerWorkdir,
+            env: opts.env,
+            tty: opts.usePty,
+          }),
+        ],
+        env: process.env,
+        stdinMode: opts.usePty ? ("pipe-open" as const) : ("pipe-closed" as const),
+      };
+    }
+    const { shell, args: shellArgs } = getShellConfig();
+    const childArgv = [shell, ...shellArgs, execCommand];
+    if (opts.usePty) {
+      return {
+        mode: "pty" as const,
+        ptyCommand: execCommand,
+        childFallbackArgv: childArgv,
+        env: opts.env,
+        stdinMode: "pipe-open" as const,
+      };
+    }
+    return {
+      mode: "child" as const,
+      argv: childArgv,
+      env: opts.env,
+      stdinMode: "pipe-closed" as const,
+    };
+  })();
+
+  let managedRun: ManagedRun | null = null;
+  let usingPty = spawnSpec.mode === "pty";
+  const cursorResponse = buildCursorPositionResponse();
+
+  const onSupervisorStdout = (chunk: string) => {
+    if (usingPty) {
+      const { cleaned, requests } = stripDsrRequests(chunk);
+      if (requests > 0 && managedRun?.stdin) {
        for (let i = 0; i < requests; i += 1) {
-          pty.write(cursorResponse);
+          managedRun.stdin.write(cursorResponse);
        }
      }
      handleStdout(cleaned);
-    });
-  } else if (child) {
-    child.stdout.on("data", handleStdout);
-    child.stderr.on("data", handleStderr);
-  }
+      return;
+    }
+    handleStdout(chunk);
+  };

-  const promise = new Promise<ExecProcessOutcome>((resolve) => {
-    resolveFn = resolve;
-    const handleExit = (code: number | null, exitSignal: NodeJS.Signals | number | null) => {
-      if (timeoutTimer) {
-        clearTimeout(timeoutTimer);
-      }
-      if (timeoutFinalizeTimer) {
-        clearTimeout(timeoutFinalizeTimer);
+  try {
+    const spawnBase = {
+      runId: sessionId,
+      sessionId: opts.sessionKey?.trim() || sessionId,
+      backendId: opts.sandbox ? "exec-sandbox" : "exec-host",
+      scopeKey: opts.scopeKey,
+      cwd: opts.workdir,
+      env: spawnSpec.env,
+      timeoutMs,
+      captureOutput: false,
+      onStdout: onSupervisorStdout,
+      onStderr: handleStderr,
+    };
+    managedRun =
+      spawnSpec.mode === "pty"
+        ? await supervisor.spawn({
+            ...spawnBase,
+            mode: "pty",
+            ptyCommand: spawnSpec.ptyCommand,
+          })
+        : await supervisor.spawn({
+            ...spawnBase,
+            mode: "child",
+            argv: spawnSpec.argv,
+            stdinMode: spawnSpec.stdinMode,
+          });
+  } catch (err) {
+    if (spawnSpec.mode === "pty") {
+      const warning = `Warning: PTY spawn failed (${String(err)}); retrying without PTY for \`${opts.command}\`.`;
+      logWarn(
+        `exec: PTY spawn failed (${String(err)}); retrying without PTY for "${opts.command}".`,
+      );
+      opts.warnings.push(warning);
+      usingPty = false;
+      try {
+        managedRun = await supervisor.spawn({
+          runId: sessionId,
+          sessionId: opts.sessionKey?.trim() || sessionId,
+          backendId: "exec-host",
+          scopeKey: opts.scopeKey,
+          mode: "child",
+          argv: spawnSpec.childFallbackArgv,
+          cwd: opts.workdir,
+          env: spawnSpec.env,
+          stdinMode: "pipe-open",
+          timeoutMs,
+          captureOutput: false,
+          onStdout: handleStdout,
+          onStderr: handleStderr,
+        });
+      } catch (retryErr) {
+        markExited(session, null, null, "failed");
+        maybeNotifyOnExit(session, "failed");
+        throw retryErr;
      }
+    } else {
+      markExited(session, null, null, "failed");
+      maybeNotifyOnExit(session, "failed");
+      throw err;
+    }
+  }
+  session.stdin = managedRun.stdin;
+  session.pid = managedRun.pid;
+
+  const promise = managedRun
+    .wait()
+    .then((exit): ExecProcessOutcome => {
      const durationMs = Date.now() - startedAt;
-      const wasSignal = exitSignal != null;
-      const isSuccess = code === 0 && !wasSignal && !timedOut;
-      const status: "completed" | "failed" = isSuccess ? "completed" : "failed";
-      markExited(session, code, exitSignal, status);
+      const status: "completed" | "failed" =
+        exit.exitCode === 0 && exit.reason === "exit" ? "completed" : "failed";
+      markExited(session, exit.exitCode, exit.exitSignal, status);
      maybeNotifyOnExit(session, status);
      if (!session.child && session.stdin) {
        session.stdin.destroyed = true;
      }
-
-      if (settled) {
-        return;
-      }
      const aggregated = session.aggregated.trim();
-      if (!isSuccess) {
-        const reason = timedOut
-          ? `Command timed out after ${opts.timeoutSec} seconds`
-          : wasSignal && exitSignal
-            ? `Command aborted by signal ${exitSignal}`
-            : code === null
-              ? "Command aborted before exit code was captured"
-              : `Command exited with code ${code}`;
-        const message = aggregated ? `${aggregated}\n\n${reason}` : reason;
-        settle({
-          status: "failed",
-          exitCode: code ?? null,
-          exitSignal: exitSignal ?? null,
+      if (status === "completed") {
+        return {
+          status: "completed",
+          exitCode: exit.exitCode ?? 0,
+          exitSignal: exit.exitSignal,
          durationMs,
          aggregated,
-          timedOut,
-          reason: message,
-        });
-        return;
+          timedOut: false,
+        };
      }
-      settle({
-        status: "completed",
-        exitCode: code ?? 0,
-        exitSignal: exitSignal ?? null,
+      const reason =
+        exit.reason === "overall-timeout"
+          ? `Command timed out after ${opts.timeoutSec} seconds`
+          : exit.reason === "no-output-timeout"
+            ? "Command timed out waiting for output"
+            : exit.exitSignal != null
+              ? `Command aborted by signal ${exit.exitSignal}`
+              : exit.exitCode == null
+                ? "Command aborted before exit code was captured"
+                : `Command exited with code ${exit.exitCode}`;
+      return {
+        status: "failed",
+        exitCode: exit.exitCode,
+        exitSignal: exit.exitSignal,
        durationMs,
        aggregated,
+        timedOut: exit.timedOut,
+        reason: aggregated ? `${aggregated}\n\n${reason}` : reason,
+      };
+    })
+    .catch((err): ExecProcessOutcome => {
+      markExited(session, null, null, "failed");
+      maybeNotifyOnExit(session, "failed");
+      const aggregated = session.aggregated.trim();
+      const message = aggregated ? `${aggregated}\n\n${String(err)}` : String(err);
+      return {
+        status: "failed",
+        exitCode: null,
+        exitSignal: null,
+        durationMs: Date.now() - startedAt,
+        aggregated,
        timedOut: false,
-      });
-    };
-
-    if (pty) {
-      pty.onExit((event) => {
-        const rawSignal = event.signal ?? null;
-        const normalizedSignal = rawSignal === 0 ? null : rawSignal;
-        handleExit(event.exitCode ?? null, normalizedSignal);
-      });
-    } else if (child) {
-      child.once("close", (code, exitSignal) => {
-        handleExit(code, exitSignal);
-      });
-
-      child.once("error", (err) => {
-        if (timeoutTimer) {
-          clearTimeout(timeoutTimer);
-        }
-        if (timeoutFinalizeTimer) {
-          clearTimeout(timeoutFinalizeTimer);
-        }
-        markExited(session, null, null, "failed");
-        maybeNotifyOnExit(session, "failed");
-        const aggregated = session.aggregated.trim();
-        const message = aggregated ? `${aggregated}\n\n${String(err)}` : String(err);
-        settle({
-          status: "failed",
-          exitCode: null,
-          exitSignal: null,
-          durationMs: Date.now() - startedAt,
-          aggregated,
-          timedOut,
-          reason: message,
-        });
-      });
-    }
-  });
+        reason: message,
+      };
+    });

  return {
    session,
    startedAt,
    pid: session.pid ?? undefined,
    promise,
-    kill: () => killSession(session),
+    kill: () => {
+      managedRun?.cancel("manual-cancel");
+    },
  };
 }
--- a/src/agents/bash-tools.exec.pty-cleanup.test.ts
+++ b/src/agents/bash-tools.exec.pty-cleanup.test.ts
@@ -0,0 +1,73 @@
+import { afterEach, expect, test, vi } from "vitest";
+import { resetProcessRegistryForTests } from "./bash-process-registry";
+
+afterEach(() => {
+  resetProcessRegistryForTests();
+  vi.resetModules();
+  vi.clearAllMocks();
+});
+
+test("exec disposes PTY listeners after normal exit", async () => {
+  const disposeData = vi.fn();
+  const disposeExit = vi.fn();
+
+  vi.doMock("@lydell/node-pty", () => ({
+    spawn: () => {
+      return {
+        pid: 0,
+        write: vi.fn(),
+        onData: (listener: (value: string) => void) => {
+          setTimeout(() => listener("ok"), 0);
+          return { dispose: disposeData };
+        },
+        onExit: (listener: (event: { exitCode: number; signal?: number }) => void) => {
+          setTimeout(() => listener({ exitCode: 0 }), 0);
+          return { dispose: disposeExit };
+        },
+        kill: vi.fn(),
+      };
+    },
+  }));
+
+  const { createExecTool } = await import("./bash-tools.exec");
+  const tool = createExecTool({ allowBackground: false });
+  const result = await tool.execute("toolcall", {
+    command: "echo ok",
+    pty: true,
+  });
+
+  expect(result.details.status).toBe("completed");
+  expect(disposeData).toHaveBeenCalledTimes(1);
+  expect(disposeExit).toHaveBeenCalledTimes(1);
+});
+
+test("exec tears down PTY resources on timeout", async () => {
+  const disposeData = vi.fn();
+  const disposeExit = vi.fn();
+  const kill = vi.fn();
+
+  vi.doMock("@lydell/node-pty", () => ({
+    spawn: () => {
+      return {
+        pid: 0,
+        write: vi.fn(),
+        onData: () => ({ dispose: disposeData }),
+        onExit: () => ({ dispose: disposeExit }),
+        kill,
+      };
+    },
+  }));
+
+  const { createExecTool } = await import("./bash-tools.exec");
+  const tool = createExecTool({ allowBackground: false });
+  await expect(
+    tool.execute("toolcall", {
+      command: "sleep 5",
+      pty: true,
+      timeout: 0.01,
+    }),
+  ).rejects.toThrow("Command timed out");
+  expect(kill).toHaveBeenCalledTimes(1);
+  expect(disposeData).toHaveBeenCalledTimes(1);
+  expect(disposeExit).toHaveBeenCalledTimes(1);
+});
--- a/src/agents/bash-tools.exec.pty-fallback-failure.test.ts
+++ b/src/agents/bash-tools.exec.pty-fallback-failure.test.ts
@@ -0,0 +1,40 @@
+import { afterEach, expect, test, vi } from "vitest";
+import { listRunningSessions, resetProcessRegistryForTests } from "./bash-process-registry";
+
+const { supervisorSpawnMock } = vi.hoisted(() => ({
+  supervisorSpawnMock: vi.fn(),
+}));
+
+vi.mock("../process/supervisor/index.js", () => ({
+  getProcessSupervisor: () => ({
+    spawn: (...args: unknown[]) => supervisorSpawnMock(...args),
+    cancel: vi.fn(),
+    cancelScope: vi.fn(),
+    reconcileOrphans: vi.fn(),
+    getRecord: vi.fn(),
+  }),
+}));
+
+afterEach(() => {
+  resetProcessRegistryForTests();
+  vi.resetModules();
+  vi.clearAllMocks();
+});
+
+test("exec cleans session state when PTY fallback spawn also fails", async () => {
+  supervisorSpawnMock
+    .mockRejectedValueOnce(new Error("pty spawn failed"))
+    .mockRejectedValueOnce(new Error("child fallback failed"));
+
+  const { createExecTool } = await import("./bash-tools.exec");
+  const tool = createExecTool({ allowBackground: false });
+
+  await expect(
+    tool.execute("toolcall", {
+      command: "echo ok",
+      pty: true,
+    }),
+  ).rejects.toThrow("child fallback failed");
+
+  expect(listRunningSessions()).toHaveLength(0);
+});
--- a/src/agents/bash-tools.process.supervisor.test.ts
+++ b/src/agents/bash-tools.process.supervisor.test.ts
@@ -0,0 +1,152 @@
+import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
+import type { ProcessSession } from "./bash-process-registry.js";
+import {
+  addSession,
+  getFinishedSession,
+  getSession,
+  resetProcessRegistryForTests,
+} from "./bash-process-registry.js";
+import { createProcessTool } from "./bash-tools.process.js";
+
+const { supervisorMock } = vi.hoisted(() => ({
+  supervisorMock: {
+    spawn: vi.fn(),
+    cancel: vi.fn(),
+    cancelScope: vi.fn(),
+    reconcileOrphans: vi.fn(),
+    getRecord: vi.fn(),
+  },
+}));
+
+const { killProcessTreeMock } = vi.hoisted(() => ({
+  killProcessTreeMock: vi.fn(),
+}));
+
+vi.mock("../process/supervisor/index.js", () => ({
+  getProcessSupervisor: () => supervisorMock,
+}));
+
+vi.mock("../process/kill-tree.js", () => ({
+  killProcessTree: (...args: unknown[]) => killProcessTreeMock(...args),
+}));
+
+function createBackgroundSession(id: string, pid?: number): ProcessSession {
+  return {
+    id,
+    command: "sleep 999",
+    startedAt: Date.now(),
+    cwd: "/tmp",
+    maxOutputChars: 10_000,
+    pendingMaxOutputChars: 30_000,
+    totalOutputChars: 0,
+    pendingStdout: [],
+    pendingStderr: [],
+    pendingStdoutChars: 0,
+    pendingStderrChars: 0,
+    aggregated: "",
+    tail: "",
+    pid,
+    exited: false,
+    exitCode: undefined,
+    exitSignal: undefined,
+    truncated: false,
+    backgrounded: true,
+  };
+}
+
+describe("process tool supervisor cancellation", () => {
+  beforeEach(() => {
+    supervisorMock.spawn.mockReset();
+    supervisorMock.cancel.mockReset();
+    supervisorMock.cancelScope.mockReset();
+    supervisorMock.reconcileOrphans.mockReset();
+    supervisorMock.getRecord.mockReset();
+    killProcessTreeMock.mockReset();
+  });
+
+  afterEach(() => {
+    resetProcessRegistryForTests();
+  });
+
+  it("routes kill through supervisor when run is managed", async () => {
+    supervisorMock.getRecord.mockReturnValue({
+      runId: "sess",
+      state: "running",
+    });
+    addSession(createBackgroundSession("sess"));
+    const processTool = createProcessTool();
+
+    const result = await processTool.execute("toolcall", {
+      action: "kill",
+      sessionId: "sess",
+    });
+
+    expect(supervisorMock.cancel).toHaveBeenCalledWith("sess", "manual-cancel");
+    expect(getSession("sess")).toBeDefined();
+    expect(getSession("sess")?.exited).toBe(false);
+    expect(result.content[0]).toMatchObject({
+      type: "text",
+      text: "Termination requested for session sess.",
+    });
+  });
+
+  it("remove drops running session immediately when cancellation is requested", async () => {
+    supervisorMock.getRecord.mockReturnValue({
+      runId: "sess",
+      state: "running",
+    });
+    addSession(createBackgroundSession("sess"));
+    const processTool = createProcessTool();
+
+    const result = await processTool.execute("toolcall", {
+      action: "remove",
+      sessionId: "sess",
+    });
+
+    expect(supervisorMock.cancel).toHaveBeenCalledWith("sess", "manual-cancel");
+    expect(getSession("sess")).toBeUndefined();
+    expect(getFinishedSession("sess")).toBeUndefined();
+    expect(result.content[0]).toMatchObject({
+      type: "text",
+      text: "Removed session sess (termination requested).",
+    });
+  });
+
+  it("falls back to process-tree kill when supervisor record is missing", async () => {
+    supervisorMock.getRecord.mockReturnValue(undefined);
+    addSession(createBackgroundSession("sess-fallback", 4242));
+    const processTool = createProcessTool();
+
+    const result = await processTool.execute("toolcall", {
+      action: "kill",
+      sessionId: "sess-fallback",
+    });
+
+    expect(killProcessTreeMock).toHaveBeenCalledWith(4242);
+    expect(getSession("sess-fallback")).toBeUndefined();
+    expect(getFinishedSession("sess-fallback")).toBeDefined();
+    expect(result.content[0]).toMatchObject({
+      type: "text",
+      text: "Killed session sess-fallback.",
+    });
+  });
+
+  it("fails remove when no supervisor record and no pid is available", async () => {
+    supervisorMock.getRecord.mockReturnValue(undefined);
+    addSession(createBackgroundSession("sess-no-pid"));
+    const processTool = createProcessTool();
+
+    const result = await processTool.execute("toolcall", {
+      action: "remove",
+      sessionId: "sess-no-pid",
+    });
+
+    expect(killProcessTreeMock).not.toHaveBeenCalled();
+    expect(getSession("sess-no-pid")).toBeDefined();
+    expect(result.details).toMatchObject({ status: "failed" });
+    expect(result.content[0]).toMatchObject({
+      type: "text",
+      text: "Unable to remove session sess-no-pid: no active supervisor run or process id.",
+    });
+  });
+});
--- a/src/agents/bash-tools.process.ts
+++ b/src/agents/bash-tools.process.ts
@@ -1,7 +1,10 @@
 import type { AgentTool, AgentToolResult } from "@mariozechner/pi-agent-core";
 import { Type } from "@sinclair/typebox";
 import { formatDurationCompact } from "../infra/format-time/format-duration.ts";
+import { killProcessTree } from "../process/kill-tree.js";
+import { getProcessSupervisor } from "../process/supervisor/index.js";
 import {
+  type ProcessSession,
  deleteSession,
  drainSession,
  getFinishedSession,
@@ -11,13 +14,7 @@ import {
  markExited,
  setJobTtlMs,
 } from "./bash-process-registry.js";
-import {
-  deriveSessionName,
-  killSession,
-  pad,
-  sliceLogLines,
-  truncateMiddle,
-} from "./bash-tools.shared.js";
+import { deriveSessionName, pad, sliceLogLines, truncateMiddle } from "./bash-tools.shared.js";
 import { encodeKeySequence, encodePaste } from "./pty-keys.js";

 export type ProcessToolDefaults = {
@@ -65,8 +62,9 @@ const processSchema = Type.Object({
  offset: Type.Optional(Type.Number({ description: "Log offset" })),
  limit: Type.Optional(Type.Number({ description: "Log length" })),
  timeout: Type.Optional(
-    Type.Union([Type.Number(), Type.String()], {
+    Type.Number({
      description: "For poll: wait up to this many milliseconds before returning",
+      minimum: 0,
    }),
  ),
 });
@@ -106,9 +104,28 @@ export function createProcessTool(
    setJobTtlMs(defaults.cleanupMs);
  }
  const scopeKey = defaults?.scopeKey;
+  const supervisor = getProcessSupervisor();
  const isInScope = (session?: { scopeKey?: string } | null) =>
    !scopeKey || session?.scopeKey === scopeKey;

+  const cancelManagedSession = (sessionId: string) => {
+    const record = supervisor.getRecord(sessionId);
+    if (!record || record.state === "exited") {
+      return false;
+    }
+    supervisor.cancel(sessionId, "manual-cancel");
+    return true;
+  };
+
+  const terminateSessionFallback = (session: ProcessSession) => {
+    const pid = session.pid ?? session.child?.pid;
+    if (typeof pid !== "number" || !Number.isFinite(pid) || pid <= 0) {
+      return false;
+    }
+    killProcessTree(pid);
+    return true;
+  };
+
  return {
    name: "process",
    label: "process",
@@ -138,7 +155,7 @@ export function createProcessTool(
        eof?: boolean;
        offset?: number;
        limit?: number;
-        timeout?: number | string;
+        timeout?: unknown;
      };

      if (params.action === "list") {
@@ -522,10 +539,25 @@ export function createProcessTool(
          if (!scopedSession.backgrounded) {
            return failText(`Session ${params.sessionId} is not backgrounded.`);
          }
-          killSession(scopedSession);
-          markExited(scopedSession, null, "SIGKILL", "failed");
+          const canceled = cancelManagedSession(scopedSession.id);
+          if (!canceled) {
+            const terminated = terminateSessionFallback(scopedSession);
+            if (!terminated) {
+              return failText(
+                `Unable to terminate session ${params.sessionId}: no active supervisor run or process id.`,
+              );
+            }
+            markExited(scopedSession, null, "SIGKILL", "failed");
+          }
          return {
-            content: [{ type: "text", text: `Killed session ${params.sessionId}.` }],
+            content: [
+              {
+                type: "text",
+                text: canceled
+                  ? `Termination requested for session ${params.sessionId}.`
+                  : `Killed session ${params.sessionId}.`,
+              },
+            ],
            details: {
              status: "failed",
              name: scopedSession ? deriveSessionName(scopedSession.command) : undefined,
@@ -554,10 +586,30 @@ export function createProcessTool(

        case "remove": {
          if (scopedSession) {
-            killSession(scopedSession);
-            markExited(scopedSession, null, "SIGKILL", "failed");
+            const canceled = cancelManagedSession(scopedSession.id);
+            if (canceled) {
+              // Keep remove semantics deterministic: drop from process registry now.
+              scopedSession.backgrounded = false;
+              deleteSession(params.sessionId);
+            } else {
+              const terminated = terminateSessionFallback(scopedSession);
+              if (!terminated) {
+                return failText(
+                  `Unable to remove session ${params.sessionId}: no active supervisor run or process id.`,
+                );
+              }
+              markExited(scopedSession, null, "SIGKILL", "failed");
+              deleteSession(params.sessionId);
+            }
            return {
-              content: [{ type: "text", text: `Removed session ${params.sessionId}.` }],
+              content: [
+                {
+                  type: "text",
+                  text: canceled
+                    ? `Removed session ${params.sessionId} (termination requested).`
+                    : `Removed session ${params.sessionId}.`,
+                },
+              ],
              details: {
                status: "failed",
                name: scopedSession ? deriveSessionName(scopedSession.command) : undefined,
--- a/src/agents/bash-tools.shared.ts
+++ b/src/agents/bash-tools.shared.ts
@@ -1,11 +1,9 @@
-import type { ChildProcessWithoutNullStreams } from "node:child_process";
 import { existsSync, statSync } from "node:fs";
 import fs from "node:fs/promises";
 import { homedir } from "node:os";
 import path from "node:path";
 import { sliceUtf16Safe } from "../utils.js";
 import { assertSandboxPath } from "./sandbox-paths.js";
-import { killProcessTree } from "./shell-utils.js";

 const CHUNK_LIMIT = 8 * 1024;

@@ -115,13 +113,6 @@ export async function resolveSandboxWorkdir(params: {
  }
 }

-export function killSession(session: { pid?: number; child?: ChildProcessWithoutNullStreams }) {
-  const pid = session.pid ?? session.child?.pid;
-  if (pid) {
-    killProcessTree(pid);
-  }
-}
-
 export function resolveWorkdir(workdir: string, warnings: string[]) {
  const current = safeCwd();
  const fallback = current ?? homedir();
--- a/src/agents/cli-backends.test.ts
+++ b/src/agents/cli-backends.test.ts
@@ -0,0 +1,36 @@
+import { describe, expect, it } from "vitest";
+import type { OpenClawConfig } from "../config/config.js";
+import { resolveCliBackendConfig } from "./cli-backends.js";
+
+describe("resolveCliBackendConfig reliability merge", () => {
+  it("deep-merges reliability watchdog overrides for codex", () => {
+    const cfg = {
+      agents: {
+        defaults: {
+          cliBackends: {
+            "codex-cli": {
+              command: "codex",
+              reliability: {
+                watchdog: {
+                  resume: {
+                    noOutputTimeoutMs: 42_000,
+                  },
+                },
+              },
+            },
+          },
+        },
+      },
+    } satisfies OpenClawConfig;
+
+    const resolved = resolveCliBackendConfig("codex-cli", cfg);
+
+    expect(resolved).not.toBeNull();
+    expect(resolved?.config.reliability?.watchdog?.resume?.noOutputTimeoutMs).toBe(42_000);
+    // Ensure defaults are retained when only one field is overridden.
+    expect(resolved?.config.reliability?.watchdog?.resume?.noOutputTimeoutRatio).toBe(0.3);
+    expect(resolved?.config.reliability?.watchdog?.resume?.minMs).toBe(60_000);
+    expect(resolved?.config.reliability?.watchdog?.resume?.maxMs).toBe(180_000);
+    expect(resolved?.config.reliability?.watchdog?.fresh?.noOutputTimeoutRatio).toBe(0.8);
+  });
+});
--- a/src/agents/cli-backends.ts
+++ b/src/agents/cli-backends.ts
@@ -1,5 +1,9 @@
 import type { OpenClawConfig } from "../config/config.js";
 import type { CliBackendConfig } from "../config/types.js";
+import {
+  CLI_FRESH_WATCHDOG_DEFAULTS,
+  CLI_RESUME_WATCHDOG_DEFAULTS,
+} from "./cli-watchdog-defaults.js";
 import { normalizeProviderId } from "./model-selection.js";

 export type ResolvedCliBackend = {
@@ -49,6 +53,12 @@ const DEFAULT_CLAUDE_BACKEND: CliBackendConfig = {
  systemPromptMode: "append",
  systemPromptWhen: "first",
  clearEnv: ["ANTHROPIC_API_KEY", "ANTHROPIC_API_KEY_OLD"],
+  reliability: {
+    watchdog: {
+      fresh: { ...CLI_FRESH_WATCHDOG_DEFAULTS },
+      resume: { ...CLI_RESUME_WATCHDOG_DEFAULTS },
+    },
+  },
  serialize: true,
 };

@@ -73,6 +83,12 @@ const DEFAULT_CODEX_BACKEND: CliBackendConfig = {
  sessionMode: "existing",
  imageArg: "--image",
  imageMode: "repeat",
+  reliability: {
+    watchdog: {
+      fresh: { ...CLI_FRESH_WATCHDOG_DEFAULTS },
+      resume: { ...CLI_RESUME_WATCHDOG_DEFAULTS },
+    },
+  },
  serialize: true,
 };

@@ -96,6 +112,10 @@ function mergeBackendConfig(base: CliBackendConfig, override?: CliBackendConfig)
  if (!override) {
    return { ...base };
  }
+  const baseFresh = base.reliability?.watchdog?.fresh ?? {};
+  const baseResume = base.reliability?.watchdog?.resume ?? {};
+  const overrideFresh = override.reliability?.watchdog?.fresh ?? {};
+  const overrideResume = override.reliability?.watchdog?.resume ?? {};
  return {
    ...base,
    ...override,
@@ -106,6 +126,22 @@ function mergeBackendConfig(base: CliBackendConfig, override?: CliBackendConfig)
    sessionIdFields: override.sessionIdFields ?? base.sessionIdFields,
    sessionArgs: override.sessionArgs ?? base.sessionArgs,
    resumeArgs: override.resumeArgs ?? base.resumeArgs,
+    reliability: {
+      ...base.reliability,
+      ...override.reliability,
+      watchdog: {
+        ...base.reliability?.watchdog,
+        ...override.reliability?.watchdog,
+        fresh: {
+          ...baseFresh,
+          ...overrideFresh,
+        },
+        resume: {
+          ...baseResume,
+          ...overrideResume,
+        },
+      },
+    },
  };
 }

--- a/src/agents/cli-runner.e2e.test.ts
+++ b/src/agents/cli-runner.e2e.test.ts
@@ -3,50 +3,69 @@ import os from "node:os";
 import path from "node:path";
 import { beforeEach, describe, expect, it, vi } from "vitest";
 import type { OpenClawConfig } from "../config/config.js";
-import type { CliBackendConfig } from "../config/types.js";
 import { runCliAgent } from "./cli-runner.js";
-import { cleanupResumeProcesses, cleanupSuspendedCliProcesses } from "./cli-runner/helpers.js";
+import { resolveCliNoOutputTimeoutMs } from "./cli-runner/helpers.js";

-const runCommandWithTimeoutMock = vi.fn();
-const runExecMock = vi.fn();
+const supervisorSpawnMock = vi.fn();

-vi.mock("../process/exec.js", () => ({
-  runCommandWithTimeout: (...args: unknown[]) => runCommandWithTimeoutMock(...args),
-  runExec: (...args: unknown[]) => runExecMock(...args),
+vi.mock("../process/supervisor/index.js", () => ({
+  getProcessSupervisor: () => ({
+    spawn: (...args: unknown[]) => supervisorSpawnMock(...args),
+    cancel: vi.fn(),
+    cancelScope: vi.fn(),
+    reconcileOrphans: vi.fn(),
+    getRecord: vi.fn(),
+  }),
 }));

-describe("runCliAgent resume cleanup", () => {
+type MockRunExit = {
+  reason:
+    | "manual-cancel"
+    | "overall-timeout"
+    | "no-output-timeout"
+    | "spawn-error"
+    | "signal"
+    | "exit";
+  exitCode: number | null;
+  exitSignal: NodeJS.Signals | number | null;
+  durationMs: number;
+  stdout: string;
+  stderr: string;
+  timedOut: boolean;
+  noOutputTimedOut: boolean;
+};
+
+function createManagedRun(exit: MockRunExit, pid = 1234) {
+  return {
+    runId: "run-supervisor",
+    pid,
+    startedAtMs: Date.now(),
+    stdin: undefined,
+    wait: vi.fn().mockResolvedValue(exit),
+    cancel: vi.fn(),
+  };
+}
+
+describe("runCliAgent with process supervisor", () => {
  beforeEach(() => {
-    runCommandWithTimeoutMock.mockReset();
-    runExecMock.mockReset();
+    supervisorSpawnMock.mockReset();
  });

-  it("kills stale resume processes for codex sessions", async () => {
-    const selfPid = process.pid;
-
-    runExecMock
-      .mockResolvedValueOnce({
-        stdout: "  1 999 S /bin/launchd\n",
+  it("runs CLI through supervisor and returns payload", async () => {
+    supervisorSpawnMock.mockResolvedValueOnce(
+      createManagedRun({
+        reason: "exit",
+        exitCode: 0,
+        exitSignal: null,
+        durationMs: 50,
+        stdout: "ok",
        stderr: "",
-      }) // cleanupSuspendedCliProcesses (ps) — ppid 999 != selfPid, no match
-      .mockResolvedValueOnce({
-        stdout: [
-          `  ${selfPid + 1} ${selfPid} codex exec resume thread-123 --color never --sandbox read-only --skip-git-repo-check`,
-          `  ${selfPid + 2} 999 codex exec resume thread-123 --color never --sandbox read-only --skip-git-repo-check`,
-        ].join("\n"),
-        stderr: "",
-      }) // cleanupResumeProcesses (ps)
-      .mockResolvedValueOnce({ stdout: "", stderr: "" }) // cleanupResumeProcesses (kill -TERM)
-      .mockResolvedValueOnce({ stdout: "", stderr: "" }); // cleanupResumeProcesses (kill -9)
-    runCommandWithTimeoutMock.mockResolvedValueOnce({
-      stdout: "ok",
-      stderr: "",
-      code: 0,
-      signal: null,
-      killed: false,
-    });
+        timedOut: false,
+        noOutputTimedOut: false,
+      }),
+    );

-    await runCliAgent({
+    const result = await runCliAgent({
      sessionId: "s1",
      sessionFile: "/tmp/session.jsonl",
      workspaceDir: "/tmp",
@@ -58,28 +77,80 @@ describe("runCliAgent resume cleanup", () => {
      cliSessionId: "thread-123",
    });

-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
+    expect(result.payloads?.[0]?.text).toBe("ok");
+    expect(supervisorSpawnMock).toHaveBeenCalledTimes(1);
+    const input = supervisorSpawnMock.mock.calls[0]?.[0] as {
+      argv?: string[];
+      mode?: string;
+      timeoutMs?: number;
+      noOutputTimeoutMs?: number;
+      replaceExistingScope?: boolean;
+      scopeKey?: string;
+    };
+    expect(input.mode).toBe("child");
+    expect(input.argv?.[0]).toBe("codex");
+    expect(input.timeoutMs).toBe(1_000);
+    expect(input.noOutputTimeoutMs).toBeGreaterThanOrEqual(1_000);
+    expect(input.replaceExistingScope).toBe(true);
+    expect(input.scopeKey).toContain("thread-123");
+  });

-    expect(runExecMock).toHaveBeenCalledTimes(4);
+  it("fails with timeout when no-output watchdog trips", async () => {
+    supervisorSpawnMock.mockResolvedValueOnce(
+      createManagedRun({
+        reason: "no-output-timeout",
+        exitCode: null,
+        exitSignal: "SIGKILL",
+        durationMs: 200,
+        stdout: "",
+        stderr: "",
+        timedOut: true,
+        noOutputTimedOut: true,
+      }),
+    );

-    // Second call: cleanupResumeProcesses ps
-    const psCall = runExecMock.mock.calls[1] ?? [];
-    expect(psCall[0]).toBe("ps");
+    await expect(
+      runCliAgent({
+        sessionId: "s1",
+        sessionFile: "/tmp/session.jsonl",
+        workspaceDir: "/tmp",
+        prompt: "hi",
+        provider: "codex-cli",
+        model: "gpt-5.2-codex",
+        timeoutMs: 1_000,
+        runId: "run-2",
+        cliSessionId: "thread-123",
+      }),
+    ).rejects.toThrow("produced no output");
+  });

-    // Third call: TERM, only the child PID
-    const termCall = runExecMock.mock.calls[2] ?? [];
-    expect(termCall[0]).toBe("kill");
-    const termArgs = termCall[1] as string[];
-    expect(termArgs).toEqual(["-TERM", String(selfPid + 1)]);
+  it("fails with timeout when overall timeout trips", async () => {
+    supervisorSpawnMock.mockResolvedValueOnce(
+      createManagedRun({
+        reason: "overall-timeout",
+        exitCode: null,
+        exitSignal: "SIGKILL",
+        durationMs: 200,
+        stdout: "",
+        stderr: "",
+        timedOut: true,
+        noOutputTimedOut: false,
+      }),
+    );

-    // Fourth call: KILL, only the child PID
-    const killCall = runExecMock.mock.calls[3] ?? [];
-    expect(killCall[0]).toBe("kill");
-    const killArgs = killCall[1] as string[];
-    expect(killArgs).toEqual(["-9", String(selfPid + 1)]);
+    await expect(
+      runCliAgent({
+        sessionId: "s1",
+        sessionFile: "/tmp/session.jsonl",
+        workspaceDir: "/tmp",
+        prompt: "hi",
+        provider: "codex-cli",
+        model: "gpt-5.2-codex",
+        timeoutMs: 1_000,
+        runId: "run-3",
+        cliSessionId: "thread-123",
+      }),
+    ).rejects.toThrow("exceeded timeout");
  });

  it("falls back to per-agent workspace when workspaceDir is missing", async () => {
@@ -94,14 +165,18 @@ describe("runCliAgent resume cleanup", () => {
      },
    } satisfies OpenClawConfig;

-    runExecMock.mockResolvedValue({ stdout: "", stderr: "" });
-    runCommandWithTimeoutMock.mockResolvedValueOnce({
-      stdout: "ok",
-      stderr: "",
-      code: 0,
-      signal: null,
-      killed: false,
-    });
+    supervisorSpawnMock.mockResolvedValueOnce(
+      createManagedRun({
+        reason: "exit",
+        exitCode: 0,
+        exitSignal: null,
+        durationMs: 25,
+        stdout: "ok",
+        stderr: "",
+        timedOut: false,
+        noOutputTimedOut: false,
+      }),
+    );

    try {
      await runCliAgent({
@@ -114,264 +189,33 @@ describe("runCliAgent resume cleanup", () => {
        provider: "codex-cli",
        model: "gpt-5.2-codex",
        timeoutMs: 1_000,
-        runId: "run-1",
+        runId: "run-4",
      });
    } finally {
      await fs.rm(tempDir, { recursive: true, force: true });
    }

-    const options = runCommandWithTimeoutMock.mock.calls[0]?.[1] as { cwd?: string };
-    expect(options.cwd).toBe(path.resolve(fallbackWorkspace));
+    const input = supervisorSpawnMock.mock.calls[0]?.[0] as { cwd?: string };
+    expect(input.cwd).toBe(path.resolve(fallbackWorkspace));
  });
+});

-  it("throws when sessionKey is malformed", async () => {
-    const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-cli-runner-"));
-    const mainWorkspace = path.join(tempDir, "workspace-main");
-    const researchWorkspace = path.join(tempDir, "workspace-research");
-    await fs.mkdir(mainWorkspace, { recursive: true });
-    await fs.mkdir(researchWorkspace, { recursive: true });
-    const cfg = {
-      agents: {
-        defaults: {
-          workspace: mainWorkspace,
+describe("resolveCliNoOutputTimeoutMs", () => {
+  it("uses backend-configured resume watchdog override", () => {
+    const timeoutMs = resolveCliNoOutputTimeoutMs({
+      backend: {
+        command: "codex",
+        reliability: {
+          watchdog: {
+            resume: {
+              noOutputTimeoutMs: 42_000,
+            },
+          },
        },
-        list: [{ id: "research", workspace: researchWorkspace }],
      },
-    } satisfies OpenClawConfig;
-
-    try {
-      await expect(
-        runCliAgent({
-          sessionId: "s1",
-          sessionKey: "agent::broken",
-          agentId: "research",
-          sessionFile: "/tmp/session.jsonl",
-          workspaceDir: undefined as unknown as string,
-          config: cfg,
-          prompt: "hi",
-          provider: "codex-cli",
-          model: "gpt-5.2-codex",
-          timeoutMs: 1_000,
-          runId: "run-2",
-        }),
-      ).rejects.toThrow("Malformed agent session key");
-    } finally {
-      await fs.rm(tempDir, { recursive: true, force: true });
-    }
-    expect(runCommandWithTimeoutMock).not.toHaveBeenCalled();
-  });
-});
-
-describe("cleanupSuspendedCliProcesses", () => {
-  beforeEach(() => {
-    runExecMock.mockReset();
-  });
-
-  it("skips when no session tokens are configured", async () => {
-    await cleanupSuspendedCliProcesses(
-      {
-        command: "tool",
-      } as CliBackendConfig,
-      0,
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    expect(runExecMock).not.toHaveBeenCalled();
-  });
-
-  it("matches sessionArg-based commands", async () => {
-    const selfPid = process.pid;
-    runExecMock
-      .mockResolvedValueOnce({
-        stdout: [
-          `  40 ${selfPid} T+ claude --session-id thread-1 -p`,
-          `  41 ${selfPid} S  claude --session-id thread-2 -p`,
-        ].join("\n"),
-        stderr: "",
-      })
-      .mockResolvedValueOnce({ stdout: "", stderr: "" });
-
-    await cleanupSuspendedCliProcesses(
-      {
-        command: "claude",
-        sessionArg: "--session-id",
-      } as CliBackendConfig,
-      0,
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    expect(runExecMock).toHaveBeenCalledTimes(2);
-    const killCall = runExecMock.mock.calls[1] ?? [];
-    expect(killCall[0]).toBe("kill");
-    expect(killCall[1]).toEqual(["-9", "40"]);
-  });
-
-  it("matches resumeArgs with positional session id", async () => {
-    const selfPid = process.pid;
-    runExecMock
-      .mockResolvedValueOnce({
-        stdout: [
-          `  50 ${selfPid} T  codex exec resume thread-99 --color never --sandbox read-only`,
-          `  51 ${selfPid} T  codex exec resume other --color never --sandbox read-only`,
-        ].join("\n"),
-        stderr: "",
-      })
-      .mockResolvedValueOnce({ stdout: "", stderr: "" });
-
-    await cleanupSuspendedCliProcesses(
-      {
-        command: "codex",
-        resumeArgs: ["exec", "resume", "{sessionId}", "--color", "never", "--sandbox", "read-only"],
-      } as CliBackendConfig,
-      1,
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    expect(runExecMock).toHaveBeenCalledTimes(2);
-    const killCall = runExecMock.mock.calls[1] ?? [];
-    expect(killCall[0]).toBe("kill");
-    expect(killCall[1]).toEqual(["-9", "50", "51"]);
-  });
-
-  it("only kills child processes of current process (ppid validation)", async () => {
-    const selfPid = process.pid;
-    const childPid = selfPid + 1;
-    const unrelatedPid = 9999;
-
-    runExecMock
-      .mockResolvedValueOnce({
-        stdout: [
-          `  ${childPid} ${selfPid} T  claude --session-id thread-1 -p`,
-          `  ${unrelatedPid} 100 T  claude --session-id thread-2 -p`,
-        ].join("\n"),
-        stderr: "",
-      })
-      .mockResolvedValueOnce({ stdout: "", stderr: "" });
-
-    await cleanupSuspendedCliProcesses(
-      {
-        command: "claude",
-        sessionArg: "--session-id",
-      } as CliBackendConfig,
-      0,
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    expect(runExecMock).toHaveBeenCalledTimes(2);
-    const killCall = runExecMock.mock.calls[1] ?? [];
-    expect(killCall[0]).toBe("kill");
-    // Only childPid killed; unrelatedPid (ppid=100) excluded
-    expect(killCall[1]).toEqual(["-9", String(childPid)]);
-  });
-
-  it("skips all processes when none are children of current process", async () => {
-    runExecMock.mockResolvedValueOnce({
-      stdout: [
-        "  200 100 T  claude --session-id thread-1 -p",
-        "  201 100 T  claude --session-id thread-2 -p",
-      ].join("\n"),
-      stderr: "",
+      timeoutMs: 120_000,
+      useResume: true,
    });
-
-    await cleanupSuspendedCliProcesses(
-      {
-        command: "claude",
-        sessionArg: "--session-id",
-      } as CliBackendConfig,
-      0,
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    // Only ps called — no kill because no matching ppid
-    expect(runExecMock).toHaveBeenCalledTimes(1);
-  });
-});
-
-describe("cleanupResumeProcesses", () => {
-  beforeEach(() => {
-    runExecMock.mockReset();
-  });
-
-  it("only kills resume processes owned by current process", async () => {
-    const selfPid = process.pid;
-
-    runExecMock
-      .mockResolvedValueOnce({
-        stdout: [
-          `  ${selfPid + 1} ${selfPid} codex exec resume abc-123`,
-          `  ${selfPid + 2} 999 codex exec resume abc-123`,
-        ].join("\n"),
-        stderr: "",
-      })
-      .mockResolvedValueOnce({ stdout: "", stderr: "" })
-      .mockResolvedValueOnce({ stdout: "", stderr: "" });
-
-    await cleanupResumeProcesses(
-      {
-        command: "codex",
-        resumeArgs: ["exec", "resume", "{sessionId}"],
-      } as CliBackendConfig,
-      "abc-123",
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    expect(runExecMock).toHaveBeenCalledTimes(3);
-
-    const termCall = runExecMock.mock.calls[1] ?? [];
-    expect(termCall[0]).toBe("kill");
-    expect(termCall[1]).toEqual(["-TERM", String(selfPid + 1)]);
-
-    const killCall = runExecMock.mock.calls[2] ?? [];
-    expect(killCall[0]).toBe("kill");
-    expect(killCall[1]).toEqual(["-9", String(selfPid + 1)]);
-  });
-
-  it("skips kill when no resume processes match ppid", async () => {
-    runExecMock.mockResolvedValueOnce({
-      stdout: ["  300 100 codex exec resume abc-123", "  301 200 codex exec resume abc-123"].join(
-        "\n",
-      ),
-      stderr: "",
-    });
-
-    await cleanupResumeProcesses(
-      {
-        command: "codex",
-        resumeArgs: ["exec", "resume", "{sessionId}"],
-      } as CliBackendConfig,
-      "abc-123",
-    );
-
-    if (process.platform === "win32") {
-      expect(runExecMock).not.toHaveBeenCalled();
-      return;
-    }
-
-    // Only ps called — no kill because no matching ppid
-    expect(runExecMock).toHaveBeenCalledTimes(1);
+    expect(timeoutMs).toBe(42_000);
  });
 });
--- a/src/agents/cli-runner.ts
+++ b/src/agents/cli-runner.ts
@@ -6,20 +6,20 @@ import { resolveHeartbeatPrompt } from "../auto-reply/heartbeat.js";
 import { shouldLogVerbose } from "../globals.js";
 import { isTruthyEnvValue } from "../infra/env.js";
 import { createSubsystemLogger } from "../logging/subsystem.js";
-import { runCommandWithTimeout } from "../process/exec.js";
+import { getProcessSupervisor } from "../process/supervisor/index.js";
 import { resolveSessionAgentIds } from "./agent-scope.js";
 import { makeBootstrapWarn, resolveBootstrapContextForRun } from "./bootstrap-files.js";
 import { resolveCliBackendConfig } from "./cli-backends.js";
 import {
  appendImagePathsToPrompt,
+  buildCliSupervisorScopeKey,
  buildCliArgs,
  buildSystemPrompt,
-  cleanupResumeProcesses,
-  cleanupSuspendedCliProcesses,
  enqueueCliRun,
  normalizeCliModel,
  parseCliJson,
  parseCliJsonl,
+  resolveCliNoOutputTimeoutMs,
  resolvePromptInput,
  resolveSessionIdToSend,
  resolveSystemPromptUsage,
@@ -226,19 +226,32 @@ export async function runCliAgent(params: {
        }
        return next;
      })();
-
-      // Cleanup suspended processes that have accumulated (regardless of sessionId)
-      await cleanupSuspendedCliProcesses(backend);
-      if (useResume && cliSessionIdToSend) {
-        await cleanupResumeProcesses(backend, cliSessionIdToSend);
-      }
-
-      const result = await runCommandWithTimeout([backend.command, ...args], {
+      const noOutputTimeoutMs = resolveCliNoOutputTimeoutMs({
+        backend,
        timeoutMs: params.timeoutMs,
+        useResume,
+      });
+      const supervisor = getProcessSupervisor();
+      const scopeKey = buildCliSupervisorScopeKey({
+        backend,
+        backendId: backendResolved.id,
+        cliSessionId: useResume ? cliSessionIdToSend : undefined,
+      });
+
+      const managedRun = await supervisor.spawn({
+        sessionId: params.sessionId,
+        backendId: backendResolved.id,
+        scopeKey,
+        replaceExistingScope: Boolean(useResume && scopeKey),
+        mode: "child",
+        argv: [backend.command, ...args],
+        timeoutMs: params.timeoutMs,
+        noOutputTimeoutMs,
        cwd: workspaceDir,
        env,
        input: stdinPayload,
      });
+      const result = await managedRun.wait();

      const stdout = result.stdout.trim();
      const stderr = result.stderr.trim();
@@ -259,7 +272,28 @@ export async function runCliAgent(params: {
        }
      }

-      if (result.code !== 0) {
+      if (result.exitCode !== 0 || result.reason !== "exit") {
+        if (result.reason === "no-output-timeout" || result.noOutputTimedOut) {
+          const timeoutReason = `CLI produced no output for ${Math.round(noOutputTimeoutMs / 1000)}s and was terminated.`;
+          log.warn(
+            `cli watchdog timeout: provider=${params.provider} model=${modelId} session=${cliSessionIdToSend ?? params.sessionId} noOutputTimeoutMs=${noOutputTimeoutMs} pid=${managedRun.pid ?? "unknown"}`,
+          );
+          throw new FailoverError(timeoutReason, {
+            reason: "timeout",
+            provider: params.provider,
+            model: modelId,
+            status: resolveFailoverStatus("timeout"),
+          });
+        }
+        if (result.reason === "overall-timeout") {
+          const timeoutReason = `CLI exceeded timeout (${Math.round(params.timeoutMs / 1000)}s) and was terminated.`;
+          throw new FailoverError(timeoutReason, {
+            reason: "timeout",
+            provider: params.provider,
+            model: modelId,
+            status: resolveFailoverStatus("timeout"),
+          });
+        }
        const err = stderr || stdout || "CLI failed.";
        const reason = classifyFailoverReason(err) ?? "unknown";
        const status = resolveFailoverStatus(reason);
--- a/src/agents/cli-runner/helpers.ts
+++ b/src/agents/cli-runner/helpers.ts
@@ -8,232 +8,27 @@ import type { ThinkLevel } from "../../auto-reply/thinking.js";
 import type { OpenClawConfig } from "../../config/config.js";
 import type { CliBackendConfig } from "../../config/types.js";
 import type { EmbeddedContextFile } from "../pi-embedded-helpers.js";
-import { runExec } from "../../process/exec.js";
 import { buildTtsSystemPromptHint } from "../../tts/tts.js";
-import { escapeRegExp, isRecord } from "../../utils.js";
+import { isRecord } from "../../utils.js";
 import { buildModelAliasLines } from "../model-alias-lines.js";
 import { resolveDefaultModelForAgent } from "../model-selection.js";
 import { detectRuntimeShell } from "../shell-utils.js";
 import { buildSystemPromptParams } from "../system-prompt-params.js";
 import { buildAgentSystemPrompt } from "../system-prompt.js";
+export { buildCliSupervisorScopeKey, resolveCliNoOutputTimeoutMs } from "./reliability.js";

 const CLI_RUN_QUEUE = new Map<string, Promise<unknown>>();
-
-function buildLooseArgOrderRegex(tokens: string[]): RegExp {
-  // Scan `ps` output lines. Keep matching flexible, but require whitespace arg boundaries
-  // to avoid substring matches like `codexx` or `/path/to/codexx`.
-  const [head, ...rest] = tokens.map((t) => String(t ?? "").trim()).filter(Boolean);
-  if (!head) {
-    return /$^/;
-  }
-
-  const headEscaped = escapeRegExp(head);
-  const headFragment = `(?:^|\\s)(?:${headEscaped}|\\S+\\/${headEscaped})(?=\\s|$)`;
-  const restFragments = rest.map((t) => `(?:^|\\s)${escapeRegExp(t)}(?=\\s|$)`);
-  return new RegExp([headFragment, ...restFragments].join(".*"));
-}
-
-async function psWithFallback(argsA: string[], argsB: string[]): Promise<string> {
-  try {
-    const { stdout } = await runExec("ps", argsA);
-    return stdout;
-  } catch {
-    // fallthrough
-  }
-  const { stdout } = await runExec("ps", argsB);
-  return stdout;
-}
-
-export async function cleanupResumeProcesses(
-  backend: CliBackendConfig,
-  sessionId: string,
-): Promise<void> {
-  if (process.platform === "win32") {
-    return;
-  }
-  const resumeArgs = backend.resumeArgs ?? [];
-  if (resumeArgs.length === 0) {
-    return;
-  }
-  if (!resumeArgs.some((arg) => arg.includes("{sessionId}"))) {
-    return;
-  }
-  const commandToken = path.basename(backend.command ?? "").trim();
-  if (!commandToken) {
-    return;
-  }
-
-  const resumeTokens = resumeArgs.map((arg) => arg.replaceAll("{sessionId}", sessionId));
-  const pattern = [commandToken, ...resumeTokens]
-    .filter(Boolean)
-    .map((token) => escapeRegExp(token))
-    .join(".*");
-  if (!pattern) {
-    return;
-  }
-
-  try {
-    const stdout = await psWithFallback(
-      ["-axww", "-o", "pid=,ppid=,command="],
-      ["-ax", "-o", "pid=,ppid=,command="],
-    );
-    const patternRegex = buildLooseArgOrderRegex([commandToken, ...resumeTokens]);
-    const toKill: number[] = [];
-
-    for (const line of stdout.split("\n")) {
-      const trimmed = line.trim();
-      if (!trimmed) {
-        continue;
-      }
-      const match = /^(\d+)\s+(\d+)\s+(.*)$/.exec(trimmed);
-      if (!match) {
-        continue;
-      }
-      const pid = Number(match[1]);
-      const ppid = Number(match[2]);
-      const cmd = match[3] ?? "";
-      if (!Number.isFinite(pid)) {
-        continue;
-      }
-      if (ppid !== process.pid) {
-        continue;
-      }
-      if (!patternRegex.test(cmd)) {
-        continue;
-      }
-      toKill.push(pid);
-    }
-
-    if (toKill.length > 0) {
-      const pidArgs = toKill.map((pid) => String(pid));
-      try {
-        await runExec("kill", ["-TERM", ...pidArgs]);
-      } catch {
-        // ignore
-      }
-      await new Promise((resolve) => setTimeout(resolve, 250));
-      try {
-        await runExec("kill", ["-9", ...pidArgs]);
-      } catch {
-        // ignore
-      }
-    }
-  } catch {
-    // ignore errors - best effort cleanup
-  }
-}
-
-function buildSessionMatchers(backend: CliBackendConfig): RegExp[] {
-  const commandToken = path.basename(backend.command ?? "").trim();
-  if (!commandToken) {
-    return [];
-  }
-  const matchers: RegExp[] = [];
-  const sessionArg = backend.sessionArg?.trim();
-  const sessionArgs = backend.sessionArgs ?? [];
-  const resumeArgs = backend.resumeArgs ?? [];
-
-  const addMatcher = (args: string[]) => {
-    if (args.length === 0) {
-      return;
-    }
-    const tokens = [commandToken, ...args];
-    const pattern = tokens
-      .map((token, index) => {
-        const tokenPattern = tokenToRegex(token);
-        return index === 0 ? `(?:^|\\s)${tokenPattern}` : `\\s+${tokenPattern}`;
-      })
-      .join("");
-    matchers.push(new RegExp(pattern));
-  };
-
-  if (sessionArgs.some((arg) => arg.includes("{sessionId}"))) {
-    addMatcher(sessionArgs);
-  } else if (sessionArg) {
-    addMatcher([sessionArg, "{sessionId}"]);
-  }
-
-  if (resumeArgs.some((arg) => arg.includes("{sessionId}"))) {
-    addMatcher(resumeArgs);
-  }
-
-  return matchers;
-}
-
-function tokenToRegex(token: string): string {
-  if (!token.includes("{sessionId}")) {
-    return escapeRegExp(token);
-  }
-  const parts = token.split("{sessionId}").map((part) => escapeRegExp(part));
-  return parts.join("\\S+");
-}
-
-/**
- * Cleanup suspended OpenClaw CLI processes that have accumulated.
- * Only cleans up if there are more than the threshold (default: 10).
- */
-export async function cleanupSuspendedCliProcesses(
-  backend: CliBackendConfig,
-  threshold = 10,
-): Promise<void> {
-  if (process.platform === "win32") {
-    return;
-  }
-  const matchers = buildSessionMatchers(backend);
-  if (matchers.length === 0) {
-    return;
-  }
-
-  try {
-    const stdout = await psWithFallback(
-      ["-axww", "-o", "pid=,ppid=,stat=,command="],
-      ["-ax", "-o", "pid=,ppid=,stat=,command="],
-    );
-    const suspended: number[] = [];
-    for (const line of stdout.split("\n")) {
-      const trimmed = line.trim();
-      if (!trimmed) {
-        continue;
-      }
-      const match = /^(\d+)\s+(\d+)\s+(\S+)\s+(.*)$/.exec(trimmed);
-      if (!match) {
-        continue;
-      }
-      const pid = Number(match[1]);
-      const ppid = Number(match[2]);
-      const stat = match[3] ?? "";
-      const command = match[4] ?? "";
-      if (!Number.isFinite(pid)) {
-        continue;
-      }
-      if (ppid !== process.pid) {
-        continue;
-      }
-      if (!stat.includes("T")) {
-        continue;
-      }
-      if (!matchers.some((matcher) => matcher.test(command))) {
-        continue;
-      }
-      suspended.push(pid);
-    }
-
-    if (suspended.length > threshold) {
-      // Verified locally: stopped (T) processes ignore SIGTERM, so use SIGKILL.
-      await runExec("kill", ["-9", ...suspended.map((pid) => String(pid))]);
-    }
-  } catch {
-    // ignore errors - best effort cleanup
-  }
-}
 export function enqueueCliRun<T>(key: string, task: () => Promise<T>): Promise<T> {
  const prior = CLI_RUN_QUEUE.get(key) ?? Promise.resolve();
  const chained = prior.catch(() => undefined).then(task);
-  const tracked = chained.finally(() => {
-    if (CLI_RUN_QUEUE.get(key) === tracked) {
-      CLI_RUN_QUEUE.delete(key);
-    }
-  });
+  // Keep queue continuity even when a run rejects, without emitting unhandled rejections.
+  const tracked = chained
+    .catch(() => undefined)
+    .finally(() => {
+      if (CLI_RUN_QUEUE.get(key) === tracked) {
+        CLI_RUN_QUEUE.delete(key);
+      }
+    });
  CLI_RUN_QUEUE.set(key, tracked);
  return chained;
 }
--- a/src/agents/cli-runner/reliability.ts
+++ b/src/agents/cli-runner/reliability.ts
@@ -0,0 +1,88 @@
+import path from "node:path";
+import type { CliBackendConfig } from "../../config/types.js";
+import {
+  CLI_FRESH_WATCHDOG_DEFAULTS,
+  CLI_RESUME_WATCHDOG_DEFAULTS,
+  CLI_WATCHDOG_MIN_TIMEOUT_MS,
+} from "../cli-watchdog-defaults.js";
+
+function pickWatchdogProfile(
+  backend: CliBackendConfig,
+  useResume: boolean,
+): {
+  noOutputTimeoutMs?: number;
+  noOutputTimeoutRatio: number;
+  minMs: number;
+  maxMs: number;
+} {
+  const defaults = useResume ? CLI_RESUME_WATCHDOG_DEFAULTS : CLI_FRESH_WATCHDOG_DEFAULTS;
+  const configured = useResume
+    ? backend.reliability?.watchdog?.resume
+    : backend.reliability?.watchdog?.fresh;
+
+  const ratio = (() => {
+    const value = configured?.noOutputTimeoutRatio;
+    if (typeof value !== "number" || !Number.isFinite(value)) {
+      return defaults.noOutputTimeoutRatio;
+    }
+    return Math.max(0.05, Math.min(0.95, value));
+  })();
+  const minMs = (() => {
+    const value = configured?.minMs;
+    if (typeof value !== "number" || !Number.isFinite(value)) {
+      return defaults.minMs;
+    }
+    return Math.max(CLI_WATCHDOG_MIN_TIMEOUT_MS, Math.floor(value));
+  })();
+  const maxMs = (() => {
+    const value = configured?.maxMs;
+    if (typeof value !== "number" || !Number.isFinite(value)) {
+      return defaults.maxMs;
+    }
+    return Math.max(CLI_WATCHDOG_MIN_TIMEOUT_MS, Math.floor(value));
+  })();
+
+  return {
+    noOutputTimeoutMs:
+      typeof configured?.noOutputTimeoutMs === "number" &&
+      Number.isFinite(configured.noOutputTimeoutMs)
+        ? Math.max(CLI_WATCHDOG_MIN_TIMEOUT_MS, Math.floor(configured.noOutputTimeoutMs))
+        : undefined,
+    noOutputTimeoutRatio: ratio,
+    minMs: Math.min(minMs, maxMs),
+    maxMs: Math.max(minMs, maxMs),
+  };
+}
+
+export function resolveCliNoOutputTimeoutMs(params: {
+  backend: CliBackendConfig;
+  timeoutMs: number;
+  useResume: boolean;
+}): number {
+  const profile = pickWatchdogProfile(params.backend, params.useResume);
+  // Keep watchdog below global timeout in normal cases.
+  const cap = Math.max(CLI_WATCHDOG_MIN_TIMEOUT_MS, params.timeoutMs - 1_000);
+  if (profile.noOutputTimeoutMs !== undefined) {
+    return Math.min(profile.noOutputTimeoutMs, cap);
+  }
+  const computed = Math.floor(params.timeoutMs * profile.noOutputTimeoutRatio);
+  const bounded = Math.min(profile.maxMs, Math.max(profile.minMs, computed));
+  return Math.min(bounded, cap);
+}
+
+export function buildCliSupervisorScopeKey(params: {
+  backend: CliBackendConfig;
+  backendId: string;
+  cliSessionId?: string;
+}): string | undefined {
+  const commandToken = path
+    .basename(params.backend.command ?? "")
+    .trim()
+    .toLowerCase();
+  const backendToken = params.backendId.trim().toLowerCase();
+  const sessionToken = params.cliSessionId?.trim();
+  if (!sessionToken) {
+    return undefined;
+  }
+  return `cli:${backendToken}:${commandToken}:${sessionToken}`;
+}
--- a/src/agents/cli-watchdog-defaults.ts
+++ b/src/agents/cli-watchdog-defaults.ts
@@ -0,0 +1,13 @@
+export const CLI_WATCHDOG_MIN_TIMEOUT_MS = 1_000;
+
+export const CLI_FRESH_WATCHDOG_DEFAULTS = {
+  noOutputTimeoutRatio: 0.8,
+  minMs: 180_000,
+  maxMs: 600_000,
+} as const;
+
+export const CLI_RESUME_WATCHDOG_DEFAULTS = {
+  noOutputTimeoutRatio: 0.3,
+  minMs: 60_000,
+  maxMs: 180_000,
+} as const;
--- a/src/agents/context.test.ts
+++ b/src/agents/context.test.ts
@@ -1,5 +1,6 @@
 import { describe, expect, it } from "vitest";
 import { applyConfiguredContextWindows } from "./context.js";
+import { createSessionManagerRuntimeRegistry } from "./pi-extensions/session-manager-runtime-registry.js";

 describe("applyConfiguredContextWindows", () => {
  it("overrides discovered cache values with explicit models.providers contextWindow", () => {
@@ -39,3 +40,23 @@ describe("applyConfiguredContextWindows", () => {
    expect(cache.has("bad/model")).toBe(false);
  });
 });
+
+describe("createSessionManagerRuntimeRegistry", () => {
+  it("stores, reads, and clears values by object identity", () => {
+    const registry = createSessionManagerRuntimeRegistry<{ value: number }>();
+    const key = {};
+    expect(registry.get(key)).toBeNull();
+    registry.set(key, { value: 1 });
+    expect(registry.get(key)).toEqual({ value: 1 });
+    registry.set(key, null);
+    expect(registry.get(key)).toBeNull();
+  });
+
+  it("ignores non-object keys", () => {
+    const registry = createSessionManagerRuntimeRegistry<{ value: number }>();
+    registry.set(null, { value: 1 });
+    registry.set(123, { value: 1 });
+    expect(registry.get(null)).toBeNull();
+    expect(registry.get(123)).toBeNull();
+  });
+});
--- a/src/agents/model-auth.e2e.test.ts
+++ b/src/agents/model-auth.e2e.test.ts
@@ -3,6 +3,7 @@ import fs from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
 import { describe, expect, it } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import { ensureAuthProfileStore } from "./auth-profiles.js";
 import { getApiKeyForModel, resolveApiKeyForProvider, resolveEnvApiKey } from "./model-auth.js";

@@ -15,9 +16,11 @@ const oauthFixture = {

 describe("getApiKeyForModel", () => {
  it("migrates legacy oauth.json into auth-profiles.json", async () => {
-    const previousStateDir = process.env.OPENCLAW_STATE_DIR;
-    const previousAgentDir = process.env.OPENCLAW_AGENT_DIR;
-    const previousPiAgentDir = process.env.PI_CODING_AGENT_DIR;
+    const envSnapshot = captureEnv([
+      "OPENCLAW_STATE_DIR",
+      "OPENCLAW_AGENT_DIR",
+      "PI_CODING_AGENT_DIR",
+    ]);
    const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-oauth-"));

    try {
@@ -73,30 +76,18 @@ describe("getApiKeyForModel", () => {
        },
      });
    } finally {
-      if (previousStateDir === undefined) {
-        delete process.env.OPENCLAW_STATE_DIR;
-      } else {
-        process.env.OPENCLAW_STATE_DIR = previousStateDir;
-      }
-      if (previousAgentDir === undefined) {
-        delete process.env.OPENCLAW_AGENT_DIR;
-      } else {
-        process.env.OPENCLAW_AGENT_DIR = previousAgentDir;
-      }
-      if (previousPiAgentDir === undefined) {
-        delete process.env.PI_CODING_AGENT_DIR;
-      } else {
-        process.env.PI_CODING_AGENT_DIR = previousPiAgentDir;
-      }
+      envSnapshot.restore();
      await fs.rm(tempDir, { recursive: true, force: true });
    }
  });

  it("suggests openai-codex when only Codex OAuth is configured", async () => {
-    const previousStateDir = process.env.OPENCLAW_STATE_DIR;
-    const previousAgentDir = process.env.OPENCLAW_AGENT_DIR;
-    const previousPiAgentDir = process.env.PI_CODING_AGENT_DIR;
-    const previousOpenAiKey = process.env.OPENAI_API_KEY;
+    const envSnapshot = captureEnv([
+      "OPENAI_API_KEY",
+      "OPENCLAW_STATE_DIR",
+      "OPENCLAW_AGENT_DIR",
+      "PI_CODING_AGENT_DIR",
+    ]);
    const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-auth-"));

    try {
@@ -137,26 +128,7 @@ describe("getApiKeyForModel", () => {
      }
      expect(String(error)).toContain("openai-codex/gpt-5.3-codex");
    } finally {
-      if (previousOpenAiKey === undefined) {
-        delete process.env.OPENAI_API_KEY;
-      } else {
-        process.env.OPENAI_API_KEY = previousOpenAiKey;
-      }
-      if (previousStateDir === undefined) {
-        delete process.env.OPENCLAW_STATE_DIR;
-      } else {
-        process.env.OPENCLAW_STATE_DIR = previousStateDir;
-      }
-      if (previousAgentDir === undefined) {
-        delete process.env.OPENCLAW_AGENT_DIR;
-      } else {
-        process.env.OPENCLAW_AGENT_DIR = previousAgentDir;
-      }
-      if (previousPiAgentDir === undefined) {
-        delete process.env.PI_CODING_AGENT_DIR;
-      } else {
-        process.env.PI_CODING_AGENT_DIR = previousPiAgentDir;
-      }
+      envSnapshot.restore();
      await fs.rm(tempDir, { recursive: true, force: true });
    }
  });
--- a/src/agents/model-scan.e2e.test.ts
+++ b/src/agents/model-scan.e2e.test.ts
@@ -1,4 +1,5 @@
 import { describe, expect, it } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import { scanOpenRouterModels } from "./model-scan.js";

 function createFetchFixture(payload: unknown): typeof fetch {
@@ -66,7 +67,7 @@ describe("scanOpenRouterModels", () => {

  it("requires an API key when probing", async () => {
    const fetchImpl = createFetchFixture({ data: [] });
-    const previousKey = process.env.OPENROUTER_API_KEY;
+    const envSnapshot = captureEnv(["OPENROUTER_API_KEY"]);
    try {
      delete process.env.OPENROUTER_API_KEY;
      await expect(
@@ -77,11 +78,7 @@ describe("scanOpenRouterModels", () => {
        }),
      ).rejects.toThrow(/Missing OpenRouter API key/);
    } finally {
-      if (previousKey === undefined) {
-        delete process.env.OPENROUTER_API_KEY;
-      } else {
-        process.env.OPENROUTER_API_KEY = previousKey;
-      }
+      envSnapshot.restore();
    }
  });
 });
--- a/src/agents/models-config.auto-injects-github-copilot-provider-token-is.e2e.test.ts
+++ b/src/agents/models-config.auto-injects-github-copilot-provider-token-is.e2e.test.ts
@@ -1,6 +1,7 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { describe, expect, it, vi } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import {
  installModelsConfigTestHooks,
  withModelsTempHome as withTempHome,
@@ -12,7 +13,7 @@ installModelsConfigTestHooks({ restoreFetch: true });
 describe("models-config", () => {
  it("auto-injects github-copilot provider when token is present", async () => {
    await withTempHome(async (home) => {
-      const previous = process.env.COPILOT_GITHUB_TOKEN;
+      const envSnapshot = captureEnv(["COPILOT_GITHUB_TOKEN"]);
      process.env.COPILOT_GITHUB_TOKEN = "gh-token";
      const fetchMock = vi.fn().mockResolvedValue({
        ok: true,
@@ -36,20 +37,14 @@ describe("models-config", () => {
        expect(parsed.providers["github-copilot"]?.baseUrl).toBe("https://api.copilot.example");
        expect(parsed.providers["github-copilot"]?.models?.length ?? 0).toBe(0);
      } finally {
-        if (previous === undefined) {
-          delete process.env.COPILOT_GITHUB_TOKEN;
-        } else {
-          process.env.COPILOT_GITHUB_TOKEN = previous;
-        }
+        envSnapshot.restore();
      }
    });
  });

  it("prefers COPILOT_GITHUB_TOKEN over GH_TOKEN and GITHUB_TOKEN", async () => {
    await withTempHome(async () => {
-      const previous = process.env.COPILOT_GITHUB_TOKEN;
-      const previousGh = process.env.GH_TOKEN;
-      const previousGithub = process.env.GITHUB_TOKEN;
+      const envSnapshot = captureEnv(["COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"]);
      process.env.COPILOT_GITHUB_TOKEN = "copilot-token";
      process.env.GH_TOKEN = "gh-token";
      process.env.GITHUB_TOKEN = "github-token";
@@ -70,9 +65,7 @@ describe("models-config", () => {
        const [, opts] = fetchMock.mock.calls[0] as [string, { headers?: Record<string, string> }];
        expect(opts?.headers?.Authorization).toBe("Bearer copilot-token");
      } finally {
-        process.env.COPILOT_GITHUB_TOKEN = previous;
-        process.env.GH_TOKEN = previousGh;
-        process.env.GITHUB_TOKEN = previousGithub;
+        envSnapshot.restore();
      }
    });
  });
--- a/src/agents/models-config.falls-back-default-baseurl-token-exchange-fails.e2e.test.ts
+++ b/src/agents/models-config.falls-back-default-baseurl-token-exchange-fails.e2e.test.ts
@@ -2,6 +2,7 @@ import fs from "node:fs/promises";
 import path from "node:path";
 import { describe, expect, it, vi } from "vitest";
 import { DEFAULT_COPILOT_API_BASE_URL } from "../providers/github-copilot-token.js";
+import { captureEnv } from "../test-utils/env.js";
 import {
  installModelsConfigTestHooks,
  withModelsTempHome as withTempHome,
@@ -13,7 +14,7 @@ installModelsConfigTestHooks({ restoreFetch: true });
 describe("models-config", () => {
  it("falls back to default baseUrl when token exchange fails", async () => {
    await withTempHome(async () => {
-      const previous = process.env.COPILOT_GITHUB_TOKEN;
+      const envSnapshot = captureEnv(["COPILOT_GITHUB_TOKEN"]);
      process.env.COPILOT_GITHUB_TOKEN = "gh-token";
      const fetchMock = vi.fn().mockResolvedValue({
        ok: false,
@@ -33,20 +34,14 @@ describe("models-config", () => {

        expect(parsed.providers["github-copilot"]?.baseUrl).toBe(DEFAULT_COPILOT_API_BASE_URL);
      } finally {
-        if (previous === undefined) {
-          delete process.env.COPILOT_GITHUB_TOKEN;
-        } else {
-          process.env.COPILOT_GITHUB_TOKEN = previous;
-        }
+        envSnapshot.restore();
      }
    });
  });

  it("uses agentDir override auth profiles for copilot injection", async () => {
    await withTempHome(async (home) => {
-      const previous = process.env.COPILOT_GITHUB_TOKEN;
-      const previousGh = process.env.GH_TOKEN;
-      const previousGithub = process.env.GITHUB_TOKEN;
+      const envSnapshot = captureEnv(["COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"]);
      delete process.env.COPILOT_GITHUB_TOKEN;
      delete process.env.GH_TOKEN;
      delete process.env.GITHUB_TOKEN;
@@ -91,21 +86,7 @@ describe("models-config", () => {

        expect(parsed.providers["github-copilot"]?.baseUrl).toBe("https://api.copilot.example");
      } finally {
-        if (previous === undefined) {
-          delete process.env.COPILOT_GITHUB_TOKEN;
-        } else {
-          process.env.COPILOT_GITHUB_TOKEN = previous;
-        }
-        if (previousGh === undefined) {
-          delete process.env.GH_TOKEN;
-        } else {
-          process.env.GH_TOKEN = previousGh;
-        }
-        if (previousGithub === undefined) {
-          delete process.env.GITHUB_TOKEN;
-        } else {
-          process.env.GITHUB_TOKEN = previousGithub;
-        }
+        envSnapshot.restore();
      }
    });
  });
--- a/src/agents/models-config.providers.minimax.test.ts
+++ b/src/agents/models-config.providers.minimax.test.ts
@@ -2,12 +2,13 @@ import { mkdtempSync } from "node:fs";
 import { tmpdir } from "node:os";
 import { join } from "node:path";
 import { describe, expect, it } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import { resolveImplicitProviders } from "./models-config.providers.js";

 describe("MiniMax implicit provider (#15275)", () => {
  it("should use anthropic-messages API for API-key provider", async () => {
    const agentDir = mkdtempSync(join(tmpdir(), "openclaw-test-"));
-    const previous = process.env.MINIMAX_API_KEY;
+    const envSnapshot = captureEnv(["MINIMAX_API_KEY"]);
    process.env.MINIMAX_API_KEY = "test-key";

    try {
@@ -16,11 +17,7 @@ describe("MiniMax implicit provider (#15275)", () => {
      expect(providers?.minimax?.api).toBe("anthropic-messages");
      expect(providers?.minimax?.baseUrl).toBe("https://api.minimax.io/anthropic");
    } finally {
-      if (previous === undefined) {
-        delete process.env.MINIMAX_API_KEY;
-      } else {
-        process.env.MINIMAX_API_KEY = previous;
-      }
+      envSnapshot.restore();
    }
  });
 });
--- a/src/agents/models-config.providers.nvidia.test.ts
+++ b/src/agents/models-config.providers.nvidia.test.ts
@@ -2,13 +2,14 @@ import { mkdtempSync } from "node:fs";
 import { tmpdir } from "node:os";
 import { join } from "node:path";
 import { describe, expect, it } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import { resolveApiKeyForProvider } from "./model-auth.js";
 import { buildNvidiaProvider, resolveImplicitProviders } from "./models-config.providers.js";

 describe("NVIDIA provider", () => {
  it("should include nvidia when NVIDIA_API_KEY is configured", async () => {
    const agentDir = mkdtempSync(join(tmpdir(), "openclaw-test-"));
-    const previous = process.env.NVIDIA_API_KEY;
+    const envSnapshot = captureEnv(["NVIDIA_API_KEY"]);
    process.env.NVIDIA_API_KEY = "test-key";

    try {
@@ -16,17 +17,13 @@ describe("NVIDIA provider", () => {
      expect(providers?.nvidia).toBeDefined();
      expect(providers?.nvidia?.models?.length).toBeGreaterThan(0);
    } finally {
-      if (previous === undefined) {
-        delete process.env.NVIDIA_API_KEY;
-      } else {
-        process.env.NVIDIA_API_KEY = previous;
-      }
+      envSnapshot.restore();
    }
  });

  it("resolves the nvidia api key value from env", async () => {
    const agentDir = mkdtempSync(join(tmpdir(), "openclaw-test-"));
-    const previous = process.env.NVIDIA_API_KEY;
+    const envSnapshot = captureEnv(["NVIDIA_API_KEY"]);
    process.env.NVIDIA_API_KEY = "nvidia-test-api-key";

    try {
@@ -39,11 +36,7 @@ describe("NVIDIA provider", () => {
      expect(auth.mode).toBe("api-key");
      expect(auth.source).toContain("NVIDIA_API_KEY");
    } finally {
-      if (previous === undefined) {
-        delete process.env.NVIDIA_API_KEY;
-      } else {
-        process.env.NVIDIA_API_KEY = previous;
-      }
+      envSnapshot.restore();
    }
  });

--- a/src/agents/models-config.providers.qianfan.e2e.test.ts
+++ b/src/agents/models-config.providers.qianfan.e2e.test.ts
@@ -2,12 +2,13 @@ import { mkdtempSync } from "node:fs";
 import { tmpdir } from "node:os";
 import { join } from "node:path";
 import { describe, expect, it } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import { resolveImplicitProviders } from "./models-config.providers.js";

 describe("Qianfan provider", () => {
  it("should include qianfan when QIANFAN_API_KEY is configured", async () => {
    const agentDir = mkdtempSync(join(tmpdir(), "openclaw-test-"));
-    const previous = process.env.QIANFAN_API_KEY;
+    const envSnapshot = captureEnv(["QIANFAN_API_KEY"]);
    process.env.QIANFAN_API_KEY = "test-key";

    try {
@@ -15,11 +16,7 @@ describe("Qianfan provider", () => {
      expect(providers?.qianfan).toBeDefined();
      expect(providers?.qianfan?.apiKey).toBe("QIANFAN_API_KEY");
    } finally {
-      if (previous === undefined) {
-        delete process.env.QIANFAN_API_KEY;
-      } else {
-        process.env.QIANFAN_API_KEY = previous;
-      }
+      envSnapshot.restore();
    }
  });
 });
--- a/src/agents/models-config.uses-first-github-copilot-profile-env-tokens.e2e.test.ts
+++ b/src/agents/models-config.uses-first-github-copilot-profile-env-tokens.e2e.test.ts
@@ -1,6 +1,7 @@
 import fs from "node:fs/promises";
 import path from "node:path";
 import { describe, expect, it, vi } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import { resolveOpenClawAgentDir } from "./agent-paths.js";
 import {
  installModelsConfigTestHooks,
@@ -13,9 +14,7 @@ installModelsConfigTestHooks({ restoreFetch: true });
 describe("models-config", () => {
  it("uses the first github-copilot profile when env tokens are missing", async () => {
    await withTempHome(async (home) => {
-      const previous = process.env.COPILOT_GITHUB_TOKEN;
-      const previousGh = process.env.GH_TOKEN;
-      const previousGithub = process.env.GITHUB_TOKEN;
+      const envSnapshot = captureEnv(["COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"]);
      delete process.env.COPILOT_GITHUB_TOKEN;
      delete process.env.GH_TOKEN;
      delete process.env.GITHUB_TOKEN;
@@ -61,28 +60,14 @@ describe("models-config", () => {
        const [, opts] = fetchMock.mock.calls[0] as [string, { headers?: Record<string, string> }];
        expect(opts?.headers?.Authorization).toBe("Bearer alpha-token");
      } finally {
-        if (previous === undefined) {
-          delete process.env.COPILOT_GITHUB_TOKEN;
-        } else {
-          process.env.COPILOT_GITHUB_TOKEN = previous;
-        }
-        if (previousGh === undefined) {
-          delete process.env.GH_TOKEN;
-        } else {
-          process.env.GH_TOKEN = previousGh;
-        }
-        if (previousGithub === undefined) {
-          delete process.env.GITHUB_TOKEN;
-        } else {
-          process.env.GITHUB_TOKEN = previousGithub;
-        }
+        envSnapshot.restore();
      }
    });
  });

  it("does not override explicit github-copilot provider config", async () => {
    await withTempHome(async () => {
-      const previous = process.env.COPILOT_GITHUB_TOKEN;
+      const envSnapshot = captureEnv(["COPILOT_GITHUB_TOKEN"]);
      process.env.COPILOT_GITHUB_TOKEN = "gh-token";
      const fetchMock = vi.fn().mockResolvedValue({
        ok: true,
@@ -115,11 +100,7 @@ describe("models-config", () => {

        expect(parsed.providers["github-copilot"]?.baseUrl).toBe("https://copilot.local");
      } finally {
-        if (previous === undefined) {
-          delete process.env.COPILOT_GITHUB_TOKEN;
-        } else {
-          process.env.COPILOT_GITHUB_TOKEN = previous;
-        }
+        envSnapshot.restore();
      }
    });
  });
--- a/src/agents/openclaw-gateway-tool.e2e.test.ts
+++ b/src/agents/openclaw-gateway-tool.e2e.test.ts
@@ -2,6 +2,7 @@ import fs from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
 import { describe, expect, it, vi } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import "./test-helpers/fast-core-tools.js";
 import { createOpenClawTools } from "./openclaw-tools.js";

@@ -18,8 +19,7 @@ describe("gateway tool", () => {
  it("schedules SIGUSR1 restart", async () => {
    vi.useFakeTimers();
    const kill = vi.spyOn(process, "kill").mockImplementation(() => true);
-    const previousStateDir = process.env.OPENCLAW_STATE_DIR;
-    const previousProfile = process.env.OPENCLAW_PROFILE;
+    const envSnapshot = captureEnv(["OPENCLAW_STATE_DIR", "OPENCLAW_PROFILE"]);
    const stateDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-test-"));
    process.env.OPENCLAW_STATE_DIR = stateDir;
    process.env.OPENCLAW_PROFILE = "isolated";
@@ -60,16 +60,8 @@ describe("gateway tool", () => {
    } finally {
      kill.mockRestore();
      vi.useRealTimers();
-      if (previousStateDir === undefined) {
-        delete process.env.OPENCLAW_STATE_DIR;
-      } else {
-        process.env.OPENCLAW_STATE_DIR = previousStateDir;
-      }
-      if (previousProfile === undefined) {
-        delete process.env.OPENCLAW_PROFILE;
-      } else {
-        process.env.OPENCLAW_PROFILE = previousProfile;
-      }
+      envSnapshot.restore();
+      await fs.rm(stateDir, { recursive: true, force: true });
    }
  });

--- a/src/agents/openclaw-tools.sessions.e2e.test.ts
+++ b/src/agents/openclaw-tools.sessions.e2e.test.ts
@@ -783,7 +783,7 @@ describe("sessions tools", () => {
        text?: string;
      };
      expect(details.status).toBe("ok");
-      expect(details.text).toContain("tokens 1k (in 12 / out 1k)");
+      expect(details.text).toMatch(/tokens 1(\.0)?k \(in 12 \/ out 1(\.0)?k\)/);
      expect(details.text).toContain("prompt/cache 197k");
      expect(details.text).not.toContain("1.0k io");
    } finally {
--- a/src/agents/openclaw-tools.subagents.sessions-spawn.allowlist.e2e.test.ts
+++ b/src/agents/openclaw-tools.subagents.sessions-spawn.allowlist.e2e.test.ts
@@ -1,6 +1,5 @@
 import { beforeEach, describe, expect, it } from "vitest";
 import "./test-helpers/fast-core-tools.js";
-import { createOpenClawTools } from "./openclaw-tools.js";
 import {
  getCallGatewayMock,
  resetSessionsSpawnConfigOverride,
@@ -9,7 +8,19 @@ import {
 import { resetSubagentRegistryForTests } from "./subagent-registry.js";

 const callGatewayMock = getCallGatewayMock();
-const setConfigOverride = setSessionsSpawnConfigOverride;
+
+type CreateOpenClawTools = (typeof import("./openclaw-tools.js"))["createOpenClawTools"];
+type CreateOpenClawToolsOpts = Parameters<CreateOpenClawTools>[0];
+
+async function getSessionsSpawnTool(opts: CreateOpenClawToolsOpts) {
+  // Dynamic import: ensure harness mocks are installed before tool modules load.
+  const { createOpenClawTools } = await import("./openclaw-tools.js");
+  const tool = createOpenClawTools(opts).find((candidate) => candidate.name === "sessions_spawn");
+  if (!tool) {
+    throw new Error("missing sessions_spawn tool");
+  }
+  return tool;
+}

 describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
  beforeEach(() => {
@@ -20,13 +31,10 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
    resetSubagentRegistryForTests();
    callGatewayMock.mockReset();

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call6", {
      task: "do thing",
@@ -58,13 +66,10 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
      },
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call9", {
      task: "do thing",
@@ -79,7 +84,7 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
  it("sessions_spawn allows cross-agent spawning when configured", async () => {
    resetSubagentRegistryForTests();
    callGatewayMock.mockReset();
-    setConfigOverride({
+    setSessionsSpawnConfigOverride({
      session: {
        mainKey: "main",
        scope: "per-sender",
@@ -110,13 +115,10 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call7", {
      task: "do thing",
@@ -133,7 +135,7 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
  it("sessions_spawn allows any agent when allowlist is *", async () => {
    resetSubagentRegistryForTests();
    callGatewayMock.mockReset();
-    setConfigOverride({
+    setSessionsSpawnConfigOverride({
      session: {
        mainKey: "main",
        scope: "per-sender",
@@ -164,13 +166,10 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call8", {
      task: "do thing",
@@ -187,7 +186,7 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
  it("sessions_spawn normalizes allowlisted agent ids", async () => {
    resetSubagentRegistryForTests();
    callGatewayMock.mockReset();
-    setConfigOverride({
+    setSessionsSpawnConfigOverride({
      session: {
        mainKey: "main",
        scope: "per-sender",
@@ -218,13 +217,10 @@ describe("openclaw-tools: subagents (sessions_spawn allowlist)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call10", {
      task: "do thing",
--- a/src/agents/openclaw-tools.subagents.sessions-spawn.lifecycle.e2e.test.ts
+++ b/src/agents/openclaw-tools.subagents.sessions-spawn.lifecycle.e2e.test.ts
@@ -1,7 +1,7 @@
 import { beforeEach, describe, expect, it, vi } from "vitest";
 import { emitAgentEvent } from "../infra/agent-events.js";
 import "./test-helpers/fast-core-tools.js";
-import { createOpenClawTools } from "./openclaw-tools.js";
+import { sleep } from "../utils.js";
 import {
  getCallGatewayMock,
  resetSessionsSpawnConfigOverride,
@@ -17,6 +17,19 @@ vi.mock("./pi-embedded.js", () => ({

 const callGatewayMock = getCallGatewayMock();

+type CreateOpenClawTools = (typeof import("./openclaw-tools.js"))["createOpenClawTools"];
+type CreateOpenClawToolsOpts = Parameters<CreateOpenClawTools>[0];
+
+async function getSessionsSpawnTool(opts: CreateOpenClawToolsOpts) {
+  // Dynamic import: ensure harness mocks are installed before tool modules load.
+  const { createOpenClawTools } = await import("./openclaw-tools.js");
+  const tool = createOpenClawTools(opts).find((candidate) => candidate.name === "sessions_spawn");
+  if (!tool) {
+    throw new Error("missing sessions_spawn tool");
+  }
+  return tool;
+}
+
 type GatewayRequest = { method?: string; params?: unknown };
 type AgentWaitCall = { runId?: string; timeoutMs?: number };

@@ -112,6 +125,16 @@ function setupSessionsSpawnGatewayMock(opts: {
  };
 }

+const waitFor = async (predicate: () => boolean, timeoutMs = 2000) => {
+  const start = Date.now();
+  while (!predicate()) {
+    if (Date.now() - start > timeoutMs) {
+      throw new Error(`timed out waiting for condition (timeoutMs=${timeoutMs})`);
+    }
+    await sleep(10);
+  }
+};
+
 describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
  beforeEach(() => {
    resetSessionsSpawnConfigOverride();
@@ -120,26 +143,21 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
  it("sessions_spawn runs cleanup flow after subagent completion", async () => {
    resetSubagentRegistryForTests();
    callGatewayMock.mockReset();
-    let patchParams: { key?: string; label?: string } = {};
+    const patchCalls: Array<{ key?: string; label?: string }> = [];

    const ctx = setupSessionsSpawnGatewayMock({
      includeSessionsList: true,
      includeChatHistory: true,
      onSessionsPatch: (params) => {
        const rec = params as { key?: string; label?: string } | undefined;
-        if (typeof rec?.label === "string" && rec.label.trim()) {
-          patchParams = { key: rec.key, label: rec.label };
-        }
+        patchCalls.push({ key: rec?.key, label: rec?.label });
      },
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call2", {
      task: "do thing",
@@ -165,18 +183,16 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      },
    });

-    vi.useFakeTimers();
-    try {
-      await vi.advanceTimersByTimeAsync(500);
-    } finally {
-      vi.useRealTimers();
-    }
+    await waitFor(() => ctx.waitCalls.some((call) => call.runId === child.runId));
+    await waitFor(() => patchCalls.some((call) => call.label === "my-task"));
+    await waitFor(() => ctx.calls.filter((c) => c.method === "agent").length >= 2);

    const childWait = ctx.waitCalls.find((call) => call.runId === child.runId);
    expect(childWait?.timeoutMs).toBe(1000);
    // Cleanup should patch the label
-    expect(patchParams.key).toBe(child.sessionKey);
-    expect(patchParams.label).toBe("my-task");
+    const labelPatch = patchCalls.find((call) => call.label === "my-task");
+    expect(labelPatch?.key).toBe(child.sessionKey);
+    expect(labelPatch?.label).toBe("my-task");

    // Two agent calls: subagent spawn + main agent trigger
    const agentCalls = ctx.calls.filter((c) => c.method === "agent");
@@ -213,13 +229,10 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      },
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "discord:group:req",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call1", {
      task: "do thing",
@@ -307,13 +320,10 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      agentWaitResult: { status: "ok", startedAt: 3000, endedAt: 4000 },
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "discord:group:req",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call1b", {
      task: "do thing",
@@ -325,14 +335,14 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      runId: "run-1",
    });

-    vi.useFakeTimers();
-    try {
-      await vi.advanceTimersByTimeAsync(500);
-    } finally {
-      vi.useRealTimers();
-    }
-
    const child = ctx.getChild();
+    if (!child.runId) {
+      throw new Error("missing child runId");
+    }
+    await waitFor(() => ctx.waitCalls.some((call) => call.runId === child.runId));
+    await waitFor(() => ctx.calls.filter((call) => call.method === "agent").length >= 2);
+    await waitFor(() => Boolean(deletedKey));
+
    const childWait = ctx.waitCalls.find((call) => call.runId === child.runId);
    expect(childWait?.timeoutMs).toBe(1000);
    expect(child.sessionKey?.startsWith("agent:main:subagent:")).toBe(true);
@@ -397,13 +407,10 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "discord:group:req",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-timeout", {
      task: "do thing",
@@ -415,12 +422,7 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      runId: "run-1",
    });

-    vi.useFakeTimers();
-    try {
-      await vi.advanceTimersByTimeAsync(500);
-    } finally {
-      vi.useRealTimers();
-    }
+    await waitFor(() => calls.filter((call) => call.method === "agent").length >= 2);

    const mainAgentCall = calls
      .filter((call) => call.method === "agent")
@@ -472,14 +474,11 @@ describe("openclaw-tools: subagents (sessions_spawn lifecycle)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
      agentAccountId: "kev",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-announce-account", {
      task: "do thing",
--- a/src/agents/openclaw-tools.subagents.sessions-spawn.model.e2e.test.ts
+++ b/src/agents/openclaw-tools.subagents.sessions-spawn.model.e2e.test.ts
@@ -1,7 +1,6 @@
 import { beforeEach, describe, expect, it } from "vitest";
 import { DEFAULT_MODEL, DEFAULT_PROVIDER } from "./defaults.js";
 import "./test-helpers/fast-core-tools.js";
-import { createOpenClawTools } from "./openclaw-tools.js";
 import {
  getCallGatewayMock,
  resetSessionsSpawnConfigOverride,
@@ -11,6 +10,19 @@ import { resetSubagentRegistryForTests } from "./subagent-registry.js";

 const callGatewayMock = getCallGatewayMock();

+type CreateOpenClawTools = (typeof import("./openclaw-tools.js"))["createOpenClawTools"];
+type CreateOpenClawToolsOpts = Parameters<CreateOpenClawTools>[0];
+
+async function getSessionsSpawnTool(opts: CreateOpenClawToolsOpts) {
+  // Dynamic import: ensure harness mocks are installed before tool modules load.
+  const { createOpenClawTools } = await import("./openclaw-tools.js");
+  const tool = createOpenClawTools(opts).find((candidate) => candidate.name === "sessions_spawn");
+  if (!tool) {
+    throw new Error("missing sessions_spawn tool");
+  }
+  return tool;
+}
+
 describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
  beforeEach(() => {
    resetSessionsSpawnConfigOverride();
@@ -46,13 +58,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "discord:group:req",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call3", {
      task: "do thing",
@@ -93,13 +102,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "discord:group:req",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-thinking", {
      task: "do thing",
@@ -126,13 +132,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "discord:group:req",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-thinking-invalid", {
      task: "do thing",
@@ -166,13 +169,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "agent:main:main",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-default-model", {
      task: "do thing",
@@ -207,13 +207,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "agent:main:main",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-runtime-default-model", {
      task: "do thing",
@@ -255,13 +252,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "agent:research:main",
      agentChannel: "discord",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call-agent-model", {
      task: "do thing",
@@ -289,8 +283,8 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      const request = opts as { method?: string; params?: unknown };
      calls.push(request);
      if (request.method === "sessions.patch") {
-        const params = request.params as { model?: unknown } | undefined;
-        if (typeof params?.model === "string" && params.model.trim()) {
+        const model = (request.params as { model?: unknown } | undefined)?.model;
+        if (model === "bad-model") {
          throw new Error("invalid model: bad-model");
        }
        return { ok: true };
@@ -313,13 +307,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call4", {
      task: "do thing",
@@ -351,13 +342,10 @@ describe("openclaw-tools: subagents (sessions_spawn model + thinking)", () => {
      return {};
    });

-    const tool = createOpenClawTools({
+    const tool = await getSessionsSpawnTool({
      agentSessionKey: "main",
      agentChannel: "whatsapp",
-    }).find((candidate) => candidate.name === "sessions_spawn");
-    if (!tool) {
-      throw new Error("missing sessions_spawn tool");
-    }
+    });

    const result = await tool.execute("call5", {
      task: "do thing",
--- a/src/agents/pi-extensions/session-manager-runtime-registry.test.ts
+++ b/src/agents/pi-extensions/session-manager-runtime-registry.test.ts
@@ -1,22 +0,0 @@
-import { describe, expect, it } from "vitest";
-import { createSessionManagerRuntimeRegistry } from "./session-manager-runtime-registry.js";
-
-describe("createSessionManagerRuntimeRegistry", () => {
-  it("stores, reads, and clears values by object identity", () => {
-    const registry = createSessionManagerRuntimeRegistry<{ value: number }>();
-    const key = {};
-    expect(registry.get(key)).toBeNull();
-    registry.set(key, { value: 1 });
-    expect(registry.get(key)).toEqual({ value: 1 });
-    registry.set(key, null);
-    expect(registry.get(key)).toBeNull();
-  });
-
-  it("ignores non-object keys", () => {
-    const registry = createSessionManagerRuntimeRegistry<{ value: number }>();
-    registry.set(null, { value: 1 });
-    registry.set(123, { value: 1 });
-    expect(registry.get(null)).toBeNull();
-    expect(registry.get(123)).toBeNull();
-  });
-});
--- a/src/agents/pi-tools.safe-bins.e2e.test.ts
+++ b/src/agents/pi-tools.safe-bins.e2e.test.ts
@@ -4,8 +4,9 @@ import path from "node:path";
 import { afterAll, beforeAll, describe, expect, it, vi } from "vitest";
 import type { OpenClawConfig } from "../config/config.js";
 import type { ExecApprovalsResolved } from "../infra/exec-approvals.js";
+import { captureEnv } from "../test-utils/env.js";

-const previousBundledPluginsDir = process.env.OPENCLAW_BUNDLED_PLUGINS_DIR;
+const bundledPluginsDirSnapshot = captureEnv(["OPENCLAW_BUNDLED_PLUGINS_DIR"]);

 beforeAll(() => {
  process.env.OPENCLAW_BUNDLED_PLUGINS_DIR = path.join(
@@ -15,32 +16,18 @@ beforeAll(() => {
 });

 afterAll(() => {
-  if (previousBundledPluginsDir === undefined) {
-    delete process.env.OPENCLAW_BUNDLED_PLUGINS_DIR;
-  } else {
-    process.env.OPENCLAW_BUNDLED_PLUGINS_DIR = previousBundledPluginsDir;
-  }
+  bundledPluginsDirSnapshot.restore();
 });

 vi.mock("../infra/shell-env.js", async (importOriginal) => {
  const mod = await importOriginal<typeof import("../infra/shell-env.js")>();
  return {
    ...mod,
-    getShellPathFromLoginShell: vi.fn(() => "/usr/bin:/bin"),
+    getShellPathFromLoginShell: vi.fn(() => null),
    resolveShellEnvFallbackTimeoutMs: vi.fn(() => 500),
  };
 });

-vi.mock("../plugins/tools.js", () => ({
-  getPluginToolMeta: () => undefined,
-  resolvePluginTools: () => [],
-}));
-
-vi.mock("../infra/shell-env.js", async (importOriginal) => {
-  const mod = await importOriginal<typeof import("../infra/shell-env.js")>();
-  return { ...mod, getShellPathFromLoginShell: () => null };
-});
-
 vi.mock("../plugins/tools.js", () => ({
  resolvePluginTools: () => [],
  getPluginToolMeta: () => undefined,
@@ -109,20 +96,16 @@ describe("createOpenClawCodingTools safeBins", () => {
    expect(execTool).toBeDefined();

    const marker = `safe-bins-${Date.now()}`;
-    const prevShellEnvTimeoutMs = process.env.OPENCLAW_SHELL_ENV_TIMEOUT_MS;
-    process.env.OPENCLAW_SHELL_ENV_TIMEOUT_MS = "1000";
+    const envSnapshot = captureEnv(["OPENCLAW_SHELL_ENV_TIMEOUT_MS"]);
    const result = await (async () => {
      try {
+        process.env.OPENCLAW_SHELL_ENV_TIMEOUT_MS = "1000";
        return await execTool!.execute("call1", {
          command: `echo ${marker}`,
          workdir: tmpDir,
        });
      } finally {
-        if (prevShellEnvTimeoutMs === undefined) {
-          delete process.env.OPENCLAW_SHELL_ENV_TIMEOUT_MS;
-        } else {
-          process.env.OPENCLAW_SHELL_ENV_TIMEOUT_MS = prevShellEnvTimeoutMs;
-        }
+        envSnapshot.restore();
      }
    })();
    const text = result.content.find((content) => content.type === "text")?.text ?? "";
--- a/src/agents/sandbox-create-args.e2e.test.ts
+++ b/src/agents/sandbox-create-args.e2e.test.ts
@@ -94,7 +94,7 @@ describe("buildSandboxCreateArgs", () => {
    );
  });

-  it("emits -v flags for custom binds", () => {
+  it("emits -v flags for safe custom binds", () => {
    const cfg: SandboxDockerConfig = {
      image: "openclaw-sandbox:bookworm-slim",
      containerPrefix: "openclaw-sbx-",
@@ -103,7 +103,7 @@ describe("buildSandboxCreateArgs", () => {
      tmpfs: [],
      network: "none",
      capDrop: [],
-      binds: ["/home/user/source:/source:rw", "/var/run/docker.sock:/var/run/docker.sock"],
+      binds: ["/home/user/source:/source:rw", "/var/data/myapp:/data:ro"],
    };

    const args = buildSandboxCreateArgs({
@@ -124,7 +124,116 @@ describe("buildSandboxCreateArgs", () => {
      }
    }
    expect(vFlags).toContain("/home/user/source:/source:rw");
-    expect(vFlags).toContain("/var/run/docker.sock:/var/run/docker.sock");
+    expect(vFlags).toContain("/var/data/myapp:/data:ro");
+  });
+
+  it("throws on dangerous bind mounts (Docker socket)", () => {
+    const cfg: SandboxDockerConfig = {
+      image: "openclaw-sandbox:bookworm-slim",
+      containerPrefix: "openclaw-sbx-",
+      workdir: "/workspace",
+      readOnlyRoot: false,
+      tmpfs: [],
+      network: "none",
+      capDrop: [],
+      binds: ["/var/run/docker.sock:/var/run/docker.sock"],
+    };
+
+    expect(() =>
+      buildSandboxCreateArgs({
+        name: "openclaw-sbx-dangerous",
+        cfg,
+        scopeKey: "main",
+        createdAtMs: 1700000000000,
+      }),
+    ).toThrow(/blocked path/);
+  });
+
+  it("throws on dangerous bind mounts (parent path)", () => {
+    const cfg: SandboxDockerConfig = {
+      image: "openclaw-sandbox:bookworm-slim",
+      containerPrefix: "openclaw-sbx-",
+      workdir: "/workspace",
+      readOnlyRoot: false,
+      tmpfs: [],
+      network: "none",
+      capDrop: [],
+      binds: ["/run:/run"],
+    };
+
+    expect(() =>
+      buildSandboxCreateArgs({
+        name: "openclaw-sbx-dangerous-parent",
+        cfg,
+        scopeKey: "main",
+        createdAtMs: 1700000000000,
+      }),
+    ).toThrow(/blocked path/);
+  });
+
+  it("throws on network host mode", () => {
+    const cfg: SandboxDockerConfig = {
+      image: "openclaw-sandbox:bookworm-slim",
+      containerPrefix: "openclaw-sbx-",
+      workdir: "/workspace",
+      readOnlyRoot: false,
+      tmpfs: [],
+      network: "host",
+      capDrop: [],
+    };
+
+    expect(() =>
+      buildSandboxCreateArgs({
+        name: "openclaw-sbx-host",
+        cfg,
+        scopeKey: "main",
+        createdAtMs: 1700000000000,
+      }),
+    ).toThrow(/network mode "host" is blocked/);
+  });
+
+  it("throws on seccomp unconfined", () => {
+    const cfg: SandboxDockerConfig = {
+      image: "openclaw-sandbox:bookworm-slim",
+      containerPrefix: "openclaw-sbx-",
+      workdir: "/workspace",
+      readOnlyRoot: false,
+      tmpfs: [],
+      network: "none",
+      capDrop: [],
+      seccompProfile: "unconfined",
+    };
+
+    expect(() =>
+      buildSandboxCreateArgs({
+        name: "openclaw-sbx-seccomp",
+        cfg,
+        scopeKey: "main",
+        createdAtMs: 1700000000000,
+      }),
+    ).toThrow(/seccomp profile "unconfined" is blocked/);
+  });
+
+  it("throws on apparmor unconfined", () => {
+    const cfg: SandboxDockerConfig = {
+      image: "openclaw-sandbox:bookworm-slim",
+      containerPrefix: "openclaw-sbx-",
+      workdir: "/workspace",
+      readOnlyRoot: false,
+      tmpfs: [],
+      network: "none",
+      capDrop: [],
+      apparmorProfile: "unconfined",
+    };
+
+    expect(() =>
+      buildSandboxCreateArgs({
+        name: "openclaw-sbx-apparmor",
+        cfg,
+        scopeKey: "main",
+        createdAtMs: 1700000000000,
+      }),
+    ).toThrow(/apparmor profile "unconfined" is blocked/);
  });

  it("omits -v flags when binds is empty or undefined", () => {
--- a/src/agents/sandbox-skills.e2e.test.ts
+++ b/src/agents/sandbox-skills.e2e.test.ts
@@ -3,6 +3,7 @@ import os from "node:os";
 import path from "node:path";
 import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
 import type { OpenClawConfig } from "../config/config.js";
+import { captureFullEnv } from "../test-utils/env.js";
 import { resolveSandboxContext } from "./sandbox.js";

 vi.mock("./sandbox/docker.js", () => ({
@@ -27,30 +28,15 @@ async function writeSkill(params: { dir: string; name: string; description: stri
  );
 }

-function restoreEnv(snapshot: Record<string, string | undefined>) {
-  for (const key of Object.keys(process.env)) {
-    if (!(key in snapshot)) {
-      delete process.env[key];
-    }
-  }
-  for (const [key, value] of Object.entries(snapshot)) {
-    if (value === undefined) {
-      delete process.env[key];
-    } else {
-      process.env[key] = value;
-    }
-  }
-}
-
 describe("sandbox skill mirroring", () => {
-  let envSnapshot: Record<string, string | undefined>;
+  let envSnapshot: ReturnType<typeof captureFullEnv>;

  beforeEach(() => {
-    envSnapshot = { ...process.env };
+    envSnapshot = captureFullEnv();
  });

  afterEach(() => {
-    restoreEnv(envSnapshot);
+    envSnapshot.restore();
  });

  const runContext = async (workspaceAccess: "none" | "ro") => {
--- a/src/agents/sandbox/docker.ts
+++ b/src/agents/sandbox/docker.ts
@@ -111,6 +111,7 @@ import { computeSandboxConfigHash } from "./config-hash.js";
 import { DEFAULT_SANDBOX_IMAGE, SANDBOX_AGENT_WORKSPACE_MOUNT } from "./constants.js";
 import { readRegistry, updateRegistry } from "./registry.js";
 import { resolveSandboxAgentId, resolveSandboxScopeKey, slugifySessionKey } from "./shared.js";
+import { validateSandboxSecurity } from "./validate-sandbox-security.js";

 const HOT_CONTAINER_WINDOW_MS = 5 * 60 * 1000;

@@ -240,6 +241,9 @@ export function buildSandboxCreateArgs(params: {
  labels?: Record<string, string>;
  configHash?: string;
 }) {
+  // Runtime security validation: blocks dangerous bind mounts, network modes, and profiles.
+  validateSandboxSecurity(params.cfg);
+
  const createdAtMs = params.createdAtMs ?? Date.now();
  const args = ["create", "--name", params.name];
  args.push("--label", "openclaw.sandbox=1");
--- a/src/agents/sandbox/validate-sandbox-security.test.ts
+++ b/src/agents/sandbox/validate-sandbox-security.test.ts
@@ -0,0 +1,145 @@
+import { mkdtempSync, symlinkSync } from "node:fs";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { describe, expect, it } from "vitest";
+import {
+  getBlockedBindReason,
+  validateBindMounts,
+  validateNetworkMode,
+  validateSeccompProfile,
+  validateApparmorProfile,
+  validateSandboxSecurity,
+} from "./validate-sandbox-security.js";
+
+describe("getBlockedBindReason", () => {
+  it("blocks common Docker socket directories", () => {
+    expect(getBlockedBindReason("/run:/run")).toEqual(expect.objectContaining({ kind: "targets" }));
+    expect(getBlockedBindReason("/var/run:/var/run:ro")).toEqual(
+      expect.objectContaining({ kind: "targets" }),
+    );
+  });
+
+  it("does not block /var by default", () => {
+    expect(getBlockedBindReason("/var:/var")).toBeNull();
+  });
+});
+
+describe("validateBindMounts", () => {
+  it("allows legitimate project directory mounts", () => {
+    expect(() =>
+      validateBindMounts([
+        "/home/user/source:/source:rw",
+        "/home/user/projects:/projects:ro",
+        "/var/data/myapp:/data",
+        "/opt/myapp/config:/config:ro",
+      ]),
+    ).not.toThrow();
+  });
+
+  it("allows undefined or empty binds", () => {
+    expect(() => validateBindMounts(undefined)).not.toThrow();
+    expect(() => validateBindMounts([])).not.toThrow();
+  });
+
+  it("blocks /etc mount", () => {
+    expect(() => validateBindMounts(["/etc/passwd:/mnt/passwd:ro"])).toThrow(
+      /blocked path "\/etc"/,
+    );
+  });
+
+  it("blocks /proc mount", () => {
+    expect(() => validateBindMounts(["/proc:/proc:ro"])).toThrow(/blocked path "\/proc"/);
+  });
+
+  it("blocks Docker socket mounts (/var/run + /run)", () => {
+    expect(() => validateBindMounts(["/var/run/docker.sock:/var/run/docker.sock"])).toThrow(
+      /docker\.sock/,
+    );
+    expect(() => validateBindMounts(["/run/docker.sock:/run/docker.sock"])).toThrow(/docker\.sock/);
+  });
+
+  it("blocks parent mounts that would expose the Docker socket", () => {
+    expect(() => validateBindMounts(["/run:/run"])).toThrow(/blocked path/);
+    expect(() => validateBindMounts(["/var/run:/var/run"])).toThrow(/blocked path/);
+    expect(() => validateBindMounts(["/var:/var"])).not.toThrow();
+  });
+
+  it("blocks paths with .. traversal to dangerous directories", () => {
+    expect(() => validateBindMounts(["/home/user/../../etc/shadow:/mnt/shadow"])).toThrow(
+      /blocked path "\/etc"/,
+    );
+  });
+
+  it("blocks paths with double slashes normalizing to dangerous dirs", () => {
+    expect(() => validateBindMounts(["//etc//passwd:/mnt/passwd"])).toThrow(/blocked path "\/etc"/);
+  });
+
+  it("blocks symlink escapes into blocked directories", () => {
+    const dir = mkdtempSync(join(tmpdir(), "openclaw-sbx-"));
+    const link = join(dir, "etc-link");
+    symlinkSync("/etc", link);
+    expect(() => validateBindMounts([`${link}/passwd:/mnt/passwd:ro`])).toThrow(/blocked path/);
+  });
+
+  it("rejects non-absolute source paths (relative or named volumes)", () => {
+    expect(() => validateBindMounts(["../etc/passwd:/mnt/passwd"])).toThrow(/non-absolute/);
+    expect(() => validateBindMounts(["etc/passwd:/mnt/passwd"])).toThrow(/non-absolute/);
+    expect(() => validateBindMounts(["myvol:/mnt"])).toThrow(/non-absolute/);
+  });
+});
+
+describe("validateNetworkMode", () => {
+  it("allows bridge/none/custom/undefined", () => {
+    expect(() => validateNetworkMode("bridge")).not.toThrow();
+    expect(() => validateNetworkMode("none")).not.toThrow();
+    expect(() => validateNetworkMode("my-custom-network")).not.toThrow();
+    expect(() => validateNetworkMode(undefined)).not.toThrow();
+  });
+
+  it("blocks host mode (case-insensitive)", () => {
+    expect(() => validateNetworkMode("host")).toThrow(/network mode "host" is blocked/);
+    expect(() => validateNetworkMode("HOST")).toThrow(/network mode "HOST" is blocked/);
+  });
+});
+
+describe("validateSeccompProfile", () => {
+  it("allows custom profile paths/undefined", () => {
+    expect(() => validateSeccompProfile("/tmp/seccomp.json")).not.toThrow();
+    expect(() => validateSeccompProfile(undefined)).not.toThrow();
+  });
+
+  it("blocks unconfined (case-insensitive)", () => {
+    expect(() => validateSeccompProfile("unconfined")).toThrow(
+      /seccomp profile "unconfined" is blocked/,
+    );
+    expect(() => validateSeccompProfile("Unconfined")).toThrow(
+      /seccomp profile "Unconfined" is blocked/,
+    );
+  });
+});
+
+describe("validateApparmorProfile", () => {
+  it("allows named profile/undefined", () => {
+    expect(() => validateApparmorProfile("openclaw-sandbox")).not.toThrow();
+    expect(() => validateApparmorProfile(undefined)).not.toThrow();
+  });
+
+  it("blocks unconfined (case-insensitive)", () => {
+    expect(() => validateApparmorProfile("unconfined")).toThrow(
+      /apparmor profile "unconfined" is blocked/,
+    );
+  });
+});
+
+describe("validateSandboxSecurity", () => {
+  it("passes with safe config", () => {
+    expect(() =>
+      validateSandboxSecurity({
+        binds: ["/home/user/src:/src:rw"],
+        network: "none",
+        seccompProfile: "/tmp/seccomp.json",
+        apparmorProfile: "openclaw-sandbox",
+      }),
+    ).not.toThrow();
+  });
+});
--- a/src/agents/sandbox/validate-sandbox-security.ts
+++ b/src/agents/sandbox/validate-sandbox-security.ts
@@ -0,0 +1,195 @@
+/**
+ * Sandbox security validation — blocks dangerous Docker configurations.
+ *
+ * Threat model: local-trusted config, but protect against foot-guns and config injection.
+ * Enforced at runtime when creating sandbox containers.
+ */
+
+import { existsSync, realpathSync } from "node:fs";
+import { posix } from "node:path";
+
+// Targeted denylist: host paths that should never be exposed inside sandbox containers.
+// Exported for reuse in security audit collectors.
+export const BLOCKED_HOST_PATHS = [
+  "/etc",
+  "/private/etc",
+  "/proc",
+  "/sys",
+  "/dev",
+  "/root",
+  "/boot",
+  // Directories that commonly contain (or alias) the Docker socket.
+  "/run",
+  "/var/run",
+  "/private/var/run",
+  "/var/run/docker.sock",
+  "/private/var/run/docker.sock",
+  "/run/docker.sock",
+];
+
+const BLOCKED_NETWORK_MODES = new Set(["host"]);
+const BLOCKED_SECCOMP_PROFILES = new Set(["unconfined"]);
+const BLOCKED_APPARMOR_PROFILES = new Set(["unconfined"]);
+
+export type BlockedBindReason =
+  | { kind: "targets"; blockedPath: string }
+  | { kind: "covers"; blockedPath: string }
+  | { kind: "non_absolute"; sourcePath: string };
+
+/**
+ * Parse the host/source path from a Docker bind mount string.
+ * Format: `source:target[:mode]`
+ */
+export function parseBindSourcePath(bind: string): string {
+  const trimmed = bind.trim();
+  const firstColon = trimmed.indexOf(":");
+  if (firstColon <= 0) {
+    // No colon or starts with colon — treat as source.
+    return trimmed;
+  }
+  return trimmed.slice(0, firstColon);
+}
+
+/**
+ * Normalize a POSIX path: resolve `.`, `..`, collapse `//`, strip trailing `/`.
+ */
+export function normalizeHostPath(raw: string): string {
+  const trimmed = raw.trim();
+  return posix.normalize(trimmed).replace(/\/+$/, "") || "/";
+}
+
+/**
+ * String-only blocked-path check (no filesystem I/O).
+ * Blocks:
+ * - binds that target blocked paths (equal or under)
+ * - binds that cover the system root (mounting "/" is never safe)
+ * - non-absolute source paths (relative / volume names) because they are hard to validate safely
+ */
+export function getBlockedBindReason(bind: string): BlockedBindReason | null {
+  const sourceRaw = parseBindSourcePath(bind);
+  if (!sourceRaw.startsWith("/")) {
+    return { kind: "non_absolute", sourcePath: sourceRaw };
+  }
+
+  const normalized = normalizeHostPath(sourceRaw);
+  return getBlockedReasonForSourcePath(normalized);
+}
+
+export function getBlockedReasonForSourcePath(sourceNormalized: string): BlockedBindReason | null {
+  if (sourceNormalized === "/") {
+    return { kind: "covers", blockedPath: "/" };
+  }
+  for (const blocked of BLOCKED_HOST_PATHS) {
+    if (sourceNormalized === blocked || sourceNormalized.startsWith(blocked + "/")) {
+      return { kind: "targets", blockedPath: blocked };
+    }
+  }
+
+  return null;
+}
+
+function tryRealpathAbsolute(path: string): string {
+  if (!path.startsWith("/")) {
+    return path;
+  }
+  if (!existsSync(path)) {
+    return path;
+  }
+  try {
+    // Use native when available (keeps platform semantics); normalize for prefix checks.
+    return normalizeHostPath(realpathSync.native(path));
+  } catch {
+    return path;
+  }
+}
+
+function formatBindBlockedError(params: { bind: string; reason: BlockedBindReason }): Error {
+  if (params.reason.kind === "non_absolute") {
+    return new Error(
+      `Sandbox security: bind mount "${params.bind}" uses a non-absolute source path ` +
+        `"${params.reason.sourcePath}". Only absolute POSIX paths are supported for sandbox binds.`,
+    );
+  }
+  const verb = params.reason.kind === "covers" ? "covers" : "targets";
+  return new Error(
+    `Sandbox security: bind mount "${params.bind}" ${verb} blocked path "${params.reason.blockedPath}". ` +
+      "Mounting system directories (or Docker socket paths) into sandbox containers is not allowed. " +
+      "Use project-specific paths instead (e.g. /home/user/myproject).",
+  );
+}
+
+/**
+ * Validate bind mounts — throws if any source path is dangerous.
+ * Includes a symlink/realpath pass when the source path exists.
+ */
+export function validateBindMounts(binds: string[] | undefined): void {
+  if (!binds?.length) {
+    return;
+  }
+
+  for (const rawBind of binds) {
+    const bind = rawBind.trim();
+    if (!bind) {
+      continue;
+    }
+
+    // Fast string-only check (covers .., //, ancestor/descendant logic).
+    const blocked = getBlockedBindReason(bind);
+    if (blocked) {
+      throw formatBindBlockedError({ bind, reason: blocked });
+    }
+
+    // Symlink escape hardening: resolve existing absolute paths and re-check.
+    const sourceRaw = parseBindSourcePath(bind);
+    const sourceNormalized = normalizeHostPath(sourceRaw);
+    const sourceReal = tryRealpathAbsolute(sourceNormalized);
+    if (sourceReal !== sourceNormalized) {
+      const reason = getBlockedReasonForSourcePath(sourceReal);
+      if (reason) {
+        throw formatBindBlockedError({ bind, reason });
+      }
+    }
+  }
+}
+
+export function validateNetworkMode(network: string | undefined): void {
+  if (network && BLOCKED_NETWORK_MODES.has(network.trim().toLowerCase())) {
+    throw new Error(
+      `Sandbox security: network mode "${network}" is blocked. ` +
+        'Network "host" mode bypasses container network isolation. ' +
+        'Use "bridge" or "none" instead.',
+    );
+  }
+}
+
+export function validateSeccompProfile(profile: string | undefined): void {
+  if (profile && BLOCKED_SECCOMP_PROFILES.has(profile.trim().toLowerCase())) {
+    throw new Error(
+      `Sandbox security: seccomp profile "${profile}" is blocked. ` +
+        "Disabling seccomp removes syscall filtering and weakens sandbox isolation. " +
+        "Use a custom seccomp profile file or omit this setting.",
+    );
+  }
+}
+
+export function validateApparmorProfile(profile: string | undefined): void {
+  if (profile && BLOCKED_APPARMOR_PROFILES.has(profile.trim().toLowerCase())) {
+    throw new Error(
+      `Sandbox security: apparmor profile "${profile}" is blocked. ` +
+        "Disabling AppArmor removes mandatory access controls and weakens sandbox isolation. " +
+        "Use a named AppArmor profile or omit this setting.",
+    );
+  }
+}
+
+export function validateSandboxSecurity(cfg: {
+  binds?: string[];
+  network?: string;
+  seccompProfile?: string;
+  apparmorProfile?: string;
+}): void {
+  validateBindMounts(cfg.binds);
+  validateNetworkMode(cfg.network);
+  validateSeccompProfile(cfg.seccompProfile);
+  validateApparmorProfile(cfg.apparmorProfile);
+}
--- a/src/agents/sanitize-for-prompt.test.ts
+++ b/src/agents/sanitize-for-prompt.test.ts
@@ -0,0 +1,53 @@
+import { describe, expect, it } from "vitest";
+import { sanitizeForPromptLiteral } from "./sanitize-for-prompt.js";
+import { buildAgentSystemPrompt } from "./system-prompt.js";
+
+describe("sanitizeForPromptLiteral (OC-19 hardening)", () => {
+  it("strips ASCII control chars (CR/LF/NUL/tab)", () => {
+    expect(sanitizeForPromptLiteral("/tmp/a\nb\rc\x00d\te")).toBe("/tmp/abcde");
+  });
+
+  it("strips Unicode line/paragraph separators", () => {
+    expect(sanitizeForPromptLiteral(`/tmp/a\u2028b\u2029c`)).toBe("/tmp/abc");
+  });
+
+  it("strips Unicode format chars (bidi override)", () => {
+    // U+202E RIGHT-TO-LEFT OVERRIDE (Cf) can spoof rendered text.
+    expect(sanitizeForPromptLiteral(`/tmp/a\u202Eb`)).toBe("/tmp/ab");
+  });
+
+  it("preserves ordinary Unicode + spaces", () => {
+    const value = "/tmp/my project/日本語-folder.v2";
+    expect(sanitizeForPromptLiteral(value)).toBe(value);
+  });
+});
+
+describe("buildAgentSystemPrompt uses sanitized workspace/sandbox strings", () => {
+  it("sanitizes workspaceDir (no newlines / separators)", () => {
+    const prompt = buildAgentSystemPrompt({
+      workspaceDir: "/tmp/project\nINJECT\u2028MORE",
+    });
+    expect(prompt).toContain("Your working directory is: /tmp/projectINJECTMORE");
+    expect(prompt).not.toContain("Your working directory is: /tmp/project\n");
+    expect(prompt).not.toContain("\u2028");
+  });
+
+  it("sanitizes sandbox workspace/mount/url strings", () => {
+    const prompt = buildAgentSystemPrompt({
+      workspaceDir: "/tmp/test",
+      sandboxInfo: {
+        enabled: true,
+        containerWorkspaceDir: "/work\u2029space",
+        workspaceDir: "/host\nspace",
+        workspaceAccess: "read-write",
+        agentWorkspaceMount: "/mnt\u2028mount",
+        browserNoVncUrl: "http://example.test/\nui",
+      },
+    });
+    expect(prompt).toContain("Sandbox container workdir: /workspace");
+    expect(prompt).toContain("Sandbox host workspace: /hostspace");
+    expect(prompt).toContain("(mounted at /mntmount)");
+    expect(prompt).toContain("Sandbox browser observer (noVNC): http://example.test/ui");
+    expect(prompt).not.toContain("\nui");
+  });
+});
--- a/src/agents/sanitize-for-prompt.ts
+++ b/src/agents/sanitize-for-prompt.ts
@@ -0,0 +1,18 @@
+/**
+ * Sanitize untrusted strings before embedding them into an LLM prompt.
+ *
+ * Threat model (OC-19): attacker-controlled directory names (or other runtime strings)
+ * that contain newline/control characters can break prompt structure and inject
+ * arbitrary instructions.
+ *
+ * Strategy (Option 3 hardening):
+ * - Strip Unicode "control" (Cc) + "format" (Cf) characters (includes CR/LF/NUL, bidi marks, zero-width chars).
+ * - Strip explicit line/paragraph separators (Zl/Zp): U+2028/U+2029.
+ *
+ * Notes:
+ * - This is intentionally lossy; it trades edge-case path fidelity for prompt integrity.
+ * - If you need lossless representation, escape instead of stripping.
+ */
+export function sanitizeForPromptLiteral(value: string): string {
+  return value.replace(/[\p{Cc}\p{Cf}\u2028\u2029]/gu, "");
+}
--- a/src/agents/skills/frontmatter.ts
+++ b/src/agents/skills/frontmatter.ts
@@ -12,6 +12,9 @@ import {
  normalizeStringList,
  parseFrontmatterBool,
  resolveOpenClawManifestBlock,
+  resolveOpenClawManifestInstall,
+  resolveOpenClawManifestOs,
+  resolveOpenClawManifestRequires,
 } from "../../shared/frontmatter.js";

 export function parseFrontmatter(content: string): ParsedSkillFrontmatter {
@@ -83,15 +86,9 @@ export function resolveOpenClawMetadata(
  if (!metadataObj) {
    return undefined;
  }
-  const requiresRaw =
-    typeof metadataObj.requires === "object" && metadataObj.requires !== null
-      ? (metadataObj.requires as Record<string, unknown>)
-      : undefined;
-  const installRaw = Array.isArray(metadataObj.install) ? (metadataObj.install as unknown[]) : [];
-  const install = installRaw
-    .map((entry) => parseInstallSpec(entry))
-    .filter((entry): entry is SkillInstallSpec => Boolean(entry));
-  const osRaw = normalizeStringList(metadataObj.os);
+  const requires = resolveOpenClawManifestRequires(metadataObj);
+  const install = resolveOpenClawManifestInstall(metadataObj, parseInstallSpec);
+  const osRaw = resolveOpenClawManifestOs(metadataObj);
  return {
    always: typeof metadataObj.always === "boolean" ? metadataObj.always : undefined,
    emoji: typeof metadataObj.emoji === "string" ? metadataObj.emoji : undefined,
@@ -99,14 +96,7 @@ export function resolveOpenClawMetadata(
    skillKey: typeof metadataObj.skillKey === "string" ? metadataObj.skillKey : undefined,
    primaryEnv: typeof metadataObj.primaryEnv === "string" ? metadataObj.primaryEnv : undefined,
    os: osRaw.length > 0 ? osRaw : undefined,
-    requires: requiresRaw
-      ? {
-          bins: normalizeStringList(requiresRaw.bins),
-          anyBins: normalizeStringList(requiresRaw.anyBins),
-          env: normalizeStringList(requiresRaw.env),
-          config: normalizeStringList(requiresRaw.config),
-        }
-      : undefined,
+    requires: requires,
    install: install.length > 0 ? install : undefined,
  };
 }
--- a/src/agents/subagent-registry.persistence.e2e.test.ts
+++ b/src/agents/subagent-registry.persistence.e2e.test.ts
@@ -2,6 +2,7 @@ import fs from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
 import { afterEach, describe, expect, it, vi } from "vitest";
+import { captureEnv } from "../test-utils/env.js";
 import {
  initSubagentRegistry,
  registerSubagentRun,
@@ -29,7 +30,7 @@ vi.mock("./subagent-announce.js", () => ({
 }));

 describe("subagent registry persistence", () => {
-  const previousStateDir = process.env.OPENCLAW_STATE_DIR;
+  const envSnapshot = captureEnv(["OPENCLAW_STATE_DIR"]);
  let tempStateDir: string | null = null;

  afterEach(async () => {
@@ -39,11 +40,7 @@ describe("subagent registry persistence", () => {
      await fs.rm(tempStateDir, { recursive: true, force: true });
      tempStateDir = null;
    }
-    if (previousStateDir === undefined) {
-      delete process.env.OPENCLAW_STATE_DIR;
-    } else {
-      process.env.OPENCLAW_STATE_DIR = previousStateDir;
-    }
+    envSnapshot.restore();
  });

  it("persists runs to disk and resumes after restart", async () => {
--- a/src/agents/system-prompt.ts
+++ b/src/agents/system-prompt.ts
@@ -4,6 +4,7 @@ import type { ResolvedTimeFormat } from "./date-time.js";
 import type { EmbeddedContextFile } from "./pi-embedded-helpers.js";
 import { SILENT_REPLY_TOKEN } from "../auto-reply/tokens.js";
 import { listDeliverableMessageChannels } from "../utils/message-channel.js";
+import { sanitizeForPromptLiteral } from "./sanitize-for-prompt.js";

 /**
 * Controls which hardcoded sections are included in the system prompt.
@@ -355,13 +356,17 @@ export function buildAgentSystemPrompt(params: {
  const promptMode = params.promptMode ?? "full";
  const isMinimal = promptMode === "minimal" || promptMode === "none";
  const sandboxContainerWorkspace = params.sandboxInfo?.containerWorkspaceDir?.trim();
+  const sanitizedWorkspaceDir = sanitizeForPromptLiteral(params.workspaceDir);
+  const sanitizedSandboxContainerWorkspace = sandboxContainerWorkspace
+    ? sanitizeForPromptLiteral(sandboxContainerWorkspace)
+    : "";
  const displayWorkspaceDir =
-    params.sandboxInfo?.enabled && sandboxContainerWorkspace
-      ? sandboxContainerWorkspace
-      : params.workspaceDir;
+    params.sandboxInfo?.enabled && sanitizedSandboxContainerWorkspace
+      ? sanitizedSandboxContainerWorkspace
+      : sanitizedWorkspaceDir;
  const workspaceGuidance =
-    params.sandboxInfo?.enabled && sandboxContainerWorkspace
-      ? `For read/write/edit/apply_patch, file paths resolve against host workspace: ${params.workspaceDir}. Prefer relative paths so both sandboxed exec and file tools work consistently.`
+    params.sandboxInfo?.enabled && sanitizedSandboxContainerWorkspace
+      ? `For read/write/edit/apply_patch, file paths resolve against host workspace: ${sanitizedWorkspaceDir}. Prefer relative paths so both sandboxed exec and file tools work consistently.`
      : "Treat this directory as the single global workspace for file operations unless explicitly instructed otherwise.";
  const safetySection = [
    "## Safety",
@@ -480,21 +485,21 @@ export function buildAgentSystemPrompt(params: {
          "Some tools may be unavailable due to sandbox policy.",
          "Sub-agents stay sandboxed (no elevated/host access). Need outside-sandbox read/write? Don't spawn; ask first.",
          params.sandboxInfo.containerWorkspaceDir
-            ? `Sandbox container workdir: ${params.sandboxInfo.containerWorkspaceDir}`
+            ? `Sandbox container workdir: ${sanitizeForPromptLiteral(params.sandboxInfo.containerWorkspaceDir)}`
            : "",
          params.sandboxInfo.workspaceDir
-            ? `Sandbox host workspace: ${params.sandboxInfo.workspaceDir}`
+            ? `Sandbox host workspace: ${sanitizeForPromptLiteral(params.sandboxInfo.workspaceDir)}`
            : "",
          params.sandboxInfo.workspaceAccess
            ? `Agent workspace access: ${params.sandboxInfo.workspaceAccess}${
                params.sandboxInfo.agentWorkspaceMount
-                  ? ` (mounted at ${params.sandboxInfo.agentWorkspaceMount})`
+                  ? ` (mounted at ${sanitizeForPromptLiteral(params.sandboxInfo.agentWorkspaceMount)})`
                  : ""
              }`
            : "",
          params.sandboxInfo.browserBridgeUrl ? "Sandbox browser: enabled." : "",
          params.sandboxInfo.browserNoVncUrl
-            ? `Sandbox browser observer (noVNC): ${params.sandboxInfo.browserNoVncUrl}`
+            ? `Sandbox browser observer (noVNC): ${sanitizeForPromptLiteral(params.sandboxInfo.browserNoVncUrl)}`
            : "",
          params.sandboxInfo.hostBrowserAllowed === true
            ? "Host browser control: allowed."
--- a/src/agents/test-helpers/fast-coding-tools.ts
+++ b/src/agents/test-helpers/fast-coding-tools.ts
@@ -1,22 +1 @@
-import { vi } from "vitest";
-
-const stubTool = (name: string) => ({
-  name,
-  description: `${name} stub`,
-  parameters: { type: "object", properties: {} },
-  execute: vi.fn(),
-});
-
-vi.mock("../tools/image-tool.js", () => ({
-  createImageTool: () => stubTool("image"),
-}));
-
-vi.mock("../tools/web-tools.js", () => ({
-  createWebSearchTool: () => null,
-  createWebFetchTool: () => null,
-}));
-
-vi.mock("../../plugins/tools.js", () => ({
-  resolvePluginTools: () => [],
-  getPluginToolMeta: () => undefined,
-}));
+import "./fast-tool-stubs.js";
--- a/src/agents/test-helpers/fast-core-tools.ts
+++ b/src/agents/test-helpers/fast-core-tools.ts
@@ -1,11 +1,5 @@
 import { vi } from "vitest";
-
-const stubTool = (name: string) => ({
-  name,
-  description: `${name} stub`,
-  parameters: { type: "object", properties: {} },
-  execute: vi.fn(),
-});
+import { stubTool } from "./fast-tool-stubs.js";

 vi.mock("../tools/browser-tool.js", () => ({
  createBrowserTool: () => stubTool("browser"),
@@ -14,17 +8,3 @@ vi.mock("../tools/browser-tool.js", () => ({
 vi.mock("../tools/canvas-tool.js", () => ({
  createCanvasTool: () => stubTool("canvas"),
 }));
-
-vi.mock("../tools/image-tool.js", () => ({
-  createImageTool: () => stubTool("image"),
-}));
-
-vi.mock("../tools/web-tools.js", () => ({
-  createWebSearchTool: () => null,
-  createWebFetchTool: () => null,
-}));
-
-vi.mock("../../plugins/tools.js", () => ({
-  resolvePluginTools: () => [],
-  getPluginToolMeta: () => undefined,
-}));
--- a/src/agents/test-helpers/fast-tool-stubs.ts
+++ b/src/agents/test-helpers/fast-tool-stubs.ts
@@ -0,0 +1,22 @@
+import { vi } from "vitest";
+
+export const stubTool = (name: string) => ({
+  name,
+  description: `${name} stub`,
+  parameters: { type: "object", properties: {} },
+  execute: vi.fn(),
+});
+
+vi.mock("../tools/image-tool.js", () => ({
+  createImageTool: () => stubTool("image"),
+}));
+
+vi.mock("../tools/web-tools.js", () => ({
+  createWebSearchTool: () => null,
+  createWebFetchTool: () => null,
+}));
+
+vi.mock("../../plugins/tools.js", () => ({
+  resolvePluginTools: () => [],
+  getPluginToolMeta: () => undefined,
+}));
--- a/src/agents/tools/cron-tool.ts
+++ b/src/agents/tools/cron-tool.ts
@@ -219,7 +219,8 @@ JOB SCHEMA (for add action):
  "payload": { ... },       // Required: what to execute
  "delivery": { ... },      // Optional: announce summary (isolated only)
  "sessionTarget": "main" | "isolated",  // Required
-  "enabled": true | false   // Optional, default true
+  "enabled": true | false,  // Optional, default true
+  "notify": true | false    // Optional webhook opt-in; set true for user-facing reminders
 }

 SCHEDULE TYPES (schedule.kind):
@@ -246,6 +247,7 @@ DELIVERY (isolated-only, top-level):
 CRITICAL CONSTRAINTS:
 - sessionTarget="main" REQUIRES payload.kind="systemEvent"
 - sessionTarget="isolated" REQUIRES payload.kind="agentTurn"
+- For reminders users should be notified about, set notify=true.
 Default: prefer isolated agentTurn jobs unless the user explicitly wants a main-session system event.

 WAKE MODES (for wake action):
@@ -292,6 +294,7 @@ Use jobId as the canonical identifier; id is accepted for compatibility. Use con
              "payload",
              "delivery",
              "enabled",
+              "notify",
              "description",
              "deleteAfterRun",
              "agentId",
--- a/src/agents/tools/memory-tool.does-not-crash-on-errors.e2e.test.ts
+++ b/src/agents/tools/memory-tool.does-not-crash-on-errors.e2e.test.ts
@@ -1,65 +0,0 @@
-import { describe, expect, it, vi } from "vitest";
-
-vi.mock("../../memory/index.js", () => {
-  return {
-    getMemorySearchManager: async () => {
-      return {
-        manager: {
-          search: async () => {
-            throw new Error("openai embeddings failed: 429 insufficient_quota");
-          },
-          readFile: async () => {
-            throw new Error("path required");
-          },
-          status: () => ({
-            files: 0,
-            chunks: 0,
-            dirty: true,
-            workspaceDir: "/tmp",
-            dbPath: "/tmp/index.sqlite",
-            provider: "openai",
-            model: "text-embedding-3-small",
-            requestedProvider: "openai",
-          }),
-        },
-      };
-    },
-  };
-});
-
-import { createMemoryGetTool, createMemorySearchTool } from "./memory-tool.js";
-
-describe("memory tools", () => {
-  it("does not throw when memory_search fails (e.g. embeddings 429)", async () => {
-    const cfg = { agents: { list: [{ id: "main", default: true }] } };
-    const tool = createMemorySearchTool({ config: cfg });
-    expect(tool).not.toBeNull();
-    if (!tool) {
-      throw new Error("tool missing");
-    }
-
-    const result = await tool.execute("call_1", { query: "hello" });
-    expect(result.details).toEqual({
-      results: [],
-      disabled: true,
-      error: "openai embeddings failed: 429 insufficient_quota",
-    });
-  });
-
-  it("does not throw when memory_get fails", async () => {
-    const cfg = { agents: { list: [{ id: "main", default: true }] } };
-    const tool = createMemoryGetTool({ config: cfg });
-    expect(tool).not.toBeNull();
-    if (!tool) {
-      throw new Error("tool missing");
-    }
-
-    const result = await tool.execute("call_2", { path: "memory/NOPE.md" });
-    expect(result.details).toEqual({
-      path: "memory/NOPE.md",
-      text: "",
-      disabled: true,
-      error: "path required",
-    });
-  });
-});
--- a/src/agents/tools/memory-tool.citations.e2e.test.ts
+++ b/src/agents/tools/memory-tool.citations.e2e.test.ts
@@ -1,18 +1,21 @@
 import { beforeEach, describe, expect, it, vi } from "vitest";

 let backend: "builtin" | "qmd" = "builtin";
+let searchImpl: () => Promise<unknown[]> = async () => [
+  {
+    path: "MEMORY.md",
+    startLine: 5,
+    endLine: 7,
+    score: 0.9,
+    snippet: "@@ -5,3 @@\nAssistant: noted",
+    source: "memory" as const,
+  },
+];
+let readFileImpl: () => Promise<string> = async () => "";
+
 const stubManager = {
-  search: vi.fn(async () => [
-    {
-      path: "MEMORY.md",
-      startLine: 5,
-      endLine: 7,
-      score: 0.9,
-      snippet: "@@ -5,3 @@\nAssistant: noted",
-      source: "memory" as const,
-    },
-  ]),
-  readFile: vi.fn(),
+  search: vi.fn(async () => await searchImpl()),
+  readFile: vi.fn(async () => await readFileImpl()),
  status: () => ({
    backend,
    files: 1,
@@ -37,9 +40,21 @@ vi.mock("../../memory/index.js", () => {
  };
 });

-import { createMemorySearchTool } from "./memory-tool.js";
+import { createMemoryGetTool, createMemorySearchTool } from "./memory-tool.js";

 beforeEach(() => {
+  backend = "builtin";
+  searchImpl = async () => [
+    {
+      path: "MEMORY.md",
+      startLine: 5,
+      endLine: 7,
+      score: 0.9,
+      snippet: "@@ -5,3 @@\nAssistant: noted",
+      source: "memory" as const,
+    },
+  ];
+  readFileImpl = async () => "";
  vi.clearAllMocks();
 });

@@ -121,3 +136,46 @@ describe("memory search citations", () => {
    expect(details.results[0]?.snippet).not.toMatch(/Source:/);
  });
 });
+
+describe("memory tools", () => {
+  it("does not throw when memory_search fails (e.g. embeddings 429)", async () => {
+    searchImpl = async () => {
+      throw new Error("openai embeddings failed: 429 insufficient_quota");
+    };
+
+    const cfg = { agents: { list: [{ id: "main", default: true }] } };
+    const tool = createMemorySearchTool({ config: cfg });
+    expect(tool).not.toBeNull();
+    if (!tool) {
+      throw new Error("tool missing");
+    }
+
+    const result = await tool.execute("call_1", { query: "hello" });
+    expect(result.details).toEqual({
+      results: [],
+      disabled: true,
+      error: "openai embeddings failed: 429 insufficient_quota",
+    });
+  });
+
+  it("does not throw when memory_get fails", async () => {
+    readFileImpl = async () => {
+      throw new Error("path required");
+    };
+
+    const cfg = { agents: { list: [{ id: "main", default: true }] } };
+    const tool = createMemoryGetTool({ config: cfg });
+    expect(tool).not.toBeNull();
+    if (!tool) {
+      throw new Error("tool missing");
+    }
+
+    const result = await tool.execute("call_2", { path: "memory/NOPE.md" });
+    expect(result.details).toEqual({
+      path: "memory/NOPE.md",
+      text: "",
+      disabled: true,
+      error: "path required",
+    });
+  });
+});
--- a/src/agents/tools/sessions-announce-target.e2e.test.ts
+++ b/src/agents/tools/sessions-announce-target.e2e.test.ts
@@ -1,103 +0,0 @@
-import { beforeEach, describe, expect, it, vi } from "vitest";
-import { createTestRegistry } from "../../test-utils/channel-plugins.js";
-
-const callGatewayMock = vi.fn();
-vi.mock("../../gateway/call.js", () => ({
-  callGateway: (opts: unknown) => callGatewayMock(opts),
-}));
-
-const loadResolveAnnounceTarget = async () => await import("./sessions-announce-target.js");
-
-const installRegistry = async () => {
-  const { setActivePluginRegistry } = await import("../../plugins/runtime.js");
-  setActivePluginRegistry(
-    createTestRegistry([
-      {
-        pluginId: "discord",
-        source: "test",
-        plugin: {
-          id: "discord",
-          meta: {
-            id: "discord",
-            label: "Discord",
-            selectionLabel: "Discord",
-            docsPath: "/channels/discord",
-            blurb: "Discord test stub.",
-          },
-          capabilities: { chatTypes: ["direct", "channel", "thread"] },
-          config: {
-            listAccountIds: () => ["default"],
-            resolveAccount: () => ({}),
-          },
-        },
-      },
-      {
-        pluginId: "whatsapp",
-        source: "test",
-        plugin: {
-          id: "whatsapp",
-          meta: {
-            id: "whatsapp",
-            label: "WhatsApp",
-            selectionLabel: "WhatsApp",
-            docsPath: "/channels/whatsapp",
-            blurb: "WhatsApp test stub.",
-            preferSessionLookupForAnnounceTarget: true,
-          },
-          capabilities: { chatTypes: ["direct", "group"] },
-          config: {
-            listAccountIds: () => ["default"],
-            resolveAccount: () => ({}),
-          },
-        },
-      },
-    ]),
-  );
-};
-
-describe("resolveAnnounceTarget", () => {
-  beforeEach(async () => {
-    callGatewayMock.mockReset();
-    await installRegistry();
-  });
-
-  it("derives non-WhatsApp announce targets from the session key", async () => {
-    const { resolveAnnounceTarget } = await loadResolveAnnounceTarget();
-    const target = await resolveAnnounceTarget({
-      sessionKey: "agent:main:discord:group:dev",
-      displayKey: "agent:main:discord:group:dev",
-    });
-    expect(target).toEqual({ channel: "discord", to: "channel:dev" });
-    expect(callGatewayMock).not.toHaveBeenCalled();
-  });
-
-  it("hydrates WhatsApp accountId from sessions.list when available", async () => {
-    const { resolveAnnounceTarget } = await loadResolveAnnounceTarget();
-    callGatewayMock.mockResolvedValueOnce({
-      sessions: [
-        {
-          key: "agent:main:whatsapp:group:123@g.us",
-          deliveryContext: {
-            channel: "whatsapp",
-            to: "123@g.us",
-            accountId: "work",
-          },
-        },
-      ],
-    });
-
-    const target = await resolveAnnounceTarget({
-      sessionKey: "agent:main:whatsapp:group:123@g.us",
-      displayKey: "agent:main:whatsapp:group:123@g.us",
-    });
-    expect(target).toEqual({
-      channel: "whatsapp",
-      to: "123@g.us",
-      accountId: "work",
-    });
-    expect(callGatewayMock).toHaveBeenCalledTimes(1);
-    const first = callGatewayMock.mock.calls[0]?.[0] as { method?: string } | undefined;
-    expect(first).toBeDefined();
-    expect(first?.method).toBe("sessions.list");
-  });
-});
--- a/src/agents/tools/sessions-helpers.e2e.test.ts
+++ b/src/agents/tools/sessions-helpers.e2e.test.ts
@@ -1,58 +0,0 @@
-import { describe, expect, it } from "vitest";
-import { extractAssistantText, sanitizeTextContent } from "./sessions-helpers.js";
-
-describe("sanitizeTextContent", () => {
-  it("strips minimax tool call XML and downgraded markers", () => {
-    const input =
-      'Hello <invoke name="tool">payload</invoke></minimax:tool_call> ' +
-      "[Tool Call: foo (ID: 1)] world";
-    const result = sanitizeTextContent(input).trim();
-    expect(result).toBe("Hello  world");
-    expect(result).not.toContain("invoke");
-    expect(result).not.toContain("Tool Call");
-  });
-
-  it("strips thinking tags", () => {
-    const input = "Before <think>secret</think> after";
-    const result = sanitizeTextContent(input).trim();
-    expect(result).toBe("Before  after");
-  });
-});
-
-describe("extractAssistantText", () => {
-  it("sanitizes blocks without injecting newlines", () => {
-    const message = {
-      role: "assistant",
-      content: [
-        { type: "text", text: "Hi " },
-        { type: "text", text: "<think>secret</think>there" },
-      ],
-    };
-    expect(extractAssistantText(message)).toBe("Hi there");
-  });
-
-  it("rewrites error-ish assistant text only when the transcript marks it as an error", () => {
-    const message = {
-      role: "assistant",
-      stopReason: "error",
-      errorMessage: "500 Internal Server Error",
-      content: [{ type: "text", text: "500 Internal Server Error" }],
-    };
-    expect(extractAssistantText(message)).toBe("HTTP 500: Internal Server Error");
-  });
-
-  it("keeps normal status text that mentions billing", () => {
-    const message = {
-      role: "assistant",
-      content: [
-        {
-          type: "text",
-          text: "Firebase downgraded us to the free Spark plan. Check whether billing should be re-enabled.",
-        },
-      ],
-    };
-    expect(extractAssistantText(message)).toBe(
-      "Firebase downgraded us to the free Spark plan. Check whether billing should be re-enabled.",
-    );
-  });
-});
--- a/src/agents/tools/sessions-list-tool.gating.e2e.test.ts
+++ b/src/agents/tools/sessions-list-tool.gating.e2e.test.ts
@@ -1,42 +0,0 @@
-import { beforeEach, describe, expect, it, vi } from "vitest";
-
-const callGatewayMock = vi.fn();
-vi.mock("../../gateway/call.js", () => ({
-  callGateway: (opts: unknown) => callGatewayMock(opts),
-}));
-
-vi.mock("../../config/config.js", async (importOriginal) => {
-  const actual = await importOriginal<typeof import("../../config/config.js")>();
-  return {
-    ...actual,
-    loadConfig: () =>
-      ({
-        session: { scope: "per-sender", mainKey: "main" },
-        tools: { agentToAgent: { enabled: false } },
-      }) as never,
-  };
-});
-
-import { createSessionsListTool } from "./sessions-list-tool.js";
-
-describe("sessions_list gating", () => {
-  beforeEach(() => {
-    callGatewayMock.mockReset();
-    callGatewayMock.mockResolvedValue({
-      path: "/tmp/sessions.json",
-      sessions: [
-        { key: "agent:main:main", kind: "direct" },
-        { key: "agent:other:main", kind: "direct" },
-      ],
-    });
-  });
-
-  it("filters out other agents when tools.agentToAgent.enabled is false", async () => {
-    const tool = createSessionsListTool({ agentSessionKey: "agent:main:main" });
-    const result = await tool.execute("call1", {});
-    expect(result.details).toMatchObject({
-      count: 1,
-      sessions: [{ key: "agent:main:main" }],
-    });
-  });
-});
--- a/src/agents/tools/sessions-send-tool.gating.e2e.test.ts
+++ b/src/agents/tools/sessions-send-tool.gating.e2e.test.ts
@@ -1,42 +0,0 @@
-import { beforeEach, describe, expect, it, vi } from "vitest";
-
-const callGatewayMock = vi.fn();
-vi.mock("../../gateway/call.js", () => ({
-  callGateway: (opts: unknown) => callGatewayMock(opts),
-}));
-
-vi.mock("../../config/config.js", async (importOriginal) => {
-  const actual = await importOriginal<typeof import("../../config/config.js")>();
-  return {
-    ...actual,
-    loadConfig: () =>
-      ({
-        session: { scope: "per-sender", mainKey: "main" },
-        tools: { agentToAgent: { enabled: false } },
-      }) as never,
-  };
-});
-
-import { createSessionsSendTool } from "./sessions-send-tool.js";
-
-describe("sessions_send gating", () => {
-  beforeEach(() => {
-    callGatewayMock.mockReset();
-  });
-
-  it("blocks cross-agent sends when tools.agentToAgent.enabled is false", async () => {
-    const tool = createSessionsSendTool({
-      agentSessionKey: "agent:main:main",
-      agentChannel: "whatsapp",
-    });
-
-    const result = await tool.execute("call1", {
-      sessionKey: "agent:other:main",
-      message: "hi",
-      timeoutSeconds: 0,
-    });
-
-    expect(callGatewayMock).not.toHaveBeenCalled();
-    expect(result.details).toMatchObject({ status: "forbidden" });
-  });
-});
--- a/src/agents/tools/sessions-spawn-tool.ts
+++ b/src/agents/tools/sessions-spawn-tool.ts
@@ -28,6 +28,7 @@ const SessionsSpawnToolSchema = Type.Object({
  model: Type.Optional(Type.String()),
  thinking: Type.Optional(Type.String()),
  runTimeoutSeconds: Type.Optional(Type.Number({ minimum: 0 })),
+  // Back-compat: older callers used timeoutSeconds for this tool.
  timeoutSeconds: Type.Optional(Type.Number({ minimum: 0 })),
  cleanup: optionalStringEnum(["delete", "keep"] as const),
 });
@@ -98,14 +99,16 @@ export function createSessionsSpawnTool(opts?: {
      });
      // Default to 0 (no timeout) when omitted. Sub-agent runs are long-lived
      // by default and should not inherit the main agent 600s timeout.
-      const legacyTimeoutSeconds =
-        typeof params.timeoutSeconds === "number" && Number.isFinite(params.timeoutSeconds)
-          ? Math.max(0, Math.floor(params.timeoutSeconds))
-          : undefined;
+      const timeoutSecondsCandidate =
+        typeof params.runTimeoutSeconds === "number"
+          ? params.runTimeoutSeconds
+          : typeof params.timeoutSeconds === "number"
+            ? params.timeoutSeconds
+            : undefined;
      const runTimeoutSeconds =
-        typeof params.runTimeoutSeconds === "number" && Number.isFinite(params.runTimeoutSeconds)
-          ? Math.max(0, Math.floor(params.runTimeoutSeconds))
-          : (legacyTimeoutSeconds ?? 0);
+        typeof timeoutSecondsCandidate === "number" && Number.isFinite(timeoutSecondsCandidate)
+          ? Math.max(0, Math.floor(timeoutSecondsCandidate))
+          : 0;
      let modelWarning: string | undefined;
      let modelApplied = false;

--- a/src/agents/tools/sessions.e2e.test.ts
+++ b/src/agents/tools/sessions.e2e.test.ts
@@ -0,0 +1,219 @@
+import { beforeEach, describe, expect, it, vi } from "vitest";
+import { createTestRegistry } from "../../test-utils/channel-plugins.js";
+import { extractAssistantText, sanitizeTextContent } from "./sessions-helpers.js";
+
+const callGatewayMock = vi.fn();
+vi.mock("../../gateway/call.js", () => ({
+  callGateway: (opts: unknown) => callGatewayMock(opts),
+}));
+
+vi.mock("../../config/config.js", async (importOriginal) => {
+  const actual = await importOriginal<typeof import("../../config/config.js")>();
+  return {
+    ...actual,
+    loadConfig: () =>
+      ({
+        session: { scope: "per-sender", mainKey: "main" },
+        tools: { agentToAgent: { enabled: false } },
+      }) as never,
+  };
+});
+
+import { createSessionsListTool } from "./sessions-list-tool.js";
+import { createSessionsSendTool } from "./sessions-send-tool.js";
+
+const loadResolveAnnounceTarget = async () => await import("./sessions-announce-target.js");
+
+const installRegistry = async () => {
+  const { setActivePluginRegistry } = await import("../../plugins/runtime.js");
+  setActivePluginRegistry(
+    createTestRegistry([
+      {
+        pluginId: "discord",
+        source: "test",
+        plugin: {
+          id: "discord",
+          meta: {
+            id: "discord",
+            label: "Discord",
+            selectionLabel: "Discord",
+            docsPath: "/channels/discord",
+            blurb: "Discord test stub.",
+          },
+          capabilities: { chatTypes: ["direct", "channel", "thread"] },
+          config: {
+            listAccountIds: () => ["default"],
+            resolveAccount: () => ({}),
+          },
+        },
+      },
+      {
+        pluginId: "whatsapp",
+        source: "test",
+        plugin: {
+          id: "whatsapp",
+          meta: {
+            id: "whatsapp",
+            label: "WhatsApp",
+            selectionLabel: "WhatsApp",
+            docsPath: "/channels/whatsapp",
+            blurb: "WhatsApp test stub.",
+            preferSessionLookupForAnnounceTarget: true,
+          },
+          capabilities: { chatTypes: ["direct", "group"] },
+          config: {
+            listAccountIds: () => ["default"],
+            resolveAccount: () => ({}),
+          },
+        },
+      },
+    ]),
+  );
+};
+
+describe("sanitizeTextContent", () => {
+  it("strips minimax tool call XML and downgraded markers", () => {
+    const input =
+      'Hello <invoke name="tool">payload</invoke></minimax:tool_call> ' +
+      "[Tool Call: foo (ID: 1)] world";
+    const result = sanitizeTextContent(input).trim();
+    expect(result).toBe("Hello  world");
+    expect(result).not.toContain("invoke");
+    expect(result).not.toContain("Tool Call");
+  });
+
+  it("strips thinking tags", () => {
+    const input = "Before <think>secret</think> after";
+    const result = sanitizeTextContent(input).trim();
+    expect(result).toBe("Before  after");
+  });
+});
+
+describe("extractAssistantText", () => {
+  it("sanitizes blocks without injecting newlines", () => {
+    const message = {
+      role: "assistant",
+      content: [
+        { type: "text", text: "Hi " },
+        { type: "text", text: "<think>secret</think>there" },
+      ],
+    };
+    expect(extractAssistantText(message)).toBe("Hi there");
+  });
+
+  it("rewrites error-ish assistant text only when the transcript marks it as an error", () => {
+    const message = {
+      role: "assistant",
+      stopReason: "error",
+      errorMessage: "500 Internal Server Error",
+      content: [{ type: "text", text: "500 Internal Server Error" }],
+    };
+    expect(extractAssistantText(message)).toBe("HTTP 500: Internal Server Error");
+  });
+
+  it("keeps normal status text that mentions billing", () => {
+    const message = {
+      role: "assistant",
+      content: [
+        {
+          type: "text",
+          text: "Firebase downgraded us to the free Spark plan. Check whether billing should be re-enabled.",
+        },
+      ],
+    };
+    expect(extractAssistantText(message)).toBe(
+      "Firebase downgraded us to the free Spark plan. Check whether billing should be re-enabled.",
+    );
+  });
+});
+
+describe("resolveAnnounceTarget", () => {
+  beforeEach(async () => {
+    callGatewayMock.mockReset();
+    await installRegistry();
+  });
+
+  it("derives non-WhatsApp announce targets from the session key", async () => {
+    const { resolveAnnounceTarget } = await loadResolveAnnounceTarget();
+    const target = await resolveAnnounceTarget({
+      sessionKey: "agent:main:discord:group:dev",
+      displayKey: "agent:main:discord:group:dev",
+    });
+    expect(target).toEqual({ channel: "discord", to: "channel:dev" });
+    expect(callGatewayMock).not.toHaveBeenCalled();
+  });
+
+  it("hydrates WhatsApp accountId from sessions.list when available", async () => {
+    const { resolveAnnounceTarget } = await loadResolveAnnounceTarget();
+    callGatewayMock.mockResolvedValueOnce({
+      sessions: [
+        {
+          key: "agent:main:whatsapp:group:123@g.us",
+          deliveryContext: {
+            channel: "whatsapp",
+            to: "123@g.us",
+            accountId: "work",
+          },
+        },
+      ],
+    });
+
+    const target = await resolveAnnounceTarget({
+      sessionKey: "agent:main:whatsapp:group:123@g.us",
+      displayKey: "agent:main:whatsapp:group:123@g.us",
+    });
+    expect(target).toEqual({
+      channel: "whatsapp",
+      to: "123@g.us",
+      accountId: "work",
+    });
+    expect(callGatewayMock).toHaveBeenCalledTimes(1);
+    const first = callGatewayMock.mock.calls[0]?.[0] as { method?: string } | undefined;
+    expect(first).toBeDefined();
+    expect(first?.method).toBe("sessions.list");
+  });
+});
+
+describe("sessions_list gating", () => {
+  beforeEach(() => {
+    callGatewayMock.mockReset();
+    callGatewayMock.mockResolvedValue({
+      path: "/tmp/sessions.json",
+      sessions: [
+        { key: "agent:main:main", kind: "direct" },
+        { key: "agent:other:main", kind: "direct" },
+      ],
+    });
+  });
+
+  it("filters out other agents when tools.agentToAgent.enabled is false", async () => {
+    const tool = createSessionsListTool({ agentSessionKey: "agent:main:main" });
+    const result = await tool.execute("call1", {});
+    expect(result.details).toMatchObject({
+      count: 1,
+      sessions: [{ key: "agent:main:main" }],
+    });
+  });
+});
+
+describe("sessions_send gating", () => {
+  beforeEach(() => {
+    callGatewayMock.mockReset();
+  });
+
+  it("blocks cross-agent sends when tools.agentToAgent.enabled is false", async () => {
+    const tool = createSessionsSendTool({
+      agentSessionKey: "agent:main:main",
+      agentChannel: "whatsapp",
+    });
+
+    const result = await tool.execute("call1", {
+      sessionKey: "agent:other:main",
+      message: "hi",
+      timeoutSeconds: 0,
+    });
+
+    expect(callGatewayMock).not.toHaveBeenCalled();
+    expect(result.details).toMatchObject({ status: "forbidden" });
+  });
+});
--- a/src/agents/tools/web-fetch-utils.ts
+++ b/src/agents/tools/web-fetch-utils.ts
@@ -1,5 +1,8 @@
 export type ExtractMode = "markdown" | "text";

+const READABILITY_MAX_HTML_CHARS = 1_000_000;
+const READABILITY_MAX_ESTIMATED_NESTING_DEPTH = 3_000;
+
 let readabilityDepsPromise:
  | Promise<{
      Readability: typeof import("@mozilla/readability").Readability;
@@ -107,6 +110,100 @@ export function truncateText(
  return { text: value.slice(0, maxChars), truncated: true };
 }

+function exceedsEstimatedHtmlNestingDepth(html: string, maxDepth: number): boolean {
+  // Cheap heuristic to skip Readability+DOM parsing on pathological HTML (deep nesting => stack/memory blowups).
+  // Not an HTML parser; tuned to catch attacker-controlled "<div><div>..." cases.
+  const voidTags = new Set([
+    "area",
+    "base",
+    "br",
+    "col",
+    "embed",
+    "hr",
+    "img",
+    "input",
+    "link",
+    "meta",
+    "param",
+    "source",
+    "track",
+    "wbr",
+  ]);
+
+  let depth = 0;
+  const len = html.length;
+  for (let i = 0; i < len; i++) {
+    if (html.charCodeAt(i) !== 60) {
+      continue; // '<'
+    }
+    const next = html.charCodeAt(i + 1);
+    if (next === 33 || next === 63) {
+      continue; // <! ...> or <? ...>
+    }
+
+    let j = i + 1;
+    let closing = false;
+    if (html.charCodeAt(j) === 47) {
+      closing = true;
+      j += 1;
+    }
+
+    while (j < len && html.charCodeAt(j) <= 32) {
+      j += 1;
+    }
+
+    const nameStart = j;
+    while (j < len) {
+      const c = html.charCodeAt(j);
+      const isNameChar =
+        (c >= 65 && c <= 90) || // A-Z
+        (c >= 97 && c <= 122) || // a-z
+        (c >= 48 && c <= 57) || // 0-9
+        c === 58 || // :
+        c === 45; // -
+      if (!isNameChar) {
+        break;
+      }
+      j += 1;
+    }
+
+    const tagName = html.slice(nameStart, j).toLowerCase();
+    if (!tagName) {
+      continue;
+    }
+
+    if (closing) {
+      depth = Math.max(0, depth - 1);
+      continue;
+    }
+
+    if (voidTags.has(tagName)) {
+      continue;
+    }
+
+    // Best-effort self-closing detection: scan a short window for "/>".
+    let selfClosing = false;
+    for (let k = j; k < len && k < j + 200; k++) {
+      const c = html.charCodeAt(k);
+      if (c === 62) {
+        if (html.charCodeAt(k - 1) === 47) {
+          selfClosing = true;
+        }
+        break;
+      }
+    }
+    if (selfClosing) {
+      continue;
+    }
+
+    depth += 1;
+    if (depth > maxDepth) {
+      return true;
+    }
+  }
+  return false;
+}
+
 export async function extractReadableContent(params: {
  html: string;
  url: string;
@@ -120,6 +217,12 @@ export async function extractReadableContent(params: {
    }
    return rendered;
  };
+  if (
+    params.html.length > READABILITY_MAX_HTML_CHARS ||
+    exceedsEstimatedHtmlNestingDepth(params.html, READABILITY_MAX_ESTIMATED_NESTING_DEPTH)
+  ) {
+    return fallback();
+  }
  try {
    const { Readability, parseHTML } = await loadReadabilityDeps();
    const { document } = parseHTML(params.html);
--- a/src/agents/tools/web-fetch.response-limit.test.ts
+++ b/src/agents/tools/web-fetch.response-limit.test.ts
@@ -0,0 +1,66 @@
+import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
+import * as ssrf from "../../infra/net/ssrf.js";
+import { createWebFetchTool } from "./web-tools.js";
+
+// Avoid dynamic-importing heavy readability deps in this unit test suite.
+vi.mock("./web-fetch-utils.js", async () => {
+  const actual =
+    await vi.importActual<typeof import("./web-fetch-utils.js")>("./web-fetch-utils.js");
+  return {
+    ...actual,
+    extractReadableContent: vi.fn().mockResolvedValue({
+      title: "HTML Page",
+      text: "HTML Page\n\nContent here.",
+    }),
+  };
+});
+
+const lookupMock = vi.fn();
+const resolvePinnedHostname = ssrf.resolvePinnedHostname;
+const baseToolConfig = {
+  config: {
+    tools: {
+      web: { fetch: { cacheTtlMinutes: 0, firecrawl: { enabled: false }, maxResponseBytes: 1024 } },
+    },
+  },
+} as const;
+
+describe("web_fetch response size limits", () => {
+  const priorFetch = global.fetch;
+
+  beforeEach(() => {
+    lookupMock.mockResolvedValue([{ address: "93.184.216.34", family: 4 }]);
+    vi.spyOn(ssrf, "resolvePinnedHostname").mockImplementation((hostname) =>
+      resolvePinnedHostname(hostname, lookupMock),
+    );
+  });
+
+  afterEach(() => {
+    // @ts-expect-error restore
+    global.fetch = priorFetch;
+    lookupMock.mockReset();
+    vi.restoreAllMocks();
+  });
+
+  it("caps response bytes and does not hang on endless streams", async () => {
+    const chunk = new TextEncoder().encode("<html><body><div>hi</div></body></html>");
+    const stream = new ReadableStream<Uint8Array>({
+      pull(controller) {
+        controller.enqueue(chunk);
+      },
+    });
+    const response = new Response(stream, {
+      status: 200,
+      headers: { "content-type": "text/html; charset=utf-8" },
+    });
+
+    const fetchSpy = vi.fn().mockResolvedValue(response);
+    // @ts-expect-error mock fetch
+    global.fetch = fetchSpy;
+
+    const tool = createWebFetchTool(baseToolConfig);
+    const result = await tool?.execute?.("call", { url: "https://example.com/stream" });
+
+    expect(result?.details?.warning).toContain("Response body truncated");
+  });
+});
--- a/src/agents/tools/web-fetch.ts
+++ b/src/agents/tools/web-fetch.ts
@@ -33,8 +33,12 @@ export { extractReadableContent } from "./web-fetch-utils.js";
 const EXTRACT_MODES = ["markdown", "text"] as const;

 const DEFAULT_FETCH_MAX_CHARS = 50_000;
+const DEFAULT_FETCH_MAX_RESPONSE_BYTES = 2_000_000;
+const FETCH_MAX_RESPONSE_BYTES_MIN = 32_000;
+const FETCH_MAX_RESPONSE_BYTES_MAX = 10_000_000;
 const DEFAULT_FETCH_MAX_REDIRECTS = 3;
 const DEFAULT_ERROR_MAX_CHARS = 4_000;
+const DEFAULT_ERROR_MAX_BYTES = 64_000;
 const DEFAULT_FIRECRAWL_BASE_URL = "https://api.firecrawl.dev";
 const DEFAULT_FIRECRAWL_MAX_AGE_MS = 172_800_000;
 const DEFAULT_FETCH_USER_AGENT =
@@ -108,6 +112,18 @@ function resolveFetchMaxCharsCap(fetch?: WebFetchConfig): number {
  return Math.max(100, Math.floor(raw));
 }

+function resolveFetchMaxResponseBytes(fetch?: WebFetchConfig): number {
+  const raw =
+    fetch && "maxResponseBytes" in fetch && typeof fetch.maxResponseBytes === "number"
+      ? fetch.maxResponseBytes
+      : undefined;
+  if (typeof raw !== "number" || !Number.isFinite(raw) || raw <= 0) {
+    return DEFAULT_FETCH_MAX_RESPONSE_BYTES;
+  }
+  const value = Math.floor(raw);
+  return Math.min(FETCH_MAX_RESPONSE_BYTES_MAX, Math.max(FETCH_MAX_RESPONSE_BYTES_MIN, value));
+}
+
 function resolveFirecrawlConfig(fetch?: WebFetchConfig): FirecrawlFetchConfig {
  if (!fetch || typeof fetch !== "object") {
    return undefined;
@@ -409,15 +425,7 @@ export async function fetchFirecrawlContent(params: {
  };
 }

-async function runWebFetch(params: {
-  url: string;
-  extractMode: ExtractMode;
-  maxChars: number;
-  maxRedirects: number;
-  timeoutSeconds: number;
-  cacheTtlMs: number;
-  userAgent: string;
-  readabilityEnabled: boolean;
+type FirecrawlRuntimeParams = {
  firecrawlEnabled: boolean;
  firecrawlApiKey?: string;
  firecrawlBaseUrl: string;
@@ -426,7 +434,72 @@ async function runWebFetch(params: {
  firecrawlProxy: "auto" | "basic" | "stealth";
  firecrawlStoreInCache: boolean;
  firecrawlTimeoutSeconds: number;
-}): Promise<Record<string, unknown>> {
+};
+
+type WebFetchRuntimeParams = FirecrawlRuntimeParams & {
+  url: string;
+  extractMode: ExtractMode;
+  maxChars: number;
+  maxResponseBytes: number;
+  maxRedirects: number;
+  timeoutSeconds: number;
+  cacheTtlMs: number;
+  userAgent: string;
+  readabilityEnabled: boolean;
+};
+
+function toFirecrawlContentParams(
+  params: FirecrawlRuntimeParams & { url: string; extractMode: ExtractMode },
+): Parameters<typeof fetchFirecrawlContent>[0] | null {
+  if (!params.firecrawlEnabled || !params.firecrawlApiKey) {
+    return null;
+  }
+  return {
+    url: params.url,
+    extractMode: params.extractMode,
+    apiKey: params.firecrawlApiKey,
+    baseUrl: params.firecrawlBaseUrl,
+    onlyMainContent: params.firecrawlOnlyMainContent,
+    maxAgeMs: params.firecrawlMaxAgeMs,
+    proxy: params.firecrawlProxy,
+    storeInCache: params.firecrawlStoreInCache,
+    timeoutSeconds: params.firecrawlTimeoutSeconds,
+  };
+}
+
+async function maybeFetchFirecrawlWebFetchPayload(
+  params: WebFetchRuntimeParams & {
+    urlToFetch: string;
+    finalUrlFallback: string;
+    statusFallback: number;
+    cacheKey: string;
+    tookMs: number;
+  },
+): Promise<Record<string, unknown> | null> {
+  const firecrawlParams = toFirecrawlContentParams({
+    ...params,
+    url: params.urlToFetch,
+    extractMode: params.extractMode,
+  });
+  if (!firecrawlParams) {
+    return null;
+  }
+
+  const firecrawl = await fetchFirecrawlContent(firecrawlParams);
+  const payload = buildFirecrawlWebFetchPayload({
+    firecrawl,
+    rawUrl: params.url,
+    finalUrlFallback: params.finalUrlFallback,
+    statusFallback: params.statusFallback,
+    extractMode: params.extractMode,
+    maxChars: params.maxChars,
+    tookMs: params.tookMs,
+  });
+  writeCache(FETCH_CACHE, params.cacheKey, payload, params.cacheTtlMs);
+  return payload;
+}
+
+async function runWebFetch(params: WebFetchRuntimeParams): Promise<Record<string, unknown>> {
  const cacheKey = normalizeCacheKey(
    `fetch:${params.url}:${params.extractMode}:${params.maxChars}`,
  );
@@ -477,28 +550,15 @@ async function runWebFetch(params: {
    if (error instanceof SsrFBlockedError) {
      throw error;
    }
-    if (params.firecrawlEnabled && params.firecrawlApiKey) {
-      const firecrawl = await fetchFirecrawlContent({
-        url: finalUrl,
-        extractMode: params.extractMode,
-        apiKey: params.firecrawlApiKey,
-        baseUrl: params.firecrawlBaseUrl,
-        onlyMainContent: params.firecrawlOnlyMainContent,
-        maxAgeMs: params.firecrawlMaxAgeMs,
-        proxy: params.firecrawlProxy,
-        storeInCache: params.firecrawlStoreInCache,
-        timeoutSeconds: params.firecrawlTimeoutSeconds,
-      });
-      const payload = buildFirecrawlWebFetchPayload({
-        firecrawl,
-        rawUrl: params.url,
-        finalUrlFallback: finalUrl,
-        statusFallback: 200,
-        extractMode: params.extractMode,
-        maxChars: params.maxChars,
-        tookMs: Date.now() - start,
-      });
-      writeCache(FETCH_CACHE, cacheKey, payload, params.cacheTtlMs);
+    const payload = await maybeFetchFirecrawlWebFetchPayload({
+      ...params,
+      urlToFetch: finalUrl,
+      finalUrlFallback: finalUrl,
+      statusFallback: 200,
+      cacheKey,
+      tookMs: Date.now() - start,
+    });
+    if (payload) {
      return payload;
    }
    throw error;
@@ -506,31 +566,19 @@ async function runWebFetch(params: {

  try {
    if (!res.ok) {
-      if (params.firecrawlEnabled && params.firecrawlApiKey) {
-        const firecrawl = await fetchFirecrawlContent({
-          url: params.url,
-          extractMode: params.extractMode,
-          apiKey: params.firecrawlApiKey,
-          baseUrl: params.firecrawlBaseUrl,
-          onlyMainContent: params.firecrawlOnlyMainContent,
-          maxAgeMs: params.firecrawlMaxAgeMs,
-          proxy: params.firecrawlProxy,
-          storeInCache: params.firecrawlStoreInCache,
-          timeoutSeconds: params.firecrawlTimeoutSeconds,
-        });
-        const payload = buildFirecrawlWebFetchPayload({
-          firecrawl,
-          rawUrl: params.url,
-          finalUrlFallback: finalUrl,
-          statusFallback: res.status,
-          extractMode: params.extractMode,
-          maxChars: params.maxChars,
-          tookMs: Date.now() - start,
-        });
-        writeCache(FETCH_CACHE, cacheKey, payload, params.cacheTtlMs);
+      const payload = await maybeFetchFirecrawlWebFetchPayload({
+        ...params,
+        urlToFetch: params.url,
+        finalUrlFallback: finalUrl,
+        statusFallback: res.status,
+        cacheKey,
+        tookMs: Date.now() - start,
+      });
+      if (payload) {
        return payload;
      }
-      const rawDetail = await readResponseText(res);
+      const rawDetailResult = await readResponseText(res, { maxBytes: DEFAULT_ERROR_MAX_BYTES });
+      const rawDetail = rawDetailResult.text;
      const detail = formatWebFetchErrorDetail({
        detail: rawDetail,
        contentType: res.headers.get("content-type"),
@@ -542,7 +590,11 @@ async function runWebFetch(params: {

    const contentType = res.headers.get("content-type") ?? "application/octet-stream";
    const normalizedContentType = normalizeContentType(contentType) ?? "application/octet-stream";
-    const body = await readResponseText(res);
+    const bodyResult = await readResponseText(res, { maxBytes: params.maxResponseBytes });
+    const body = bodyResult.text;
+    const responseTruncatedWarning = bodyResult.truncated
+      ? `Response body truncated after ${params.maxResponseBytes} bytes.`
+      : undefined;

    let title: string | undefined;
    let extractor = "raw";
@@ -593,6 +645,7 @@ async function runWebFetch(params: {

    const wrapped = wrapWebFetchContent(text, params.maxChars);
    const wrappedTitle = title ? wrapWebFetchField(title) : undefined;
+    const wrappedWarning = wrapWebFetchField(responseTruncatedWarning);
    const payload = {
      url: params.url, // Keep raw for tool chaining
      finalUrl, // Keep raw
@@ -613,6 +666,7 @@ async function runWebFetch(params: {
      fetchedAt: new Date().toISOString(),
      tookMs: Date.now() - start,
      text: wrapped.text,
+      warning: wrappedWarning,
    };
    writeCache(FETCH_CACHE, cacheKey, payload, params.cacheTtlMs);
    return payload;
@@ -623,33 +677,15 @@ async function runWebFetch(params: {
  }
 }

-async function tryFirecrawlFallback(params: {
-  url: string;
-  extractMode: ExtractMode;
-  firecrawlEnabled: boolean;
-  firecrawlApiKey?: string;
-  firecrawlBaseUrl: string;
-  firecrawlOnlyMainContent: boolean;
-  firecrawlMaxAgeMs: number;
-  firecrawlProxy: "auto" | "basic" | "stealth";
-  firecrawlStoreInCache: boolean;
-  firecrawlTimeoutSeconds: number;
-}): Promise<{ text: string; title?: string } | null> {
-  if (!params.firecrawlEnabled || !params.firecrawlApiKey) {
+async function tryFirecrawlFallback(
+  params: FirecrawlRuntimeParams & { url: string; extractMode: ExtractMode },
+): Promise<{ text: string; title?: string } | null> {
+  const firecrawlParams = toFirecrawlContentParams(params);
+  if (!firecrawlParams) {
    return null;
  }
  try {
-    const firecrawl = await fetchFirecrawlContent({
-      url: params.url,
-      extractMode: params.extractMode,
-      apiKey: params.firecrawlApiKey,
-      baseUrl: params.firecrawlBaseUrl,
-      onlyMainContent: params.firecrawlOnlyMainContent,
-      maxAgeMs: params.firecrawlMaxAgeMs,
-      proxy: params.firecrawlProxy,
-      storeInCache: params.firecrawlStoreInCache,
-      timeoutSeconds: params.firecrawlTimeoutSeconds,
-    });
+    const firecrawl = await fetchFirecrawlContent(firecrawlParams);
    return { text: firecrawl.text, title: firecrawl.title };
  } catch {
    return null;
@@ -695,6 +731,7 @@ export function createWebFetchTool(options?: {
  const userAgent =
    (fetch && "userAgent" in fetch && typeof fetch.userAgent === "string" && fetch.userAgent) ||
    DEFAULT_FETCH_USER_AGENT;
+  const maxResponseBytes = resolveFetchMaxResponseBytes(fetch);
  return {
    label: "Web Fetch",
    name: "web_fetch",
@@ -715,6 +752,7 @@ export function createWebFetchTool(options?: {
          DEFAULT_FETCH_MAX_CHARS,
          maxCharsCap,
        ),
+        maxResponseBytes,
        maxRedirects: resolveMaxRedirects(fetch?.maxRedirects, DEFAULT_FETCH_MAX_REDIRECTS),
        timeoutSeconds: resolveTimeoutSeconds(fetch?.timeoutSeconds, DEFAULT_TIMEOUT_SECONDS),
        cacheTtlMs: resolveCacheTtlMs(fetch?.cacheTtlMinutes, DEFAULT_CACHE_TTL_MINUTES),
--- a/src/agents/tools/web-search.e2e.test.ts
+++ b/src/agents/tools/web-search.e2e.test.ts
@@ -1,30 +1,7 @@
 import { describe, expect, it } from "vitest";
+import { withEnv } from "../../test-utils/env.js";
 import { __testing } from "./web-search.js";

-function withEnv<T>(env: Record<string, string | undefined>, fn: () => T): T {
-  const prev: Record<string, string | undefined> = {};
-  for (const [key, value] of Object.entries(env)) {
-    prev[key] = process.env[key];
-    if (value === undefined) {
-      // Make tests hermetic even on machines with real keys set.
-      delete process.env[key];
-    } else {
-      process.env[key] = value;
-    }
-  }
-  try {
-    return fn();
-  } finally {
-    for (const [key, value] of Object.entries(prev)) {
-      if (value === undefined) {
-        delete process.env[key];
-      } else {
-        process.env[key] = value;
-      }
-    }
-  }
-}
-
 const {
  inferPerplexityBaseUrlFromApiKey,
  resolvePerplexityBaseUrl,
--- a/src/agents/tools/web-search.ts
+++ b/src/agents/tools/web-search.ts
@@ -486,7 +486,8 @@ async function runPerplexitySearch(params: {
  });

  if (!res.ok) {
-    const detail = await readResponseText(res);
+    const detailResult = await readResponseText(res, { maxBytes: 64_000 });
+    const detail = detailResult.text;
    throw new Error(`Perplexity API error (${res.status}): ${detail || res.statusText}`);
  }

@@ -535,7 +536,8 @@ async function runGrokSearch(params: {
  });

  if (!res.ok) {
-    const detail = await readResponseText(res);
+    const detailResult = await readResponseText(res, { maxBytes: 64_000 });
+    const detail = detailResult.text;
    throw new Error(`xAI API error (${res.status}): ${detail || res.statusText}`);
  }

@@ -665,7 +667,8 @@ async function runWebSearch(params: {
  });

  if (!res.ok) {
-    const detail = await readResponseText(res);
+    const detailResult = await readResponseText(res, { maxBytes: 64_000 });
+    const detail = detailResult.text;
    throw new Error(`Brave Search API error (${res.status}): ${detail || res.statusText}`);
  }

--- a/src/agents/tools/web-shared.ts
+++ b/src/agents/tools/web-shared.ts
@@ -86,10 +86,85 @@ export function withTimeout(signal: AbortSignal | undefined, timeoutMs: number):
  return controller.signal;
 }

-export async function readResponseText(res: Response): Promise<string> {
+export type ReadResponseTextResult = {
+  text: string;
+  truncated: boolean;
+  bytesRead: number;
+};
+
+export async function readResponseText(
+  res: Response,
+  options?: { maxBytes?: number },
+): Promise<ReadResponseTextResult> {
+  const maxBytesRaw = options?.maxBytes;
+  const maxBytes =
+    typeof maxBytesRaw === "number" && Number.isFinite(maxBytesRaw) && maxBytesRaw > 0
+      ? Math.floor(maxBytesRaw)
+      : undefined;
+
+  const body = (res as unknown as { body?: unknown }).body;
+  if (
+    maxBytes &&
+    body &&
+    typeof body === "object" &&
+    "getReader" in body &&
+    typeof (body as { getReader: () => unknown }).getReader === "function"
+  ) {
+    const reader = (body as ReadableStream<Uint8Array>).getReader();
+    const decoder = new TextDecoder();
+    let bytesRead = 0;
+    let truncated = false;
+    const parts: string[] = [];
+
+    try {
+      while (true) {
+        const { value, done } = await reader.read();
+        if (done) {
+          break;
+        }
+        if (!value || value.byteLength === 0) {
+          continue;
+        }
+
+        let chunk = value;
+        if (bytesRead + chunk.byteLength > maxBytes) {
+          const remaining = Math.max(0, maxBytes - bytesRead);
+          if (remaining <= 0) {
+            truncated = true;
+            break;
+          }
+          chunk = chunk.subarray(0, remaining);
+          truncated = true;
+        }
+
+        bytesRead += chunk.byteLength;
+        parts.push(decoder.decode(chunk, { stream: true }));
+
+        if (truncated || bytesRead >= maxBytes) {
+          truncated = true;
+          break;
+        }
+      }
+    } catch {
+      // Best-effort: return whatever we decoded so far.
+    } finally {
+      if (truncated) {
+        try {
+          await reader.cancel();
+        } catch {
+          // ignore
+        }
+      }
+    }
+
+    parts.push(decoder.decode());
+    return { text: parts.join(""), truncated, bytesRead };
+  }
+
  try {
-    return await res.text();
+    const text = await res.text();
+    return { text, truncated: false, bytesRead: text.length };
  } catch {
-    return "";
+    return { text: "", truncated: false, bytesRead: 0 };
  }
 }
--- a/src/agents/workspace-run.ts
+++ b/src/agents/workspace-run.ts
@@ -1,4 +1,5 @@
 import type { OpenClawConfig } from "../config/config.js";
+import { logWarn } from "../logger.js";
 import { redactIdentifier } from "../logging/redact-identifier.js";
 import {
  classifySessionKeyShape,
@@ -8,6 +9,7 @@ import {
 } from "../routing/session-key.js";
 import { resolveUserPath } from "../utils.js";
 import { resolveAgentWorkspaceDir, resolveDefaultAgentId } from "./agent-scope.js";
+import { sanitizeForPromptLiteral } from "./sanitize-for-prompt.js";

 export type WorkspaceFallbackReason = "missing" | "blank" | "invalid_type";
 type AgentIdSource = "explicit" | "session_key" | "default";
@@ -84,8 +86,12 @@ export function resolveRunWorkspaceDir(params: {
  if (typeof requested === "string") {
    const trimmed = requested.trim();
    if (trimmed) {
+      const sanitized = sanitizeForPromptLiteral(trimmed);
+      if (sanitized !== trimmed) {
+        logWarn("Control/format characters stripped from workspaceDir (OC-19 hardening).");
+      }
      return {
-        workspaceDir: resolveUserPath(trimmed),
+        workspaceDir: resolveUserPath(sanitized),
        usedFallback: false,
        agentId,
        agentIdSource,
@@ -96,8 +102,12 @@ export function resolveRunWorkspaceDir(params: {
  const fallbackReason: WorkspaceFallbackReason =
    requested == null ? "missing" : typeof requested === "string" ? "blank" : "invalid_type";
  const fallbackWorkspace = resolveAgentWorkspaceDir(params.config ?? {}, agentId);
+  const sanitizedFallback = sanitizeForPromptLiteral(fallbackWorkspace);
+  if (sanitizedFallback !== fallbackWorkspace) {
+    logWarn("Control/format characters stripped from fallback workspaceDir (OC-19 hardening).");
+  }
  return {
-    workspaceDir: resolveUserPath(fallbackWorkspace),
+    workspaceDir: resolveUserPath(sanitizedFallback),
    usedFallback: true,
    fallbackReason,
    agentId,
--- a/src/auto-reply/reply.triggers.group-intro-prompts.e2e.test.ts
+++ b/src/auto-reply/reply.triggers.group-intro-prompts.e2e.test.ts
@@ -127,7 +127,10 @@ describe("group intro prompts", () => {
        vi.mocked(runEmbeddedPiAgent).mock.calls.at(-1)?.[0]?.extraSystemPrompt ?? "";
      expect(extraSystemPrompt).toContain('"channel": "discord"');
      expect(extraSystemPrompt).toContain(
-        `You are replying inside a Discord group chat. Activation: trigger-only (you are invoked only when explicitly mentioned; recent context may be included). ${groupParticipationNote} Address the specific sender noted in the message context.`,
+        `You are in the Discord group chat "Release Squad". Participants: Alice, Bob.`,
+      );
+      expect(extraSystemPrompt).toContain(
+        `Activation: trigger-only (you are invoked only when explicitly mentioned; recent context may be included). ${groupParticipationNote} Address the specific sender noted in the message context.`,
      );
    });
  });
@@ -158,8 +161,12 @@ describe("group intro prompts", () => {
      const extraSystemPrompt =
        vi.mocked(runEmbeddedPiAgent).mock.calls.at(-1)?.[0]?.extraSystemPrompt ?? "";
      expect(extraSystemPrompt).toContain('"channel": "whatsapp"');
+      expect(extraSystemPrompt).toContain(`You are in the WhatsApp group chat "Ops".`);
      expect(extraSystemPrompt).toContain(
-        `You are replying inside a WhatsApp group chat. Activation: trigger-only (you are invoked only when explicitly mentioned; recent context may be included). WhatsApp IDs: SenderId is the participant JID (group participant id). ${groupParticipationNote} Address the specific sender noted in the message context.`,
+        `WhatsApp IDs: SenderId is the participant JID (group participant id).`,
+      );
+      expect(extraSystemPrompt).toContain(
+        `Activation: trigger-only (you are invoked only when explicitly mentioned; recent context may be included). WhatsApp IDs: SenderId is the participant JID (group participant id). ${groupParticipationNote} Address the specific sender noted in the message context.`,
      );
    });
  });
@@ -190,8 +197,9 @@ describe("group intro prompts", () => {
      const extraSystemPrompt =
        vi.mocked(runEmbeddedPiAgent).mock.calls.at(-1)?.[0]?.extraSystemPrompt ?? "";
      expect(extraSystemPrompt).toContain('"channel": "telegram"');
+      expect(extraSystemPrompt).toContain(`You are in the Telegram group chat "Dev Chat".`);
      expect(extraSystemPrompt).toContain(
-        `You are replying inside a Telegram group chat. Activation: trigger-only (you are invoked only when explicitly mentioned; recent context may be included). ${groupParticipationNote} Address the specific sender noted in the message context.`,
+        `Activation: trigger-only (you are invoked only when explicitly mentioned; recent context may be included). ${groupParticipationNote} Address the specific sender noted in the message context.`,
      );
    });
  });
--- a/src/auto-reply/reply/agent-runner-utils.test.ts
+++ b/src/auto-reply/reply/agent-runner-utils.test.ts
@@ -1,106 +0,0 @@
-import { describe, expect, it } from "vitest";
-import type { OpenClawConfig } from "../../config/config.js";
-import type { TemplateContext } from "../templating.js";
-import { buildThreadingToolContext } from "./agent-runner-utils.js";
-
-describe("buildThreadingToolContext", () => {
-  const cfg = {} as OpenClawConfig;
-
-  it("uses conversation id for WhatsApp", () => {
-    const sessionCtx = {
-      Provider: "whatsapp",
-      From: "123@g.us",
-      To: "+15550001",
-    } as TemplateContext;
-
-    const result = buildThreadingToolContext({
-      sessionCtx,
-      config: cfg,
-      hasRepliedRef: undefined,
-    });
-
-    expect(result.currentChannelId).toBe("123@g.us");
-  });
-
-  it("falls back to To for WhatsApp when From is missing", () => {
-    const sessionCtx = {
-      Provider: "whatsapp",
-      To: "+15550001",
-    } as TemplateContext;
-
-    const result = buildThreadingToolContext({
-      sessionCtx,
-      config: cfg,
-      hasRepliedRef: undefined,
-    });
-
-    expect(result.currentChannelId).toBe("+15550001");
-  });
-
-  it("uses the recipient id for other channels", () => {
-    const sessionCtx = {
-      Provider: "telegram",
-      From: "user:42",
-      To: "chat:99",
-    } as TemplateContext;
-
-    const result = buildThreadingToolContext({
-      sessionCtx,
-      config: cfg,
-      hasRepliedRef: undefined,
-    });
-
-    expect(result.currentChannelId).toBe("chat:99");
-  });
-
-  it("uses the sender handle for iMessage direct chats", () => {
-    const sessionCtx = {
-      Provider: "imessage",
-      ChatType: "direct",
-      From: "imessage:+15550001",
-      To: "chat_id:12",
-    } as TemplateContext;
-
-    const result = buildThreadingToolContext({
-      sessionCtx,
-      config: cfg,
-      hasRepliedRef: undefined,
-    });
-
-    expect(result.currentChannelId).toBe("imessage:+15550001");
-  });
-
-  it("uses chat_id for iMessage groups", () => {
-    const sessionCtx = {
-      Provider: "imessage",
-      ChatType: "group",
-      From: "imessage:group:7",
-      To: "chat_id:7",
-    } as TemplateContext;
-
-    const result = buildThreadingToolContext({
-      sessionCtx,
-      config: cfg,
-      hasRepliedRef: undefined,
-    });
-
-    expect(result.currentChannelId).toBe("chat_id:7");
-  });
-
-  it("prefers MessageThreadId for Slack tool threading", () => {
-    const sessionCtx = {
-      Provider: "slack",
-      To: "channel:C1",
-      MessageThreadId: "123.456",
-    } as TemplateContext;
-
-    const result = buildThreadingToolContext({
-      sessionCtx,
-      config: { channels: { slack: { replyToMode: "all" } } } as OpenClawConfig,
-      hasRepliedRef: undefined,
-    });
-
-    expect(result.currentChannelId).toBe("C1");
-    expect(result.currentThreadTs).toBe("123.456");
-  });
-});
--- a/src/auto-reply/reply/agent-runner.heartbeat-typing.runreplyagent-typing-heartbeat.test.ts
+++ b/src/auto-reply/reply/agent-runner.heartbeat-typing.runreplyagent-typing-heartbeat.test.ts
@@ -1,583 +0,0 @@
-import fs from "node:fs/promises";
-import { tmpdir } from "node:os";
-import path from "node:path";
-import { afterAll, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
-import * as sessions from "../../config/sessions.js";
-import {
-  createMinimalRun,
-  getRunEmbeddedPiAgentMock,
-  installRunReplyAgentTypingHeartbeatTestHooks,
-} from "./agent-runner.heartbeat-typing.test-harness.js";
-
-type AgentRunParams = {
-  onPartialReply?: (payload: { text?: string }) => Promise<void> | void;
-  onAssistantMessageStart?: () => Promise<void> | void;
-  onReasoningStream?: (payload: { text?: string }) => Promise<void> | void;
-  onBlockReply?: (payload: { text?: string; mediaUrls?: string[] }) => Promise<void> | void;
-  onToolResult?: (payload: { text?: string; mediaUrls?: string[] }) => Promise<void> | void;
-  onAgentEvent?: (evt: { stream: string; data: Record<string, unknown> }) => void;
-};
-
-const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-
-let fixtureRoot = "";
-let caseId = 0;
-
-type StateEnvSnapshot = {
-  OPENCLAW_STATE_DIR: string | undefined;
-};
-
-function snapshotStateEnv(): StateEnvSnapshot {
-  return { OPENCLAW_STATE_DIR: process.env.OPENCLAW_STATE_DIR };
-}
-
-function restoreStateEnv(snapshot: StateEnvSnapshot) {
-  if (snapshot.OPENCLAW_STATE_DIR === undefined) {
-    delete process.env.OPENCLAW_STATE_DIR;
-  } else {
-    process.env.OPENCLAW_STATE_DIR = snapshot.OPENCLAW_STATE_DIR;
-  }
-}
-
-async function withTempStateDir<T>(fn: (stateDir: string) => Promise<T>): Promise<T> {
-  const stateDir = path.join(fixtureRoot, `case-${++caseId}`);
-  await fs.mkdir(stateDir, { recursive: true });
-  const envSnapshot = snapshotStateEnv();
-  process.env.OPENCLAW_STATE_DIR = stateDir;
-  try {
-    return await fn(stateDir);
-  } finally {
-    restoreStateEnv(envSnapshot);
-  }
-}
-
-async function writeCorruptGeminiSessionFixture(params: {
-  stateDir: string;
-  sessionId: string;
-  persistStore: boolean;
-}) {
-  const storePath = path.join(params.stateDir, "sessions", "sessions.json");
-  const sessionEntry = { sessionId: params.sessionId, updatedAt: Date.now() };
-  const sessionStore = { main: sessionEntry };
-
-  await fs.mkdir(path.dirname(storePath), { recursive: true });
-  if (params.persistStore) {
-    await fs.writeFile(storePath, JSON.stringify(sessionStore), "utf-8");
-  }
-
-  const transcriptPath = sessions.resolveSessionTranscriptPath(params.sessionId);
-  await fs.mkdir(path.dirname(transcriptPath), { recursive: true });
-  await fs.writeFile(transcriptPath, "bad", "utf-8");
-
-  return { storePath, sessionEntry, sessionStore, transcriptPath };
-}
-
-describe("runReplyAgent typing (heartbeat)", () => {
-  installRunReplyAgentTypingHeartbeatTestHooks();
-
-  beforeAll(async () => {
-    fixtureRoot = await fs.mkdtemp(path.join(tmpdir(), "openclaw-typing-heartbeat-"));
-  });
-
-  afterAll(async () => {
-    if (fixtureRoot) {
-      await fs.rm(fixtureRoot, { recursive: true, force: true });
-    }
-  });
-
-  beforeEach(() => {
-    vi.stubEnv("OPENCLAW_TEST_FAST", "1");
-  });
-
-  it("signals typing for normal runs", async () => {
-    const onPartialReply = vi.fn();
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onPartialReply?.({ text: "hi" });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      opts: { isHeartbeat: false, onPartialReply },
-    });
-    await run();
-
-    expect(onPartialReply).toHaveBeenCalled();
-    expect(typing.startTypingOnText).toHaveBeenCalledWith("hi");
-    expect(typing.startTypingLoop).toHaveBeenCalled();
-  });
-
-  it("signals typing even without consumer partial handler", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onPartialReply?.({ text: "hi" });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "message",
-    });
-    await run();
-
-    expect(typing.startTypingOnText).toHaveBeenCalledWith("hi");
-    expect(typing.startTypingLoop).not.toHaveBeenCalled();
-  });
-
-  it("never signals typing for heartbeat runs", async () => {
-    const onPartialReply = vi.fn();
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onPartialReply?.({ text: "hi" });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      opts: { isHeartbeat: true, onPartialReply },
-    });
-    await run();
-
-    expect(onPartialReply).toHaveBeenCalled();
-    expect(typing.startTypingOnText).not.toHaveBeenCalled();
-    expect(typing.startTypingLoop).not.toHaveBeenCalled();
-  });
-
-  it("suppresses partial streaming for NO_REPLY", async () => {
-    const onPartialReply = vi.fn();
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onPartialReply?.({ text: "NO_REPLY" });
-      return { payloads: [{ text: "NO_REPLY" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      opts: { isHeartbeat: false, onPartialReply },
-      typingMode: "message",
-    });
-    await run();
-
-    expect(onPartialReply).not.toHaveBeenCalled();
-    expect(typing.startTypingOnText).not.toHaveBeenCalled();
-    expect(typing.startTypingLoop).not.toHaveBeenCalled();
-  });
-
-  it("does not start typing on assistant message start without prior text in message mode", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onAssistantMessageStart?.();
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "message",
-    });
-    await run();
-
-    expect(typing.startTypingLoop).not.toHaveBeenCalled();
-    expect(typing.startTypingOnText).not.toHaveBeenCalled();
-  });
-
-  it("starts typing from reasoning stream in thinking mode", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onReasoningStream?.({ text: "Reasoning:\n_step_" });
-      await params.onPartialReply?.({ text: "hi" });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "thinking",
-    });
-    await run();
-
-    expect(typing.startTypingLoop).toHaveBeenCalled();
-    expect(typing.startTypingOnText).not.toHaveBeenCalled();
-  });
-
-  it("suppresses typing in never mode", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onPartialReply?.({ text: "hi" });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "never",
-    });
-    await run();
-
-    expect(typing.startTypingOnText).not.toHaveBeenCalled();
-    expect(typing.startTypingLoop).not.toHaveBeenCalled();
-  });
-
-  it("signals typing on normalized block replies", async () => {
-    const onBlockReply = vi.fn();
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onBlockReply?.({ text: "\n\nchunk", mediaUrls: [] });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "message",
-      blockStreamingEnabled: true,
-      opts: { onBlockReply },
-    });
-    await run();
-
-    expect(typing.startTypingOnText).toHaveBeenCalledWith("chunk");
-    expect(onBlockReply).toHaveBeenCalled();
-    const [blockPayload, blockOpts] = onBlockReply.mock.calls[0] ?? [];
-    expect(blockPayload).toMatchObject({ text: "chunk", audioAsVoice: false });
-    expect(blockOpts).toMatchObject({
-      abortSignal: expect.any(AbortSignal),
-      timeoutMs: expect.any(Number),
-    });
-  });
-
-  it("signals typing on tool results", async () => {
-    const onToolResult = vi.fn();
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onToolResult?.({ text: "tooling", mediaUrls: [] });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "message",
-      opts: { onToolResult },
-    });
-    await run();
-
-    expect(typing.startTypingOnText).toHaveBeenCalledWith("tooling");
-    expect(onToolResult).toHaveBeenCalledWith({
-      text: "tooling",
-      mediaUrls: [],
-    });
-  });
-
-  it("skips typing for silent tool results", async () => {
-    const onToolResult = vi.fn();
-    runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-      await params.onToolResult?.({ text: "NO_REPLY", mediaUrls: [] });
-      return { payloads: [{ text: "final" }], meta: {} };
-    });
-
-    const { run, typing } = createMinimalRun({
-      typingMode: "message",
-      opts: { onToolResult },
-    });
-    await run();
-
-    expect(typing.startTypingOnText).not.toHaveBeenCalled();
-    expect(onToolResult).not.toHaveBeenCalled();
-  });
-
-  it("announces auto-compaction in verbose mode and tracks count", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const storePath = path.join(stateDir, "sessions", "sessions.json");
-      const sessionEntry = { sessionId: "session", updatedAt: Date.now() };
-      const sessionStore = { main: sessionEntry };
-
-      runEmbeddedPiAgentMock.mockImplementationOnce(async (params: AgentRunParams) => {
-        params.onAgentEvent?.({
-          stream: "compaction",
-          data: { phase: "end", willRetry: false },
-        });
-        return { payloads: [{ text: "final" }], meta: {} };
-      });
-
-      const { run } = createMinimalRun({
-        resolvedVerboseLevel: "on",
-        sessionEntry,
-        sessionStore,
-        sessionKey: "main",
-        storePath,
-      });
-      const res = await run();
-      expect(Array.isArray(res)).toBe(true);
-      const payloads = res as { text?: string }[];
-      expect(payloads[0]?.text).toContain("Auto-compaction complete");
-      expect(payloads[0]?.text).toContain("count 1");
-      expect(sessionStore.main.compactionCount).toBe(1);
-    });
-  });
-
-  it("retries after compaction failure by resetting the session", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const sessionId = "session";
-      const storePath = path.join(stateDir, "sessions", "sessions.json");
-      const transcriptPath = sessions.resolveSessionTranscriptPath(sessionId);
-      const sessionEntry = { sessionId, updatedAt: Date.now(), sessionFile: transcriptPath };
-      const sessionStore = { main: sessionEntry };
-
-      await fs.mkdir(path.dirname(storePath), { recursive: true });
-      await fs.writeFile(storePath, JSON.stringify(sessionStore), "utf-8");
-      await fs.mkdir(path.dirname(transcriptPath), { recursive: true });
-      await fs.writeFile(transcriptPath, "ok", "utf-8");
-
-      runEmbeddedPiAgentMock.mockImplementationOnce(async () => {
-        throw new Error(
-          'Context overflow: Summarization failed: 400 {"message":"prompt is too long"}',
-        );
-      });
-
-      const { run } = createMinimalRun({
-        sessionEntry,
-        sessionStore,
-        sessionKey: "main",
-        storePath,
-      });
-      const res = await run();
-
-      expect(runEmbeddedPiAgentMock).toHaveBeenCalledTimes(1);
-      const payload = Array.isArray(res) ? res[0] : res;
-      expect(payload).toMatchObject({
-        text: expect.stringContaining("Context limit exceeded during compaction"),
-      });
-      expect(payload.text?.toLowerCase()).toContain("reset");
-      expect(sessionStore.main.sessionId).not.toBe(sessionId);
-
-      const persisted = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(persisted.main.sessionId).toBe(sessionStore.main.sessionId);
-    });
-  });
-
-  it("retries after context overflow payload by resetting the session", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const sessionId = "session";
-      const storePath = path.join(stateDir, "sessions", "sessions.json");
-      const transcriptPath = sessions.resolveSessionTranscriptPath(sessionId);
-      const sessionEntry = { sessionId, updatedAt: Date.now(), sessionFile: transcriptPath };
-      const sessionStore = { main: sessionEntry };
-
-      await fs.mkdir(path.dirname(storePath), { recursive: true });
-      await fs.writeFile(storePath, JSON.stringify(sessionStore), "utf-8");
-      await fs.mkdir(path.dirname(transcriptPath), { recursive: true });
-      await fs.writeFile(transcriptPath, "ok", "utf-8");
-
-      runEmbeddedPiAgentMock.mockImplementationOnce(async () => ({
-        payloads: [{ text: "Context overflow: prompt too large", isError: true }],
-        meta: {
-          durationMs: 1,
-          error: {
-            kind: "context_overflow",
-            message: 'Context overflow: Summarization failed: 400 {"message":"prompt is too long"}',
-          },
-        },
-      }));
-
-      const { run } = createMinimalRun({
-        sessionEntry,
-        sessionStore,
-        sessionKey: "main",
-        storePath,
-      });
-      const res = await run();
-
-      expect(runEmbeddedPiAgentMock).toHaveBeenCalledTimes(1);
-      const payload = Array.isArray(res) ? res[0] : res;
-      expect(payload).toMatchObject({
-        text: expect.stringContaining("Context limit exceeded"),
-      });
-      expect(payload.text?.toLowerCase()).toContain("reset");
-      expect(sessionStore.main.sessionId).not.toBe(sessionId);
-
-      const persisted = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(persisted.main.sessionId).toBe(sessionStore.main.sessionId);
-    });
-  });
-
-  it("resets the session after role ordering payloads", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const sessionId = "session";
-      const storePath = path.join(stateDir, "sessions", "sessions.json");
-      const transcriptPath = sessions.resolveSessionTranscriptPath(sessionId);
-      const sessionEntry = { sessionId, updatedAt: Date.now(), sessionFile: transcriptPath };
-      const sessionStore = { main: sessionEntry };
-
-      await fs.mkdir(path.dirname(storePath), { recursive: true });
-      await fs.writeFile(storePath, JSON.stringify(sessionStore), "utf-8");
-      await fs.mkdir(path.dirname(transcriptPath), { recursive: true });
-      await fs.writeFile(transcriptPath, "ok", "utf-8");
-
-      runEmbeddedPiAgentMock.mockImplementationOnce(async () => ({
-        payloads: [{ text: "Message ordering conflict - please try again.", isError: true }],
-        meta: {
-          durationMs: 1,
-          error: {
-            kind: "role_ordering",
-            message: 'messages: roles must alternate between "user" and "assistant"',
-          },
-        },
-      }));
-
-      const { run } = createMinimalRun({
-        sessionEntry,
-        sessionStore,
-        sessionKey: "main",
-        storePath,
-      });
-      const res = await run();
-
-      const payload = Array.isArray(res) ? res[0] : res;
-      expect(payload).toMatchObject({
-        text: expect.stringContaining("Message ordering conflict"),
-      });
-      expect(payload.text?.toLowerCase()).toContain("reset");
-      expect(sessionStore.main.sessionId).not.toBe(sessionId);
-      await expect(fs.access(transcriptPath)).rejects.toBeDefined();
-
-      const persisted = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(persisted.main.sessionId).toBe(sessionStore.main.sessionId);
-    });
-  });
-
-  it("resets corrupted Gemini sessions and deletes transcripts", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const { storePath, sessionEntry, sessionStore, transcriptPath } =
-        await writeCorruptGeminiSessionFixture({
-          stateDir,
-          sessionId: "session-corrupt",
-          persistStore: true,
-        });
-
-      runEmbeddedPiAgentMock.mockImplementationOnce(async () => {
-        throw new Error(
-          "function call turn comes immediately after a user turn or after a function response turn",
-        );
-      });
-
-      const { run } = createMinimalRun({
-        sessionEntry,
-        sessionStore,
-        sessionKey: "main",
-        storePath,
-      });
-      const res = await run();
-
-      expect(res).toMatchObject({
-        text: expect.stringContaining("Session history was corrupted"),
-      });
-      expect(sessionStore.main).toBeUndefined();
-      await expect(fs.access(transcriptPath)).rejects.toThrow();
-
-      const persisted = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(persisted.main).toBeUndefined();
-    });
-  });
-
-  it("keeps sessions intact on other errors", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const sessionId = "session-ok";
-      const storePath = path.join(stateDir, "sessions", "sessions.json");
-      const sessionEntry = { sessionId, updatedAt: Date.now() };
-      const sessionStore = { main: sessionEntry };
-
-      await fs.mkdir(path.dirname(storePath), { recursive: true });
-      await fs.writeFile(storePath, JSON.stringify(sessionStore), "utf-8");
-
-      const transcriptPath = sessions.resolveSessionTranscriptPath(sessionId);
-      await fs.mkdir(path.dirname(transcriptPath), { recursive: true });
-      await fs.writeFile(transcriptPath, "ok", "utf-8");
-
-      runEmbeddedPiAgentMock.mockImplementationOnce(async () => {
-        throw new Error("INVALID_ARGUMENT: some other failure");
-      });
-
-      const { run } = createMinimalRun({
-        sessionEntry,
-        sessionStore,
-        sessionKey: "main",
-        storePath,
-      });
-      const res = await run();
-
-      expect(res).toMatchObject({
-        text: expect.stringContaining("Agent failed before reply"),
-      });
-      expect(sessionStore.main).toBeDefined();
-      await expect(fs.access(transcriptPath)).resolves.toBeUndefined();
-
-      const persisted = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(persisted.main).toBeDefined();
-    });
-  });
-
-  it("still replies even if session reset fails to persist", async () => {
-    await withTempStateDir(async (stateDir) => {
-      const saveSpy = vi
-        .spyOn(sessions, "saveSessionStore")
-        .mockRejectedValueOnce(new Error("boom"));
-      try {
-        const { storePath, sessionEntry, sessionStore, transcriptPath } =
-          await writeCorruptGeminiSessionFixture({
-            stateDir,
-            sessionId: "session-corrupt",
-            persistStore: false,
-          });
-
-        runEmbeddedPiAgentMock.mockImplementationOnce(async () => {
-          throw new Error(
-            "function call turn comes immediately after a user turn or after a function response turn",
-          );
-        });
-
-        const { run } = createMinimalRun({
-          sessionEntry,
-          sessionStore,
-          sessionKey: "main",
-          storePath,
-        });
-        const res = await run();
-
-        expect(res).toMatchObject({
-          text: expect.stringContaining("Session history was corrupted"),
-        });
-        expect(sessionStore.main).toBeUndefined();
-        await expect(fs.access(transcriptPath)).rejects.toThrow();
-      } finally {
-        saveSpy.mockRestore();
-      }
-    });
-  });
-
-  it("returns friendly message for role ordering errors thrown as exceptions", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async () => {
-      throw new Error("400 Incorrect role information");
-    });
-
-    const { run } = createMinimalRun({});
-    const res = await run();
-
-    expect(res).toMatchObject({
-      text: expect.stringContaining("Message ordering conflict"),
-    });
-    expect(res).toMatchObject({
-      text: expect.not.stringContaining("400"),
-    });
-  });
-
-  it("returns friendly message for 'roles must alternate' errors thrown as exceptions", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async () => {
-      throw new Error('messages: roles must alternate between "user" and "assistant"');
-    });
-
-    const { run } = createMinimalRun({});
-    const res = await run();
-
-    expect(res).toMatchObject({
-      text: expect.stringContaining("Message ordering conflict"),
-    });
-  });
-
-  it("rewrites Bun socket errors into friendly text", async () => {
-    runEmbeddedPiAgentMock.mockImplementationOnce(async () => ({
-      payloads: [
-        {
-          text: "TypeError: The socket connection was closed unexpectedly. For more information, pass `verbose: true` in the second argument to fetch()",
-          isError: true,
-        },
-      ],
-      meta: {},
-    }));
-
-    const { run } = createMinimalRun();
-    const res = await run();
-    const payloads = Array.isArray(res) ? res : res ? [res] : [];
-    expect(payloads.length).toBe(1);
-    expect(payloads[0]?.text).toContain("LLM connection failed");
-    expect(payloads[0]?.text).toContain("socket connection was closed unexpectedly");
-    expect(payloads[0]?.text).toContain("```");
-  });
-});
--- a/src/auto-reply/reply/agent-runner.heartbeat-typing.test-harness.ts
+++ b/src/auto-reply/reply/agent-runner.heartbeat-typing.test-harness.ts
@@ -1,135 +0,0 @@
-import { beforeAll, beforeEach, vi } from "vitest";
-import type { SessionEntry } from "../../config/sessions.js";
-import type { TypingMode } from "../../config/types.js";
-import type { TemplateContext } from "../templating.js";
-import type { GetReplyOptions } from "../types.js";
-import type { FollowupRun, QueueSettings } from "./queue.js";
-import { createMockTypingController } from "./test-helpers.js";
-
-// Avoid exporting vitest mock types (TS2742 under pnpm + d.ts emit).
-// oxlint-disable-next-line typescript/no-explicit-any
-type AnyMock = any;
-
-const state = vi.hoisted(() => ({
-  runEmbeddedPiAgentMock: vi.fn(),
-}));
-
-let runReplyAgentPromise:
-  | Promise<(typeof import("./agent-runner.js"))["runReplyAgent"]>
-  | undefined;
-
-async function getRunReplyAgent() {
-  if (!runReplyAgentPromise) {
-    runReplyAgentPromise = import("./agent-runner.js").then((m) => m.runReplyAgent);
-  }
-  return await runReplyAgentPromise;
-}
-
-export function getRunEmbeddedPiAgentMock(): AnyMock {
-  return state.runEmbeddedPiAgentMock;
-}
-
-export function installRunReplyAgentTypingHeartbeatTestHooks() {
-  beforeAll(async () => {
-    // Avoid attributing the initial agent-runner import cost to the first test case.
-    await getRunReplyAgent();
-  });
-  beforeEach(() => {
-    state.runEmbeddedPiAgentMock.mockReset();
-  });
-}
-
-async function loadHarnessMocks() {
-  const { loadAgentRunnerHarnessMockBundle } = await import("./agent-runner.test-harness.mocks.js");
-  return await loadAgentRunnerHarnessMockBundle(state);
-}
-
-vi.mock("../../agents/model-fallback.js", async () => {
-  return (await loadHarnessMocks()).modelFallback;
-});
-
-vi.mock("../../agents/pi-embedded.js", async () => {
-  return (await loadHarnessMocks()).embeddedPi;
-});
-
-vi.mock("./queue.js", async () => {
-  return (await loadHarnessMocks()).queue;
-});
-
-export function createMinimalRun(params?: {
-  opts?: GetReplyOptions;
-  resolvedVerboseLevel?: "off" | "on";
-  sessionStore?: Record<string, SessionEntry>;
-  sessionEntry?: SessionEntry;
-  sessionKey?: string;
-  storePath?: string;
-  typingMode?: TypingMode;
-  blockStreamingEnabled?: boolean;
-}) {
-  const typing = createMockTypingController();
-  const opts = params?.opts;
-  const sessionCtx = {
-    Provider: "whatsapp",
-    MessageSid: "msg",
-  } as unknown as TemplateContext;
-  const resolvedQueue = { mode: "interrupt" } as unknown as QueueSettings;
-  const sessionKey = params?.sessionKey ?? "main";
-  const followupRun = {
-    prompt: "hello",
-    summaryLine: "hello",
-    enqueuedAt: Date.now(),
-    run: {
-      sessionId: "session",
-      sessionKey,
-      messageProvider: "whatsapp",
-      sessionFile: "/tmp/session.jsonl",
-      workspaceDir: "/tmp",
-      config: {},
-      skillsSnapshot: {},
-      provider: "anthropic",
-      model: "claude",
-      thinkLevel: "low",
-      verboseLevel: params?.resolvedVerboseLevel ?? "off",
-      elevatedLevel: "off",
-      bashElevated: {
-        enabled: false,
-        allowed: false,
-        defaultLevel: "off",
-      },
-      timeoutMs: 1_000,
-      blockReplyBreak: "message_end",
-    },
-  } as unknown as FollowupRun;
-
-  return {
-    typing,
-    opts,
-    run: async () => {
-      const runReplyAgent = await getRunReplyAgent();
-      return runReplyAgent({
-        commandBody: "hello",
-        followupRun,
-        queueKey: "main",
-        resolvedQueue,
-        shouldSteer: false,
-        shouldFollowup: false,
-        isActive: false,
-        isStreaming: false,
-        opts,
-        typing,
-        sessionEntry: params?.sessionEntry,
-        sessionStore: params?.sessionStore,
-        sessionKey,
-        storePath: params?.storePath,
-        sessionCtx,
-        defaultModel: "anthropic/claude-opus-4-5",
-        resolvedVerboseLevel: params?.resolvedVerboseLevel ?? "off",
-        isNewSession: false,
-        blockStreamingEnabled: params?.blockStreamingEnabled ?? false,
-        resolvedBlockStreamingBreak: "message_end",
-        shouldInjectGroupIntro: false,
-        typingMode: params?.typingMode ?? "instant",
-      });
-    },
-  };
-}
--- a/src/auto-reply/reply/agent-runner.memory-flush.runreplyagent-memory-flush.test.ts
+++ b/src/auto-reply/reply/agent-runner.memory-flush.runreplyagent-memory-flush.test.ts
@@ -1,423 +0,0 @@
-import fs from "node:fs/promises";
-import os from "node:os";
-import path from "node:path";
-import { afterAll, beforeAll, describe, expect, it } from "vitest";
-import {
-  createBaseRun,
-  getRunCliAgentMock,
-  getRunEmbeddedPiAgentMock,
-  seedSessionStore,
-  type EmbeddedRunParams,
-} from "./agent-runner.memory-flush.test-harness.js";
-import { DEFAULT_MEMORY_FLUSH_PROMPT } from "./memory-flush.js";
-
-let runReplyAgent: typeof import("./agent-runner.js").runReplyAgent;
-
-let fixtureRoot = "";
-let caseId = 0;
-
-async function withTempStore<T>(fn: (storePath: string) => Promise<T>): Promise<T> {
-  const dir = path.join(fixtureRoot, `case-${++caseId}`);
-  await fs.mkdir(dir, { recursive: true });
-  return await fn(path.join(dir, "sessions.json"));
-}
-
-async function runReplyAgentWithBase(params: {
-  baseRun: ReturnType<typeof createBaseRun>;
-  storePath: string;
-  sessionKey: string;
-  sessionEntry: Record<string, unknown>;
-  commandBody: string;
-  typingMode?: "instant";
-}): Promise<void> {
-  const { typing, sessionCtx, resolvedQueue, followupRun } = params.baseRun;
-  await runReplyAgent({
-    commandBody: params.commandBody,
-    followupRun,
-    queueKey: params.sessionKey,
-    resolvedQueue,
-    shouldSteer: false,
-    shouldFollowup: false,
-    isActive: false,
-    isStreaming: false,
-    typing,
-    sessionCtx,
-    sessionEntry: params.sessionEntry,
-    sessionStore: { [params.sessionKey]: params.sessionEntry },
-    sessionKey: params.sessionKey,
-    storePath: params.storePath,
-    defaultModel: "anthropic/claude-opus-4-5",
-    agentCfgContextTokens: 100_000,
-    resolvedVerboseLevel: "off",
-    isNewSession: false,
-    blockStreamingEnabled: false,
-    resolvedBlockStreamingBreak: "message_end",
-    shouldInjectGroupIntro: false,
-    typingMode: params.typingMode ?? "instant",
-  });
-}
-
-async function expectMemoryFlushSkippedWithWorkspaceAccess(
-  workspaceAccess: "ro" | "none",
-): Promise<void> {
-  const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-  runEmbeddedPiAgentMock.mockReset();
-
-  await withTempStore(async (storePath) => {
-    const sessionKey = "main";
-    const sessionEntry = {
-      sessionId: "session",
-      updatedAt: Date.now(),
-      totalTokens: 80_000,
-      compactionCount: 1,
-    };
-
-    await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-    const calls: Array<{ prompt?: string }> = [];
-    runEmbeddedPiAgentMock.mockImplementation(async (params: EmbeddedRunParams) => {
-      calls.push({ prompt: params.prompt });
-      return {
-        payloads: [{ text: "ok" }],
-        meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-      };
-    });
-
-    const baseRun = createBaseRun({
-      storePath,
-      sessionEntry,
-      config: {
-        agents: {
-          defaults: {
-            sandbox: { mode: "all", workspaceAccess },
-          },
-        },
-      },
-    });
-
-    await runReplyAgentWithBase({
-      baseRun,
-      storePath,
-      sessionKey,
-      sessionEntry,
-      commandBody: "hello",
-    });
-
-    expect(calls.map((call) => call.prompt)).toEqual(["hello"]);
-
-    const stored = JSON.parse(await fs.readFile(storePath, "utf-8"));
-    expect(stored[sessionKey].memoryFlushAt).toBeUndefined();
-  });
-}
-
-beforeAll(async () => {
-  fixtureRoot = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-memory-flush-"));
-  ({ runReplyAgent } = await import("./agent-runner.js"));
-});
-
-afterAll(async () => {
-  if (fixtureRoot) {
-    await fs.rm(fixtureRoot, { recursive: true, force: true });
-  }
-});
-
-describe("runReplyAgent memory flush", () => {
-  it("skips memory flush for CLI providers", async () => {
-    const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-    const runCliAgentMock = getRunCliAgentMock();
-    runEmbeddedPiAgentMock.mockReset();
-    runCliAgentMock.mockReset();
-
-    await withTempStore(async (storePath) => {
-      const sessionKey = "main";
-      const sessionEntry = {
-        sessionId: "session",
-        updatedAt: Date.now(),
-        totalTokens: 80_000,
-        compactionCount: 1,
-      };
-
-      await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-      runEmbeddedPiAgentMock.mockImplementation(async () => ({
-        payloads: [{ text: "ok" }],
-        meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-      }));
-      runCliAgentMock.mockResolvedValue({
-        payloads: [{ text: "ok" }],
-        meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-      });
-
-      const baseRun = createBaseRun({
-        storePath,
-        sessionEntry,
-        runOverrides: { provider: "codex-cli" },
-      });
-
-      await runReplyAgentWithBase({
-        baseRun,
-        storePath,
-        sessionKey,
-        sessionEntry,
-        commandBody: "hello",
-      });
-
-      expect(runCliAgentMock).toHaveBeenCalledTimes(1);
-      const call = runCliAgentMock.mock.calls[0]?.[0] as { prompt?: string } | undefined;
-      expect(call?.prompt).toBe("hello");
-      expect(runEmbeddedPiAgentMock).not.toHaveBeenCalled();
-    });
-  });
-
-  it("uses configured prompts for memory flush runs", async () => {
-    const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-    runEmbeddedPiAgentMock.mockReset();
-
-    await withTempStore(async (storePath) => {
-      const sessionKey = "main";
-      const sessionEntry = {
-        sessionId: "session",
-        updatedAt: Date.now(),
-        totalTokens: 80_000,
-        compactionCount: 1,
-      };
-
-      await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-      const calls: Array<EmbeddedRunParams> = [];
-      runEmbeddedPiAgentMock.mockImplementation(async (params: EmbeddedRunParams) => {
-        calls.push(params);
-        if (params.prompt === DEFAULT_MEMORY_FLUSH_PROMPT) {
-          return { payloads: [], meta: {} };
-        }
-        return {
-          payloads: [{ text: "ok" }],
-          meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-        };
-      });
-
-      const baseRun = createBaseRun({
-        storePath,
-        sessionEntry,
-        config: {
-          agents: {
-            defaults: {
-              compaction: {
-                memoryFlush: {
-                  prompt: "Write notes.",
-                  systemPrompt: "Flush memory now.",
-                },
-              },
-            },
-          },
-        },
-        runOverrides: { extraSystemPrompt: "extra system" },
-      });
-
-      await runReplyAgentWithBase({
-        baseRun,
-        storePath,
-        sessionKey,
-        sessionEntry,
-        commandBody: "hello",
-      });
-
-      const flushCall = calls[0];
-      expect(flushCall?.prompt).toContain("Write notes.");
-      expect(flushCall?.prompt).toContain("NO_REPLY");
-      expect(flushCall?.extraSystemPrompt).toContain("extra system");
-      expect(flushCall?.extraSystemPrompt).toContain("Flush memory now.");
-      expect(flushCall?.extraSystemPrompt).toContain("NO_REPLY");
-      expect(calls[1]?.prompt).toBe("hello");
-    });
-  });
-
-  it("runs a memory flush turn and updates session metadata", async () => {
-    const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-    runEmbeddedPiAgentMock.mockReset();
-
-    await withTempStore(async (storePath) => {
-      const sessionKey = "main";
-      const sessionEntry = {
-        sessionId: "session",
-        updatedAt: Date.now(),
-        totalTokens: 80_000,
-        compactionCount: 1,
-      };
-
-      await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-      const calls: Array<{ prompt?: string }> = [];
-      runEmbeddedPiAgentMock.mockImplementation(async (params: EmbeddedRunParams) => {
-        calls.push({ prompt: params.prompt });
-        if (params.prompt === DEFAULT_MEMORY_FLUSH_PROMPT) {
-          return { payloads: [], meta: {} };
-        }
-        return {
-          payloads: [{ text: "ok" }],
-          meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-        };
-      });
-
-      const baseRun = createBaseRun({
-        storePath,
-        sessionEntry,
-      });
-
-      await runReplyAgentWithBase({
-        baseRun,
-        storePath,
-        sessionKey,
-        sessionEntry,
-        commandBody: "hello",
-      });
-
-      expect(calls.map((call) => call.prompt)).toEqual([DEFAULT_MEMORY_FLUSH_PROMPT, "hello"]);
-
-      const stored = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(stored[sessionKey].memoryFlushAt).toBeTypeOf("number");
-      expect(stored[sessionKey].memoryFlushCompactionCount).toBe(1);
-    });
-  });
-
-  it("skips memory flush when disabled in config", async () => {
-    const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-    runEmbeddedPiAgentMock.mockReset();
-
-    await withTempStore(async (storePath) => {
-      const sessionKey = "main";
-      const sessionEntry = {
-        sessionId: "session",
-        updatedAt: Date.now(),
-        totalTokens: 80_000,
-        compactionCount: 1,
-      };
-
-      await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-      runEmbeddedPiAgentMock.mockImplementation(async () => ({
-        payloads: [{ text: "ok" }],
-        meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-      }));
-
-      const baseRun = createBaseRun({
-        storePath,
-        sessionEntry,
-        config: { agents: { defaults: { compaction: { memoryFlush: { enabled: false } } } } },
-      });
-
-      await runReplyAgentWithBase({
-        baseRun,
-        storePath,
-        sessionKey,
-        sessionEntry,
-        commandBody: "hello",
-      });
-
-      expect(runEmbeddedPiAgentMock).toHaveBeenCalledTimes(1);
-      const call = runEmbeddedPiAgentMock.mock.calls[0]?.[0] as { prompt?: string } | undefined;
-      expect(call?.prompt).toBe("hello");
-
-      const stored = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(stored[sessionKey].memoryFlushAt).toBeUndefined();
-    });
-  });
-
-  it("skips memory flush after a prior flush in the same compaction cycle", async () => {
-    const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-    runEmbeddedPiAgentMock.mockReset();
-
-    await withTempStore(async (storePath) => {
-      const sessionKey = "main";
-      const sessionEntry = {
-        sessionId: "session",
-        updatedAt: Date.now(),
-        totalTokens: 80_000,
-        compactionCount: 2,
-        memoryFlushCompactionCount: 2,
-      };
-
-      await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-      const calls: Array<{ prompt?: string }> = [];
-      runEmbeddedPiAgentMock.mockImplementation(async (params: EmbeddedRunParams) => {
-        calls.push({ prompt: params.prompt });
-        return {
-          payloads: [{ text: "ok" }],
-          meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-        };
-      });
-
-      const baseRun = createBaseRun({
-        storePath,
-        sessionEntry,
-      });
-
-      await runReplyAgentWithBase({
-        baseRun,
-        storePath,
-        sessionKey,
-        sessionEntry,
-        commandBody: "hello",
-      });
-
-      expect(calls.map((call) => call.prompt)).toEqual(["hello"]);
-    });
-  });
-
-  it("skips memory flush when the sandbox workspace is read-only", async () => {
-    await expectMemoryFlushSkippedWithWorkspaceAccess("ro");
-  });
-
-  it("skips memory flush when the sandbox workspace is none", async () => {
-    await expectMemoryFlushSkippedWithWorkspaceAccess("none");
-  });
-
-  it("increments compaction count when flush compaction completes", async () => {
-    const runEmbeddedPiAgentMock = getRunEmbeddedPiAgentMock();
-    runEmbeddedPiAgentMock.mockReset();
-
-    await withTempStore(async (storePath) => {
-      const sessionKey = "main";
-      const sessionEntry = {
-        sessionId: "session",
-        updatedAt: Date.now(),
-        totalTokens: 80_000,
-        compactionCount: 1,
-      };
-
-      await seedSessionStore({ storePath, sessionKey, entry: sessionEntry });
-
-      runEmbeddedPiAgentMock.mockImplementation(async (params: EmbeddedRunParams) => {
-        if (params.prompt === DEFAULT_MEMORY_FLUSH_PROMPT) {
-          params.onAgentEvent?.({
-            stream: "compaction",
-            data: { phase: "end", willRetry: false },
-          });
-          return { payloads: [], meta: {} };
-        }
-        return {
-          payloads: [{ text: "ok" }],
-          meta: { agentMeta: { usage: { input: 1, output: 1 } } },
-        };
-      });
-
-      const baseRun = createBaseRun({
-        storePath,
-        sessionEntry,
-      });
-
-      await runReplyAgentWithBase({
-        baseRun,
-        storePath,
-        sessionKey,
-        sessionEntry,
-        commandBody: "hello",
-      });
-
-      const stored = JSON.parse(await fs.readFile(storePath, "utf-8"));
-      expect(stored[sessionKey].compactionCount).toBe(2);
-      expect(stored[sessionKey].memoryFlushCompactionCount).toBe(2);
-    });
-  });
-});
--- a/src/auto-reply/reply/agent-runner.memory-flush.test-harness.ts
+++ b/src/auto-reply/reply/agent-runner.memory-flush.test-harness.ts
@@ -1,121 +0,0 @@
-import fs from "node:fs/promises";
-import path from "node:path";
-import { vi } from "vitest";
-import type { TemplateContext } from "../templating.js";
-import type { FollowupRun, QueueSettings } from "./queue.js";
-import { createMockTypingController } from "./test-helpers.js";
-
-// Avoid exporting vitest mock types (TS2742 under pnpm + d.ts emit).
-// oxlint-disable-next-line typescript/no-explicit-any
-type AnyMock = any;
-
-type EmbeddedRunParams = {
-  prompt?: string;
-  extraSystemPrompt?: string;
-  onAgentEvent?: (evt: { stream?: string; data?: { phase?: string; willRetry?: boolean } }) => void;
-};
-
-const state = vi.hoisted(() => ({
-  runEmbeddedPiAgentMock: vi.fn(),
-  runCliAgentMock: vi.fn(),
-}));
-
-export function getRunEmbeddedPiAgentMock(): AnyMock {
-  return state.runEmbeddedPiAgentMock;
-}
-
-export function getRunCliAgentMock(): AnyMock {
-  return state.runCliAgentMock;
-}
-
-export type { EmbeddedRunParams };
-
-async function loadHarnessMocks() {
-  const { loadAgentRunnerHarnessMockBundle } = await import("./agent-runner.test-harness.mocks.js");
-  return await loadAgentRunnerHarnessMockBundle(state);
-}
-
-vi.mock("../../agents/model-fallback.js", async () => {
-  return (await loadHarnessMocks()).modelFallback;
-});
-
-vi.mock("../../agents/cli-runner.js", () => ({
-  runCliAgent: (params: unknown) => state.runCliAgentMock(params),
-}));
-
-vi.mock("../../agents/pi-embedded.js", async () => {
-  return (await loadHarnessMocks()).embeddedPi;
-});
-
-vi.mock("./queue.js", async () => {
-  return (await loadHarnessMocks()).queue;
-});
-
-export async function seedSessionStore(params: {
-  storePath: string;
-  sessionKey: string;
-  entry: Record<string, unknown>;
-}) {
-  await fs.mkdir(path.dirname(params.storePath), { recursive: true });
-  await fs.writeFile(
-    params.storePath,
-    JSON.stringify({ [params.sessionKey]: params.entry }, null, 2),
-    "utf-8",
-  );
-}
-
-export function createBaseRun(params: {
-  storePath: string;
-  sessionEntry: Record<string, unknown>;
-  config?: Record<string, unknown>;
-  runOverrides?: Partial<FollowupRun["run"]>;
-}) {
-  const typing = createMockTypingController();
-  const sessionCtx = {
-    Provider: "whatsapp",
-    OriginatingTo: "+15550001111",
-    AccountId: "primary",
-    MessageSid: "msg",
-  } as unknown as TemplateContext;
-  const resolvedQueue = { mode: "interrupt" } as unknown as QueueSettings;
-  const followupRun = {
-    prompt: "hello",
-    summaryLine: "hello",
-    enqueuedAt: Date.now(),
-    run: {
-      agentId: "main",
-      agentDir: "/tmp/agent",
-      sessionId: "session",
-      sessionKey: "main",
-      messageProvider: "whatsapp",
-      sessionFile: "/tmp/session.jsonl",
-      workspaceDir: "/tmp",
-      config: params.config ?? {},
-      skillsSnapshot: {},
-      provider: "anthropic",
-      model: "claude",
-      thinkLevel: "low",
-      verboseLevel: "off",
-      elevatedLevel: "off",
-      bashElevated: {
-        enabled: false,
-        allowed: false,
-        defaultLevel: "off",
-      },
-      timeoutMs: 1_000,
-      blockReplyBreak: "message_end",
-    },
-  } as unknown as FollowupRun;
-  const run = {
-    ...followupRun.run,
-    ...params.runOverrides,
-    config: params.config ?? followupRun.run.config,
-  };
-
-  return {
-    typing,
-    sessionCtx,
-    resolvedQueue,
-    followupRun: { ...followupRun, run },
-  };
-}
--- a/src/auto-reply/reply/agent-runner.runreplyagent.test.ts
+++ b/src/auto-reply/reply/agent-runner.runreplyagent.test.ts
--- a/src/auto-reply/reply/agent-runner.test-harness.mocks.ts
+++ b/src/auto-reply/reply/agent-runner.test-harness.mocks.ts
@@ -1,51 +0,0 @@
-import { vi } from "vitest";
-
-export type AgentRunnerEmbeddedState = {
-  runEmbeddedPiAgentMock: (params: unknown) => unknown;
-};
-
-export function modelFallbackMockFactory(): Record<string, unknown> {
-  return {
-    runWithModelFallback: async ({
-      provider,
-      model,
-      run,
-    }: {
-      provider: string;
-      model: string;
-      run: (provider: string, model: string) => Promise<unknown>;
-    }) => ({
-      result: await run(provider, model),
-      provider,
-      model,
-    }),
-  };
-}
-
-export function embeddedPiMockFactory(state: AgentRunnerEmbeddedState): Record<string, unknown> {
-  return {
-    queueEmbeddedPiMessage: vi.fn().mockReturnValue(false),
-    runEmbeddedPiAgent: (params: unknown) => state.runEmbeddedPiAgentMock(params),
-  };
-}
-
-export async function queueMockFactory(): Promise<Record<string, unknown>> {
-  const actual = await vi.importActual<typeof import("./queue.js")>("./queue.js");
-  return {
-    ...actual,
-    enqueueFollowupRun: vi.fn(),
-    scheduleFollowupDrain: vi.fn(),
-  };
-}
-
-export async function loadAgentRunnerHarnessMockBundle(state: AgentRunnerEmbeddedState): Promise<{
-  modelFallback: Record<string, unknown>;
-  embeddedPi: Record<string, unknown>;
-  queue: Record<string, unknown>;
-}> {
-  return {
-    modelFallback: modelFallbackMockFactory(),
-    embeddedPi: embeddedPiMockFactory(state),
-    queue: await queueMockFactory(),
-  };
-}
--- a/src/auto-reply/reply/commands-approve.test.ts
+++ b/src/auto-reply/reply/commands-approve.test.ts
@@ -1,114 +0,0 @@
-import { beforeEach, describe, expect, it, vi } from "vitest";
-import type { OpenClawConfig } from "../../config/config.js";
-import { callGateway } from "../../gateway/call.js";
-import { handleCommands } from "./commands.js";
-import { buildCommandTestParams } from "./commands.test-harness.js";
-
-vi.mock("../../gateway/call.js", () => ({
-  callGateway: vi.fn(),
-}));
-
-describe("/approve command", () => {
-  beforeEach(() => {
-    vi.clearAllMocks();
-  });
-
-  it("rejects invalid usage", async () => {
-    const cfg = {
-      commands: { text: true },
-      channels: { whatsapp: { allowFrom: ["*"] } },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/approve", cfg);
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Usage: /approve");
-  });
-
-  it("submits approval", async () => {
-    const cfg = {
-      commands: { text: true },
-      channels: { whatsapp: { allowFrom: ["*"] } },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/approve abc allow-once", cfg, { SenderId: "123" });
-
-    const mockCallGateway = vi.mocked(callGateway);
-    mockCallGateway.mockResolvedValueOnce({ ok: true });
-
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Exec approval allow-once submitted");
-    expect(mockCallGateway).toHaveBeenCalledWith(
-      expect.objectContaining({
-        method: "exec.approval.resolve",
-        params: { id: "abc", decision: "allow-once" },
-      }),
-    );
-  });
-
-  it("rejects gateway clients without approvals scope", async () => {
-    const cfg = {
-      commands: { text: true },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/approve abc allow-once", cfg, {
-      Provider: "webchat",
-      Surface: "webchat",
-      GatewayClientScopes: ["operator.write"],
-    });
-
-    const mockCallGateway = vi.mocked(callGateway);
-    mockCallGateway.mockResolvedValueOnce({ ok: true });
-
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("requires operator.approvals");
-    expect(mockCallGateway).not.toHaveBeenCalled();
-  });
-
-  it("allows gateway clients with approvals scope", async () => {
-    const cfg = {
-      commands: { text: true },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/approve abc allow-once", cfg, {
-      Provider: "webchat",
-      Surface: "webchat",
-      GatewayClientScopes: ["operator.approvals"],
-    });
-
-    const mockCallGateway = vi.mocked(callGateway);
-    mockCallGateway.mockResolvedValueOnce({ ok: true });
-
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Exec approval allow-once submitted");
-    expect(mockCallGateway).toHaveBeenCalledWith(
-      expect.objectContaining({
-        method: "exec.approval.resolve",
-        params: { id: "abc", decision: "allow-once" },
-      }),
-    );
-  });
-
-  it("allows gateway clients with admin scope", async () => {
-    const cfg = {
-      commands: { text: true },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/approve abc allow-once", cfg, {
-      Provider: "webchat",
-      Surface: "webchat",
-      GatewayClientScopes: ["operator.admin"],
-    });
-
-    const mockCallGateway = vi.mocked(callGateway);
-    mockCallGateway.mockResolvedValueOnce({ ok: true });
-
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Exec approval allow-once submitted");
-    expect(mockCallGateway).toHaveBeenCalledWith(
-      expect.objectContaining({
-        method: "exec.approval.resolve",
-        params: { id: "abc", decision: "allow-once" },
-      }),
-    );
-  });
-});
--- a/src/auto-reply/reply/commands-compact.test.ts
+++ b/src/auto-reply/reply/commands-compact.test.ts
@@ -1,114 +0,0 @@
-import { beforeEach, describe, expect, it, vi } from "vitest";
-import type { OpenClawConfig } from "../../config/config.js";
-import { compactEmbeddedPiSession } from "../../agents/pi-embedded.js";
-import { handleCompactCommand } from "./commands-compact.js";
-import { buildCommandTestParams } from "./commands.test-harness.js";
-
-vi.mock("../../agents/pi-embedded.js", () => ({
-  abortEmbeddedPiRun: vi.fn(),
-  compactEmbeddedPiSession: vi.fn(),
-  isEmbeddedPiRunActive: vi.fn().mockReturnValue(false),
-  waitForEmbeddedPiRunEnd: vi.fn().mockResolvedValue(undefined),
-}));
-
-vi.mock("../../infra/system-events.js", () => ({
-  enqueueSystemEvent: vi.fn(),
-}));
-
-vi.mock("./session-updates.js", () => ({
-  incrementCompactionCount: vi.fn(),
-}));
-
-describe("/compact command", () => {
-  beforeEach(() => {
-    vi.clearAllMocks();
-  });
-
-  it("returns null when command is not /compact", async () => {
-    const cfg = {
-      commands: { text: true },
-      channels: { whatsapp: { allowFrom: ["*"] } },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/status", cfg);
-
-    const result = await handleCompactCommand(
-      {
-        ...params,
-      },
-      true,
-    );
-
-    expect(result).toBeNull();
-    expect(vi.mocked(compactEmbeddedPiSession)).not.toHaveBeenCalled();
-  });
-
-  it("rejects unauthorized /compact commands", async () => {
-    const cfg = {
-      commands: { text: true },
-      channels: { whatsapp: { allowFrom: ["*"] } },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/compact", cfg);
-
-    const result = await handleCompactCommand(
-      {
-        ...params,
-        command: {
-          ...params.command,
-          isAuthorizedSender: false,
-          senderId: "unauthorized",
-        },
-      },
-      true,
-    );
-
-    expect(result).toEqual({ shouldContinue: false });
-    expect(vi.mocked(compactEmbeddedPiSession)).not.toHaveBeenCalled();
-  });
-
-  it("routes manual compaction with explicit trigger and context metadata", async () => {
-    const cfg = {
-      commands: { text: true },
-      channels: { whatsapp: { allowFrom: ["*"] } },
-      session: { store: "/tmp/openclaw-session-store.json" },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams("/compact: focus on decisions", cfg, {
-      From: "+15550001",
-      To: "+15550002",
-    });
-    vi.mocked(compactEmbeddedPiSession).mockResolvedValueOnce({
-      ok: true,
-      compacted: false,
-    });
-
-    const result = await handleCompactCommand(
-      {
-        ...params,
-        sessionEntry: {
-          sessionId: "session-1",
-          groupId: "group-1",
-          groupChannel: "#general",
-          space: "workspace-1",
-          spawnedBy: "agent:main:parent",
-          totalTokens: 12345,
-        },
-      },
-      true,
-    );
-
-    expect(result?.shouldContinue).toBe(false);
-    expect(vi.mocked(compactEmbeddedPiSession)).toHaveBeenCalledOnce();
-    expect(vi.mocked(compactEmbeddedPiSession)).toHaveBeenCalledWith(
-      expect.objectContaining({
-        sessionId: "session-1",
-        sessionKey: "agent:main:main",
-        trigger: "manual",
-        customInstructions: "focus on decisions",
-        messageChannel: "whatsapp",
-        groupId: "group-1",
-        groupChannel: "#general",
-        groupSpace: "workspace-1",
-        spawnedBy: "agent:main:parent",
-      }),
-    );
-  });
-});
--- a/src/auto-reply/reply/commands-info.test.ts
+++ b/src/auto-reply/reply/commands-info.test.ts
@@ -1,13 +0,0 @@
-import { describe, expect, it } from "vitest";
-import { buildCommandsPaginationKeyboard } from "./commands-info.js";
-
-describe("buildCommandsPaginationKeyboard", () => {
-  it("adds agent id to callback data when provided", () => {
-    const keyboard = buildCommandsPaginationKeyboard(2, 3, "agent-main");
-    expect(keyboard[0]).toEqual([
-      { text: "◀ Prev", callback_data: "commands_page_1:agent-main" },
-      { text: "2/3", callback_data: "commands_page_noop:agent-main" },
-      { text: "Next ▶", callback_data: "commands_page_3:agent-main" },
-    ]);
-  });
-});
--- a/src/auto-reply/reply/commands-parsing.test.ts
+++ b/src/auto-reply/reply/commands-parsing.test.ts
@@ -1,85 +0,0 @@
-import { describe, expect, it } from "vitest";
-import type { OpenClawConfig } from "../../config/config.js";
-import { extractMessageText } from "./commands-subagents.js";
-import { handleCommands } from "./commands.js";
-import { buildCommandTestParams } from "./commands.test-harness.js";
-import { parseConfigCommand } from "./config-commands.js";
-import { parseDebugCommand } from "./debug-commands.js";
-
-describe("parseConfigCommand", () => {
-  it("parses show/unset", () => {
-    expect(parseConfigCommand("/config")).toEqual({ action: "show" });
-    expect(parseConfigCommand("/config show")).toEqual({
-      action: "show",
-      path: undefined,
-    });
-    expect(parseConfigCommand("/config show foo.bar")).toEqual({
-      action: "show",
-      path: "foo.bar",
-    });
-    expect(parseConfigCommand("/config get foo.bar")).toEqual({
-      action: "show",
-      path: "foo.bar",
-    });
-    expect(parseConfigCommand("/config unset foo.bar")).toEqual({
-      action: "unset",
-      path: "foo.bar",
-    });
-  });
-
-  it("parses set with JSON", () => {
-    const cmd = parseConfigCommand('/config set foo={"a":1}');
-    expect(cmd).toEqual({ action: "set", path: "foo", value: { a: 1 } });
-  });
-});
-
-describe("parseDebugCommand", () => {
-  it("parses show/reset", () => {
-    expect(parseDebugCommand("/debug")).toEqual({ action: "show" });
-    expect(parseDebugCommand("/debug show")).toEqual({ action: "show" });
-    expect(parseDebugCommand("/debug reset")).toEqual({ action: "reset" });
-  });
-
-  it("parses set with JSON", () => {
-    const cmd = parseDebugCommand('/debug set foo={"a":1}');
-    expect(cmd).toEqual({ action: "set", path: "foo", value: { a: 1 } });
-  });
-
-  it("parses unset", () => {
-    const cmd = parseDebugCommand("/debug unset foo.bar");
-    expect(cmd).toEqual({ action: "unset", path: "foo.bar" });
-  });
-});
-
-describe("extractMessageText", () => {
-  it("preserves user text that looks like tool call markers", () => {
-    const message = {
-      role: "user",
-      content: "Here [Tool Call: foo (ID: 1)] ok",
-    };
-    const result = extractMessageText(message);
-    expect(result?.text).toContain("[Tool Call: foo (ID: 1)]");
-  });
-
-  it("sanitizes assistant tool call markers", () => {
-    const message = {
-      role: "assistant",
-      content: "Here [Tool Call: foo (ID: 1)] ok",
-    };
-    const result = extractMessageText(message);
-    expect(result?.text).toBe("Here ok");
-  });
-});
-
-describe("handleCommands /config configWrites gating", () => {
-  it("blocks /config set when channel config writes are disabled", async () => {
-    const cfg = {
-      commands: { config: true, text: true },
-      channels: { whatsapp: { allowFrom: ["*"], configWrites: false } },
-    } as OpenClawConfig;
-    const params = buildCommandTestParams('/config set messages.ackReaction=":)"', cfg);
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Config writes are disabled");
-  });
-});
--- a/src/auto-reply/reply/commands-policy.test.ts
+++ b/src/auto-reply/reply/commands-policy.test.ts
@@ -1,335 +0,0 @@
-import { beforeEach, describe, expect, it, vi } from "vitest";
-import type { OpenClawConfig } from "../../config/config.js";
-import type { MsgContext } from "../templating.js";
-import { buildCommandContext, handleCommands } from "./commands.js";
-import { parseInlineDirectives } from "./directive-handling.js";
-
-const readConfigFileSnapshotMock = vi.hoisted(() => vi.fn());
-const validateConfigObjectWithPluginsMock = vi.hoisted(() => vi.fn());
-const writeConfigFileMock = vi.hoisted(() => vi.fn());
-
-vi.mock("../../config/config.js", async () => {
-  const actual =
-    await vi.importActual<typeof import("../../config/config.js")>("../../config/config.js");
-  return {
-    ...actual,
-    readConfigFileSnapshot: readConfigFileSnapshotMock,
-    validateConfigObjectWithPlugins: validateConfigObjectWithPluginsMock,
-    writeConfigFile: writeConfigFileMock,
-  };
-});
-
-const readChannelAllowFromStoreMock = vi.hoisted(() => vi.fn());
-const addChannelAllowFromStoreEntryMock = vi.hoisted(() => vi.fn());
-const removeChannelAllowFromStoreEntryMock = vi.hoisted(() => vi.fn());
-
-vi.mock("../../pairing/pairing-store.js", async () => {
-  const actual = await vi.importActual<typeof import("../../pairing/pairing-store.js")>(
-    "../../pairing/pairing-store.js",
-  );
-  return {
-    ...actual,
-    readChannelAllowFromStore: readChannelAllowFromStoreMock,
-    addChannelAllowFromStoreEntry: addChannelAllowFromStoreEntryMock,
-    removeChannelAllowFromStoreEntry: removeChannelAllowFromStoreEntryMock,
-  };
-});
-
-vi.mock("../../channels/plugins/pairing.js", async () => {
-  const actual = await vi.importActual<typeof import("../../channels/plugins/pairing.js")>(
-    "../../channels/plugins/pairing.js",
-  );
-  return {
-    ...actual,
-    listPairingChannels: () => ["telegram"],
-  };
-});
-
-vi.mock("../../agents/model-catalog.js", () => ({
-  loadModelCatalog: vi.fn(async () => [
-    { provider: "anthropic", id: "claude-opus-4-5", name: "Claude Opus" },
-    { provider: "anthropic", id: "claude-sonnet-4-5", name: "Claude Sonnet" },
-    { provider: "openai", id: "gpt-4.1", name: "GPT-4.1" },
-    { provider: "openai", id: "gpt-4.1-mini", name: "GPT-4.1 Mini" },
-    { provider: "google", id: "gemini-2.0-flash", name: "Gemini Flash" },
-  ]),
-}));
-
-function buildParams(commandBody: string, cfg: OpenClawConfig, ctxOverrides?: Partial<MsgContext>) {
-  const ctx = {
-    Body: commandBody,
-    CommandBody: commandBody,
-    CommandSource: "text",
-    CommandAuthorized: true,
-    Provider: "telegram",
-    Surface: "telegram",
-    ...ctxOverrides,
-  } as MsgContext;
-
-  const command = buildCommandContext({
-    ctx,
-    cfg,
-    isGroup: false,
-    triggerBodyNormalized: commandBody.trim().toLowerCase(),
-    commandAuthorized: true,
-  });
-
-  return {
-    ctx,
-    cfg,
-    command,
-    directives: parseInlineDirectives(commandBody),
-    elevated: { enabled: true, allowed: true, failures: [] },
-    sessionKey: "agent:main:main",
-    workspaceDir: "/tmp",
-    defaultGroupActivation: () => "mention",
-    resolvedVerboseLevel: "off" as const,
-    resolvedReasoningLevel: "off" as const,
-    resolveDefaultThinkingLevel: async () => undefined,
-    provider: "telegram",
-    model: "test-model",
-    contextTokens: 0,
-    isGroup: false,
-  };
-}
-
-describe("handleCommands /allowlist", () => {
-  beforeEach(() => {
-    vi.clearAllMocks();
-  });
-
-  it("lists config + store allowFrom entries", async () => {
-    readChannelAllowFromStoreMock.mockResolvedValueOnce(["456"]);
-
-    const cfg = {
-      commands: { text: true },
-      channels: { telegram: { allowFrom: ["123", "@Alice"] } },
-    } as OpenClawConfig;
-    const params = buildParams("/allowlist list dm", cfg);
-    const result = await handleCommands(params);
-
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Channel: telegram");
-    expect(result.reply?.text).toContain("DM allowFrom (config): 123, @alice");
-    expect(result.reply?.text).toContain("Paired allowFrom (store): 456");
-  });
-
-  it("adds entries to config and pairing store", async () => {
-    readConfigFileSnapshotMock.mockResolvedValueOnce({
-      valid: true,
-      parsed: {
-        channels: { telegram: { allowFrom: ["123"] } },
-      },
-    });
-    validateConfigObjectWithPluginsMock.mockImplementation((config: unknown) => ({
-      ok: true,
-      config,
-    }));
-    addChannelAllowFromStoreEntryMock.mockResolvedValueOnce({
-      changed: true,
-      allowFrom: ["123", "789"],
-    });
-
-    const cfg = {
-      commands: { text: true, config: true },
-      channels: { telegram: { allowFrom: ["123"] } },
-    } as OpenClawConfig;
-    const params = buildParams("/allowlist add dm 789", cfg);
-    const result = await handleCommands(params);
-
-    expect(result.shouldContinue).toBe(false);
-    expect(writeConfigFileMock).toHaveBeenCalledWith(
-      expect.objectContaining({
-        channels: { telegram: { allowFrom: ["123", "789"] } },
-      }),
-    );
-    expect(addChannelAllowFromStoreEntryMock).toHaveBeenCalledWith({
-      channel: "telegram",
-      entry: "789",
-    });
-    expect(result.reply?.text).toContain("DM allowlist added");
-  });
-
-  it("removes Slack DM allowlist entries from canonical allowFrom and deletes legacy dm.allowFrom", async () => {
-    readConfigFileSnapshotMock.mockResolvedValueOnce({
-      valid: true,
-      parsed: {
-        channels: {
-          slack: {
-            allowFrom: ["U111", "U222"],
-            dm: { allowFrom: ["U111", "U222"] },
-            configWrites: true,
-          },
-        },
-      },
-    });
-    validateConfigObjectWithPluginsMock.mockImplementation((config: unknown) => ({
-      ok: true,
-      config,
-    }));
-
-    const cfg = {
-      commands: { text: true, config: true },
-      channels: {
-        slack: {
-          allowFrom: ["U111", "U222"],
-          dm: { allowFrom: ["U111", "U222"] },
-          configWrites: true,
-        },
-      },
-    } as OpenClawConfig;
-
-    const params = buildParams("/allowlist remove dm U111", cfg, {
-      Provider: "slack",
-      Surface: "slack",
-    });
-    const result = await handleCommands(params);
-
-    expect(result.shouldContinue).toBe(false);
-    expect(writeConfigFileMock).toHaveBeenCalledTimes(1);
-    const written = writeConfigFileMock.mock.calls[0]?.[0] as OpenClawConfig;
-    expect(written.channels?.slack?.allowFrom).toEqual(["U222"]);
-    expect(written.channels?.slack?.dm?.allowFrom).toBeUndefined();
-    expect(result.reply?.text).toContain("channels.slack.allowFrom");
-  });
-
-  it("removes Discord DM allowlist entries from canonical allowFrom and deletes legacy dm.allowFrom", async () => {
-    readConfigFileSnapshotMock.mockResolvedValueOnce({
-      valid: true,
-      parsed: {
-        channels: {
-          discord: {
-            allowFrom: ["111", "222"],
-            dm: { allowFrom: ["111", "222"] },
-            configWrites: true,
-          },
-        },
-      },
-    });
-    validateConfigObjectWithPluginsMock.mockImplementation((config: unknown) => ({
-      ok: true,
-      config,
-    }));
-
-    const cfg = {
-      commands: { text: true, config: true },
-      channels: {
-        discord: {
-          allowFrom: ["111", "222"],
-          dm: { allowFrom: ["111", "222"] },
-          configWrites: true,
-        },
-      },
-    } as OpenClawConfig;
-
-    const params = buildParams("/allowlist remove dm 111", cfg, {
-      Provider: "discord",
-      Surface: "discord",
-    });
-    const result = await handleCommands(params);
-
-    expect(result.shouldContinue).toBe(false);
-    expect(writeConfigFileMock).toHaveBeenCalledTimes(1);
-    const written = writeConfigFileMock.mock.calls[0]?.[0] as OpenClawConfig;
-    expect(written.channels?.discord?.allowFrom).toEqual(["222"]);
-    expect(written.channels?.discord?.dm?.allowFrom).toBeUndefined();
-    expect(result.reply?.text).toContain("channels.discord.allowFrom");
-  });
-});
-
-describe("/models command", () => {
-  const cfg = {
-    commands: { text: true },
-    agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } },
-  } as unknown as OpenClawConfig;
-
-  it.each(["discord", "whatsapp"])("lists providers on %s (text)", async (surface) => {
-    const params = buildParams("/models", cfg, { Provider: surface, Surface: surface });
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Providers:");
-    expect(result.reply?.text).toContain("anthropic");
-    expect(result.reply?.text).toContain("Use: /models <provider>");
-  });
-
-  it("lists providers on telegram (buttons)", async () => {
-    const params = buildParams("/models", cfg, { Provider: "telegram", Surface: "telegram" });
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toBe("Select a provider:");
-    const buttons = (result.reply?.channelData as { telegram?: { buttons?: unknown[][] } })
-      ?.telegram?.buttons;
-    expect(buttons).toBeDefined();
-    expect(buttons?.length).toBeGreaterThan(0);
-  });
-
-  it("lists provider models with pagination hints", async () => {
-    // Use discord surface for text-based output tests
-    const params = buildParams("/models anthropic", cfg, { Surface: "discord" });
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Models (anthropic)");
-    expect(result.reply?.text).toContain("page 1/");
-    expect(result.reply?.text).toContain("anthropic/claude-opus-4-5");
-    expect(result.reply?.text).toContain("Switch: /model <provider/model>");
-    expect(result.reply?.text).toContain("All: /models anthropic all");
-  });
-
-  it("ignores page argument when all flag is present", async () => {
-    // Use discord surface for text-based output tests
-    const params = buildParams("/models anthropic 3 all", cfg, { Surface: "discord" });
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Models (anthropic)");
-    expect(result.reply?.text).toContain("page 1/1");
-    expect(result.reply?.text).toContain("anthropic/claude-opus-4-5");
-    expect(result.reply?.text).not.toContain("Page out of range");
-  });
-
-  it("errors on out-of-range pages", async () => {
-    // Use discord surface for text-based output tests
-    const params = buildParams("/models anthropic 4", cfg, { Surface: "discord" });
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Page out of range");
-    expect(result.reply?.text).toContain("valid: 1-");
-  });
-
-  it("handles unknown providers", async () => {
-    const params = buildParams("/models not-a-provider", cfg);
-    const result = await handleCommands(params);
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Unknown provider");
-    expect(result.reply?.text).toContain("Available providers");
-  });
-
-  it("lists configured models outside the curated catalog", async () => {
-    const customCfg = {
-      commands: { text: true },
-      agents: {
-        defaults: {
-          model: {
-            primary: "localai/ultra-chat",
-            fallbacks: ["anthropic/claude-opus-4-5"],
-          },
-          imageModel: "visionpro/studio-v1",
-        },
-      },
-    } as unknown as OpenClawConfig;
-
-    // Use discord surface for text-based output tests
-    const providerList = await handleCommands(
-      buildParams("/models", customCfg, { Surface: "discord" }),
-    );
-    expect(providerList.reply?.text).toContain("localai");
-    expect(providerList.reply?.text).toContain("visionpro");
-
-    const result = await handleCommands(
-      buildParams("/models localai", customCfg, { Surface: "discord" }),
-    );
-    expect(result.shouldContinue).toBe(false);
-    expect(result.reply?.text).toContain("Models (localai)");
-    expect(result.reply?.text).toContain("localai/ultra-chat");
-    expect(result.reply?.text).not.toContain("Unknown provider");
-  });
-});
--- a/src/auto-reply/reply/commands-setunset.ts
+++ b/src/auto-reply/reply/commands-setunset.ts
@@ -0,0 +1,38 @@
+import { parseConfigValue } from "./config-value.js";
+
+export type SetUnsetParseResult =
+  | { kind: "set"; path: string; value: unknown }
+  | { kind: "unset"; path: string }
+  | { kind: "error"; message: string };
+
+export function parseSetUnsetCommand(params: {
+  slash: string;
+  action: "set" | "unset";
+  args: string;
+}): SetUnsetParseResult {
+  const action = params.action;
+  const args = params.args.trim();
+  if (action === "unset") {
+    if (!args) {
+      return { kind: "error", message: `Usage: ${params.slash} unset path` };
+    }
+    return { kind: "unset", path: args };
+  }
+  if (!args) {
+    return { kind: "error", message: `Usage: ${params.slash} set path=value` };
+  }
+  const eqIndex = args.indexOf("=");
+  if (eqIndex <= 0) {
+    return { kind: "error", message: `Usage: ${params.slash} set path=value` };
+  }
+  const path = args.slice(0, eqIndex).trim();
+  const rawValue = args.slice(eqIndex + 1);
+  if (!path) {
+    return { kind: "error", message: `Usage: ${params.slash} set path=value` };
+  }
+  const parsed = parseConfigValue(rawValue);
+  if (parsed.error) {
+    return { kind: "error", message: parsed.error };
+  }
+  return { kind: "set", path, value: parsed.value };
+}
--- a/src/auto-reply/reply/commands-slash-parse.ts
+++ b/src/auto-reply/reply/commands-slash-parse.ts
@@ -0,0 +1,46 @@
+export type SlashCommandParseResult =
+  | { kind: "no-match" }
+  | { kind: "empty" }
+  | { kind: "invalid" }
+  | { kind: "parsed"; action: string; args: string };
+
+export type ParsedSlashCommand =
+  | { ok: true; action: string; args: string }
+  | { ok: false; message: string };
+
+export function parseSlashCommandActionArgs(raw: string, slash: string): SlashCommandParseResult {
+  const trimmed = raw.trim();
+  const slashLower = slash.toLowerCase();
+  if (!trimmed.toLowerCase().startsWith(slashLower)) {
+    return { kind: "no-match" };
+  }
+  const rest = trimmed.slice(slash.length).trim();
+  if (!rest) {
+    return { kind: "empty" };
+  }
+  const match = rest.match(/^(\S+)(?:\s+([\s\S]+))?$/);
+  if (!match) {
+    return { kind: "invalid" };
+  }
+  const action = match[1]?.toLowerCase() ?? "";
+  const args = (match[2] ?? "").trim();
+  return { kind: "parsed", action, args };
+}
+
+export function parseSlashCommandOrNull(
+  raw: string,
+  slash: string,
+  opts: { invalidMessage: string; defaultAction?: string },
+): ParsedSlashCommand | null {
+  const parsed = parseSlashCommandActionArgs(raw, slash);
+  if (parsed.kind === "no-match") {
+    return null;
+  }
+  if (parsed.kind === "invalid") {
+    return { ok: false, message: opts.invalidMessage };
+  }
+  if (parsed.kind === "empty") {
+    return { ok: true, action: opts.defaultAction ?? "show", args: "" };
+  }
+  return { ok: true, action: parsed.action, args: parsed.args };
+}
--- a/src/auto-reply/reply/commands.test.ts
+++ b/src/auto-reply/reply/commands.test.ts
@@ -1,7 +1,7 @@
 import fs from "node:fs/promises";
 import os from "node:os";
 import path from "node:path";
-import { afterAll, beforeAll, describe, expect, it, vi } from "vitest";
+import { afterAll, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
 import type { OpenClawConfig } from "../../config/config.js";
 import type { MsgContext } from "../templating.js";
 import {
@@ -13,14 +13,96 @@ import { updateSessionStore } from "../../config/sessions.js";
 import * as internalHooks from "../../hooks/internal-hooks.js";
 import { clearPluginCommands, registerPluginCommand } from "../../plugins/commands.js";
 import { resetBashChatCommandForTests } from "./bash-command.js";
+import { handleCompactCommand } from "./commands-compact.js";
+import { buildCommandsPaginationKeyboard } from "./commands-info.js";
+import { extractMessageText } from "./commands-subagents.js";
 import { buildCommandTestParams } from "./commands.test-harness.js";
+import { parseConfigCommand } from "./config-commands.js";
+import { parseDebugCommand } from "./debug-commands.js";
+import { parseInlineDirectives } from "./directive-handling.js";
+
+const readConfigFileSnapshotMock = vi.hoisted(() => vi.fn());
+const validateConfigObjectWithPluginsMock = vi.hoisted(() => vi.fn());
+const writeConfigFileMock = vi.hoisted(() => vi.fn());
+
+vi.mock("../../config/config.js", async () => {
+  const actual =
+    await vi.importActual<typeof import("../../config/config.js")>("../../config/config.js");
+  return {
+    ...actual,
+    readConfigFileSnapshot: readConfigFileSnapshotMock,
+    validateConfigObjectWithPlugins: validateConfigObjectWithPluginsMock,
+    writeConfigFile: writeConfigFileMock,
+  };
+});
+
+const readChannelAllowFromStoreMock = vi.hoisted(() => vi.fn());
+const addChannelAllowFromStoreEntryMock = vi.hoisted(() => vi.fn());
+const removeChannelAllowFromStoreEntryMock = vi.hoisted(() => vi.fn());
+
+vi.mock("../../pairing/pairing-store.js", async () => {
+  const actual = await vi.importActual<typeof import("../../pairing/pairing-store.js")>(
+    "../../pairing/pairing-store.js",
+  );
+  return {
+    ...actual,
+    readChannelAllowFromStore: readChannelAllowFromStoreMock,
+    addChannelAllowFromStoreEntry: addChannelAllowFromStoreEntryMock,
+    removeChannelAllowFromStoreEntry: removeChannelAllowFromStoreEntryMock,
+  };
+});
+
+vi.mock("../../channels/plugins/pairing.js", async () => {
+  const actual = await vi.importActual<typeof import("../../channels/plugins/pairing.js")>(
+    "../../channels/plugins/pairing.js",
+  );
+  return {
+    ...actual,
+    listPairingChannels: () => ["telegram"],
+  };
+});
+
+vi.mock("../../agents/model-catalog.js", () => ({
+  loadModelCatalog: vi.fn(async () => [
+    { provider: "anthropic", id: "claude-opus-4-5", name: "Claude Opus" },
+    { provider: "anthropic", id: "claude-sonnet-4-5", name: "Claude Sonnet" },
+    { provider: "openai", id: "gpt-4.1", name: "GPT-4.1" },
+    { provider: "openai", id: "gpt-4.1-mini", name: "GPT-4.1 Mini" },
+    { provider: "google", id: "gemini-2.0-flash", name: "Gemini Flash" },
+  ]),
+}));
+
+vi.mock("../../agents/pi-embedded.js", () => {
+  const resolveEmbeddedSessionLane = (key: string) => {
+    const cleaned = key.trim() || "main";
+    return cleaned.startsWith("session:") ? cleaned : `session:${cleaned}`;
+  };
+  return {
+    abortEmbeddedPiRun: vi.fn(),
+    compactEmbeddedPiSession: vi.fn(),
+    isEmbeddedPiRunActive: vi.fn().mockReturnValue(false),
+    isEmbeddedPiRunStreaming: vi.fn().mockReturnValue(false),
+    queueEmbeddedPiMessage: vi.fn().mockReturnValue(false),
+    resolveEmbeddedSessionLane,
+    runEmbeddedPiAgent: vi.fn(),
+    waitForEmbeddedPiRunEnd: vi.fn().mockResolvedValue(undefined),
+  };
+});
+
+vi.mock("../../infra/system-events.js", () => ({
+  enqueueSystemEvent: vi.fn(),
+}));
+
+vi.mock("./session-updates.js", () => ({
+  incrementCompactionCount: vi.fn(),
+}));

 const callGatewayMock = vi.fn();
 vi.mock("../../gateway/call.js", () => ({
  callGateway: (opts: unknown) => callGatewayMock(opts),
 }));

-import { handleCommands } from "./commands.js";
+import { buildCommandContext, handleCommands } from "./commands.js";

 // Avoid expensive workspace scans during /context tests.
 vi.mock("./commands-context-report.js", () => ({
@@ -104,6 +186,293 @@ describe("handleCommands gating", () => {
  });
 });

+describe("/approve command", () => {
+  beforeEach(() => {
+    vi.clearAllMocks();
+  });
+
+  it("rejects invalid usage", async () => {
+    const cfg = {
+      commands: { text: true },
+      channels: { whatsapp: { allowFrom: ["*"] } },
+    } as OpenClawConfig;
+    const params = buildParams("/approve", cfg);
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Usage: /approve");
+  });
+
+  it("submits approval", async () => {
+    const cfg = {
+      commands: { text: true },
+      channels: { whatsapp: { allowFrom: ["*"] } },
+    } as OpenClawConfig;
+    const params = buildParams("/approve abc allow-once", cfg, { SenderId: "123" });
+
+    callGatewayMock.mockResolvedValueOnce({ ok: true });
+
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Exec approval allow-once submitted");
+    expect(callGatewayMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        method: "exec.approval.resolve",
+        params: { id: "abc", decision: "allow-once" },
+      }),
+    );
+  });
+
+  it("rejects gateway clients without approvals scope", async () => {
+    const cfg = {
+      commands: { text: true },
+    } as OpenClawConfig;
+    const params = buildParams("/approve abc allow-once", cfg, {
+      Provider: "webchat",
+      Surface: "webchat",
+      GatewayClientScopes: ["operator.write"],
+    });
+
+    callGatewayMock.mockResolvedValueOnce({ ok: true });
+
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("requires operator.approvals");
+    expect(callGatewayMock).not.toHaveBeenCalled();
+  });
+
+  it("allows gateway clients with approvals scope", async () => {
+    const cfg = {
+      commands: { text: true },
+    } as OpenClawConfig;
+    const params = buildParams("/approve abc allow-once", cfg, {
+      Provider: "webchat",
+      Surface: "webchat",
+      GatewayClientScopes: ["operator.approvals"],
+    });
+
+    callGatewayMock.mockResolvedValueOnce({ ok: true });
+
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Exec approval allow-once submitted");
+    expect(callGatewayMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        method: "exec.approval.resolve",
+        params: { id: "abc", decision: "allow-once" },
+      }),
+    );
+  });
+
+  it("allows gateway clients with admin scope", async () => {
+    const cfg = {
+      commands: { text: true },
+    } as OpenClawConfig;
+    const params = buildParams("/approve abc allow-once", cfg, {
+      Provider: "webchat",
+      Surface: "webchat",
+      GatewayClientScopes: ["operator.admin"],
+    });
+
+    callGatewayMock.mockResolvedValueOnce({ ok: true });
+
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Exec approval allow-once submitted");
+    expect(callGatewayMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        method: "exec.approval.resolve",
+        params: { id: "abc", decision: "allow-once" },
+      }),
+    );
+  });
+});
+
+describe("/compact command", () => {
+  beforeEach(() => {
+    vi.clearAllMocks();
+  });
+
+  it("returns null when command is not /compact", async () => {
+    const { compactEmbeddedPiSession } = await import("../../agents/pi-embedded.js");
+    const cfg = {
+      commands: { text: true },
+      channels: { whatsapp: { allowFrom: ["*"] } },
+    } as OpenClawConfig;
+    const params = buildParams("/status", cfg);
+
+    const result = await handleCompactCommand(
+      {
+        ...params,
+      },
+      true,
+    );
+
+    expect(result).toBeNull();
+    expect(vi.mocked(compactEmbeddedPiSession)).not.toHaveBeenCalled();
+  });
+
+  it("rejects unauthorized /compact commands", async () => {
+    const { compactEmbeddedPiSession } = await import("../../agents/pi-embedded.js");
+    const cfg = {
+      commands: { text: true },
+      channels: { whatsapp: { allowFrom: ["*"] } },
+    } as OpenClawConfig;
+    const params = buildParams("/compact", cfg);
+
+    const result = await handleCompactCommand(
+      {
+        ...params,
+        command: {
+          ...params.command,
+          isAuthorizedSender: false,
+          senderId: "unauthorized",
+        },
+      },
+      true,
+    );
+
+    expect(result).toEqual({ shouldContinue: false });
+    expect(vi.mocked(compactEmbeddedPiSession)).not.toHaveBeenCalled();
+  });
+
+  it("routes manual compaction with explicit trigger and context metadata", async () => {
+    const { compactEmbeddedPiSession } = await import("../../agents/pi-embedded.js");
+    const cfg = {
+      commands: { text: true },
+      channels: { whatsapp: { allowFrom: ["*"] } },
+      session: { store: "/tmp/openclaw-session-store.json" },
+    } as OpenClawConfig;
+    const params = buildParams("/compact: focus on decisions", cfg, {
+      From: "+15550001",
+      To: "+15550002",
+    });
+    vi.mocked(compactEmbeddedPiSession).mockResolvedValueOnce({
+      ok: true,
+      compacted: false,
+    });
+
+    const result = await handleCompactCommand(
+      {
+        ...params,
+        sessionEntry: {
+          sessionId: "session-1",
+          groupId: "group-1",
+          groupChannel: "#general",
+          space: "workspace-1",
+          spawnedBy: "agent:main:parent",
+          totalTokens: 12345,
+        },
+      },
+      true,
+    );
+
+    expect(result?.shouldContinue).toBe(false);
+    expect(vi.mocked(compactEmbeddedPiSession)).toHaveBeenCalledOnce();
+    expect(vi.mocked(compactEmbeddedPiSession)).toHaveBeenCalledWith(
+      expect.objectContaining({
+        sessionId: "session-1",
+        sessionKey: "agent:main:main",
+        trigger: "manual",
+        customInstructions: "focus on decisions",
+        messageChannel: "whatsapp",
+        groupId: "group-1",
+        groupChannel: "#general",
+        groupSpace: "workspace-1",
+        spawnedBy: "agent:main:parent",
+      }),
+    );
+  });
+});
+
+describe("buildCommandsPaginationKeyboard", () => {
+  it("adds agent id to callback data when provided", () => {
+    const keyboard = buildCommandsPaginationKeyboard(2, 3, "agent-main");
+    expect(keyboard[0]).toEqual([
+      { text: "◀ Prev", callback_data: "commands_page_1:agent-main" },
+      { text: "2/3", callback_data: "commands_page_noop:agent-main" },
+      { text: "Next ▶", callback_data: "commands_page_3:agent-main" },
+    ]);
+  });
+});
+
+describe("parseConfigCommand", () => {
+  it("parses show/unset", () => {
+    expect(parseConfigCommand("/config")).toEqual({ action: "show" });
+    expect(parseConfigCommand("/config show")).toEqual({
+      action: "show",
+      path: undefined,
+    });
+    expect(parseConfigCommand("/config show foo.bar")).toEqual({
+      action: "show",
+      path: "foo.bar",
+    });
+    expect(parseConfigCommand("/config get foo.bar")).toEqual({
+      action: "show",
+      path: "foo.bar",
+    });
+    expect(parseConfigCommand("/config unset foo.bar")).toEqual({
+      action: "unset",
+      path: "foo.bar",
+    });
+  });
+
+  it("parses set with JSON", () => {
+    const cmd = parseConfigCommand('/config set foo={"a":1}');
+    expect(cmd).toEqual({ action: "set", path: "foo", value: { a: 1 } });
+  });
+});
+
+describe("parseDebugCommand", () => {
+  it("parses show/reset", () => {
+    expect(parseDebugCommand("/debug")).toEqual({ action: "show" });
+    expect(parseDebugCommand("/debug show")).toEqual({ action: "show" });
+    expect(parseDebugCommand("/debug reset")).toEqual({ action: "reset" });
+  });
+
+  it("parses set with JSON", () => {
+    const cmd = parseDebugCommand('/debug set foo={"a":1}');
+    expect(cmd).toEqual({ action: "set", path: "foo", value: { a: 1 } });
+  });
+
+  it("parses unset", () => {
+    const cmd = parseDebugCommand("/debug unset foo.bar");
+    expect(cmd).toEqual({ action: "unset", path: "foo.bar" });
+  });
+});
+
+describe("extractMessageText", () => {
+  it("preserves user text that looks like tool call markers", () => {
+    const message = {
+      role: "user",
+      content: "Here [Tool Call: foo (ID: 1)] ok",
+    };
+    const result = extractMessageText(message);
+    expect(result?.text).toContain("[Tool Call: foo (ID: 1)]");
+  });
+
+  it("sanitizes assistant tool call markers", () => {
+    const message = {
+      role: "assistant",
+      content: "Here [Tool Call: foo (ID: 1)] ok",
+    };
+    const result = extractMessageText(message);
+    expect(result?.text).toBe("Here ok");
+  });
+});
+
+describe("handleCommands /config configWrites gating", () => {
+  it("blocks /config set when channel config writes are disabled", async () => {
+    const cfg = {
+      commands: { config: true, text: true },
+      channels: { whatsapp: { allowFrom: ["*"], configWrites: false } },
+    } as OpenClawConfig;
+    const params = buildParams('/config set messages.ackReaction=":)"', cfg);
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Config writes are disabled");
+  });
+});
+
 describe("handleCommands bash alias", () => {
  it("routes !poll through the /bash handler", async () => {
    resetBashChatCommandForTests();
@@ -130,6 +499,289 @@ describe("handleCommands bash alias", () => {
  });
 });

+function buildPolicyParams(
+  commandBody: string,
+  cfg: OpenClawConfig,
+  ctxOverrides?: Partial<MsgContext>,
+) {
+  const ctx = {
+    Body: commandBody,
+    CommandBody: commandBody,
+    CommandSource: "text",
+    CommandAuthorized: true,
+    Provider: "telegram",
+    Surface: "telegram",
+    ...ctxOverrides,
+  } as MsgContext;
+
+  const command = buildCommandContext({
+    ctx,
+    cfg,
+    isGroup: false,
+    triggerBodyNormalized: commandBody.trim().toLowerCase(),
+    commandAuthorized: true,
+  });
+
+  return {
+    ctx,
+    cfg,
+    command,
+    directives: parseInlineDirectives(commandBody),
+    elevated: { enabled: true, allowed: true, failures: [] },
+    sessionKey: "agent:main:main",
+    workspaceDir: "/tmp",
+    defaultGroupActivation: () => "mention",
+    resolvedVerboseLevel: "off" as const,
+    resolvedReasoningLevel: "off" as const,
+    resolveDefaultThinkingLevel: async () => undefined,
+    provider: "telegram",
+    model: "test-model",
+    contextTokens: 0,
+    isGroup: false,
+  };
+}
+
+describe("handleCommands /allowlist", () => {
+  beforeEach(() => {
+    vi.clearAllMocks();
+  });
+
+  it("lists config + store allowFrom entries", async () => {
+    readChannelAllowFromStoreMock.mockResolvedValueOnce(["456"]);
+
+    const cfg = {
+      commands: { text: true },
+      channels: { telegram: { allowFrom: ["123", "@Alice"] } },
+    } as OpenClawConfig;
+    const params = buildPolicyParams("/allowlist list dm", cfg);
+    const result = await handleCommands(params);
+
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Channel: telegram");
+    expect(result.reply?.text).toContain("DM allowFrom (config): 123, @alice");
+    expect(result.reply?.text).toContain("Paired allowFrom (store): 456");
+  });
+
+  it("adds entries to config and pairing store", async () => {
+    readConfigFileSnapshotMock.mockResolvedValueOnce({
+      valid: true,
+      parsed: {
+        channels: { telegram: { allowFrom: ["123"] } },
+      },
+    });
+    validateConfigObjectWithPluginsMock.mockImplementation((config: unknown) => ({
+      ok: true,
+      config,
+    }));
+    addChannelAllowFromStoreEntryMock.mockResolvedValueOnce({
+      changed: true,
+      allowFrom: ["123", "789"],
+    });
+
+    const cfg = {
+      commands: { text: true, config: true },
+      channels: { telegram: { allowFrom: ["123"] } },
+    } as OpenClawConfig;
+    const params = buildPolicyParams("/allowlist add dm 789", cfg);
+    const result = await handleCommands(params);
+
+    expect(result.shouldContinue).toBe(false);
+    expect(writeConfigFileMock).toHaveBeenCalledWith(
+      expect.objectContaining({
+        channels: { telegram: { allowFrom: ["123", "789"] } },
+      }),
+    );
+    expect(addChannelAllowFromStoreEntryMock).toHaveBeenCalledWith({
+      channel: "telegram",
+      entry: "789",
+    });
+    expect(result.reply?.text).toContain("DM allowlist added");
+  });
+
+  it("removes Slack DM allowlist entries from canonical allowFrom and deletes legacy dm.allowFrom", async () => {
+    readConfigFileSnapshotMock.mockResolvedValueOnce({
+      valid: true,
+      parsed: {
+        channels: {
+          slack: {
+            allowFrom: ["U111", "U222"],
+            dm: { allowFrom: ["U111", "U222"] },
+            configWrites: true,
+          },
+        },
+      },
+    });
+    validateConfigObjectWithPluginsMock.mockImplementation((config: unknown) => ({
+      ok: true,
+      config,
+    }));
+
+    const cfg = {
+      commands: { text: true, config: true },
+      channels: {
+        slack: {
+          allowFrom: ["U111", "U222"],
+          dm: { allowFrom: ["U111", "U222"] },
+          configWrites: true,
+        },
+      },
+    } as OpenClawConfig;
+
+    const params = buildPolicyParams("/allowlist remove dm U111", cfg, {
+      Provider: "slack",
+      Surface: "slack",
+    });
+    const result = await handleCommands(params);
+
+    expect(result.shouldContinue).toBe(false);
+    expect(writeConfigFileMock).toHaveBeenCalledTimes(1);
+    const written = writeConfigFileMock.mock.calls[0]?.[0] as OpenClawConfig;
+    expect(written.channels?.slack?.allowFrom).toEqual(["U222"]);
+    expect(written.channels?.slack?.dm?.allowFrom).toBeUndefined();
+    expect(result.reply?.text).toContain("channels.slack.allowFrom");
+  });
+
+  it("removes Discord DM allowlist entries from canonical allowFrom and deletes legacy dm.allowFrom", async () => {
+    readConfigFileSnapshotMock.mockResolvedValueOnce({
+      valid: true,
+      parsed: {
+        channels: {
+          discord: {
+            allowFrom: ["111", "222"],
+            dm: { allowFrom: ["111", "222"] },
+            configWrites: true,
+          },
+        },
+      },
+    });
+    validateConfigObjectWithPluginsMock.mockImplementation((config: unknown) => ({
+      ok: true,
+      config,
+    }));
+
+    const cfg = {
+      commands: { text: true, config: true },
+      channels: {
+        discord: {
+          allowFrom: ["111", "222"],
+          dm: { allowFrom: ["111", "222"] },
+          configWrites: true,
+        },
+      },
+    } as OpenClawConfig;
+
+    const params = buildPolicyParams("/allowlist remove dm 111", cfg, {
+      Provider: "discord",
+      Surface: "discord",
+    });
+    const result = await handleCommands(params);
+
+    expect(result.shouldContinue).toBe(false);
+    expect(writeConfigFileMock).toHaveBeenCalledTimes(1);
+    const written = writeConfigFileMock.mock.calls[0]?.[0] as OpenClawConfig;
+    expect(written.channels?.discord?.allowFrom).toEqual(["222"]);
+    expect(written.channels?.discord?.dm?.allowFrom).toBeUndefined();
+    expect(result.reply?.text).toContain("channels.discord.allowFrom");
+  });
+});
+
+describe("/models command", () => {
+  const cfg = {
+    commands: { text: true },
+    agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } },
+  } as unknown as OpenClawConfig;
+
+  it.each(["discord", "whatsapp"])("lists providers on %s (text)", async (surface) => {
+    const params = buildPolicyParams("/models", cfg, { Provider: surface, Surface: surface });
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Providers:");
+    expect(result.reply?.text).toContain("anthropic");
+    expect(result.reply?.text).toContain("Use: /models <provider>");
+  });
+
+  it("lists providers on telegram (buttons)", async () => {
+    const params = buildPolicyParams("/models", cfg, { Provider: "telegram", Surface: "telegram" });
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toBe("Select a provider:");
+    const buttons = (result.reply?.channelData as { telegram?: { buttons?: unknown[][] } })
+      ?.telegram?.buttons;
+    expect(buttons).toBeDefined();
+    expect(buttons?.length).toBeGreaterThan(0);
+  });
+
+  it("lists provider models with pagination hints", async () => {
+    // Use discord surface for text-based output tests
+    const params = buildPolicyParams("/models anthropic", cfg, { Surface: "discord" });
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Models (anthropic)");
+    expect(result.reply?.text).toContain("page 1/");
+    expect(result.reply?.text).toContain("anthropic/claude-opus-4-5");
+    expect(result.reply?.text).toContain("Switch: /model <provider/model>");
+    expect(result.reply?.text).toContain("All: /models anthropic all");
+  });
+
+  it("ignores page argument when all flag is present", async () => {
+    // Use discord surface for text-based output tests
+    const params = buildPolicyParams("/models anthropic 3 all", cfg, { Surface: "discord" });
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Models (anthropic)");
+    expect(result.reply?.text).toContain("page 1/1");
+    expect(result.reply?.text).toContain("anthropic/claude-opus-4-5");
+    expect(result.reply?.text).not.toContain("Page out of range");
+  });
+
+  it("errors on out-of-range pages", async () => {
+    // Use discord surface for text-based output tests
+    const params = buildPolicyParams("/models anthropic 4", cfg, { Surface: "discord" });
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Page out of range");
+    expect(result.reply?.text).toContain("valid: 1-");
+  });
+
+  it("handles unknown providers", async () => {
+    const params = buildPolicyParams("/models not-a-provider", cfg);
+    const result = await handleCommands(params);
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Unknown provider");
+    expect(result.reply?.text).toContain("Available providers");
+  });
+
+  it("lists configured models outside the curated catalog", async () => {
+    const customCfg = {
+      commands: { text: true },
+      agents: {
+        defaults: {
+          model: {
+            primary: "localai/ultra-chat",
+            fallbacks: ["anthropic/claude-opus-4-5"],
+          },
+          imageModel: "visionpro/studio-v1",
+        },
+      },
+    } as unknown as OpenClawConfig;
+
+    // Use discord surface for text-based output tests
+    const providerList = await handleCommands(
+      buildPolicyParams("/models", customCfg, { Surface: "discord" }),
+    );
+    expect(providerList.reply?.text).toContain("localai");
+    expect(providerList.reply?.text).toContain("visionpro");
+
+    const result = await handleCommands(
+      buildPolicyParams("/models localai", customCfg, { Surface: "discord" }),
+    );
+    expect(result.shouldContinue).toBe(false);
+    expect(result.reply?.text).toContain("Models (localai)");
+    expect(result.reply?.text).toContain("localai/ultra-chat");
+    expect(result.reply?.text).not.toContain("Unknown provider");
+  });
+});
+
 describe("handleCommands plugin commands", () => {
  it("dispatches registered plugin commands", async () => {
    clearPluginCommands();
--- a/src/auto-reply/reply/config-commands.ts
+++ b/src/auto-reply/reply/config-commands.ts
@@ -1,4 +1,5 @@
-import { parseConfigValue } from "./config-value.js";
+import { parseSetUnsetCommand } from "./commands-setunset.js";
+import { parseSlashCommandOrNull } from "./commands-slash-parse.js";

 export type ConfigCommand =
  | { action: "show"; path?: string }
@@ -7,60 +8,31 @@ export type ConfigCommand =
  | { action: "error"; message: string };

 export function parseConfigCommand(raw: string): ConfigCommand | null {
-  const trimmed = raw.trim();
-  if (!trimmed.toLowerCase().startsWith("/config")) {
+  const parsed = parseSlashCommandOrNull(raw, "/config", {
+    invalidMessage: "Invalid /config syntax.",
+  });
+  if (!parsed) {
    return null;
  }
-  const rest = trimmed.slice("/config".length).trim();
-  if (!rest) {
-    return { action: "show" };
+  if (!parsed.ok) {
+    return { action: "error", message: parsed.message };
  }
-
-  const match = rest.match(/^(\S+)(?:\s+([\s\S]+))?$/);
-  if (!match) {
-    return { action: "error", message: "Invalid /config syntax." };
-  }
-  const action = match[1].toLowerCase();
-  const args = (match[2] ?? "").trim();
+  const { action, args } = parsed;

  switch (action) {
    case "show":
      return { action: "show", path: args || undefined };
    case "get":
      return { action: "show", path: args || undefined };
-    case "unset": {
-      if (!args) {
-        return { action: "error", message: "Usage: /config unset path" };
-      }
-      return { action: "unset", path: args };
-    }
+    case "unset":
    case "set": {
-      if (!args) {
-        return {
-          action: "error",
-          message: "Usage: /config set path=value",
-        };
+      const parsed = parseSetUnsetCommand({ slash: "/config", action, args });
+      if (parsed.kind === "error") {
+        return { action: "error", message: parsed.message };
      }
-      const eqIndex = args.indexOf("=");
-      if (eqIndex <= 0) {
-        return {
-          action: "error",
-          message: "Usage: /config set path=value",
-        };
-      }
-      const path = args.slice(0, eqIndex).trim();
-      const rawValue = args.slice(eqIndex + 1);
-      if (!path) {
-        return {
-          action: "error",
-          message: "Usage: /config set path=value",
-        };
-      }
-      const parsed = parseConfigValue(rawValue);
-      if (parsed.error) {
-        return { action: "error", message: parsed.error };
-      }
-      return { action: "set", path, value: parsed.value };
+      return parsed.kind === "set"
+        ? { action: "set", path: parsed.path, value: parsed.value }
+        : { action: "unset", path: parsed.path };
    }
    default:
      return {
--- a/src/auto-reply/reply/debug-commands.ts
+++ b/src/auto-reply/reply/debug-commands.ts
@@ -1,4 +1,5 @@
-import { parseConfigValue } from "./config-value.js";
+import { parseSetUnsetCommand } from "./commands-setunset.js";
+import { parseSlashCommandOrNull } from "./commands-slash-parse.js";

 export type DebugCommand =
  | { action: "show" }
@@ -8,60 +9,31 @@ export type DebugCommand =
  | { action: "error"; message: string };

 export function parseDebugCommand(raw: string): DebugCommand | null {
-  const trimmed = raw.trim();
-  if (!trimmed.toLowerCase().startsWith("/debug")) {
+  const parsed = parseSlashCommandOrNull(raw, "/debug", {
+    invalidMessage: "Invalid /debug syntax.",
+  });
+  if (!parsed) {
    return null;
  }
-  const rest = trimmed.slice("/debug".length).trim();
-  if (!rest) {
-    return { action: "show" };
+  if (!parsed.ok) {
+    return { action: "error", message: parsed.message };
  }
-
-  const match = rest.match(/^(\S+)(?:\s+([\s\S]+))?$/);
-  if (!match) {
-    return { action: "error", message: "Invalid /debug syntax." };
-  }
-  const action = match[1].toLowerCase();
-  const args = (match[2] ?? "").trim();
+  const { action, args } = parsed;

  switch (action) {
    case "show":
      return { action: "show" };
    case "reset":
      return { action: "reset" };
-    case "unset": {
-      if (!args) {
-        return { action: "error", message: "Usage: /debug unset path" };
-      }
-      return { action: "unset", path: args };
-    }
+    case "unset":
    case "set": {
-      if (!args) {
-        return {
-          action: "error",
-          message: "Usage: /debug set path=value",
-        };
+      const parsed = parseSetUnsetCommand({ slash: "/debug", action, args });
+      if (parsed.kind === "error") {
+        return { action: "error", message: parsed.message };
      }
-      const eqIndex = args.indexOf("=");
-      if (eqIndex <= 0) {
-        return {
-          action: "error",
-          message: "Usage: /debug set path=value",
-        };
-      }
-      const path = args.slice(0, eqIndex).trim();
-      const rawValue = args.slice(eqIndex + 1);
-      if (!path) {
-        return {
-          action: "error",
-          message: "Usage: /debug set path=value",
-        };
-      }
-      const parsed = parseConfigValue(rawValue);
-      if (parsed.error) {
-        return { action: "error", message: parsed.error };
-      }
-      return { action: "set", path, value: parsed.value };
+      return parsed.kind === "set"
+        ? { action: "set", path: parsed.path, value: parsed.value }
+        : { action: "unset", path: parsed.path };
    }
    default:
      return {
--- a/src/auto-reply/reply/directive-handling.fast-lane.ts
+++ b/src/auto-reply/reply/directive-handling.fast-lane.ts
@@ -1,7 +1,7 @@
 import type { ReplyPayload } from "../types.js";
 import type { ApplyInlineDirectivesFastLaneParams } from "./directive-handling.params.js";
-import type { ElevatedLevel, ReasoningLevel, ThinkLevel, VerboseLevel } from "./directives.js";
 import { handleDirectiveOnly } from "./directive-handling.impl.js";
+import { resolveCurrentDirectiveLevels } from "./directive-handling.levels.js";
 import { isDirectiveOnly } from "./directive-handling.parse.js";

 export async function applyInlineDirectivesFastLane(
@@ -48,19 +48,12 @@ export async function applyInlineDirectivesFastLane(
  }

  const agentCfg = params.agentCfg;
-  const resolvedDefaultThinkLevel =
-    (sessionEntry?.thinkingLevel as ThinkLevel | undefined) ??
-    (agentCfg?.thinkingDefault as ThinkLevel | undefined) ??
-    (await modelState.resolveDefaultThinkingLevel());
-  const currentThinkLevel = resolvedDefaultThinkLevel;
-  const currentVerboseLevel =
-    (sessionEntry?.verboseLevel as VerboseLevel | undefined) ??
-    (agentCfg?.verboseDefault as VerboseLevel | undefined);
-  const currentReasoningLevel =
-    (sessionEntry?.reasoningLevel as ReasoningLevel | undefined) ?? "off";
-  const currentElevatedLevel =
-    (sessionEntry?.elevatedLevel as ElevatedLevel | undefined) ??
-    (agentCfg?.elevatedDefault as ElevatedLevel | undefined);
+  const { currentThinkLevel, currentVerboseLevel, currentReasoningLevel, currentElevatedLevel } =
+    await resolveCurrentDirectiveLevels({
+      sessionEntry,
+      agentCfg,
+      resolveDefaultThinkingLevel: () => modelState.resolveDefaultThinkingLevel(),
+    });

  const directiveAck = await handleDirectiveOnly({
    cfg,
--- a/src/auto-reply/reply/directive-handling.levels.ts
+++ b/src/auto-reply/reply/directive-handling.levels.ts
@@ -0,0 +1,41 @@
+import type { ElevatedLevel, ReasoningLevel, ThinkLevel, VerboseLevel } from "../thinking.js";
+
+export async function resolveCurrentDirectiveLevels(params: {
+  sessionEntry?: {
+    thinkingLevel?: unknown;
+    verboseLevel?: unknown;
+    reasoningLevel?: unknown;
+    elevatedLevel?: unknown;
+  };
+  agentCfg?: {
+    thinkingDefault?: unknown;
+    verboseDefault?: unknown;
+    elevatedDefault?: unknown;
+  };
+  resolveDefaultThinkingLevel: () => Promise<ThinkLevel | undefined>;
+}): Promise<{
+  currentThinkLevel: ThinkLevel | undefined;
+  currentVerboseLevel: VerboseLevel | undefined;
+  currentReasoningLevel: ReasoningLevel;
+  currentElevatedLevel: ElevatedLevel | undefined;
+}> {
+  const resolvedDefaultThinkLevel =
+    (params.sessionEntry?.thinkingLevel as ThinkLevel | undefined) ??
+    (params.agentCfg?.thinkingDefault as ThinkLevel | undefined) ??
+    (await params.resolveDefaultThinkingLevel());
+  const currentThinkLevel = resolvedDefaultThinkLevel;
+  const currentVerboseLevel =
+    (params.sessionEntry?.verboseLevel as VerboseLevel | undefined) ??
+    (params.agentCfg?.verboseDefault as VerboseLevel | undefined);
+  const currentReasoningLevel =
+    (params.sessionEntry?.reasoningLevel as ReasoningLevel | undefined) ?? "off";
+  const currentElevatedLevel =
+    (params.sessionEntry?.elevatedLevel as ElevatedLevel | undefined) ??
+    (params.agentCfg?.elevatedDefault as ElevatedLevel | undefined);
+  return {
+    currentThinkLevel,
+    currentVerboseLevel,
+    currentReasoningLevel,
+    currentElevatedLevel,
+  };
+}
--- a/src/auto-reply/reply/formatting.test.ts
+++ b/src/auto-reply/reply/formatting.test.ts
@@ -1,280 +0,0 @@
-import { afterEach, describe, expect, it, vi } from "vitest";
-import { parseAudioTag } from "./audio-tags.js";
-import { createBlockReplyCoalescer } from "./block-reply-coalescer.js";
-import { createReplyReferencePlanner } from "./reply-reference.js";
-import { createStreamingDirectiveAccumulator } from "./streaming-directives.js";
-
-describe("parseAudioTag", () => {
-  it("detects audio_as_voice and strips the tag", () => {
-    const result = parseAudioTag("Hello [[audio_as_voice]] world");
-    expect(result.audioAsVoice).toBe(true);
-    expect(result.hadTag).toBe(true);
-    expect(result.text).toBe("Hello world");
-  });
-
-  it("returns empty output for missing text", () => {
-    const result = parseAudioTag(undefined);
-    expect(result.audioAsVoice).toBe(false);
-    expect(result.hadTag).toBe(false);
-    expect(result.text).toBe("");
-  });
-
-  it("removes tag-only messages", () => {
-    const result = parseAudioTag("[[audio_as_voice]]");
-    expect(result.audioAsVoice).toBe(true);
-    expect(result.text).toBe("");
-  });
-});
-
-describe("block reply coalescer", () => {
-  afterEach(() => {
-    vi.useRealTimers();
-  });
-
-  it("coalesces chunks within the idle window", async () => {
-    vi.useFakeTimers();
-    const flushes: string[] = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 1, maxChars: 200, idleMs: 100, joiner: " " },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push(payload.text ?? "");
-      },
-    });
-
-    coalescer.enqueue({ text: "Hello" });
-    coalescer.enqueue({ text: "world" });
-
-    await vi.advanceTimersByTimeAsync(100);
-    expect(flushes).toEqual(["Hello world"]);
-    coalescer.stop();
-  });
-
-  it("waits until minChars before idle flush", async () => {
-    vi.useFakeTimers();
-    const flushes: string[] = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 10, maxChars: 200, idleMs: 50, joiner: " " },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push(payload.text ?? "");
-      },
-    });
-
-    coalescer.enqueue({ text: "short" });
-    await vi.advanceTimersByTimeAsync(50);
-    expect(flushes).toEqual([]);
-
-    coalescer.enqueue({ text: "message" });
-    await vi.advanceTimersByTimeAsync(50);
-    expect(flushes).toEqual(["short message"]);
-    coalescer.stop();
-  });
-
-  it("flushes each enqueued payload separately when flushOnEnqueue is set", async () => {
-    const flushes: string[] = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 1, maxChars: 200, idleMs: 100, joiner: "\n\n", flushOnEnqueue: true },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push(payload.text ?? "");
-      },
-    });
-
-    coalescer.enqueue({ text: "First paragraph" });
-    coalescer.enqueue({ text: "Second paragraph" });
-    coalescer.enqueue({ text: "Third paragraph" });
-
-    await Promise.resolve();
-    expect(flushes).toEqual(["First paragraph", "Second paragraph", "Third paragraph"]);
-    coalescer.stop();
-  });
-
-  it("still accumulates when flushOnEnqueue is not set (default)", async () => {
-    vi.useFakeTimers();
-    const flushes: string[] = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 1, maxChars: 2000, idleMs: 100, joiner: "\n\n" },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push(payload.text ?? "");
-      },
-    });
-
-    coalescer.enqueue({ text: "First paragraph" });
-    coalescer.enqueue({ text: "Second paragraph" });
-
-    await vi.advanceTimersByTimeAsync(100);
-    expect(flushes).toEqual(["First paragraph\n\nSecond paragraph"]);
-    coalescer.stop();
-  });
-
-  it("flushes short payloads immediately when flushOnEnqueue is set", async () => {
-    const flushes: string[] = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 10, maxChars: 200, idleMs: 50, joiner: "\n\n", flushOnEnqueue: true },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push(payload.text ?? "");
-      },
-    });
-
-    coalescer.enqueue({ text: "Hi" });
-    await Promise.resolve();
-    expect(flushes).toEqual(["Hi"]);
-    coalescer.stop();
-  });
-
-  it("resets char budget per paragraph with flushOnEnqueue", async () => {
-    const flushes: string[] = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 1, maxChars: 30, idleMs: 100, joiner: "\n\n", flushOnEnqueue: true },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push(payload.text ?? "");
-      },
-    });
-
-    // Each 20-char payload fits within maxChars=30 individually
-    coalescer.enqueue({ text: "12345678901234567890" });
-    coalescer.enqueue({ text: "abcdefghijklmnopqrst" });
-
-    await Promise.resolve();
-    // Without flushOnEnqueue, these would be joined to 40+ chars and trigger maxChars split.
-    // With flushOnEnqueue, each is sent independently within budget.
-    expect(flushes).toEqual(["12345678901234567890", "abcdefghijklmnopqrst"]);
-    coalescer.stop();
-  });
-
-  it("flushes buffered text before media payloads", () => {
-    const flushes: Array<{ text?: string; mediaUrls?: string[] }> = [];
-    const coalescer = createBlockReplyCoalescer({
-      config: { minChars: 1, maxChars: 200, idleMs: 0, joiner: " " },
-      shouldAbort: () => false,
-      onFlush: (payload) => {
-        flushes.push({
-          text: payload.text,
-          mediaUrls: payload.mediaUrls,
-        });
-      },
-    });
-
-    coalescer.enqueue({ text: "Hello" });
-    coalescer.enqueue({ text: "world" });
-    coalescer.enqueue({ mediaUrls: ["https://example.com/a.png"] });
-    void coalescer.flush({ force: true });
-
-    expect(flushes[0].text).toBe("Hello world");
-    expect(flushes[1].mediaUrls).toEqual(["https://example.com/a.png"]);
-    coalescer.stop();
-  });
-});
-
-describe("createReplyReferencePlanner", () => {
-  it("disables references when mode is off", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "off",
-      startId: "parent",
-    });
-    expect(planner.use()).toBeUndefined();
-    expect(planner.hasReplied()).toBe(false);
-  });
-
-  it("uses startId once when mode is first", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "first",
-      startId: "parent",
-    });
-    expect(planner.use()).toBe("parent");
-    expect(planner.hasReplied()).toBe(true);
-    planner.markSent();
-    expect(planner.use()).toBeUndefined();
-  });
-
-  it("returns startId for every call when mode is all", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "all",
-      startId: "parent",
-    });
-    expect(planner.use()).toBe("parent");
-    expect(planner.use()).toBe("parent");
-  });
-
-  it("respects replyToMode off even with existingId", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "off",
-      existingId: "thread-1",
-      startId: "parent",
-    });
-    expect(planner.use()).toBeUndefined();
-    expect(planner.hasReplied()).toBe(false);
-  });
-
-  it("uses existingId once when mode is first", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "first",
-      existingId: "thread-1",
-      startId: "parent",
-    });
-    expect(planner.use()).toBe("thread-1");
-    expect(planner.hasReplied()).toBe(true);
-    expect(planner.use()).toBeUndefined();
-  });
-
-  it("uses existingId on every call when mode is all", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "all",
-      existingId: "thread-1",
-      startId: "parent",
-    });
-    expect(planner.use()).toBe("thread-1");
-    expect(planner.use()).toBe("thread-1");
-  });
-
-  it("honors allowReference=false", () => {
-    const planner = createReplyReferencePlanner({
-      replyToMode: "all",
-      startId: "parent",
-      allowReference: false,
-    });
-    expect(planner.use()).toBeUndefined();
-    expect(planner.hasReplied()).toBe(false);
-    planner.markSent();
-    expect(planner.hasReplied()).toBe(true);
-  });
-});
-
-describe("createStreamingDirectiveAccumulator", () => {
-  it("stashes reply_to_current until a renderable chunk arrives", () => {
-    const accumulator = createStreamingDirectiveAccumulator();
-
-    expect(accumulator.consume("[[reply_to_current]]")).toBeNull();
-
-    const result = accumulator.consume("Hello");
-    expect(result?.text).toBe("Hello");
-    expect(result?.replyToCurrent).toBe(true);
-    expect(result?.replyToTag).toBe(true);
-  });
-
-  it("handles reply tags split across chunks", () => {
-    const accumulator = createStreamingDirectiveAccumulator();
-
-    expect(accumulator.consume("[[reply_to_")).toBeNull();
-
-    const result = accumulator.consume("current]] Yo");
-    expect(result?.text).toBe("Yo");
-    expect(result?.replyToCurrent).toBe(true);
-    expect(result?.replyToTag).toBe(true);
-  });
-
-  it("propagates explicit reply ids across chunks", () => {
-    const accumulator = createStreamingDirectiveAccumulator();
-
-    expect(accumulator.consume("[[reply_to: abc-123]]")).toBeNull();
-
-    const result = accumulator.consume("Hi");
-    expect(result?.text).toBe("Hi");
-    expect(result?.replyToId).toBe("abc-123");
-    expect(result?.replyToTag).toBe(true);
-  });
-});
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
joshp123	3254bae4ca	Build: add runtime build (openclaw#17636) thanks @joshp123	2026-02-15 18:26:25 -08:00
joshp123	77d162fc7f	Build: add runtime build	2026-02-15 18:26:25 -08:00
Peter Steinberger	dc9808a674	refactor(gateway): dedupe transcript tail preview	2026-02-16 02:21:39 +00:00
Peter Steinberger	60ad2c2e96	refactor(device-pairing): share token update context	2026-02-16 02:19:53 +00:00
Peter Steinberger	a7cbce1b3d	refactor(security): tighten sandbox bind validation	2026-02-16 03:19:50 +01:00
Peter Steinberger	a74251d415	refactor(agents): dedupe fast tool stubs	2026-02-16 02:17:45 +00:00
Peter Steinberger	cbc3de6c97	docs(changelog): fix conflict marker	2026-02-16 03:15:57 +01:00
Peter Steinberger	01b1e350b2	docs: note Control UI XSS fix	2026-02-16 03:15:57 +01:00
Peter Steinberger	3b4096e02e	fix(ui): load Control UI bootstrap config via JSON endpoint	2026-02-16 03:15:57 +01:00
Peter Steinberger	adc818db4a	fix(gateway): serve Control UI bootstrap config and lock down CSP	2026-02-16 03:15:57 +01:00
Peter Steinberger	568fd337be	refactor(web-fetch): dedupe firecrawl fallback	2026-02-16 02:15:02 +00:00
Peter Steinberger	d9ca051a1d	refactor(auto-reply): share slash parsing for config/debug	2026-02-16 02:11:12 +00:00
Peter Steinberger	1b6704ef53	docs: update sandbox bind mount guidance	2026-02-16 03:05:16 +01:00
Peter Steinberger	887b209db4	fix(security): harden sandbox docker config validation	2026-02-16 03:04:06 +01:00
Peter Steinberger	d4bdcda324	refactor(nodes-cli): share node.invoke param builder	2026-02-16 02:03:15 +00:00
Peter Steinberger	966957fc66	refactor(nodes-cli): share pending pairing table	2026-02-16 01:58:30 +00:00
Peter Steinberger	555eb3f62c	refactor(discord): share member access state	2026-02-16 01:55:40 +00:00
Peter Steinberger	93b9f1ec5f	docs(changelog): note prompt path injection hardening	2026-02-16 02:53:40 +01:00
Peter Steinberger	6254e96acf	fix(security): harden prompt path sanitization	2026-02-16 02:53:40 +01:00
Peter Steinberger	19f53543d2	refactor(utils): share chunkItems helper	2026-02-16 01:52:30 +00:00
Peter Steinberger	618008b483	refactor(channels): share directory allowFrom parsing	2026-02-16 01:49:16 +00:00
Peter Steinberger	31d1ed351f	refactor(channels): dedupe status account bits	2026-02-16 01:47:52 +00:00
Peter Steinberger	22c1210a16	refactor(auto-reply): share directive level resolution	2026-02-16 01:45:51 +00:00
Peter Steinberger	273d70741f	refactor(supervisor): share env normalization	2026-02-16 01:41:35 +00:00
Peter Steinberger	07be14c02d	refactor(gateway): dedupe chat session abort flow	2026-02-16 01:39:39 +00:00
Peter Steinberger	5b2cb8ba11	refactor(cron): dedupe finished event emit	2026-02-16 01:37:03 +00:00
Peter Steinberger	1d7b2bc9c8	refactor(slack): dedupe slash reply delivery	2026-02-16 01:35:46 +00:00
Peter Steinberger	a881bd41eb	refactor(outbound): dedupe plugin outbound context	2026-02-16 01:35:46 +00:00
Onur	cd44a0d01e	fix: codex and similar processes keep dying on pty, solved by refactoring process spawning (#14257 ) * exec: clean up PTY resources on timeout and exit * cli: harden resume cleanup and watchdog stalled runs * cli: productionize PTY and resume reliability paths * docs: add PTY process supervision architecture plan * docs: rewrite PTY supervision plan as pre-rewrite baseline * docs: switch PTY supervision plan to one-go execution * docs: add one-line root cause to PTY supervision plan * docs: add OS contracts and test matrix to PTY supervision plan * docs: define process-supervisor package placement and scope * docs: tie supervisor plan to existing CI lanes * docs: place PTY supervisor plan under src/process * refactor(process): route exec and cli runs through supervisor * docs(process): refresh PTY supervision plan * wip * fix(process): harden supervisor timeout and PTY termination * fix(process): harden supervisor adapters env and wait handling * ci: avoid failing formal conformance on comment permissions * test(ui): fix cron request mock argument typing * fix(ui): remove leftover conflict marker * fix: supervise PTY processes (#14257) (openclaw#14257) (thanks @onutc)	2026-02-16 02:32:05 +01:00
Peter Steinberger	a73e7786e7	refactor(cron): share runnable job filter	2026-02-16 01:29:01 +00:00
Peter Steinberger	2679089e9e	refactor(cron): dedupe next-run recompute loop	2026-02-16 01:27:40 +00:00
Peter Steinberger	c95a61aa9d	refactor(cron): dedupe read-only load flow	2026-02-16 01:26:37 +00:00
Peter Steinberger	73a97ee255	refactor(gateway): share node invoke error handling	2026-02-16 01:25:06 +00:00
Peter Steinberger	b1dca644bc	refactor(gateway): share restart request parsing	2026-02-16 01:21:54 +00:00
Peter Steinberger	b743e652c0	refactor(gateway): reuse shared validators + baseHash	2026-02-16 01:19:01 +00:00
Peter Steinberger	71cee673b2	fix(gateway): satisfy server-method lint	2026-02-16 01:15:31 +00:00
Peter Steinberger	dc5d234848	refactor(gateway): share server-method param validation	2026-02-16 01:13:52 +00:00
Peter Steinberger	a5cbd036de	refactor(gateway): dedupe wizard param validation	2026-02-16 01:08:36 +00:00
Peter Steinberger	260a514467	refactor(slack): share channel config entry type	2026-02-16 01:06:18 +00:00
Peter Steinberger	067509fa44	refactor(onboarding): dedupe WhatsApp owner allowlist	2026-02-16 01:05:27 +00:00
Peter Steinberger	e84b20a527	refactor(status-issues): share enabled/configured account gate	2026-02-16 01:03:02 +00:00
Peter Steinberger	4aaafe5322	refactor(net): share hostname normalization	2026-02-16 01:01:22 +00:00
Peter Steinberger	d5ee766afe	refactor(outbound): dedupe channel handler params	2026-02-16 00:59:42 +00:00
Peter Steinberger	00c91c3678	refactor(outbound): dedupe queued delivery param types	2026-02-16 00:57:28 +00:00
Peter Steinberger	4ab25a2889	refactor(outbound): reuse message gateway call	2026-02-16 00:56:28 +00:00
Advait Paliwal	14fb2c05b1	Gateway/Control UI: preserve partial output on abort (#15026 ) * Gateway/Control UI: preserve partial output on abort * fix: finalize abort partial handling and tests (#15026) (thanks @advaitpaliwal) --------- Co-authored-by: Tyler Yust <TYTYYUST@YAHOO.COM>	2026-02-15 16:55:28 -08:00
Peter Steinberger	57d5a8df86	refactor(outbound): dedupe directory list call	2026-02-16 00:54:37 +00:00
Peter Steinberger	b6871d9c0f	refactor(outbound): dedupe attachment hydration	2026-02-16 00:52:27 +00:00
Peter Steinberger	f03ea76db3	fix(slack): tighten threadId type to satisfy lint	2026-02-16 00:49:00 +00:00
Peter Steinberger	753491ab80	refactor(slack): dedupe outbound send flow	2026-02-16 00:48:32 +00:00
Peter Steinberger	d00adfe98c	refactor(signal): dedupe outbound media limit resolve	2026-02-16 00:47:19 +00:00
Peter Steinberger	2b2c3a071b	refactor(imessage): dedupe outbound media limit resolve	2026-02-16 00:46:18 +00:00
Peter Steinberger	f8fbeb52b0	refactor(protocol): dedupe cron/config schemas	2026-02-16 00:46:11 +00:00
Peter Steinberger	cb46ea037f	refactor(models): dedupe set default model updates	2026-02-16 00:43:15 +00:00
Peter Steinberger	dece9e8b07	refactor(update): share package.json readers	2026-02-16 00:41:28 +00:00
Peter Steinberger	32221e194a	refactor(probe): share withTimeout	2026-02-16 00:39:11 +00:00
Peter Steinberger	5ecc364d55	fix(daemon): drop unused formatGatewayServiceDescription import	2026-02-16 00:37:19 +00:00
Peter Steinberger	0dbc51aa55	refactor(daemon): share service description resolve	2026-02-16 00:36:43 +00:00
Peter Steinberger	58cf37ceeb	refactor(memory): reuse batch utils in gemini	2026-02-16 00:34:10 +00:00
Peter Steinberger	652318e56a	refactor(media): share http error handling	2026-02-16 00:32:16 +00:00
Peter Steinberger	d8691ff4ec	refactor(memory): share sync progress helpers	2026-02-16 00:29:01 +00:00
Peter Steinberger	8251f7c235	refactor(memory): dedupe batch helpers	2026-02-16 00:26:03 +00:00
Peter Steinberger	ae1880acf6	refactor(frontmatter): share openclaw manifest parsing	2026-02-16 00:23:33 +00:00
Peter Steinberger	fddf8a6f4a	perf(test): fold pi extensions runtime registry tests into agents suite	2026-02-16 00:22:36 +00:00
Peter Steinberger	412c1d0af1	perf(test): fold logger import side-effects test into diagnostic suite	2026-02-16 00:21:30 +00:00
Peter Steinberger	166cf6a3e0	fix(web_fetch): cap response body before parsing	2026-02-16 01:21:11 +01:00
Peter Steinberger	fd3d452f1f	fix(ci): fix ui cron test mock signature	2026-02-16 00:19:34 +00:00
Peter Steinberger	fdd0e78d1b	perf(test): fold exec approvals socket defaults into main suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	60ce38d216	perf(test): drop redundant line signature unit test	2026-02-16 00:18:27 +00:00
Peter Steinberger	acb2a1ce37	perf(test): fold discord voice hardening into web media suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	ba3a0e7adb	perf(test): fold gateway server utils into misc suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	3a7b1b36b6	perf(test): consolidate shared utility suites	2026-02-16 00:18:27 +00:00
Peter Steinberger	3830a4b58e	perf(test): fold acp session store assertions into mapper suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	6288c51774	perf(test): fold secret equality assertions into audit extra suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	a508c34731	perf(test): fold signal daemon log parsing into probe suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	5baa08ed13	perf(test): fold model-default assertions into command utils suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	55fd88e967	perf(test): consolidate utils parsing helpers	2026-02-16 00:18:27 +00:00
Peter Steinberger	725f63f724	perf(test): fold restart recovery helper into spawn utils suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	c82dc02b4d	perf(test): fold tui command parsing into tui suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	2cf060f774	perf(test): consolidate media-understanding misc suites	2026-02-16 00:18:27 +00:00
Peter Steinberger	5529473af9	perf(test): fold browser server-context helper into utils suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	5e3b211d93	perf(test): fold gmail watcher assertions into hooks install suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	3fd40fc5a3	perf(test): fold media constants assertions into mime suite	2026-02-16 00:18:27 +00:00
Peter Steinberger	f934725ccd	perf(test): consolidate channel misc suites	2026-02-16 00:18:27 +00:00
Peter Steinberger	5709b30700	perf(test): consolidate config misc suites	2026-02-16 00:18:27 +00:00
Peter Steinberger	2d5004cee4	perf(test): consolidate CLI utility tests	2026-02-16 00:18:27 +00:00
Peter Steinberger	1287abe0b5	perf(test): consolidate browser utility tests	2026-02-16 00:18:27 +00:00
Peter Steinberger	a91bcd2cf4	fix(test): avoid fake-timers hang in gateway lock	2026-02-16 00:18:27 +00:00
Peter Steinberger	67bfe8fb80	perf(test): cut gateway unit suite overhead	2026-02-16 00:18:26 +00:00
Peter Steinberger	be4a490c23	refactor(test): fix update-cli env restore	2026-02-16 00:16:57 +00:00
Peter Steinberger	e9ed5febc5	refactor(test): dedupe token exchange env cleanup	2026-02-16 00:16:00 +00:00
Peter Steinberger	72baa58edd	refactor(test): fix copilot env restore	2026-02-16 00:15:20 +00:00
Peter Steinberger	76015aab23	refactor(test): dedupe copilot env restores	2026-02-16 00:14:48 +00:00
Advait Paliwal	115cfb4430	gateway: add cron finished-run webhook (#14535 ) * gateway: add cron finished webhook delivery * config: allow cron webhook in runtime schema * cron: require notify flag for webhook posts * ui/docs: add cron notify toggle and webhook docs * fix: harden cron webhook auth and fill notify coverage (#14535) (thanks @advaitpaliwal) --------- Co-authored-by: Tyler Yust <TYTYYUST@YAHOO.COM>	2026-02-15 16:14:17 -08:00
Peter Steinberger	ab000bc411	refactor(test): dedupe qianfan env restore	2026-02-16 00:13:01 +00:00
Peter Steinberger	e3a93d6705	refactor(test): dedupe safe-bins mocks	2026-02-16 00:12:23 +00:00
Peter Steinberger	7857096d29	refactor(test): reuse env snapshot in model scan	2026-02-16 00:08:35 +00:00
Peter Steinberger	cedd520f25	refactor(test): simplify state dir env helpers	2026-02-16 00:08:00 +00:00
cpojer	4bdb857eca	chore: Use proper pnpm caching in one CI step.	2026-02-16 09:07:09 +09:00
Peter Steinberger	997b9ad232	refactor(test): dedupe provider api key env restore	2026-02-16 00:05:02 +00:00
Peter Steinberger	e075a33ca3	refactor(test): simplify oauth/profile env restore	2026-02-16 00:03:54 +00:00
cpojer	c07036e813	chore: Update deps.	2026-02-16 09:03:29 +09:00
Shakker	b562aa6625	fix(gateway): keep boot sessions ephemeral without remapping main	2026-02-16 00:03:21 +00:00
Shakker	fe73878dfc	fix(gateway): preserve session mapping across gateway restarts	2026-02-16 00:03:21 +00:00
Peter Steinberger	ee2fa5f411	refactor(test): reuse env snapshots in unit suites	2026-02-16 00:02:32 +00:00
Peter Steinberger	07dea4c6cc	refactor(test): dedupe auth choice env cleanup	2026-02-15 23:59:28 +00:00
Peter Steinberger	7bb0b7d1fc	refactor(test): simplify config io env snapshot	2026-02-15 23:58:06 +00:00
Peter Steinberger	a90e007d50	refactor(test): reuse env snapshot in gateway ws harness	2026-02-15 23:56:57 +00:00
Peter Steinberger	94e84e6f75	refactor(test): clean up gateway tool env restore	2026-02-15 23:56:06 +00:00
Peter Steinberger	e9c8540e21	refactor(test): simplify model auth env restore	2026-02-15 23:55:11 +00:00
Peter Steinberger	961ca61b0e	refactor(test): dedupe onboard auth env cleanup	2026-02-15 23:53:55 +00:00
Peter Steinberger	f809ff5e55	refactor(test): reuse env snapshot helper	2026-02-15 23:51:24 +00:00
Peter Steinberger	d27a763eec	refactor(test): reuse env helper in temp home harness	2026-02-15 23:42:20 +00:00
Peter Steinberger	abd009b092	refactor(test): dedupe openresponses server setup	2026-02-15 23:34:52 +00:00
Peter Steinberger	f0e373b82e	refactor(test): simplify state dir env restore	2026-02-15 23:34:02 +00:00
Peter Steinberger	35ab521e07	refactor(test): simplify voicewake env cleanup	2026-02-15 23:34:02 +00:00
Peter Steinberger	d8d9d3724f	docs(agents): add GHSA patch/publish notes	2026-02-16 00:31:51 +01:00
Peter Steinberger	e3445f59c9	docs(changelog): note inter-session provenance security fix	2026-02-16 00:31:51 +01:00
Peter Steinberger	a68ed3f64c	refactor(test): reuse env snapshots in gateway call tests	2026-02-15 23:22:58 +00:00
Peter Steinberger	31980bcaf1	refactor(test): dedupe gateway env restores	2026-02-15 23:18:16 +00:00
Peter Steinberger	70f86e326d	refactor(test): reuse shared env snapshots	2026-02-15 23:15:07 +00:00
Peter Steinberger	bed0e07620	fix(cli): clear plugin manifest cache after install	2026-02-15 23:14:42 +00:00
Peter Steinberger	632b71c7f8	fix(test): avoid inheriting process.env in nix config e2e	2026-02-15 23:14:42 +00:00
Peter Steinberger	eef13235ad	fix(test): make sessions_spawn e2e harness ordering stable	2026-02-15 23:14:42 +00:00
Peter Steinberger	89155aa6c6	fix(test): load sessions_spawn harness before tools	2026-02-15 23:14:42 +00:00
Peter Steinberger	bbcbabab74	fix(ci): repair e2e mocks and tool schemas	2026-02-15 23:14:42 +00:00
Peter Steinberger	0e2d8b8a1e	perf(test): consolidate channel action suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	c5288300a1	perf(test): consolidate reply flow suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	a7f6c95675	perf(test): consolidate slack monitor suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	74294a4653	perf(test): consolidate web auto-reply suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	c59a472ca2	perf(test): consolidate memory tool e2e suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	722bfaa9c9	perf(test): consolidate reply plumbing/state suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	37086d0c3e	perf(test): consolidate sessions tool e2e suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	a1c50b4ee3	perf(test): consolidate channel plugin suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	d75cd40787	perf(test): consolidate reply utility suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	34b088ede6	perf(test): consolidate infra outbound suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	36b5f0c9a8	perf(test): consolidate gateway server-methods suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	704c8ed530	perf(test): consolidate sessions config suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	2158b09b9d	perf(test): consolidate discord monitor utils	2026-02-15 23:14:42 +00:00
Peter Steinberger	ed276d3e50	perf(test): consolidate inbound reply suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	53ec78319d	perf(test): consolidate session suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	51709c63fe	perf(test): consolidate model selection suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	f8925b7588	perf(test): consolidate reply commands suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	023091ded3	perf(test): consolidate slack tool-result suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	ce922915ab	perf(test): consolidate telegram send suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	f749365b1c	perf(test): consolidate telegram create bot suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	4fc72226fa	perf(test): speed up slack slash suite	2026-02-15 23:14:42 +00:00
Peter Steinberger	def74465eb	perf(test): consolidate runReplyAgent suites	2026-02-15 23:14:42 +00:00
Peter Steinberger	a91553c7cf	perf(slack): consolidate slash tests	2026-02-15 23:14:42 +00:00
Peter Steinberger	65ea200c31	refactor(test): share env var helpers	2026-02-15 23:12:57 +00:00
Peter Steinberger	0b56472cf5	refactor(test): dedupe ios/android gateway client id tests	2026-02-15 23:07:50 +00:00
Peter Steinberger	8ba16a894f	refactor(test): reuse withGatewayServer in auth/http suites	2026-02-15 23:06:34 +00:00
Peter Steinberger	99909f7bc7	refactor(test): share gateway server start helper	2026-02-15 23:02:27 +00:00
Peter Steinberger	1b455b6d9f	refactor(test): dedupe gateway hooks server setup	2026-02-15 22:43:27 +00:00