first commit
This commit is contained in:
47
content/experiments/onboarding-config-protocol.md
Normal file
47
content/experiments/onboarding-config-protocol.md
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
read_when: Changing onboarding wizard steps or config schema endpoints
|
||||
summary: 新手引导向导和配置模式的 RPC 协议说明
|
||||
title: 新手引导和配置协议
|
||||
x-i18n:
|
||||
generated_at: "2026-02-03T07:47:10Z"
|
||||
model: claude-opus-4-5
|
||||
provider: pi
|
||||
source_hash: 55163b3ee029c02476800cb616a054e5adfe97dae5bb72f2763dce0079851e06
|
||||
source_path: experiments/onboarding-config-protocol.md
|
||||
workflow: 15
|
||||
---
|
||||
|
||||
# 新手引导 + 配置协议
|
||||
|
||||
目的:CLI、macOS 应用和 Web UI 之间共享的新手引导 + 配置界面。
|
||||
|
||||
## 组件
|
||||
|
||||
- 向导引擎(共享会话 + 提示 + 新手引导状态)。
|
||||
- CLI 新手引导使用与 UI 客户端相同的向导流程。
|
||||
- Gateway 网关 RPC 公开向导 + 配置模式端点。
|
||||
- macOS 新手引导使用向导步骤模型。
|
||||
- Web UI 从 JSON Schema + UI 提示渲染配置表单。
|
||||
|
||||
## Gateway 网关 RPC
|
||||
|
||||
- `wizard.start` 参数:`{ mode?: "local"|"remote", workspace?: string }`
|
||||
- `wizard.next` 参数:`{ sessionId, answer?: { stepId, value? } }`
|
||||
- `wizard.cancel` 参数:`{ sessionId }`
|
||||
- `wizard.status` 参数:`{ sessionId }`
|
||||
- `config.schema` 参数:`{}`
|
||||
|
||||
响应(结构)
|
||||
|
||||
- 向导:`{ sessionId, done, step?, status?, error? }`
|
||||
- 配置模式:`{ schema, uiHints, version, generatedAt }`
|
||||
|
||||
## UI 提示
|
||||
|
||||
- `uiHints` 按路径键入;可选元数据(label/help/group/order/advanced/sensitive/placeholder)。
|
||||
- 敏感字段渲染为密码输入;无脱敏层。
|
||||
- 不支持的模式节点回退到原始 JSON 编辑器。
|
||||
|
||||
## 注意
|
||||
|
||||
- 本文档是跟踪新手引导/配置协议重构的唯一位置。
|
||||
1304
content/experiments/plans/acp-thread-bound-agents.md
Normal file
1304
content/experiments/plans/acp-thread-bound-agents.md
Normal file
File diff suppressed because it is too large
Load Diff
158
content/experiments/plans/acp-unified-streaming-refactor.md
Normal file
158
content/experiments/plans/acp-unified-streaming-refactor.md
Normal file
@@ -0,0 +1,158 @@
|
||||
---
|
||||
summary: "Holy grail refactor plan for one unified runtime streaming pipeline across main, subagent, and ACP"
|
||||
owner: "onutc"
|
||||
status: "draft"
|
||||
last_updated: "2026-02-25"
|
||||
title: "Unified Runtime Streaming Refactor Plan"
|
||||
---
|
||||
|
||||
# Unified Runtime Streaming Refactor Plan
|
||||
|
||||
|
||||
## Objective
|
||||
|
||||
|
||||
Deliver one shared streaming pipeline for `main`, `subagent`, and `acp` so all runtimes get identical coalescing, chunking, delivery ordering, and crash recovery behavior.
|
||||
|
||||
|
||||
## Why this exists
|
||||
|
||||
|
||||
- Current behavior is split across multiple runtime-specific shaping paths.
|
||||
|
||||
- Formatting/coalescing bugs can be fixed in one path but remain in others.
|
||||
|
||||
- Delivery consistency, duplicate suppression, and recovery semantics are harder to reason about.
|
||||
|
||||
|
||||
## Target architecture
|
||||
|
||||
|
||||
Single pipeline, runtime-specific adapters:
|
||||
|
||||
|
||||
1. Runtime adapters emit canonical events only.
|
||||
|
||||
2. Shared stream assembler coalesces and finalizes text/tool/status events.
|
||||
|
||||
3. Shared channel projector applies channel-specific chunking/formatting once.
|
||||
|
||||
4. Shared delivery ledger enforces idempotent send/replay semantics.
|
||||
|
||||
5. Outbound channel adapter executes sends and records delivery checkpoints.
|
||||
|
||||
|
||||
Canonical event contract:
|
||||
|
||||
|
||||
- `turn_started`
|
||||
|
||||
- `text_delta`
|
||||
|
||||
- `block_final`
|
||||
|
||||
- `tool_started`
|
||||
|
||||
- `tool_finished`
|
||||
|
||||
- `status`
|
||||
|
||||
- `turn_completed`
|
||||
|
||||
- `turn_failed`
|
||||
|
||||
- `turn_cancelled`
|
||||
|
||||
|
||||
## Workstreams
|
||||
|
||||
|
||||
### 1) Canonical streaming contract
|
||||
|
||||
|
||||
- Define strict event schema + validation in core.
|
||||
|
||||
- Add adapter contract tests to guarantee each runtime emits compatible events.
|
||||
|
||||
- Reject malformed runtime events early and surface structured diagnostics.
|
||||
|
||||
|
||||
### 2) Shared stream processor
|
||||
|
||||
|
||||
- Replace runtime-specific coalescer/projector logic with one processor.
|
||||
|
||||
- Processor owns text delta buffering, idle flush, max-chunk splitting, and completion flush.
|
||||
|
||||
- Move ACP/main/subagent config resolution into one helper to prevent drift.
|
||||
|
||||
|
||||
### 3) Shared channel projection
|
||||
|
||||
|
||||
- Keep channel adapters dumb: accept finalized blocks and send.
|
||||
|
||||
- Move Discord-specific chunking quirks to channel projector only.
|
||||
|
||||
- Keep pipeline channel-agnostic before projection.
|
||||
|
||||
|
||||
### 4) Delivery ledger + replay
|
||||
|
||||
|
||||
- Add per-turn/per-chunk delivery IDs.
|
||||
|
||||
- Record checkpoints before and after physical send.
|
||||
|
||||
- On restart, replay pending chunks idempotently and avoid duplicates.
|
||||
|
||||
|
||||
### 5) Migration and cutover
|
||||
|
||||
|
||||
- Phase 1: shadow mode (new pipeline computes output but old path sends; compare).
|
||||
|
||||
- Phase 2: runtime-by-runtime cutover (`acp`, then `subagent`, then `main` or reverse by risk).
|
||||
|
||||
- Phase 3: delete legacy runtime-specific streaming code.
|
||||
|
||||
|
||||
## Non-goals
|
||||
|
||||
|
||||
- No changes to ACP policy/permissions model in this refactor.
|
||||
|
||||
- No channel-specific feature expansion outside projection compatibility fixes.
|
||||
|
||||
- No transport/backend redesign (acpx plugin contract remains as-is unless needed for event parity).
|
||||
|
||||
|
||||
## Risks and mitigations
|
||||
|
||||
|
||||
- Risk: behavioral regressions in existing main/subagent paths.
|
||||
|
||||
Mitigation: shadow mode diffing + adapter contract tests + channel e2e tests.
|
||||
|
||||
- Risk: duplicate sends during crash recovery.
|
||||
|
||||
Mitigation: durable delivery IDs + idempotent replay in delivery adapter.
|
||||
|
||||
- Risk: runtime adapters diverge again.
|
||||
|
||||
Mitigation: required shared contract test suite for all adapters.
|
||||
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
|
||||
- All runtimes pass shared streaming contract tests.
|
||||
|
||||
- Discord ACP/main/subagent produce equivalent spacing/chunking behavior for tiny deltas.
|
||||
|
||||
- Crash/restart replay sends no duplicate chunk for the same delivery ID.
|
||||
|
||||
- Legacy ACP projector/coalescer path is removed.
|
||||
|
||||
- Streaming config resolution is shared and runtime-independent.
|
||||
|
||||
396
content/experiments/plans/browser-evaluate-cdp-refactor.md
Normal file
396
content/experiments/plans/browser-evaluate-cdp-refactor.md
Normal file
@@ -0,0 +1,396 @@
|
||||
---
|
||||
summary: "Plan: isolate browser act:evaluate from Playwright queue using CDP, with end-to-end deadlines and safer ref resolution"
|
||||
read_when:
|
||||
- Working on browser `act:evaluate` timeout, abort, or queue blocking issues
|
||||
- Planning CDP based isolation for evaluate execution
|
||||
owner: "openclaw"
|
||||
status: "draft"
|
||||
last_updated: "2026-02-10"
|
||||
title: "Browser Evaluate CDP Refactor"
|
||||
---
|
||||
|
||||
# Browser Evaluate CDP Refactor Plan
|
||||
|
||||
|
||||
## Context
|
||||
|
||||
|
||||
`act:evaluate` executes user provided JavaScript in the page. Today it runs via Playwright
|
||||
(`page.evaluate` or `locator.evaluate`). Playwright serializes CDP commands per page, so a
|
||||
stuck or long running evaluate can block the page command queue and make every later action
|
||||
|
||||
on that tab look "stuck".
|
||||
|
||||
|
||||
PR #13498 adds a pragmatic safety net (bounded evaluate, abort propagation, and best-effort
|
||||
|
||||
recovery). This document describes a larger refactor that makes `act:evaluate` inherently
|
||||
|
||||
isolated from Playwright so a stuck evaluate cannot wedge normal Playwright operations.
|
||||
|
||||
|
||||
## Goals
|
||||
|
||||
|
||||
- `act:evaluate` cannot permanently block later browser actions on the same tab.
|
||||
|
||||
- Timeouts are single source of truth end to end so a caller can rely on a budget.
|
||||
|
||||
- Abort and timeout are treated the same way across HTTP and in-process dispatch.
|
||||
|
||||
- Element targeting for evaluate is supported without switching everything off Playwright.
|
||||
|
||||
- Maintain backward compatibility for existing callers and payloads.
|
||||
|
||||
|
||||
## Non-goals
|
||||
|
||||
|
||||
- Replace all browser actions (click, type, wait, etc.) with CDP implementations.
|
||||
|
||||
- Remove the existing safety net introduced in PR #13498 (it remains a useful fallback).
|
||||
|
||||
- Introduce new unsafe capabilities beyond the existing `browser.evaluateEnabled` gate.
|
||||
|
||||
- Add process isolation (worker process/thread) for evaluate. If we still see hard to recover
|
||||
|
||||
stuck states after this refactor, that is a follow-up idea.
|
||||
|
||||
|
||||
## Current Architecture (Why It Gets Stuck)
|
||||
|
||||
|
||||
At a high level:
|
||||
|
||||
|
||||
- Callers send `act:evaluate` to the browser control service.
|
||||
|
||||
- The route handler calls into Playwright to execute the JavaScript.
|
||||
|
||||
- Playwright serializes page commands, so an evaluate that never finishes blocks the queue.
|
||||
|
||||
- A stuck queue means later click/type/wait operations on the tab can appear to hang.
|
||||
|
||||
|
||||
## Proposed Architecture
|
||||
|
||||
|
||||
### 1. Deadline Propagation
|
||||
|
||||
|
||||
Introduce a single budget concept and derive everything from it:
|
||||
|
||||
|
||||
- Caller sets `timeoutMs` (or a deadline in the future).
|
||||
|
||||
- The outer request timeout, route handler logic, and the execution budget inside the page
|
||||
|
||||
all use the same budget, with small headroom where needed for serialization overhead.
|
||||
|
||||
- Abort is propagated as an `AbortSignal` everywhere so cancellation is consistent.
|
||||
|
||||
|
||||
Implementation direction:
|
||||
|
||||
|
||||
- Add a small helper (for example `createBudget({ timeoutMs, signal })`) that returns:
|
||||
|
||||
- `signal`: the linked AbortSignal
|
||||
|
||||
- `deadlineAtMs`: absolute deadline
|
||||
|
||||
- `remainingMs()`: remaining budget for child operations
|
||||
|
||||
- Use this helper in:
|
||||
|
||||
- `src/browser/client-fetch.ts` (HTTP and in-process dispatch)
|
||||
|
||||
- `src/node-host/runner.ts` (proxy path)
|
||||
|
||||
- browser action implementations (Playwright and CDP)
|
||||
|
||||
|
||||
### 2. Separate Evaluate Engine (CDP Path)
|
||||
|
||||
|
||||
Add a CDP based evaluate implementation that does not share Playwright's per page command
|
||||
|
||||
queue. The key property is that the evaluate transport is a separate WebSocket connection
|
||||
|
||||
and a separate CDP session attached to the target.
|
||||
|
||||
|
||||
Implementation direction:
|
||||
|
||||
|
||||
- New module, for example `src/browser/cdp-evaluate.ts`, that:
|
||||
|
||||
- Connects to the configured CDP endpoint (browser level socket).
|
||||
|
||||
- Uses `Target.attachToTarget({ targetId, flatten: true })` to get a `sessionId`.
|
||||
|
||||
- Runs either:
|
||||
|
||||
- `Runtime.evaluate` for page level evaluate, or
|
||||
|
||||
- `DOM.resolveNode` plus `Runtime.callFunctionOn` for element evaluate.
|
||||
|
||||
- On timeout or abort:
|
||||
|
||||
- Sends `Runtime.terminateExecution` best-effort for the session.
|
||||
|
||||
- Closes the WebSocket and returns a clear error.
|
||||
|
||||
|
||||
Notes:
|
||||
|
||||
|
||||
- This still executes JavaScript in the page, so termination can have side effects. The win
|
||||
|
||||
is that it does not wedge the Playwright queue, and it is cancelable at the transport
|
||||
|
||||
layer by killing the CDP session.
|
||||
|
||||
|
||||
### 3. Ref Story (Element Targeting Without A Full Rewrite)
|
||||
|
||||
|
||||
The hard part is element targeting. CDP needs a DOM handle or `backendDOMNodeId`, while
|
||||
|
||||
today most browser actions use Playwright locators based on refs from snapshots.
|
||||
|
||||
|
||||
Recommended approach: keep existing refs, but attach an optional CDP resolvable id.
|
||||
|
||||
|
||||
#### 3.1 Extend Stored Ref Info
|
||||
|
||||
|
||||
Extend the stored role ref metadata to optionally include a CDP id:
|
||||
|
||||
|
||||
- Today: `{ role, name, nth }`
|
||||
|
||||
- Proposed: `{ role, name, nth, backendDOMNodeId?: number }`
|
||||
|
||||
|
||||
This keeps all existing Playwright based actions working and allows CDP evaluate to accept
|
||||
|
||||
the same `ref` value when the `backendDOMNodeId` is available.
|
||||
|
||||
|
||||
#### 3.2 Populate backendDOMNodeId At Snapshot Time
|
||||
|
||||
|
||||
When producing a role snapshot:
|
||||
|
||||
|
||||
1. Generate the existing role ref map as today (role, name, nth).
|
||||
|
||||
2. Fetch the AX tree via CDP (`Accessibility.getFullAXTree`) and compute a parallel map of
|
||||
|
||||
`(role, name, nth) -> backendDOMNodeId` using the same duplicate handling rules.
|
||||
3. Merge the id back into the stored ref info for the current tab.
|
||||
|
||||
|
||||
If mapping fails for a ref, leave `backendDOMNodeId` undefined. This makes the feature
|
||||
|
||||
best-effort and safe to roll out.
|
||||
|
||||
|
||||
#### 3.3 Evaluate Behavior With Ref
|
||||
|
||||
|
||||
In `act:evaluate`:
|
||||
|
||||
|
||||
- If `ref` is present and has `backendDOMNodeId`, run element evaluate via CDP.
|
||||
|
||||
- If `ref` is present but has no `backendDOMNodeId`, fall back to the Playwright path (with
|
||||
|
||||
the safety net).
|
||||
|
||||
|
||||
Optional escape hatch:
|
||||
|
||||
|
||||
- Extend the request shape to accept `backendDOMNodeId` directly for advanced callers (and
|
||||
|
||||
for debugging), while keeping `ref` as the primary interface.
|
||||
|
||||
|
||||
### 4. Keep A Last Resort Recovery Path
|
||||
|
||||
|
||||
Even with CDP evaluate, there are other ways to wedge a tab or a connection. Keep the
|
||||
|
||||
existing recovery mechanisms (terminate execution + disconnect Playwright) as a last resort
|
||||
|
||||
for:
|
||||
|
||||
|
||||
- legacy callers
|
||||
|
||||
- environments where CDP attach is blocked
|
||||
|
||||
- unexpected Playwright edge cases
|
||||
|
||||
|
||||
## Implementation Plan (Single Iteration)
|
||||
|
||||
|
||||
### Deliverables
|
||||
|
||||
|
||||
- A CDP based evaluate engine that runs outside the Playwright per-page command queue.
|
||||
|
||||
- A single end-to-end timeout/abort budget used consistently by callers and handlers.
|
||||
|
||||
- Ref metadata that can optionally carry `backendDOMNodeId` for element evaluate.
|
||||
|
||||
- `act:evaluate` prefers the CDP engine when possible and falls back to Playwright when not.
|
||||
|
||||
- Tests that prove a stuck evaluate does not wedge later actions.
|
||||
|
||||
- Logs/metrics that make failures and fallbacks visible.
|
||||
|
||||
|
||||
### Implementation Checklist
|
||||
|
||||
|
||||
1. Add a shared "budget" helper to link `timeoutMs` + upstream `AbortSignal` into:
|
||||
|
||||
- a single `AbortSignal`
|
||||
|
||||
- an absolute deadline
|
||||
|
||||
- a `remainingMs()` helper for downstream operations
|
||||
|
||||
2. Update all caller paths to use that helper so `timeoutMs` means the same thing everywhere:
|
||||
|
||||
- `src/browser/client-fetch.ts` (HTTP and in-process dispatch)
|
||||
|
||||
- `src/node-host/runner.ts` (node proxy path)
|
||||
|
||||
- CLI wrappers that call `/act` (add `--timeout-ms` to `browser evaluate`)
|
||||
|
||||
3. Implement `src/browser/cdp-evaluate.ts`:
|
||||
|
||||
- connect to the browser-level CDP socket
|
||||
|
||||
- `Target.attachToTarget` to get a `sessionId`
|
||||
|
||||
- run `Runtime.evaluate` for page evaluate
|
||||
|
||||
- run `DOM.resolveNode` + `Runtime.callFunctionOn` for element evaluate
|
||||
|
||||
- on timeout/abort: best-effort `Runtime.terminateExecution` then close the socket
|
||||
|
||||
4. Extend stored role ref metadata to optionally include `backendDOMNodeId`:
|
||||
|
||||
- keep existing `{ role, name, nth }` behavior for Playwright actions
|
||||
|
||||
- add `backendDOMNodeId?: number` for CDP element targeting
|
||||
|
||||
5. Populate `backendDOMNodeId` during snapshot creation (best-effort):
|
||||
|
||||
- fetch AX tree via CDP (`Accessibility.getFullAXTree`)
|
||||
|
||||
- compute `(role, name, nth) -> backendDOMNodeId` and merge into the stored ref map
|
||||
|
||||
- if mapping is ambiguous or missing, leave the id undefined
|
||||
|
||||
6. Update `act:evaluate` routing:
|
||||
|
||||
- if no `ref`: always use CDP evaluate
|
||||
|
||||
- if `ref` resolves to a `backendDOMNodeId`: use CDP element evaluate
|
||||
|
||||
- otherwise: fall back to Playwright evaluate (still bounded and abortable)
|
||||
|
||||
7. Keep the existing "last resort" recovery path as a fallback, not the default path.
|
||||
|
||||
8. Add tests:
|
||||
|
||||
- stuck evaluate times out within budget and the next click/type succeeds
|
||||
|
||||
- abort cancels evaluate (client disconnect or timeout) and unblocks subsequent actions
|
||||
|
||||
- mapping failures cleanly fall back to Playwright
|
||||
|
||||
9. Add observability:
|
||||
|
||||
- evaluate duration and timeout counters
|
||||
|
||||
- terminateExecution usage
|
||||
|
||||
- fallback rate (CDP -> Playwright) and reasons
|
||||
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
|
||||
- A deliberately hung `act:evaluate` returns within the caller budget and does not wedge the
|
||||
|
||||
tab for later actions.
|
||||
|
||||
- `timeoutMs` behaves consistently across CLI, agent tool, node proxy, and in-process calls.
|
||||
|
||||
- If `ref` can be mapped to `backendDOMNodeId`, element evaluate uses CDP; otherwise the
|
||||
|
||||
fallback path is still bounded and recoverable.
|
||||
|
||||
|
||||
## Testing Plan
|
||||
|
||||
|
||||
- Unit tests:
|
||||
|
||||
- `(role, name, nth)` matching logic between role refs and AX tree nodes.
|
||||
|
||||
- Budget helper behavior (headroom, remaining time math).
|
||||
|
||||
- Integration tests:
|
||||
|
||||
- CDP evaluate timeout returns within budget and does not block the next action.
|
||||
|
||||
- Abort cancels evaluate and triggers termination best-effort.
|
||||
|
||||
- Contract tests:
|
||||
|
||||
- Ensure `BrowserActRequest` and `BrowserActResponse` remain compatible.
|
||||
|
||||
|
||||
## Risks And Mitigations
|
||||
|
||||
|
||||
- Mapping is imperfect:
|
||||
|
||||
- Mitigation: best-effort mapping, fallback to Playwright evaluate, and add debug tooling.
|
||||
|
||||
- `Runtime.terminateExecution` has side effects:
|
||||
|
||||
- Mitigation: only use on timeout/abort and document the behavior in errors.
|
||||
|
||||
- Extra overhead:
|
||||
|
||||
- Mitigation: only fetch AX tree when snapshots are requested, cache per target, and keep
|
||||
|
||||
CDP session short lived.
|
||||
|
||||
- Extension relay limitations:
|
||||
|
||||
- Mitigation: use browser level attach APIs when per page sockets are not available, and
|
||||
|
||||
keep the current Playwright path as fallback.
|
||||
|
||||
|
||||
## Open Questions
|
||||
|
||||
|
||||
- Should the new engine be configurable as `playwright`, `cdp`, or `auto`?
|
||||
|
||||
- Do we want to expose a new "nodeRef" format for advanced users, or keep `ref` only?
|
||||
|
||||
- How should frame snapshots and selector scoped snapshots participate in AX mapping?
|
||||
|
||||
70
content/experiments/plans/cron-add-hardening.md
Normal file
70
content/experiments/plans/cron-add-hardening.md
Normal file
@@ -0,0 +1,70 @@
|
||||
---
|
||||
last_updated: "2026-01-05"
|
||||
owner: openclaw
|
||||
status: complete
|
||||
summary: 加固 cron.add 输入处理,对齐 schema,改进 cron UI/智能体工具
|
||||
title: Cron Add 加固
|
||||
x-i18n:
|
||||
generated_at: "2026-02-03T07:47:26Z"
|
||||
model: claude-opus-4-5
|
||||
provider: pi
|
||||
source_hash: d7e469674bd9435b846757ea0d5dc8f174eaa8533917fc013b1ef4f82859496d
|
||||
source_path: experiments/plans/cron-add-hardening.md
|
||||
workflow: 15
|
||||
---
|
||||
|
||||
# Cron Add 加固 & Schema 对齐
|
||||
|
||||
## 背景
|
||||
|
||||
最近的 Gateway 网关日志显示重复的 `cron.add` 失败,参数无效(缺少 `sessionTarget`、`wakeMode`、`payload`,以及格式错误的 `schedule`)。这表明至少有一个客户端(可能是智能体工具调用路径)正在发送包装的或部分指定的任务负载。另外,TypeScript 中的 cron 提供商枚举、Gateway 网关 schema、CLI 标志和 UI 表单类型之间存在漂移,加上 `cron.status` 的 UI 不匹配(期望 `jobCount` 而 Gateway 网关返回 `jobs`)。
|
||||
|
||||
## 目标
|
||||
|
||||
- 通过规范化常见的包装负载并推断缺失的 `kind` 字段来停止 `cron.add` INVALID_REQUEST 垃圾。
|
||||
- 在 Gateway 网关 schema、cron 类型、CLI 文档和 UI 表单之间对齐 cron 提供商列表。
|
||||
- 使智能体 cron 工具 schema 明确,以便 LLM 生成正确的任务负载。
|
||||
- 修复 Control UI cron 状态任务计数显示。
|
||||
- 添加测试以覆盖规范化和工具行为。
|
||||
|
||||
## 非目标
|
||||
|
||||
- 更改 cron 调度语义或任务执行行为。
|
||||
- 添加新的调度类型或 cron 表达式解析。
|
||||
- 除了必要的字段修复外,不大改 cron 的 UI/UX。
|
||||
|
||||
## 发现(当前差距)
|
||||
|
||||
- Gateway 网关中的 `CronPayloadSchema` 排除了 `signal` + `imessage`,而 TS 类型包含它们。
|
||||
- Control UI CronStatus 期望 `jobCount`,但 Gateway 网关返回 `jobs`。
|
||||
- 智能体 cron 工具 schema 允许任意 `job` 对象,导致格式错误的输入。
|
||||
- Gateway 网关严格验证 `cron.add` 而不进行规范化,因此包装的负载会失败。
|
||||
|
||||
## 变更内容
|
||||
|
||||
- `cron.add` 和 `cron.update` 现在规范化常见的包装形式并推断缺失的 `kind` 字段。
|
||||
- 智能体 cron 工具 schema 与 Gateway 网关 schema 匹配,减少无效负载。
|
||||
- 提供商枚举在 Gateway 网关、CLI、UI 和 macOS 选择器之间对齐。
|
||||
- Control UI 使用 Gateway 网关的 `jobs` 计数字段显示状态。
|
||||
|
||||
## 当前行为
|
||||
|
||||
- **规范化:**包装的 `data`/`job` 负载被解包;`schedule.kind` 和 `payload.kind` 在安全时被推断。
|
||||
- **默认值:**当缺失时,为 `wakeMode` 和 `sessionTarget` 应用安全默认值。
|
||||
- **提供商:**Discord/Slack/Signal/iMessage 现在在 CLI/UI 中一致显示。
|
||||
|
||||
参见 [Cron 任务](/automation/cron-jobs) 了解规范化的形式和示例。
|
||||
|
||||
## 验证
|
||||
|
||||
- 观察 Gateway 网关日志中 `cron.add` INVALID_REQUEST 错误是否减少。
|
||||
- 确认 Control UI cron 状态在刷新后显示任务计数。
|
||||
|
||||
## 可选后续工作
|
||||
|
||||
- 手动 Control UI 冒烟测试:为每个提供商添加一个 cron 任务 + 验证状态任务计数。
|
||||
|
||||
## 开放问题
|
||||
|
||||
- `cron.add` 是否应该接受来自客户端的显式 `state`(当前被 schema 禁止)?
|
||||
- 我们是否应该允许 `webchat` 作为显式投递提供商(当前在投递解析中被过滤)?
|
||||
45
content/experiments/plans/group-policy-hardening.md
Normal file
45
content/experiments/plans/group-policy-hardening.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
read_when:
|
||||
- 查看历史 Telegram 允许列表更改
|
||||
summary: Telegram 允许列表加固:前缀 + 空白规范化
|
||||
title: Telegram 允许列表加固
|
||||
x-i18n:
|
||||
generated_at: "2026-02-03T07:47:16Z"
|
||||
model: claude-opus-4-5
|
||||
provider: pi
|
||||
source_hash: a2eca5fcc85376948cfe1b6044f1a8bc69c7f0eb94d1ceafedc1e507ba544162
|
||||
source_path: experiments/plans/group-policy-hardening.md
|
||||
workflow: 15
|
||||
---
|
||||
|
||||
# Telegram 允许列表加固
|
||||
|
||||
**日期**:2026-01-05
|
||||
**状态**:已完成
|
||||
**PR**:#216
|
||||
|
||||
## 摘要
|
||||
|
||||
Telegram 允许列表现在不区分大小写地接受 `telegram:` 和 `tg:` 前缀,并容忍意外的空白。这使入站允许列表检查与出站发送规范化保持一致。
|
||||
|
||||
## 更改内容
|
||||
|
||||
- 前缀 `telegram:` 和 `tg:` 被同等对待(不区分大小写)。
|
||||
- 允许列表条目会被修剪;空条目会被忽略。
|
||||
|
||||
## 示例
|
||||
|
||||
以下所有形式都被接受为同一 ID:
|
||||
|
||||
- `telegram:123456`
|
||||
- `TG:123456`
|
||||
- `tg:123456`
|
||||
|
||||
## 为什么重要
|
||||
|
||||
从日志或聊天 ID 复制/粘贴通常会包含前缀和空白。规范化可避免在决定是否在私信或群组中响应时出现误判。
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [群聊](/channels/groups)
|
||||
- [Telegram 提供商](/channels/telegram)
|
||||
121
content/experiments/plans/openresponses-gateway.md
Normal file
121
content/experiments/plans/openresponses-gateway.md
Normal file
@@ -0,0 +1,121 @@
|
||||
---
|
||||
last_updated: "2026-01-19"
|
||||
owner: openclaw
|
||||
status: draft
|
||||
summary: 计划:添加 OpenResponses /v1/responses 端点并干净地弃用 chat completions
|
||||
title: OpenResponses Gateway 网关计划
|
||||
x-i18n:
|
||||
generated_at: "2026-02-03T07:47:33Z"
|
||||
model: claude-opus-4-5
|
||||
provider: pi
|
||||
source_hash: 71a22c48397507d1648b40766a3153e420c54f2a2d5186d07e51eb3d12e4636a
|
||||
source_path: experiments/plans/openresponses-gateway.md
|
||||
workflow: 15
|
||||
---
|
||||
|
||||
# OpenResponses Gateway 网关集成计划
|
||||
|
||||
## 背景
|
||||
|
||||
OpenClaw Gateway 网关目前在 `/v1/chat/completions` 暴露了一个最小的 OpenAI 兼容 Chat Completions 端点(参见 [OpenAI Chat Completions](/gateway/openai-http-api))。
|
||||
|
||||
Open Responses 是基于 OpenAI Responses API 的开放推理标准。它专为智能体工作流设计,使用基于项目的输入加语义流式事件。OpenResponses 规范定义的是 `/v1/responses`,而不是 `/v1/chat/completions`。
|
||||
|
||||
## 目标
|
||||
|
||||
- 添加一个遵循 OpenResponses 语义的 `/v1/responses` 端点。
|
||||
- 保留 Chat Completions 作为兼容层,易于禁用并最终移除。
|
||||
- 使用隔离的、可复用的 schema 标准化验证和解析。
|
||||
|
||||
## 非目标
|
||||
|
||||
- 第一阶段完全实现 OpenResponses 功能(图片、文件、托管工具)。
|
||||
- 替换内部智能体执行逻辑或工具编排。
|
||||
- 在第一阶段更改现有的 `/v1/chat/completions` 行为。
|
||||
|
||||
## 研究摘要
|
||||
|
||||
来源:OpenResponses OpenAPI、OpenResponses 规范网站和 Hugging Face 博客文章。
|
||||
|
||||
提取的关键点:
|
||||
|
||||
- `POST /v1/responses` 接受 `CreateResponseBody` 字段,如 `model`、`input`(字符串或 `ItemParam[]`)、`instructions`、`tools`、`tool_choice`、`stream`、`max_output_tokens` 和 `max_tool_calls`。
|
||||
- `ItemParam` 是以下类型的可区分联合:
|
||||
- 具有角色 `system`、`developer`、`user`、`assistant` 的 `message` 项
|
||||
- `function_call` 和 `function_call_output`
|
||||
- `reasoning`
|
||||
- `item_reference`
|
||||
- 成功响应返回带有 `object: "response"`、`status` 和 `output` 项的 `ResponseResource`。
|
||||
- 流式传输使用语义事件,如:
|
||||
- `response.created`、`response.in_progress`、`response.completed`、`response.failed`
|
||||
- `response.output_item.added`、`response.output_item.done`
|
||||
- `response.content_part.added`、`response.content_part.done`
|
||||
- `response.output_text.delta`、`response.output_text.done`
|
||||
- 规范要求:
|
||||
- `Content-Type: text/event-stream`
|
||||
- `event:` 必须匹配 JSON `type` 字段
|
||||
- 终止事件必须是字面量 `[DONE]`
|
||||
- Reasoning 项可能暴露 `content`、`encrypted_content` 和 `summary`。
|
||||
- HF 示例在请求中包含 `OpenResponses-Version: latest`(可选头部)。
|
||||
|
||||
## 提议的架构
|
||||
|
||||
- 添加 `src/gateway/open-responses.schema.ts`,仅包含 Zod schema(无 gateway 导入)。
|
||||
- 添加 `src/gateway/openresponses-http.ts`(或 `open-responses-http.ts`)用于 `/v1/responses`。
|
||||
- 保持 `src/gateway/openai-http.ts` 不变,作为遗留兼容适配器。
|
||||
- 添加配置 `gateway.http.endpoints.responses.enabled`(默认 `false`)。
|
||||
- 保持 `gateway.http.endpoints.chatCompletions.enabled` 独立;允许两个端点分别切换。
|
||||
- 当 Chat Completions 启用时发出启动警告,以表明其遗留状态。
|
||||
|
||||
## Chat Completions 弃用路径
|
||||
|
||||
- 保持严格的模块边界:responses 和 chat completions 之间不共享 schema 类型。
|
||||
- 通过配置使 Chat Completions 成为可选,这样无需代码更改即可禁用。
|
||||
- 一旦 `/v1/responses` 稳定,更新文档将 Chat Completions 标记为遗留。
|
||||
- 可选的未来步骤:将 Chat Completions 请求映射到 Responses 处理器,以便更简单地移除。
|
||||
|
||||
## 第一阶段支持子集
|
||||
|
||||
- 接受 `input` 为字符串或带有消息角色和 `function_call_output` 的 `ItemParam[]`。
|
||||
- 将 system 和 developer 消息提取到 `extraSystemPrompt` 中。
|
||||
- 使用最近的 `user` 或 `function_call_output` 作为智能体运行的当前消息。
|
||||
- 对不支持的内容部分(图片/文件)返回 `invalid_request_error` 拒绝。
|
||||
- 返回带有 `output_text` 内容的单个助手消息。
|
||||
- 返回带有零值的 `usage`,直到 token 计数接入。
|
||||
|
||||
## 验证策略(无 SDK)
|
||||
|
||||
- 为以下支持子集实现 Zod schema:
|
||||
- `CreateResponseBody`
|
||||
- `ItemParam` + 消息内容部分联合
|
||||
- `ResponseResource`
|
||||
- Gateway 网关使用的流式事件形状
|
||||
- 将 schema 保存在单个隔离模块中,以避免漂移并允许未来代码生成。
|
||||
|
||||
## 流式实现(第一阶段)
|
||||
|
||||
- 带有 `event:` 和 `data:` 的 SSE 行。
|
||||
- 所需序列(最小可行):
|
||||
- `response.created`
|
||||
- `response.output_item.added`
|
||||
- `response.content_part.added`
|
||||
- `response.output_text.delta`(根据需要重复)
|
||||
- `response.output_text.done`
|
||||
- `response.content_part.done`
|
||||
- `response.completed`
|
||||
- `[DONE]`
|
||||
|
||||
## 测试和验证计划
|
||||
|
||||
- 为 `/v1/responses` 添加端到端覆盖:
|
||||
- 需要认证
|
||||
- 非流式响应形状
|
||||
- 流式事件顺序和 `[DONE]`
|
||||
- 使用头部和 `user` 的会话路由
|
||||
- 保持 `src/gateway/openai-http.e2e.test.ts` 不变。
|
||||
- 手动:用 `stream: true` curl `/v1/responses` 并验证事件顺序和终止 `[DONE]`。
|
||||
|
||||
## 文档更新(后续)
|
||||
|
||||
- 为 `/v1/responses` 使用和示例添加新文档页面。
|
||||
- 更新 `/gateway/openai-http-api`,添加遗留说明和指向 `/v1/responses` 的指针。
|
||||
317
content/experiments/plans/pty-process-supervision.md
Normal file
317
content/experiments/plans/pty-process-supervision.md
Normal file
@@ -0,0 +1,317 @@
|
||||
---
|
||||
summary: "Production plan for reliable interactive process supervision (PTY + non-PTY) with explicit ownership, unified lifecycle, and deterministic cleanup"
|
||||
read_when:
|
||||
- Working on exec/process lifecycle ownership and cleanup
|
||||
- Debugging PTY and non-PTY supervision behavior
|
||||
owner: "openclaw"
|
||||
status: "in-progress"
|
||||
last_updated: "2026-02-15"
|
||||
title: "PTY and Process Supervision Plan"
|
||||
---
|
||||
|
||||
# PTY and Process Supervision Plan
|
||||
|
||||
|
||||
## 1. Problem and goal
|
||||
|
||||
|
||||
We need one reliable lifecycle for long-running command execution across:
|
||||
|
||||
|
||||
- `exec` foreground runs
|
||||
|
||||
- `exec` background runs
|
||||
|
||||
- `process` follow up actions (`poll`, `log`, `send-keys`, `paste`, `submit`, `kill`, `remove`)
|
||||
|
||||
- CLI agent runner subprocesses
|
||||
|
||||
|
||||
The goal is not just to support PTY. The goal is predictable ownership, cancellation, timeout, and cleanup with no unsafe process matching heuristics.
|
||||
|
||||
|
||||
## 2. Scope and boundaries
|
||||
|
||||
|
||||
- Keep implementation internal in `src/process/supervisor`.
|
||||
|
||||
- Do not create a new package for this.
|
||||
|
||||
- Keep current behavior compatibility where practical.
|
||||
|
||||
- Do not broaden scope to terminal replay or tmux style session persistence.
|
||||
|
||||
|
||||
## 3. Implemented in this branch
|
||||
|
||||
|
||||
### Supervisor baseline already present
|
||||
|
||||
|
||||
- Supervisor module is in place under `src/process/supervisor/*`.
|
||||
|
||||
- Exec runtime and CLI runner are already routed through supervisor spawn and wait.
|
||||
|
||||
- Registry finalization is idempotent.
|
||||
|
||||
|
||||
### This pass completed
|
||||
|
||||
|
||||
1. Explicit PTY command contract
|
||||
|
||||
|
||||
- `SpawnInput` is now a discriminated union in `src/process/supervisor/types.ts`.
|
||||
|
||||
- PTY runs require `ptyCommand` instead of reusing generic `argv`.
|
||||
|
||||
- Supervisor no longer rebuilds PTY command strings from argv joins in `src/process/supervisor/supervisor.ts`.
|
||||
|
||||
- Exec runtime now passes `ptyCommand` directly in `src/agents/bash-tools.exec-runtime.ts`.
|
||||
|
||||
|
||||
2. Process layer type decoupling
|
||||
|
||||
|
||||
- Supervisor types no longer import `SessionStdin` from agents.
|
||||
|
||||
- Process local stdin contract lives in `src/process/supervisor/types.ts` (`ManagedRunStdin`).
|
||||
|
||||
- Adapters now depend only on process level types:
|
||||
|
||||
- `src/process/supervisor/adapters/child.ts`
|
||||
|
||||
- `src/process/supervisor/adapters/pty.ts`
|
||||
|
||||
|
||||
3. Process tool lifecycle ownership improvement
|
||||
|
||||
|
||||
- `src/agents/bash-tools.process.ts` now requests cancellation through supervisor first.
|
||||
|
||||
- `process kill/remove` now use process-tree fallback termination when supervisor lookup misses.
|
||||
|
||||
- `remove` keeps deterministic remove behavior by dropping running session entries immediately after termination is requested.
|
||||
|
||||
|
||||
4. Single source watchdog defaults
|
||||
|
||||
|
||||
- Added shared defaults in `src/agents/cli-watchdog-defaults.ts`.
|
||||
|
||||
- `src/agents/cli-backends.ts` consumes the shared defaults.
|
||||
|
||||
- `src/agents/cli-runner/reliability.ts` consumes the same shared defaults.
|
||||
|
||||
|
||||
5. Dead helper cleanup
|
||||
|
||||
|
||||
- Removed unused `killSession` helper path from `src/agents/bash-tools.shared.ts`.
|
||||
|
||||
|
||||
6. Direct supervisor path tests added
|
||||
|
||||
|
||||
- Added `src/agents/bash-tools.process.supervisor.test.ts` to cover kill and remove routing through supervisor cancellation.
|
||||
|
||||
|
||||
7. Reliability gap fixes completed
|
||||
|
||||
|
||||
- `src/agents/bash-tools.process.ts` now falls back to real OS-level process termination when supervisor lookup misses.
|
||||
|
||||
- `src/process/supervisor/adapters/child.ts` now uses process-tree termination semantics for default cancel/timeout kill paths.
|
||||
|
||||
- Added shared process-tree utility in `src/process/kill-tree.ts`.
|
||||
|
||||
|
||||
8. PTY contract edge-case coverage added
|
||||
|
||||
|
||||
- Added `src/process/supervisor/supervisor.pty-command.test.ts` for verbatim PTY command forwarding and empty-command rejection.
|
||||
|
||||
- Added `src/process/supervisor/adapters/child.test.ts` for process-tree kill behavior in child adapter cancellation.
|
||||
|
||||
|
||||
## 4. Remaining gaps and decisions
|
||||
|
||||
|
||||
### Reliability status
|
||||
|
||||
|
||||
The two required reliability gaps for this pass are now closed:
|
||||
|
||||
|
||||
- `process kill/remove` now has a real OS termination fallback when supervisor lookup misses.
|
||||
|
||||
- child cancel/timeout now uses process-tree kill semantics for default kill path.
|
||||
|
||||
- Regression tests were added for both behaviors.
|
||||
|
||||
|
||||
### Durability and startup reconciliation
|
||||
|
||||
|
||||
Restart behavior is now explicitly defined as in-memory lifecycle only.
|
||||
|
||||
|
||||
- `reconcileOrphans()` remains a no-op in `src/process/supervisor/supervisor.ts` by design.
|
||||
|
||||
- Active runs are not recovered after process restart.
|
||||
|
||||
- This boundary is intentional for this implementation pass to avoid partial persistence risks.
|
||||
|
||||
|
||||
### Maintainability follow-ups
|
||||
|
||||
|
||||
1. `runExecProcess` in `src/agents/bash-tools.exec-runtime.ts` still handles multiple responsibilities and can be split into focused helpers in a follow-up.
|
||||
|
||||
|
||||
## 5. Implementation plan
|
||||
|
||||
|
||||
The implementation pass for required reliability and contract items is complete.
|
||||
|
||||
|
||||
Completed:
|
||||
|
||||
|
||||
- `process kill/remove` fallback real termination
|
||||
|
||||
- process-tree cancellation for child adapter default kill path
|
||||
|
||||
- regression tests for fallback kill and child adapter kill path
|
||||
|
||||
- PTY command edge-case tests under explicit `ptyCommand`
|
||||
|
||||
- explicit in-memory restart boundary with `reconcileOrphans()` no-op by design
|
||||
|
||||
|
||||
Optional follow-up:
|
||||
|
||||
|
||||
- split `runExecProcess` into focused helpers with no behavior drift
|
||||
|
||||
|
||||
## 6. File map
|
||||
|
||||
|
||||
### Process supervisor
|
||||
|
||||
|
||||
- `src/process/supervisor/types.ts` updated with discriminated spawn input and process local stdin contract.
|
||||
|
||||
- `src/process/supervisor/supervisor.ts` updated to use explicit `ptyCommand`.
|
||||
|
||||
- `src/process/supervisor/adapters/child.ts` and `src/process/supervisor/adapters/pty.ts` decoupled from agent types.
|
||||
|
||||
- `src/process/supervisor/registry.ts` idempotent finalize unchanged and retained.
|
||||
|
||||
|
||||
### Exec and process integration
|
||||
|
||||
|
||||
- `src/agents/bash-tools.exec-runtime.ts` updated to pass PTY command explicitly and keep fallback path.
|
||||
|
||||
- `src/agents/bash-tools.process.ts` updated to cancel via supervisor with real process-tree fallback termination.
|
||||
|
||||
- `src/agents/bash-tools.shared.ts` removed direct kill helper path.
|
||||
|
||||
|
||||
### CLI reliability
|
||||
|
||||
|
||||
- `src/agents/cli-watchdog-defaults.ts` added as shared baseline.
|
||||
|
||||
- `src/agents/cli-backends.ts` and `src/agents/cli-runner/reliability.ts` now consume same defaults.
|
||||
|
||||
|
||||
## 7. Validation run in this pass
|
||||
|
||||
|
||||
Unit tests:
|
||||
|
||||
|
||||
- `pnpm vitest src/process/supervisor/registry.test.ts`
|
||||
|
||||
- `pnpm vitest src/process/supervisor/supervisor.test.ts`
|
||||
|
||||
- `pnpm vitest src/process/supervisor/supervisor.pty-command.test.ts`
|
||||
|
||||
- `pnpm vitest src/process/supervisor/adapters/child.test.ts`
|
||||
|
||||
- `pnpm vitest src/agents/cli-backends.test.ts`
|
||||
|
||||
- `pnpm vitest src/agents/bash-tools.exec.pty-cleanup.test.ts`
|
||||
|
||||
- `pnpm vitest src/agents/bash-tools.process.poll-timeout.test.ts`
|
||||
|
||||
- `pnpm vitest src/agents/bash-tools.process.supervisor.test.ts`
|
||||
|
||||
- `pnpm vitest src/process/exec.test.ts`
|
||||
|
||||
|
||||
E2E targets:
|
||||
|
||||
|
||||
- `pnpm vitest src/agents/cli-runner.test.ts`
|
||||
|
||||
- `pnpm vitest run src/agents/bash-tools.exec.pty-fallback.test.ts src/agents/bash-tools.exec.background-abort.test.ts src/agents/bash-tools.process.send-keys.test.ts`
|
||||
|
||||
|
||||
Typecheck note:
|
||||
|
||||
|
||||
- Use `pnpm build` (and `pnpm check` for full lint/docs gate) in this repo. Older notes that mention `pnpm tsgo` are obsolete.
|
||||
|
||||
|
||||
## 8. Operational guarantees preserved
|
||||
|
||||
|
||||
- Exec env hardening behavior is unchanged.
|
||||
|
||||
- Approval and allowlist flow is unchanged.
|
||||
|
||||
- Output sanitization and output caps are unchanged.
|
||||
|
||||
- PTY adapter still guarantees wait settlement on forced kill and listener disposal.
|
||||
|
||||
|
||||
## 9. Definition of done
|
||||
|
||||
|
||||
1. Supervisor is lifecycle owner for managed runs.
|
||||
|
||||
2. PTY spawn uses explicit command contract with no argv reconstruction.
|
||||
|
||||
3. Process layer has no type dependency on agent layer for supervisor stdin contracts.
|
||||
|
||||
4. Watchdog defaults are single source.
|
||||
|
||||
5. Targeted unit and e2e tests remain green.
|
||||
|
||||
6. Restart durability boundary is explicitly documented or fully implemented.
|
||||
|
||||
|
||||
## 10. Summary
|
||||
|
||||
|
||||
The branch now has a coherent and safer supervision shape:
|
||||
|
||||
|
||||
- explicit PTY contract
|
||||
|
||||
- cleaner process layering
|
||||
|
||||
- supervisor driven cancellation path for process operations
|
||||
|
||||
- real fallback termination when supervisor lookup misses
|
||||
|
||||
- process-tree cancellation for child-run default kill paths
|
||||
|
||||
- unified watchdog defaults
|
||||
|
||||
- explicit in-memory restart boundary (no orphan reconciliation across restart in this pass)
|
||||
|
||||
324
content/experiments/plans/session-binding-channel-agnostic.md
Normal file
324
content/experiments/plans/session-binding-channel-agnostic.md
Normal file
@@ -0,0 +1,324 @@
|
||||
---
|
||||
summary: "Channel agnostic session binding architecture and iteration 1 delivery scope"
|
||||
read_when:
|
||||
- Refactoring channel-agnostic session routing and bindings
|
||||
- Investigating duplicate, stale, or missing session delivery across channels
|
||||
owner: "onutc"
|
||||
status: "in-progress"
|
||||
last_updated: "2026-02-21"
|
||||
title: "Session Binding Channel Agnostic Plan"
|
||||
---
|
||||
|
||||
# Session Binding Channel Agnostic Plan
|
||||
|
||||
|
||||
## Overview
|
||||
|
||||
|
||||
This document defines the long term channel agnostic session binding model and the concrete scope for the next implementation iteration.
|
||||
|
||||
|
||||
Goal:
|
||||
|
||||
|
||||
- make subagent bound session routing a core capability
|
||||
|
||||
- keep channel specific behavior in adapters
|
||||
|
||||
- avoid regressions in normal Discord behavior
|
||||
|
||||
|
||||
## Why this exists
|
||||
|
||||
|
||||
Current behavior mixes:
|
||||
|
||||
|
||||
- completion content policy
|
||||
|
||||
- destination routing policy
|
||||
|
||||
- Discord specific details
|
||||
|
||||
|
||||
This caused edge cases such as:
|
||||
|
||||
|
||||
- duplicate main and thread delivery under concurrent runs
|
||||
|
||||
- stale token usage on reused binding managers
|
||||
|
||||
- missing activity accounting for webhook sends
|
||||
|
||||
|
||||
## Iteration 1 scope
|
||||
|
||||
|
||||
This iteration is intentionally limited.
|
||||
|
||||
|
||||
### 1. Add channel agnostic core interfaces
|
||||
|
||||
|
||||
Add core types and service interfaces for bindings and routing.
|
||||
|
||||
|
||||
Proposed core types:
|
||||
|
||||
|
||||
```ts
|
||||
export type BindingTargetKind = "subagent" | "session";
|
||||
export type BindingStatus = "active" | "ending" | "ended";
|
||||
|
||||
export type ConversationRef = {
|
||||
channel: string;
|
||||
accountId: string;
|
||||
conversationId: string;
|
||||
parentConversationId?: string;
|
||||
};
|
||||
|
||||
export type SessionBindingRecord = {
|
||||
bindingId: string;
|
||||
targetSessionKey: string;
|
||||
targetKind: BindingTargetKind;
|
||||
conversation: ConversationRef;
|
||||
status: BindingStatus;
|
||||
boundAt: number;
|
||||
expiresAt?: number;
|
||||
metadata?: Record<string, unknown>;
|
||||
};
|
||||
```
|
||||
|
||||
Core service contract:
|
||||
|
||||
|
||||
```ts
|
||||
export interface SessionBindingService {
|
||||
bind(input: {
|
||||
targetSessionKey: string;
|
||||
targetKind: BindingTargetKind;
|
||||
conversation: ConversationRef;
|
||||
metadata?: Record<string, unknown>;
|
||||
ttlMs?: number;
|
||||
}): Promise<SessionBindingRecord>;
|
||||
|
||||
listBySession(targetSessionKey: string): SessionBindingRecord[];
|
||||
resolveByConversation(ref: ConversationRef): SessionBindingRecord | null;
|
||||
touch(bindingId: string, at?: number): void;
|
||||
unbind(input: {
|
||||
bindingId?: string;
|
||||
targetSessionKey?: string;
|
||||
reason: string;
|
||||
}): Promise<SessionBindingRecord[]>;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Add one core delivery router for subagent completions
|
||||
|
||||
|
||||
Add a single destination resolution path for completion events.
|
||||
|
||||
|
||||
Router contract:
|
||||
|
||||
|
||||
```ts
|
||||
export interface BoundDeliveryRouter {
|
||||
resolveDestination(input: {
|
||||
eventKind: "task_completion";
|
||||
targetSessionKey: string;
|
||||
requester?: ConversationRef;
|
||||
failClosed: boolean;
|
||||
}): {
|
||||
binding: SessionBindingRecord | null;
|
||||
mode: "bound" | "fallback";
|
||||
reason: string;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
For this iteration:
|
||||
|
||||
|
||||
- only `task_completion` is routed through this new path
|
||||
|
||||
- existing paths for other event kinds remain as-is
|
||||
|
||||
|
||||
### 3. Keep Discord as adapter
|
||||
|
||||
|
||||
Discord remains the first adapter implementation.
|
||||
|
||||
|
||||
Adapter responsibilities:
|
||||
|
||||
|
||||
- create/reuse thread conversations
|
||||
|
||||
- send bound messages via webhook or channel send
|
||||
|
||||
- validate thread state (archived/deleted)
|
||||
|
||||
- map adapter metadata (webhook identity, thread ids)
|
||||
|
||||
|
||||
### 4. Fix currently known correctness issues
|
||||
|
||||
|
||||
Required in this iteration:
|
||||
|
||||
|
||||
- refresh token usage when reusing existing thread binding manager
|
||||
|
||||
- record outbound activity for webhook based Discord sends
|
||||
|
||||
- stop implicit main channel fallback when a bound thread destination is selected for session mode completion
|
||||
|
||||
|
||||
### 5. Preserve current runtime safety defaults
|
||||
|
||||
|
||||
No behavior change for users with thread bound spawn disabled.
|
||||
|
||||
|
||||
Defaults stay:
|
||||
|
||||
|
||||
- `channels.discord.threadBindings.spawnSubagentSessions = false`
|
||||
|
||||
|
||||
Result:
|
||||
|
||||
|
||||
- normal Discord users stay on current behavior
|
||||
|
||||
- new core path affects only bound session completion routing where enabled
|
||||
|
||||
|
||||
## Not in iteration 1
|
||||
|
||||
|
||||
Explicitly deferred:
|
||||
|
||||
|
||||
- ACP binding targets (`targetKind: "acp"`)
|
||||
|
||||
- new channel adapters beyond Discord
|
||||
|
||||
- global replacement of all delivery paths (`spawn_ack`, future `subagent_message`)
|
||||
|
||||
- protocol level changes
|
||||
|
||||
- store migration/versioning redesign for all binding persistence
|
||||
|
||||
|
||||
Notes on ACP:
|
||||
|
||||
|
||||
- interface design keeps room for ACP
|
||||
|
||||
- ACP implementation is not started in this iteration
|
||||
|
||||
|
||||
## Routing invariants
|
||||
|
||||
|
||||
These invariants are mandatory for iteration 1.
|
||||
|
||||
|
||||
- destination selection and content generation are separate steps
|
||||
|
||||
- if session mode completion resolves to an active bound destination, delivery must target that destination
|
||||
|
||||
- no hidden reroute from bound destination to main channel
|
||||
|
||||
- fallback behavior must be explicit and observable
|
||||
|
||||
|
||||
## Compatibility and rollout
|
||||
|
||||
|
||||
Compatibility target:
|
||||
|
||||
|
||||
- no regression for users with thread bound spawning off
|
||||
|
||||
- no change to non-Discord channels in this iteration
|
||||
|
||||
|
||||
Rollout:
|
||||
|
||||
|
||||
1. Land interfaces and router behind current feature gates.
|
||||
|
||||
2. Route Discord completion mode bound deliveries through router.
|
||||
|
||||
3. Keep legacy path for non-bound flows.
|
||||
|
||||
4. Verify with targeted tests and canary runtime logs.
|
||||
|
||||
|
||||
## Tests required in iteration 1
|
||||
|
||||
|
||||
Unit and integration coverage required:
|
||||
|
||||
|
||||
- manager token rotation uses latest token after manager reuse
|
||||
|
||||
- webhook sends update channel activity timestamps
|
||||
|
||||
- two active bound sessions in same requester channel do not duplicate to main channel
|
||||
|
||||
- completion for bound session mode run resolves to thread destination only
|
||||
|
||||
- disabled spawn flag keeps legacy behavior unchanged
|
||||
|
||||
|
||||
## Proposed implementation files
|
||||
|
||||
|
||||
Core:
|
||||
|
||||
|
||||
- `src/infra/outbound/session-binding-service.ts` (new)
|
||||
|
||||
- `src/infra/outbound/bound-delivery-router.ts` (new)
|
||||
|
||||
- `src/agents/subagent-announce.ts` (completion destination resolution integration)
|
||||
|
||||
|
||||
Discord adapter and runtime:
|
||||
|
||||
|
||||
- `src/discord/monitor/thread-bindings.manager.ts`
|
||||
|
||||
- `src/discord/monitor/reply-delivery.ts`
|
||||
|
||||
- `src/discord/send.outbound.ts`
|
||||
|
||||
|
||||
Tests:
|
||||
|
||||
|
||||
- `src/discord/monitor/provider*.test.ts`
|
||||
|
||||
- `src/discord/monitor/reply-delivery.test.ts`
|
||||
|
||||
- `src/agents/subagent-announce.format.test.ts`
|
||||
|
||||
|
||||
## Done criteria for iteration 1
|
||||
|
||||
|
||||
- core interfaces exist and are wired for completion routing
|
||||
|
||||
- correctness fixes above are merged with tests
|
||||
|
||||
- no main and thread duplicate completion delivery in session mode bound runs
|
||||
|
||||
- no behavior change for disabled bound spawn deployments
|
||||
|
||||
- ACP remains explicitly deferred
|
||||
|
||||
42
content/experiments/proposals/model-config.md
Normal file
42
content/experiments/proposals/model-config.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
read_when:
|
||||
- 探索未来模型选择和认证配置文件的方案
|
||||
summary: 探索:模型配置、认证配置文件和回退行为
|
||||
title: 模型配置探索
|
||||
x-i18n:
|
||||
generated_at: "2026-02-01T20:25:05Z"
|
||||
model: claude-opus-4-5
|
||||
provider: pi
|
||||
source_hash: 48623233d80f874c0ae853b51f888599cf8b50ae6fbfe47f6d7b0216bae9500b
|
||||
source_path: experiments/proposals/model-config.md
|
||||
workflow: 14
|
||||
---
|
||||
|
||||
# 模型配置(探索)
|
||||
|
||||
本文档记录了未来模型配置的**构想**。这不是正式的发布规范。如需了解当前行为,请参阅:
|
||||
|
||||
- [模型](/concepts/models)
|
||||
- [模型故障转移](/concepts/model-failover)
|
||||
- [OAuth + 配置文件](/concepts/oauth)
|
||||
|
||||
## 动机
|
||||
|
||||
运营者希望:
|
||||
|
||||
- 每个提供商支持多个认证配置文件(个人 vs 工作)。
|
||||
- 简单的 `/model` 选择,并具有可预测的回退行为。
|
||||
- 文本模型与图像模型之间有清晰的分离。
|
||||
|
||||
## 可能的方向(高层级)
|
||||
|
||||
- 保持模型选择简洁:`provider/model` 加可选别名。
|
||||
- 允许提供商拥有多个认证配置文件,并指定明确的顺序。
|
||||
- 使用全局回退列表,使所有会话以一致的方式进行故障转移。
|
||||
- 仅在明确配置时才覆盖图像路由。
|
||||
|
||||
## 待解决的问题
|
||||
|
||||
- 配置文件轮换应该按提供商还是按模型进行?
|
||||
- UI 应如何为会话展示配置文件选择?
|
||||
- 从旧版配置键迁移的最安全路径是什么?
|
||||
235
content/experiments/research/memory.md
Normal file
235
content/experiments/research/memory.md
Normal file
@@ -0,0 +1,235 @@
|
||||
---
|
||||
read_when:
|
||||
- 设计超越每日 Markdown 日志的工作区记忆(~/.openclaw/workspace)
|
||||
- Deciding: standalone CLI vs deep OpenClaw integration
|
||||
- 添加离线回忆 + 反思(retain/recall/reflect)
|
||||
summary: 研究笔记:Clawd 工作区的离线记忆系统(Markdown 作为数据源 + 派生索引)
|
||||
title: 工作区记忆研究
|
||||
x-i18n:
|
||||
generated_at: "2026-02-03T10:06:14Z"
|
||||
model: claude-opus-4-5
|
||||
provider: pi
|
||||
source_hash: 1753c8ee6284999fab4a94ff5fae7421c85233699c9d3088453d0c2133ac0feb
|
||||
source_path: experiments/research/memory.md
|
||||
workflow: 15
|
||||
---
|
||||
|
||||
# 工作区记忆 v2(离线):研究笔记
|
||||
|
||||
目标:Clawd 风格的工作区(`agents.defaults.workspace`,默认 `~/.openclaw/workspace`),其中"记忆"以每天一个 Markdown 文件(`memory/YYYY-MM-DD.md`)加上一小组稳定文件(例如 `memory.md`、`SOUL.md`)的形式存储。
|
||||
|
||||
本文档提出一种**离线优先**的记忆架构,保持 Markdown 作为规范的、可审查的数据源,但通过派生索引添加**结构化回忆**(搜索、实体摘要、置信度更新)。
|
||||
|
||||
## 为什么要改变?
|
||||
|
||||
当前设置(每天一个文件)非常适合:
|
||||
|
||||
- "仅追加"式日志记录
|
||||
- 人工编辑
|
||||
- git 支持的持久性 + 可审计性
|
||||
- 低摩擦捕获("直接写下来")
|
||||
|
||||
但它在以下方面较弱:
|
||||
|
||||
- 高召回率检索("我们对 X 做了什么决定?"、"上次我们尝试 Y 时?")
|
||||
- 以实体为中心的答案("告诉我关于 Alice / The Castle / warelay 的信息")而无需重读多个文件
|
||||
- 观点/偏好稳定性(以及变化时的证据)
|
||||
- 时间约束("2025 年 11 月期间什么是真实的?")和冲突解决
|
||||
|
||||
## 设计目标
|
||||
|
||||
- **离线**:无需网络即可工作;可在笔记本电脑/Castle 上运行;无云依赖。
|
||||
- **可解释**:检索的项目应该可归因(文件 + 位置)并与推理分离。
|
||||
- **低仪式感**:每日日志保持 Markdown,无需繁重的 schema 工作。
|
||||
- **增量式**:v1 仅使用 FTS 就很有用;语义/向量和图是可选升级。
|
||||
- **对智能体友好**:使"在 token 预算内回忆"变得简单(返回小型事实包)。
|
||||
|
||||
## 北极星模型(Hindsight × Letta)
|
||||
|
||||
需要融合两个部分:
|
||||
|
||||
1. **Letta/MemGPT 风格的控制循环**
|
||||
|
||||
- 保持一个小的"核心"始终在上下文中(角色 + 关键用户事实)
|
||||
- 其他所有内容都在上下文之外,通过工具检索
|
||||
- 记忆写入是显式的工具调用(append/replace/insert),持久化后在下一轮重新注入
|
||||
|
||||
2. **Hindsight 风格的记忆基底**
|
||||
|
||||
- 分离观察到的、相信的和总结的内容
|
||||
- 支持 retain/recall/reflect
|
||||
- 带有置信度的观点可以随证据演变
|
||||
- 实体感知检索 + 时间查询(即使没有完整的知识图谱)
|
||||
|
||||
## 提议的架构(Markdown 数据源 + 派生索引)
|
||||
|
||||
### 规范存储(git 友好)
|
||||
|
||||
保持 `~/.openclaw/workspace` 作为规范的人类可读记忆。
|
||||
|
||||
建议的工作区布局:
|
||||
|
||||
```
|
||||
~/.openclaw/workspace/
|
||||
memory.md # 小型:持久事实 + 偏好(类似核心)
|
||||
memory/
|
||||
YYYY-MM-DD.md # 每日日志(追加;叙事)
|
||||
bank/ # "类型化"记忆页面(稳定、可审查)
|
||||
world.md # 关于世界的客观事实
|
||||
experience.md # 智能体做了什么(第一人称)
|
||||
opinions.md # 主观偏好/判断 + 置信度 + 证据指针
|
||||
entities/
|
||||
Peter.md
|
||||
The-Castle.md
|
||||
warelay.md
|
||||
...
|
||||
```
|
||||
|
||||
注意:
|
||||
|
||||
- **每日日志保持为每日日志**。无需将其转换为 JSON。
|
||||
- `bank/` 文件是**经过整理的**,由反思任务生成,仍可手动编辑。
|
||||
- `memory.md` 保持"小型 + 类似核心":你希望 Clawd 每次会话都能看到的内容。
|
||||
|
||||
### 派生存储(机器回忆)
|
||||
|
||||
在工作区下添加派生索引(不一定需要 git 跟踪):
|
||||
|
||||
```
|
||||
~/.openclaw/workspace/.memory/index.sqlite
|
||||
```
|
||||
|
||||
后端支持:
|
||||
|
||||
- 用于事实 + 实体链接 + 观点元数据的 SQLite schema
|
||||
- SQLite **FTS5** 用于词法回忆(快速、小巧、离线)
|
||||
- 可选的嵌入表用于语义回忆(仍然离线)
|
||||
|
||||
索引始终**可从 Markdown 重建**。
|
||||
|
||||
## Retain / Recall / Reflect(操作循环)
|
||||
|
||||
### Retain:将每日日志规范化为"事实"
|
||||
|
||||
Hindsight 在这里重要的关键洞察:存储**叙事性、自包含的事实**,而不是微小的片段。
|
||||
|
||||
`memory/YYYY-MM-DD.md` 的实用规则:
|
||||
|
||||
- 在一天结束时(或期间),添加一个 `## Retain` 部分,包含 2-5 个要点:
|
||||
- 叙事性(保留跨轮上下文)
|
||||
- 自包含(独立时也有意义)
|
||||
- 标记类型 + 实体提及
|
||||
|
||||
示例:
|
||||
|
||||
```
|
||||
## Retain
|
||||
- W @Peter: Currently in Marrakech (Nov 27–Dec 1, 2025) for Andy's birthday.
|
||||
- B @warelay: I fixed the Baileys WS crash by wrapping connection.update handlers in try/catch (see memory/2025-11-27.md).
|
||||
- O(c=0.95) @Peter: Prefers concise replies (<1500 chars) on WhatsApp; long content goes into files.
|
||||
```
|
||||
|
||||
最小化解析:
|
||||
|
||||
- 类型前缀:`W`(世界)、`B`(经历/传记)、`O`(观点)、`S`(观察/摘要;通常是生成的)
|
||||
- 实体:`@Peter`、`@warelay` 等(slug 映射到 `bank/entities/*.md`)
|
||||
- 观点置信度:`O(c=0.0..1.0)` 可选
|
||||
|
||||
如果你不想让作者考虑这些:反思任务可以从日志的其余部分推断这些要点,但有一个显式的 `## Retain` 部分是最简单的"质量杠杆"。
|
||||
|
||||
### Recall:对派生索引的查询
|
||||
|
||||
Recall 应支持:
|
||||
|
||||
- **词法**:"查找精确的术语/名称/命令"(FTS5)
|
||||
- **实体**:"告诉我关于 X 的信息"(实体页面 + 实体链接的事实)
|
||||
- **时间**:"11 月 27 日前后发生了什么"/"自上周以来"
|
||||
- **观点**:"Peter 偏好什么?"(带置信度 + 证据)
|
||||
|
||||
返回格式应对智能体友好并引用来源:
|
||||
|
||||
- `kind`(`world|experience|opinion|observation`)
|
||||
- `timestamp`(来源日期,或如果存在则提取的时间范围)
|
||||
- `entities`(`["Peter","warelay"]`)
|
||||
- `content`(叙事性事实)
|
||||
- `source`(`memory/2025-11-27.md#L12` 等)
|
||||
|
||||
### Reflect:生成稳定页面 + 更新信念
|
||||
|
||||
反思是一个定时任务(每日或心跳 `ultrathink`),它:
|
||||
|
||||
- 根据最近的事实更新 `bank/entities/*.md`(实体摘要)
|
||||
- 根据强化/矛盾更新 `bank/opinions.md` 置信度
|
||||
- 可选地提议对 `memory.md`("类似核心"的持久事实)的编辑
|
||||
|
||||
观点演变(简单、可解释):
|
||||
|
||||
- 每个观点有:
|
||||
- 陈述
|
||||
- 置信度 `c ∈ [0,1]`
|
||||
- last_updated
|
||||
- 证据链接(支持 + 矛盾的事实 ID)
|
||||
- 当新事实到达时:
|
||||
- 通过实体重叠 + 相似性找到候选观点(先 FTS,后嵌入)
|
||||
- 通过小幅增量更新置信度;大幅跳跃需要强矛盾 + 重复证据
|
||||
|
||||
## CLI 集成:独立 vs 深度集成
|
||||
|
||||
建议:**深度集成到 OpenClaw**,但保持可分离的核心库。
|
||||
|
||||
### 为什么要集成到 OpenClaw?
|
||||
|
||||
- OpenClaw 已经知道:
|
||||
- 工作区路径(`agents.defaults.workspace`)
|
||||
- 会话模型 + 心跳
|
||||
- 日志记录 + 故障排除模式
|
||||
- 你希望智能体自己调用工具:
|
||||
- `openclaw memory recall "…" --k 25 --since 30d`
|
||||
- `openclaw memory reflect --since 7d`
|
||||
|
||||
### 为什么仍要分离库?
|
||||
|
||||
- 保持记忆逻辑可测试,无需 Gateway 网关/运行时
|
||||
- 可从其他上下文重用(本地脚本、未来的桌面应用等)
|
||||
|
||||
形态:
|
||||
记忆工具预计是一个小型 CLI + 库层,但这仅是探索性的。
|
||||
|
||||
## "S-Collide" / SuCo:何时使用(研究)
|
||||
|
||||
如果"S-Collide"指的是 **SuCo(Subspace Collision)**:这是一种 ANN 检索方法,通过在子空间中使用学习/结构化碰撞来实现强召回/延迟权衡(论文:arXiv 2411.14754,2024)。
|
||||
|
||||
对于 `~/.openclaw/workspace` 的务实观点:
|
||||
|
||||
- **不要从** SuCo 开始。
|
||||
- 从 SQLite FTS +(可选的)简单嵌入开始;你会立即获得大部分 UX 收益。
|
||||
- 仅在以下情况下考虑 SuCo/HNSW/ScaNN 级别的解决方案:
|
||||
- 语料库很大(数万/数十万个块)
|
||||
- 暴力嵌入搜索变得太慢
|
||||
- 召回质量明显受到词法搜索的瓶颈限制
|
||||
|
||||
离线友好的替代方案(按复杂性递增):
|
||||
|
||||
- SQLite FTS5 + 元数据过滤(零 ML)
|
||||
- 嵌入 + 暴力搜索(如果块数量低,效果出奇地好)
|
||||
- HNSW 索引(常见、稳健;需要库绑定)
|
||||
- SuCo(研究级;如果有可嵌入的可靠实现则很有吸引力)
|
||||
|
||||
开放问题:
|
||||
|
||||
- 对于你的机器(笔记本 + 台式机)上的"个人助理记忆",**最佳**的离线嵌入模型是什么?
|
||||
- 如果你已经有 Ollama:使用本地模型嵌入;否则在工具链中附带一个小型嵌入模型。
|
||||
|
||||
## 最小可用试点
|
||||
|
||||
如果你想要一个最小但仍有用的版本:
|
||||
|
||||
- 添加 `bank/` 实体页面和每日日志中的 `## Retain` 部分。
|
||||
- 使用 SQLite FTS 进行带引用的回忆(路径 + 行号)。
|
||||
- 仅在召回质量或规模需要时添加嵌入。
|
||||
|
||||
## 参考资料
|
||||
|
||||
- Letta / MemGPT 概念:"核心记忆块" + "档案记忆" + 工具驱动的自编辑记忆。
|
||||
- Hindsight 技术报告:"retain / recall / reflect",四网络记忆,叙事性事实提取,观点置信度演变。
|
||||
- SuCo:arXiv 2411.14754(2024):"Subspace Collision"近似最近邻检索。
|
||||
Reference in New Issue
Block a user