first commit

2026-02-28 23:01:30 +08:00
commit 3956ee4806
415 changed files with 74538 additions and 0 deletions
--- a/content/security/CONTRIBUTING-THREAT-MODEL.md
+++ b/content/security/CONTRIBUTING-THREAT-MODEL.md
@@ -0,0 +1,90 @@
+# Contributing to the OpenClaw Threat Model
+
+Thanks for helping make OpenClaw more secure. This threat model is a living document and we welcome contributions from anyone - you don't need to be a security expert.
+
+## Ways to Contribute
+
+### Add a Threat
+
+Spotted an attack vector or risk we haven't covered? Open an issue on [openclaw/trust](https://github.com/openclaw/trust/issues) and describe it in your own words. You don't need to know any frameworks or fill in every field - just describe the scenario.
+
+**Helpful to include (but not required):**
+
+- The attack scenario and how it could be exploited
+- Which parts of OpenClaw are affected (CLI, gateway, channels, ClawHub, MCP servers, etc.)
+- How severe you think it is (low / medium / high / critical)
+- Any links to related research, CVEs, or real-world examples
+
+We'll handle the ATLAS mapping, threat IDs, and risk assessment during review. If you want to include those details, great - but it's not expected.
+
+> **This is for adding to the threat model, not reporting live vulnerabilities.** If you've found an exploitable vulnerability, see our [Trust page](https://trust.openclaw.ai) for responsible disclosure instructions.
+
+### Suggest a Mitigation
+
+Have an idea for how to address an existing threat? Open an issue or PR referencing the threat. Useful mitigations are specific and actionable - for example, "per-sender rate limiting of 10 messages/minute at the gateway" is better than "implement rate limiting."
+
+### Propose an Attack Chain
+
+Attack chains show how multiple threats combine into a realistic attack scenario. If you see a dangerous combination, describe the steps and how an attacker would chain them together. A short narrative of how the attack unfolds in practice is more valuable than a formal template.
+
+### Fix or Improve Existing Content
+
+Typos, clarifications, outdated info, better examples - PRs welcome, no issue needed.
+
+## What We Use
+
+### MITRE ATLAS
+
+This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/) (Adversarial Threat Landscape for AI Systems), a framework designed specifically for AI/ML threats like prompt injection, tool misuse, and agent exploitation. You don't need to know ATLAS to contribute - we map submissions to the framework during review.
+
+### Threat IDs
+
+Each threat gets an ID like `T-EXEC-003`. The categories are:
+
+| Code    | Category                                   |
+| ------- | ------------------------------------------ |
+| RECON   | Reconnaissance - information gathering     |
+| ACCESS  | Initial access - gaining entry             |
+| EXEC    | Execution - running malicious actions      |
+| PERSIST | Persistence - maintaining access           |
+| EVADE   | Defense evasion - avoiding detection       |
+| DISC    | Discovery - learning about the environment |
+| EXFIL   | Exfiltration - stealing data               |
+| IMPACT  | Impact - damage or disruption              |
+
+IDs are assigned by maintainers during review. You don't need to pick one.
+
+### Risk Levels
+
+| Level        | Meaning                                                           |
+| ------------ | ----------------------------------------------------------------- |
+| **Critical** | Full system compromise, or high likelihood + critical impact      |
+| **High**     | Significant damage likely, or medium likelihood + critical impact |
+| **Medium**   | Moderate risk, or low likelihood + high impact                    |
+| **Low**      | Unlikely and limited impact                                       |
+
+If you're unsure about the risk level, just describe the impact and we'll assess it.
+
+## Review Process
+
+1. **Triage** - We review new submissions within 48 hours
+2. **Assessment** - We verify feasibility, assign ATLAS mapping and threat ID, validate risk level
+3. **Documentation** - We ensure everything is formatted and complete
+4. **Merge** - Added to the threat model and visualization
+
+## Resources
+
+- [ATLAS Website](https://atlas.mitre.org/)
+- [ATLAS Techniques](https://atlas.mitre.org/techniques/)
+- [ATLAS Case Studies](https://atlas.mitre.org/studies/)
+- [OpenClaw Threat Model](./THREAT-MODEL-ATLAS.md)
+
+## Contact
+
+- **Security vulnerabilities:** See our [Trust page](https://trust.openclaw.ai) for reporting instructions
+- **Threat model questions:** Open an issue on [openclaw/trust](https://github.com/openclaw/trust/issues)
+- **General chat:** Discord #security channel
+
+## Recognition
+
+Contributors to the threat model are recognized in the threat model acknowledgments, release notes, and the OpenClaw security hall of fame for significant contributions.
--- a/content/security/README.md
+++ b/content/security/README.md
@@ -0,0 +1,17 @@
+# OpenClaw Security & Trust
+
+**Live:** [trust.openclaw.ai](https://trust.openclaw.ai)
+
+## Documents
+
+- [Threat Model](./THREAT-MODEL-ATLAS.md) - MITRE ATLAS-based threat model for the OpenClaw ecosystem
+- [Contributing to the Threat Model](./CONTRIBUTING-THREAT-MODEL.md) - How to add threats, mitigations, and attack chains
+
+## Reporting Vulnerabilities
+
+See the [Trust page](https://trust.openclaw.ai) for full reporting instructions covering all repos.
+
+## Contact
+
+- **Jamieson O'Reilly** ([@theonejvo](https://twitter.com/theonejvo)) - Security & Trust
+- Discord: #security channel
--- a/content/security/THREAT-MODEL-ATLAS.md
+++ b/content/security/THREAT-MODEL-ATLAS.md
@@ -0,0 +1,603 @@
+# OpenClaw Threat Model v1.0
+
+## MITRE ATLAS Framework
+
+**Version:** 1.0-draft
+**Last Updated:** 2026-02-04
+**Methodology:** MITRE ATLAS + Data Flow Diagrams
+**Framework:** [MITRE ATLAS](https://atlas.mitre.org/) (Adversarial Threat Landscape for AI Systems)
+
+### Framework Attribution
+
+This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/), the industry-standard framework for documenting adversarial threats to AI/ML systems. ATLAS is maintained by [MITRE](https://www.mitre.org/) in collaboration with the AI security community.
+
+**Key ATLAS Resources:**
+
+- [ATLAS Techniques](https://atlas.mitre.org/techniques/)
+- [ATLAS Tactics](https://atlas.mitre.org/tactics/)
+- [ATLAS Case Studies](https://atlas.mitre.org/studies/)
+- [ATLAS GitHub](https://github.com/mitre-atlas/atlas-data)
+- [Contributing to ATLAS](https://atlas.mitre.org/resources/contribute)
+
+### Contributing to This Threat Model
+
+This is a living document maintained by the OpenClaw community. See [CONTRIBUTING-THREAT-MODEL.md](./CONTRIBUTING-THREAT-MODEL.md) for guidelines on contributing:
+
+- Reporting new threats
+- Updating existing threats
+- Proposing attack chains
+- Suggesting mitigations
+
+---
+
+## 1. Introduction
+
+### 1.1 Purpose
+
+This threat model documents adversarial threats to the OpenClaw AI agent platform and ClawHub skill marketplace, using the MITRE ATLAS framework designed specifically for AI/ML systems.
+
+### 1.2 Scope
+
+| Component              | Included | Notes                                            |
+| ---------------------- | -------- | ------------------------------------------------ |
+| OpenClaw Agent Runtime | Yes      | Core agent execution, tool calls, sessions       |
+| Gateway                | Yes      | Authentication, routing, channel integration     |
+| Channel Integrations   | Yes      | WhatsApp, Telegram, Discord, Signal, Slack, etc. |
+| ClawHub Marketplace    | Yes      | Skill publishing, moderation, distribution       |
+| MCP Servers            | Yes      | External tool providers                          |
+| User Devices           | Partial  | Mobile apps, desktop clients                     |
+
+### 1.3 Out of Scope
+
+Nothing is explicitly out of scope for this threat model.
+
+---
+
+## 2. System Architecture
+
+### 2.1 Trust Boundaries
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                    UNTRUSTED ZONE                                │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
+│  │  WhatsApp   │  │  Telegram   │  │   Discord   │  ...         │
+│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘              │
+│         │                │                │                      │
+└─────────┼────────────────┼────────────────┼──────────────────────┘
+          │                │                │
+          ▼                ▼                ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                 TRUST BOUNDARY 1: Channel Access                 │
+│  ┌──────────────────────────────────────────────────────────┐   │
+│  │                      GATEWAY                              │   │
+│  │  • Device Pairing (30s grace period)                      │   │
+│  │  • AllowFrom / AllowList validation                       │   │
+│  │  • Token/Password/Tailscale auth                          │   │
+│  └──────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                 TRUST BOUNDARY 2: Session Isolation              │
+│  ┌──────────────────────────────────────────────────────────┐   │
+│  │                   AGENT SESSIONS                          │   │
+│  │  • Session key = agent:channel:peer                       │   │
+│  │  • Tool policies per agent                                │   │
+│  │  • Transcript logging                                     │   │
+│  └──────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                 TRUST BOUNDARY 3: Tool Execution                 │
+│  ┌──────────────────────────────────────────────────────────┐   │
+│  │                  EXECUTION SANDBOX                        │   │
+│  │  • Docker sandbox OR Host (exec-approvals)                │   │
+│  │  • Node remote execution                                  │   │
+│  │  • SSRF protection (DNS pinning + IP blocking)            │   │
+│  └──────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                 TRUST BOUNDARY 4: External Content               │
+│  ┌──────────────────────────────────────────────────────────┐   │
+│  │              FETCHED URLs / EMAILS / WEBHOOKS             │   │
+│  │  • External content wrapping (XML tags)                   │   │
+│  │  • Security notice injection                              │   │
+│  └──────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                 TRUST BOUNDARY 5: Supply Chain                   │
+│  ┌──────────────────────────────────────────────────────────┐   │
+│  │                      CLAWHUB                              │   │
+│  │  • Skill publishing (semver, SKILL.md required)           │   │
+│  │  • Pattern-based moderation flags                         │   │
+│  │  • VirusTotal scanning (coming soon)                      │   │
+│  │  • GitHub account age verification                        │   │
+│  └──────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+### 2.2 Data Flows
+
+| Flow | Source  | Destination | Data               | Protection           |
+| ---- | ------- | ----------- | ------------------ | -------------------- |
+| F1   | Channel | Gateway     | User messages      | TLS, AllowFrom       |
+| F2   | Gateway | Agent       | Routed messages    | Session isolation    |
+| F3   | Agent   | 工具 | Tool invocations   | Policy enforcement   |
+| F4   | Agent   | External    | web_fetch requests | SSRF blocking        |
+| F5   | ClawHub | Agent       | Skill code         | Moderation, scanning |
+| F6   | Agent   | Channel     | Responses          | Output filtering     |
+
+---
+
+## 3. Threat Analysis by ATLAS Tactic
+
+### 3.1 Reconnaissance (AML.TA0002)
+
+#### T-RECON-001: Agent Endpoint Discovery
+
+| Attribute               | Value                                                                |
+| ----------------------- | -------------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0006 - Active Scanning                                          |
+| **Description**         | Attacker scans for exposed OpenClaw gateway endpoints                |
+| **Attack Vector**       | Network scanning, shodan queries, DNS enumeration                    |
+| **Affected Components** | Gateway, exposed API endpoints                                       |
+| **Current Mitigations** | Tailscale auth option, bind to loopback by default                   |
+| **Residual Risk**       | Medium - Public gateways discoverable                                |
+| **Recommendations**     | Document secure deployment, add rate limiting on discovery endpoints |
+
+#### T-RECON-002: Channel Integration Probing
+
+| Attribute               | Value                                                              |
+| ----------------------- | ------------------------------------------------------------------ |
+| **ATLAS ID**            | AML.T0006 - Active Scanning                                        |
+| **Description**         | Attacker probes messaging channels to identify AI-managed accounts |
+| **Attack Vector**       | Sending test messages, observing response patterns                 |
+| **Affected Components** | All channel integrations                                           |
+| **Current Mitigations** | None specific                                                      |
+| **Residual Risk**       | Low - Limited value from discovery alone                           |
+| **Recommendations**     | Consider response timing randomization                             |
+
+---
+
+### 3.2 Initial Access (AML.TA0004)
+
+#### T-ACCESS-001: Pairing Code Interception
+
+| Attribute               | Value                                                    |
+| ----------------------- | -------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0040 - AI Model Inference API Access                |
+| **Description**         | Attacker intercepts pairing code during 30s grace period |
+| **Attack Vector**       | Shoulder surfing, network sniffing, social engineering   |
+| **Affected Components** | Device pairing system                                    |
+| **Current Mitigations** | 30s expiry, codes sent via existing channel              |
+| **Residual Risk**       | Medium - Grace period exploitable                        |
+| **Recommendations**     | Reduce grace period, add confirmation step               |
+
+#### T-ACCESS-002: AllowFrom Spoofing
+
+| Attribute               | Value                                                                          |
+| ----------------------- | ------------------------------------------------------------------------------ |
+| **ATLAS ID**            | AML.T0040 - AI Model Inference API Access                                      |
+| **Description**         | Attacker spoofs allowed sender identity in channel                             |
+| **Attack Vector**       | Depends on channel - phone number spoofing, username impersonation             |
+| **Affected Components** | AllowFrom validation per channel                                               |
+| **Current Mitigations** | Channel-specific identity verification                                         |
+| **Residual Risk**       | Medium - Some channels vulnerable to spoofing                                  |
+| **Recommendations**     | Document channel-specific risks, add cryptographic verification where possible |
+
+#### T-ACCESS-003: Token Theft
+
+| Attribute               | Value                                                       |
+| ----------------------- | ----------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0040 - AI Model Inference API Access                   |
+| **Description**         | Attacker steals authentication tokens from config files     |
+| **Attack Vector**       | Malware, unauthorized device access, config backup exposure |
+| **Affected Components** | ~/.openclaw/credentials/, config storage                    |
+| **Current Mitigations** | File permissions                                            |
+| **Residual Risk**       | High - Tokens stored in plaintext                           |
+| **Recommendations**     | Implement token encryption at rest, add token rotation      |
+
+---
+
+### 3.3 Execution (AML.TA0005)
+
+#### T-EXEC-001: Direct Prompt Injection
+
+| Attribute               | Value                                                                                     |
+| ----------------------- | ----------------------------------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0051.000 - LLM Prompt Injection: Direct                                              |
+| **Description**         | Attacker sends crafted prompts to manipulate agent behavior                               |
+| **Attack Vector**       | Channel messages containing adversarial instructions                                      |
+| **Affected Components** | Agent LLM, all input surfaces                                                             |
+| **Current Mitigations** | Pattern detection, external content wrapping                                              |
+| **Residual Risk**       | Critical - Detection only, no blocking; sophisticated attacks bypass                      |
+| **Recommendations**     | Implement multi-layer defense, output validation, user confirmation for sensitive actions |
+
+#### T-EXEC-002: Indirect Prompt Injection
+
+| Attribute               | Value                                                       |
+| ----------------------- | ----------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0051.001 - LLM Prompt Injection: Indirect              |
+| **Description**         | Attacker embeds malicious instructions in fetched content   |
+| **Attack Vector**       | Malicious URLs, poisoned emails, compromised webhooks       |
+| **Affected Components** | web_fetch, email ingestion, external data sources           |
+| **Current Mitigations** | Content wrapping with XML tags and security notice          |
+| **Residual Risk**       | High - LLM may ignore wrapper instructions                  |
+| **Recommendations**     | Implement content sanitization, separate execution contexts |
+
+#### T-EXEC-003: Tool Argument Injection
+
+| Attribute               | Value                                                        |
+| ----------------------- | ------------------------------------------------------------ |
+| **ATLAS ID**            | AML.T0051.000 - LLM Prompt Injection: Direct                 |
+| **Description**         | Attacker manipulates tool arguments through prompt injection |
+| **Attack Vector**       | Crafted prompts that influence tool parameter values         |
+| **Affected Components** | All tool invocations                                         |
+| **Current Mitigations** | Exec approvals for dangerous commands                        |
+| **Residual Risk**       | High - Relies on user judgment                               |
+| **Recommendations**     | Implement argument validation, parameterized tool calls      |
+
+#### T-EXEC-004: Exec Approval Bypass
+
+| Attribute               | Value                                                      |
+| ----------------------- | ---------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0043 - Craft Adversarial Data                         |
+| **Description**         | Attacker crafts commands that bypass approval allowlist    |
+| **Attack Vector**       | Command obfuscation, alias exploitation, path manipulation |
+| **Affected Components** | exec-approvals.ts, command allowlist                       |
+| **Current Mitigations** | Allowlist + ask mode                                       |
+| **Residual Risk**       | High - No command sanitization                             |
+| **Recommendations**     | Implement command normalization, expand blocklist          |
+
+---
+
+### 3.4 Persistence (AML.TA0006)
+
+#### T-PERSIST-001: Malicious Skill Installation
+
+| Attribute               | Value                                                                    |
+| ----------------------- | ------------------------------------------------------------------------ |
+| **ATLAS ID**            | AML.T0010.001 - Supply Chain Compromise: AI Software                     |
+| **Description**         | Attacker publishes malicious skill to ClawHub                            |
+| **Attack Vector**       | Create account, publish skill with hidden malicious code                 |
+| **Affected Components** | ClawHub, skill loading, agent execution                                  |
+| **Current Mitigations** | GitHub account age verification, pattern-based moderation flags          |
+| **Residual Risk**       | Critical - No sandboxing, limited review                                 |
+| **Recommendations**     | VirusTotal integration (in progress), skill sandboxing, community review |
+
+#### T-PERSIST-002: Skill Update Poisoning
+
+| Attribute               | Value                                                          |
+| ----------------------- | -------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0010.001 - Supply Chain Compromise: AI Software           |
+| **Description**         | Attacker compromises popular skill and pushes malicious update |
+| **Attack Vector**       | Account compromise, social engineering of skill owner          |
+| **Affected Components** | ClawHub versioning, auto-update flows                          |
+| **Current Mitigations** | Version fingerprinting                                         |
+| **Residual Risk**       | High - Auto-updates may pull malicious versions                |
+| **Recommendations**     | Implement update signing, rollback capability, version pinning |
+
+#### T-PERSIST-003: Agent Configuration Tampering
+
+| Attribute               | Value                                                           |
+| ----------------------- | --------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0010.002 - Supply Chain Compromise: Data                   |
+| **Description**         | Attacker modifies agent configuration to persist access         |
+| **Attack Vector**       | Config file modification, settings injection                    |
+| **Affected Components** | Agent config, tool policies                                     |
+| **Current Mitigations** | File permissions                                                |
+| **Residual Risk**       | Medium - Requires local access                                  |
+| **Recommendations**     | Config integrity verification, audit logging for config changes |
+
+---
+
+### 3.5 Defense Evasion (AML.TA0007)
+
+#### T-EVADE-001: Moderation Pattern Bypass
+
+| Attribute               | Value                                                                  |
+| ----------------------- | ---------------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0043 - Craft Adversarial Data                                     |
+| **Description**         | Attacker crafts skill content to evade moderation patterns             |
+| **Attack Vector**       | Unicode homoglyphs, encoding tricks, dynamic loading                   |
+| **Affected Components** | ClawHub moderation.ts                                                  |
+| **Current Mitigations** | Pattern-based FLAG_RULES                                               |
+| **Residual Risk**       | High - Simple regex easily bypassed                                    |
+| **Recommendations**     | Add behavioral analysis (VirusTotal Code Insight), AST-based detection |
+
+#### T-EVADE-002: Content Wrapper Escape
+
+| Attribute               | Value                                                     |
+| ----------------------- | --------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0043 - Craft Adversarial Data                        |
+| **Description**         | Attacker crafts content that escapes XML wrapper context  |
+| **Attack Vector**       | Tag manipulation, context confusion, instruction override |
+| **Affected Components** | External content wrapping                                 |
+| **Current Mitigations** | XML tags + security notice                                |
+| **Residual Risk**       | Medium - Novel escapes discovered regularly               |
+| **Recommendations**     | Multiple wrapper layers, output-side validation           |
+
+---
+
+### 3.6 Discovery (AML.TA0008)
+
+#### T-DISC-001: Tool Enumeration
+
+| Attribute               | Value                                                 |
+| ----------------------- | ----------------------------------------------------- |
+| **ATLAS ID**            | AML.T0040 - AI Model Inference API Access             |
+| **Description**         | Attacker enumerates available tools through prompting |
+| **Attack Vector**       | "What tools do you have?" style queries               |
+| **Affected Components** | Agent tool registry                                   |
+| **Current Mitigations** | None specific                                         |
+| **Residual Risk**       | Low - Tools generally documented                      |
+| **Recommendations**     | Consider tool visibility controls                     |
+
+#### T-DISC-002: Session Data Extraction
+
+| Attribute               | Value                                                 |
+| ----------------------- | ----------------------------------------------------- |
+| **ATLAS ID**            | AML.T0040 - AI Model Inference API Access             |
+| **Description**         | Attacker extracts sensitive data from session context |
+| **Attack Vector**       | "What did we discuss?" queries, context probing       |
+| **Affected Components** | Session transcripts, context window                   |
+| **Current Mitigations** | Session isolation per sender                          |
+| **Residual Risk**       | Medium - Within-session data accessible               |
+| **Recommendations**     | Implement sensitive data redaction in context         |
+
+---
+
+### 3.7 Collection & Exfiltration (AML.TA0009, AML.TA0010)
+
+#### T-EXFIL-001: Data Theft via web_fetch
+
+| Attribute               | Value                                                                  |
+| ----------------------- | ---------------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0009 - Collection                                                 |
+| **Description**         | Attacker exfiltrates data by instructing agent to send to external URL |
+| **Attack Vector**       | Prompt injection causing agent to POST data to attacker server         |
+| **Affected Components** | web_fetch tool                                                         |
+| **Current Mitigations** | SSRF blocking for internal networks                                    |
+| **Residual Risk**       | High - External URLs permitted                                         |
+| **Recommendations**     | Implement URL allowlisting, data classification awareness              |
+
+#### T-EXFIL-002: Unauthorized Message Sending
+
+| Attribute               | Value                                                            |
+| ----------------------- | ---------------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0009 - Collection                                           |
+| **Description**         | Attacker causes agent to send messages containing sensitive data |
+| **Attack Vector**       | Prompt injection causing agent to message attacker               |
+| **Affected Components** | Message tool, channel integrations                               |
+| **Current Mitigations** | Outbound messaging gating                                        |
+| **Residual Risk**       | Medium - Gating may be bypassed                                  |
+| **Recommendations**     | Require explicit confirmation for new recipients                 |
+
+#### T-EXFIL-003: Credential Harvesting
+
+| Attribute               | Value                                                   |
+| ----------------------- | ------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0009 - Collection                                  |
+| **Description**         | Malicious skill harvests credentials from agent context |
+| **Attack Vector**       | Skill code reads environment variables, config files    |
+| **Affected Components** | Skill execution environment                             |
+| **Current Mitigations** | None specific to skills                                 |
+| **Residual Risk**       | Critical - Skills run with agent privileges             |
+| **Recommendations**     | Skill sandboxing, credential isolation                  |
+
+---
+
+### 3.8 Impact (AML.TA0011)
+
+#### T-IMPACT-001: Unauthorized Command Execution
+
+| Attribute               | Value                                               |
+| ----------------------- | --------------------------------------------------- |
+| **ATLAS ID**            | AML.T0031 - Erode AI Model Integrity                |
+| **Description**         | Attacker executes arbitrary commands on user system |
+| **Attack Vector**       | Prompt injection combined with exec approval bypass |
+| **Affected Components** | Bash tool, command execution                        |
+| **Current Mitigations** | Exec approvals, Docker sandbox option               |
+| **Residual Risk**       | Critical - Host execution without sandbox           |
+| **Recommendations**     | Default to sandbox, improve approval UX             |
+
+#### T-IMPACT-002: Resource Exhaustion (DoS)
+
+| Attribute               | Value                                              |
+| ----------------------- | -------------------------------------------------- |
+| **ATLAS ID**            | AML.T0031 - Erode AI Model Integrity               |
+| **Description**         | Attacker exhausts API credits or compute resources |
+| **Attack Vector**       | Automated message flooding, expensive tool calls   |
+| **Affected Components** | Gateway, agent sessions, API provider              |
+| **Current Mitigations** | None                                               |
+| **Residual Risk**       | High - No rate limiting                            |
+| **Recommendations**     | Implement per-sender rate limits, cost budgets     |
+
+#### T-IMPACT-003: Reputation Damage
+
+| Attribute               | Value                                                   |
+| ----------------------- | ------------------------------------------------------- |
+| **ATLAS ID**            | AML.T0031 - Erode AI Model Integrity                    |
+| **Description**         | Attacker causes agent to send harmful/offensive content |
+| **Attack Vector**       | Prompt injection causing inappropriate responses        |
+| **Affected Components** | Output generation, channel messaging                    |
+| **Current Mitigations** | LLM provider content policies                           |
+| **Residual Risk**       | Medium - Provider filters imperfect                     |
+| **Recommendations**     | Output filtering layer, user controls                   |
+
+---
+
+## 4. ClawHub Supply Chain Analysis
+
+### 4.1 Current Security Controls
+
+| Control              | Implementation              | Effectiveness                                        |
+| -------------------- | --------------------------- | ---------------------------------------------------- |
+| GitHub Account Age   | `requireGitHubAccountAge()` | Medium - Raises bar for new attackers                |
+| Path Sanitization    | `sanitizePath()`            | High - Prevents path traversal                       |
+| File Type Validation | `isTextFile()`              | Medium - Only text files, but can still be malicious |
+| Size Limits          | 50MB total bundle           | High - Prevents resource exhaustion                  |
+| Required SKILL.md    | Mandatory readme            | Low security value - Informational only              |
+| Pattern Moderation   | FLAG_RULES in moderation.ts | Low - Easily bypassed                                |
+| Moderation Status    | `moderationStatus` field    | Medium - Manual review possible                      |
+
+### 4.2 Moderation Flag Patterns
+
+Current patterns in `moderation.ts`:
+
+```javascript
+// Known-bad identifiers
+/(keepcold131\/ClawdAuthenticatorTool|ClawdAuthenticatorTool)/i
+
+// Suspicious keywords
+/(malware|stealer|phish|phishing|keylogger)/i
+/(api[-_ ]?key|token|password|private key|secret)/i
+/(wallet|seed phrase|mnemonic|crypto)/i
+/(discord\.gg|webhook|hooks\.slack)/i
+/(curl[^\n]+\|\s*(sh|bash))/i
+/(bit\.ly|tinyurl\.com|t\.co|goo\.gl|is\.gd)/i
+```
+
+**Limitations:**
+
+- Only checks slug, displayName, summary, frontmatter, metadata, file paths
+- Does not analyze actual skill code content
+- Simple regex easily bypassed with obfuscation
+- No behavioral analysis
+
+### 4.3 Planned Improvements
+
+| Improvement            | Status                                | Impact                                                                |
+| ---------------------- | ------------------------------------- | --------------------------------------------------------------------- |
+| VirusTotal Integration | In Progress                           | High - Code Insight behavioral analysis                               |
+| Community Reporting    | Partial (`skillReports` table exists) | Medium                                                                |
+| Audit Logging          | Partial (`auditLogs` table exists)    | Medium                                                                |
+| Badge System           | Implemented                           | Medium - `highlighted`, `official`, `deprecated`, `redactionApproved` |
+
+---
+
+## 5. Risk Matrix
+
+### 5.1 Likelihood vs Impact
+
+| Threat ID     | Likelihood | Impact   | Risk Level   | Priority |
+| ------------- | ---------- | -------- | ------------ | -------- |
+| T-EXEC-001    | High       | Critical | **Critical** | P0       |
+| T-PERSIST-001 | High       | Critical | **Critical** | P0       |
+| T-EXFIL-003   | Medium     | Critical | **Critical** | P0       |
+| T-IMPACT-001  | Medium     | Critical | **High**     | P1       |
+| T-EXEC-002    | High       | High     | **High**     | P1       |
+| T-EXEC-004    | Medium     | High     | **High**     | P1       |
+| T-ACCESS-003  | Medium     | High     | **High**     | P1       |
+| T-EXFIL-001   | Medium     | High     | **High**     | P1       |
+| T-IMPACT-002  | High       | Medium   | **High**     | P1       |
+| T-EVADE-001   | High       | Medium   | **Medium**   | P2       |
+| T-ACCESS-001  | Low        | High     | **Medium**   | P2       |
+| T-ACCESS-002  | Low        | High     | **Medium**   | P2       |
+| T-PERSIST-002 | Low        | High     | **Medium**   | P2       |
+
+### 5.2 Critical Path Attack Chains
+
+**Attack Chain 1: Skill-Based Data Theft**
+
+```
+T-PERSIST-001 → T-EVADE-001 → T-EXFIL-003
+(Publish malicious skill) → (Evade moderation) → (Harvest credentials)
+```
+
+**Attack Chain 2: Prompt Injection to RCE**
+
+```
+T-EXEC-001 → T-EXEC-004 → T-IMPACT-001
+(Inject prompt) → (Bypass exec approval) → (Execute commands)
+```
+
+**Attack Chain 3: Indirect Injection via Fetched Content**
+
+```
+T-EXEC-002 → T-EXFIL-001 → External exfiltration
+(Poison URL content) → (Agent fetches & follows instructions) → (Data sent to attacker)
+```
+
+---
+
+## 6. Recommendations Summary
+
+### 6.1 Immediate (P0)
+
+| ID    | Recommendation                              | Addresses                  |
+| ----- | ------------------------------------------- | -------------------------- |
+| R-001 | Complete VirusTotal integration             | T-PERSIST-001, T-EVADE-001 |
+| R-002 | Implement skill sandboxing                  | T-PERSIST-001, T-EXFIL-003 |
+| R-003 | Add output validation for sensitive actions | T-EXEC-001, T-EXEC-002     |
+
+### 6.2 Short-term (P1)
+
+| ID    | Recommendation                           | Addresses    |
+| ----- | ---------------------------------------- | ------------ |
+| R-004 | Implement rate limiting                  | T-IMPACT-002 |
+| R-005 | Add token encryption at rest             | T-ACCESS-003 |
+| R-006 | Improve exec approval UX and validation  | T-EXEC-004   |
+| R-007 | Implement URL allowlisting for web_fetch | T-EXFIL-001  |
+
+### 6.3 Medium-term (P2)
+
+| ID    | Recommendation                                        | Addresses     |
+| ----- | ----------------------------------------------------- | ------------- |
+| R-008 | Add cryptographic channel verification where possible | T-ACCESS-002  |
+| R-009 | Implement config integrity verification               | T-PERSIST-003 |
+| R-010 | Add update signing and version pinning                | T-PERSIST-002 |
+
+---
+
+## 7. Appendices
+
+### 7.1 ATLAS Technique Mapping
+
+| ATLAS ID      | Technique Name                 | OpenClaw Threats                                                 |
+| ------------- | ------------------------------ | ---------------------------------------------------------------- |
+| AML.T0006     | Active Scanning                | T-RECON-001, T-RECON-002                                         |
+| AML.T0009     | Collection                     | T-EXFIL-001, T-EXFIL-002, T-EXFIL-003                            |
+| AML.T0010.001 | Supply Chain: AI Software      | T-PERSIST-001, T-PERSIST-002                                     |
+| AML.T0010.002 | Supply Chain: Data             | T-PERSIST-003                                                    |
+| AML.T0031     | Erode AI Model Integrity       | T-IMPACT-001, T-IMPACT-002, T-IMPACT-003                         |
+| AML.T0040     | AI Model Inference API Access  | T-ACCESS-001, T-ACCESS-002, T-ACCESS-003, T-DISC-001, T-DISC-002 |
+| AML.T0043     | Craft Adversarial Data         | T-EXEC-004, T-EVADE-001, T-EVADE-002                             |
+| AML.T0051.000 | LLM Prompt Injection: Direct   | T-EXEC-001, T-EXEC-003                                           |
+| AML.T0051.001 | LLM Prompt Injection: Indirect | T-EXEC-002                                                       |
+
+### 7.2 Key Security Files
+
+| Path                                | Purpose                     | Risk Level   |
+| ----------------------------------- | --------------------------- | ------------ |
+| `src/infra/exec-approvals.ts`       | Command approval logic      | **Critical** |
+| `src/gateway/auth.ts`               | Gateway authentication      | **Critical** |
+| `src/web/inbound/access-control.ts` | Channel access control      | **Critical** |
+| `src/infra/net/ssrf.ts`             | SSRF protection             | **Critical** |
+| `src/security/external-content.ts`  | Prompt injection mitigation | **Critical** |
+| `src/agents/sandbox/tool-policy.ts` | Tool policy enforcement     | **Critical** |
+| `convex/lib/moderation.ts`          | ClawHub moderation          | **High**     |
+| `convex/lib/skillPublish.ts`        | Skill publishing flow       | **High**     |
+| `src/routing/resolve-route.ts`      | Session isolation           | **Medium**   |
+
+### 7.3 Glossary
+
+| Term                 | Definition                                                |
+| -------------------- | --------------------------------------------------------- |
+| **ATLAS**            | MITRE's Adversarial Threat Landscape for AI Systems       |
+| **ClawHub**          | OpenClaw's skill marketplace                              |
+| **Gateway**          | OpenClaw's message routing and authentication layer       |
+| **MCP**              | Model Context Protocol - tool provider interface          |
+| **Prompt Injection** | Attack where malicious instructions are embedded in input |
+| **Skill**            | Downloadable extension for OpenClaw agents                |
+| **SSRF**             | Server-Side Request Forgery                               |
+
+---
+
+_This threat model is a living document. Report security issues to security@openclaw.ai_
--- a/content/security/formal-verification.md
+++ b/content/security/formal-verification.md
@@ -0,0 +1,171 @@
+---
+permalink: /security/formal-verification/
+summary: 针对 OpenClaw 最高风险路径的机器检查安全模型。
+title: 形式化验证（安全模型）
+x-i18n:
+  generated_at: "2026-02-03T07:54:04Z"
+  model: claude-opus-4-5
+  provider: pi
+  source_hash: 8dff6ea41a37fb6b870424e4e788015c3f8a6099075eece5dbf909883c045106
+  source_path: security/formal-verification.md
+  workflow: 15
+---
+
+# 形式化验证（安全模型）
+
+本页跟踪 OpenClaw 的**形式化安全模型**（目前是 TLA+/TLC；根据需要会添加更多）。
+
+> 注意：一些较旧的链接可能引用了以前的项目名称。
+
+**目标（北极星）：** 提供机器检查的论证，证明 OpenClaw 在明确假设下执行其
+预期的安全策略（授权、会话隔离、工具门控和
+配置错误安全）。
+
+**目前是什么：** 一个可执行的、攻击者驱动的**安全回归测试套件**：
+
+- 每个声明都有一个在有限状态空间上运行的模型检查。
+- 许多声明有一个配对的**负面模型**，为现实的 bug 类别生成反例追踪。
+
+**目前还不是什么：** 证明"OpenClaw 在所有方面都是安全的"或完整 TypeScript 实现是正确的。
+
+## 模型存放位置
+
+模型维护在一个单独的仓库中：[vignesh07/openclaw-formal-models](https://github.com/vignesh07/openclaw-formal-models)。
+
+## 重要注意事项
+
+- 这些是**模型**，不是完整的 TypeScript 实现。模型和代码之间可能存在偏差。
+- 结果受 TLC 探索的状态空间限制；"绿色"并不意味着在建模的假设和边界之外也是安全的。
+- 一些声明依赖于明确的环境假设（例如，正确的部署、正确的配置输入）。
+
+## 复现结果
+
+目前，结果通过在本地克隆模型仓库并运行 TLC 来复现（见下文）。未来的迭代可能提供：
+
+- 带有公开产物（反例追踪、运行日志）的 CI 运行模型
+- 用于小型、有界检查的托管"运行此模型"工作流
+
+开始使用：
+
+```bash
+git clone https://github.com/vignesh07/openclaw-formal-models
+cd openclaw-formal-models
+
+# 需要 Java 11+（TLC 在 JVM 上运行）。
+# 仓库内置了固定版本的 `tla2tools.jar`（TLA+ 工具）并提供 `bin/tlc` + Make 目标。
+
+make <target>
+```
+
+### Gateway 网关暴露和开放 Gateway 网关配置错误
+
+**声明：** 在没有认证的情况下绑定到 loopback 之外可能使远程入侵成为可能 / 增加暴露；令牌/密码可以阻止未认证的攻击者（根据模型假设）。
+
+- 绿色运行：
+  - `make gateway-exposure-v2`
+  - `make gateway-exposure-v2-protected`
+- 红色（预期）：
+  - `make gateway-exposure-v2-negative`
+
+另见：模型仓库中的 `docs/gateway-exposure-matrix.md`。
+
+### Nodes.run 管道（最高风险能力）
+
+**声明：** `nodes.run` 需要（a）节点命令允许列表加上声明的命令以及（b）配置时的实时批准；批准被令牌化以防止重放（在模型中）。
+
+- 绿色运行：
+  - `make nodes-pipeline`
+  - `make approvals-token`
+- 红色（预期）：
+  - `make nodes-pipeline-negative`
+  - `make approvals-token-negative`
+
+### 配对存储（私信门控）
+
+**声明：** 配对请求遵守 TTL 和待处理请求上限。
+
+- 绿色运行：
+  - `make pairing`
+  - `make pairing-cap`
+- 红色（预期）：
+  - `make pairing-negative`
+  - `make pairing-cap-negative`
+
+### 入站门控（提及 + 控制命令绕过）
+
+**声明：** 在需要提及的群组上下文中，未授权的"控制命令"不能绕过提及门控。
+
+- 绿色：
+  - `make ingress-gating`
+- 红色（预期）：
+  - `make ingress-gating-negative`
+
+### 路由/会话键隔离
+
+**声明：** 来自不同对等方的私信不会折叠到同一会话中，除非明确链接/配置。
+
+- 绿色：
+  - `make routing-isolation`
+- 红色（预期）：
+  - `make routing-isolation-negative`
+
+## v1++：额外的有界模型（并发、重试、追踪正确性）
+
+这些是后续模型，围绕真实世界的故障模式（非原子更新、重试和消息扇出）提高保真度。
+
+### 配对存储并发 / 幂等性
+
+**声明：** 配对存储应该在交错情况下也强制执行 `MaxPending` 和幂等性（即"检查然后写入"必须是原子/加锁的；刷新不应创建重复项）。
+
+这意味着：
+
+- 在并发请求下，你不能超过渠道的 `MaxPending`。
+- 对同一 `(channel, sender)` 的重复请求/刷新不应创建重复的活跃待处理行。
+
+- 绿色运行：
+  - `make pairing-race`（原子/加锁的上限检查）
+  - `make pairing-idempotency`
+  - `make pairing-refresh`
+  - `make pairing-refresh-race`
+- 红色（预期）：
+  - `make pairing-race-negative`（非原子 begin/commit 上限竞争）
+  - `make pairing-idempotency-negative`
+  - `make pairing-refresh-negative`
+  - `make pairing-refresh-race-negative`
+
+### 入站追踪关联 / 幂等性
+
+**声明：** 摄入应在扇出过程中保持追踪关联，并在提供商重试下保持幂等。
+
+这意味着：
+
+- 当一个外部事件变成多个内部消息时，每个部分都保持相同的追踪/事件标识。
+- 重试不会导致重复处理。
+- 如果提供商事件 ID 缺失，去重回退到安全键（例如，追踪 ID）以避免丢弃不同的事件。
+
+- 绿色：
+  - `make ingress-trace`
+  - `make ingress-trace2`
+  - `make ingress-idempotency`
+  - `make ingress-dedupe-fallback`
+- 红色（预期）：
+  - `make ingress-trace-negative`
+  - `make ingress-trace2-negative`
+  - `make ingress-idempotency-negative`
+  - `make ingress-dedupe-fallback-negative`
+
+### 路由 dmScope 优先级 + identityLinks
+
+**声明：** 路由必须默认保持私信会话隔离，只有在明确配置时才折叠会话（渠道优先级 + 身份链接）。
+
+这意味着：
+
+- 渠道特定的 dmScope 覆盖必须优先于全局默认值。
+- identityLinks 应该只在明确链接的组内折叠，而不是跨不相关的对等方。
+
+- 绿色：
+  - `make routing-precedence`
+  - `make routing-identitylinks`
+- 红色（预期）：
+  - `make routing-precedence-negative`
+  - `make routing-identitylinks-negative`