first commit

This commit is contained in:
2026-02-28 23:01:30 +08:00
commit 3956ee4806
415 changed files with 74538 additions and 0 deletions

View File

@@ -0,0 +1,90 @@
# Contributing to the OpenClaw Threat Model
Thanks for helping make OpenClaw more secure. This threat model is a living document and we welcome contributions from anyone - you don't need to be a security expert.
## Ways to Contribute
### Add a Threat
Spotted an attack vector or risk we haven't covered? Open an issue on [openclaw/trust](https://github.com/openclaw/trust/issues) and describe it in your own words. You don't need to know any frameworks or fill in every field - just describe the scenario.
**Helpful to include (but not required):**
- The attack scenario and how it could be exploited
- Which parts of OpenClaw are affected (CLI, gateway, channels, ClawHub, MCP servers, etc.)
- How severe you think it is (low / medium / high / critical)
- Any links to related research, CVEs, or real-world examples
We'll handle the ATLAS mapping, threat IDs, and risk assessment during review. If you want to include those details, great - but it's not expected.
> **This is for adding to the threat model, not reporting live vulnerabilities.** If you've found an exploitable vulnerability, see our [Trust page](https://trust.openclaw.ai) for responsible disclosure instructions.
### Suggest a Mitigation
Have an idea for how to address an existing threat? Open an issue or PR referencing the threat. Useful mitigations are specific and actionable - for example, "per-sender rate limiting of 10 messages/minute at the gateway" is better than "implement rate limiting."
### Propose an Attack Chain
Attack chains show how multiple threats combine into a realistic attack scenario. If you see a dangerous combination, describe the steps and how an attacker would chain them together. A short narrative of how the attack unfolds in practice is more valuable than a formal template.
### Fix or Improve Existing Content
Typos, clarifications, outdated info, better examples - PRs welcome, no issue needed.
## What We Use
### MITRE ATLAS
This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/) (Adversarial Threat Landscape for AI Systems), a framework designed specifically for AI/ML threats like prompt injection, tool misuse, and agent exploitation. You don't need to know ATLAS to contribute - we map submissions to the framework during review.
### Threat IDs
Each threat gets an ID like `T-EXEC-003`. The categories are:
| Code | Category |
| ------- | ------------------------------------------ |
| RECON | Reconnaissance - information gathering |
| ACCESS | Initial access - gaining entry |
| EXEC | Execution - running malicious actions |
| PERSIST | Persistence - maintaining access |
| EVADE | Defense evasion - avoiding detection |
| DISC | Discovery - learning about the environment |
| EXFIL | Exfiltration - stealing data |
| IMPACT | Impact - damage or disruption |
IDs are assigned by maintainers during review. You don't need to pick one.
### Risk Levels
| Level | Meaning |
| ------------ | ----------------------------------------------------------------- |
| **Critical** | Full system compromise, or high likelihood + critical impact |
| **High** | Significant damage likely, or medium likelihood + critical impact |
| **Medium** | Moderate risk, or low likelihood + high impact |
| **Low** | Unlikely and limited impact |
If you're unsure about the risk level, just describe the impact and we'll assess it.
## Review Process
1. **Triage** - We review new submissions within 48 hours
2. **Assessment** - We verify feasibility, assign ATLAS mapping and threat ID, validate risk level
3. **Documentation** - We ensure everything is formatted and complete
4. **Merge** - Added to the threat model and visualization
## Resources
- [ATLAS Website](https://atlas.mitre.org/)
- [ATLAS Techniques](https://atlas.mitre.org/techniques/)
- [ATLAS Case Studies](https://atlas.mitre.org/studies/)
- [OpenClaw Threat Model](./THREAT-MODEL-ATLAS.md)
## Contact
- **Security vulnerabilities:** See our [Trust page](https://trust.openclaw.ai) for reporting instructions
- **Threat model questions:** Open an issue on [openclaw/trust](https://github.com/openclaw/trust/issues)
- **General chat:** Discord #security channel
## Recognition
Contributors to the threat model are recognized in the threat model acknowledgments, release notes, and the OpenClaw security hall of fame for significant contributions.

View File

@@ -0,0 +1,17 @@
# OpenClaw Security & Trust
**Live:** [trust.openclaw.ai](https://trust.openclaw.ai)
## Documents
- [Threat Model](./THREAT-MODEL-ATLAS.md) - MITRE ATLAS-based threat model for the OpenClaw ecosystem
- [Contributing to the Threat Model](./CONTRIBUTING-THREAT-MODEL.md) - How to add threats, mitigations, and attack chains
## Reporting Vulnerabilities
See the [Trust page](https://trust.openclaw.ai) for full reporting instructions covering all repos.
## Contact
- **Jamieson O'Reilly** ([@theonejvo](https://twitter.com/theonejvo)) - Security & Trust
- Discord: #security channel

View File

@@ -0,0 +1,603 @@
# OpenClaw Threat Model v1.0
## MITRE ATLAS Framework
**Version:** 1.0-draft
**Last Updated:** 2026-02-04
**Methodology:** MITRE ATLAS + Data Flow Diagrams
**Framework:** [MITRE ATLAS](https://atlas.mitre.org/) (Adversarial Threat Landscape for AI Systems)
### Framework Attribution
This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/), the industry-standard framework for documenting adversarial threats to AI/ML systems. ATLAS is maintained by [MITRE](https://www.mitre.org/) in collaboration with the AI security community.
**Key ATLAS Resources:**
- [ATLAS Techniques](https://atlas.mitre.org/techniques/)
- [ATLAS Tactics](https://atlas.mitre.org/tactics/)
- [ATLAS Case Studies](https://atlas.mitre.org/studies/)
- [ATLAS GitHub](https://github.com/mitre-atlas/atlas-data)
- [Contributing to ATLAS](https://atlas.mitre.org/resources/contribute)
### Contributing to This Threat Model
This is a living document maintained by the OpenClaw community. See [CONTRIBUTING-THREAT-MODEL.md](./CONTRIBUTING-THREAT-MODEL.md) for guidelines on contributing:
- Reporting new threats
- Updating existing threats
- Proposing attack chains
- Suggesting mitigations
---
## 1. Introduction
### 1.1 Purpose
This threat model documents adversarial threats to the OpenClaw AI agent platform and ClawHub skill marketplace, using the MITRE ATLAS framework designed specifically for AI/ML systems.
### 1.2 Scope
| Component | Included | Notes |
| ---------------------- | -------- | ------------------------------------------------ |
| OpenClaw Agent Runtime | Yes | Core agent execution, tool calls, sessions |
| Gateway | Yes | Authentication, routing, channel integration |
| Channel Integrations | Yes | WhatsApp, Telegram, Discord, Signal, Slack, etc. |
| ClawHub Marketplace | Yes | Skill publishing, moderation, distribution |
| MCP Servers | Yes | External tool providers |
| User Devices | Partial | Mobile apps, desktop clients |
### 1.3 Out of Scope
Nothing is explicitly out of scope for this threat model.
---
## 2. System Architecture
### 2.1 Trust Boundaries
```
┌─────────────────────────────────────────────────────────────────┐
│ UNTRUSTED ZONE │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ WhatsApp │ │ Telegram │ │ Discord │ ... │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
└─────────┼────────────────┼────────────────┼──────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 1: Channel Access │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ GATEWAY │ │
│ │ • Device Pairing (30s grace period) │ │
│ │ • AllowFrom / AllowList validation │ │
│ │ • Token/Password/Tailscale auth │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 2: Session Isolation │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AGENT SESSIONS │ │
│ │ • Session key = agent:channel:peer │ │
│ │ • Tool policies per agent │ │
│ │ • Transcript logging │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 3: Tool Execution │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ EXECUTION SANDBOX │ │
│ │ • Docker sandbox OR Host (exec-approvals) │ │
│ │ • Node remote execution │ │
│ │ • SSRF protection (DNS pinning + IP blocking) │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 4: External Content │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ FETCHED URLs / EMAILS / WEBHOOKS │ │
│ │ • External content wrapping (XML tags) │ │
│ │ • Security notice injection │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 5: Supply Chain │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ CLAWHUB │ │
│ │ • Skill publishing (semver, SKILL.md required) │ │
│ │ • Pattern-based moderation flags │ │
│ │ • VirusTotal scanning (coming soon) │ │
│ │ • GitHub account age verification │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
### 2.2 Data Flows
| Flow | Source | Destination | Data | Protection |
| ---- | ------- | ----------- | ------------------ | -------------------- |
| F1 | Channel | Gateway | User messages | TLS, AllowFrom |
| F2 | Gateway | Agent | Routed messages | Session isolation |
| F3 | Agent | 工具 | Tool invocations | Policy enforcement |
| F4 | Agent | External | web_fetch requests | SSRF blocking |
| F5 | ClawHub | Agent | Skill code | Moderation, scanning |
| F6 | Agent | Channel | Responses | Output filtering |
---
## 3. Threat Analysis by ATLAS Tactic
### 3.1 Reconnaissance (AML.TA0002)
#### T-RECON-001: Agent Endpoint Discovery
| Attribute | Value |
| ----------------------- | -------------------------------------------------------------------- |
| **ATLAS ID** | AML.T0006 - Active Scanning |
| **Description** | Attacker scans for exposed OpenClaw gateway endpoints |
| **Attack Vector** | Network scanning, shodan queries, DNS enumeration |
| **Affected Components** | Gateway, exposed API endpoints |
| **Current Mitigations** | Tailscale auth option, bind to loopback by default |
| **Residual Risk** | Medium - Public gateways discoverable |
| **Recommendations** | Document secure deployment, add rate limiting on discovery endpoints |
#### T-RECON-002: Channel Integration Probing
| Attribute | Value |
| ----------------------- | ------------------------------------------------------------------ |
| **ATLAS ID** | AML.T0006 - Active Scanning |
| **Description** | Attacker probes messaging channels to identify AI-managed accounts |
| **Attack Vector** | Sending test messages, observing response patterns |
| **Affected Components** | All channel integrations |
| **Current Mitigations** | None specific |
| **Residual Risk** | Low - Limited value from discovery alone |
| **Recommendations** | Consider response timing randomization |
---
### 3.2 Initial Access (AML.TA0004)
#### T-ACCESS-001: Pairing Code Interception
| Attribute | Value |
| ----------------------- | -------------------------------------------------------- |
| **ATLAS ID** | AML.T0040 - AI Model Inference API Access |
| **Description** | Attacker intercepts pairing code during 30s grace period |
| **Attack Vector** | Shoulder surfing, network sniffing, social engineering |
| **Affected Components** | Device pairing system |
| **Current Mitigations** | 30s expiry, codes sent via existing channel |
| **Residual Risk** | Medium - Grace period exploitable |
| **Recommendations** | Reduce grace period, add confirmation step |
#### T-ACCESS-002: AllowFrom Spoofing
| Attribute | Value |
| ----------------------- | ------------------------------------------------------------------------------ |
| **ATLAS ID** | AML.T0040 - AI Model Inference API Access |
| **Description** | Attacker spoofs allowed sender identity in channel |
| **Attack Vector** | Depends on channel - phone number spoofing, username impersonation |
| **Affected Components** | AllowFrom validation per channel |
| **Current Mitigations** | Channel-specific identity verification |
| **Residual Risk** | Medium - Some channels vulnerable to spoofing |
| **Recommendations** | Document channel-specific risks, add cryptographic verification where possible |
#### T-ACCESS-003: Token Theft
| Attribute | Value |
| ----------------------- | ----------------------------------------------------------- |
| **ATLAS ID** | AML.T0040 - AI Model Inference API Access |
| **Description** | Attacker steals authentication tokens from config files |
| **Attack Vector** | Malware, unauthorized device access, config backup exposure |
| **Affected Components** | ~/.openclaw/credentials/, config storage |
| **Current Mitigations** | File permissions |
| **Residual Risk** | High - Tokens stored in plaintext |
| **Recommendations** | Implement token encryption at rest, add token rotation |
---
### 3.3 Execution (AML.TA0005)
#### T-EXEC-001: Direct Prompt Injection
| Attribute | Value |
| ----------------------- | ----------------------------------------------------------------------------------------- |
| **ATLAS ID** | AML.T0051.000 - LLM Prompt Injection: Direct |
| **Description** | Attacker sends crafted prompts to manipulate agent behavior |
| **Attack Vector** | Channel messages containing adversarial instructions |
| **Affected Components** | Agent LLM, all input surfaces |
| **Current Mitigations** | Pattern detection, external content wrapping |
| **Residual Risk** | Critical - Detection only, no blocking; sophisticated attacks bypass |
| **Recommendations** | Implement multi-layer defense, output validation, user confirmation for sensitive actions |
#### T-EXEC-002: Indirect Prompt Injection
| Attribute | Value |
| ----------------------- | ----------------------------------------------------------- |
| **ATLAS ID** | AML.T0051.001 - LLM Prompt Injection: Indirect |
| **Description** | Attacker embeds malicious instructions in fetched content |
| **Attack Vector** | Malicious URLs, poisoned emails, compromised webhooks |
| **Affected Components** | web_fetch, email ingestion, external data sources |
| **Current Mitigations** | Content wrapping with XML tags and security notice |
| **Residual Risk** | High - LLM may ignore wrapper instructions |
| **Recommendations** | Implement content sanitization, separate execution contexts |
#### T-EXEC-003: Tool Argument Injection
| Attribute | Value |
| ----------------------- | ------------------------------------------------------------ |
| **ATLAS ID** | AML.T0051.000 - LLM Prompt Injection: Direct |
| **Description** | Attacker manipulates tool arguments through prompt injection |
| **Attack Vector** | Crafted prompts that influence tool parameter values |
| **Affected Components** | All tool invocations |
| **Current Mitigations** | Exec approvals for dangerous commands |
| **Residual Risk** | High - Relies on user judgment |
| **Recommendations** | Implement argument validation, parameterized tool calls |
#### T-EXEC-004: Exec Approval Bypass
| Attribute | Value |
| ----------------------- | ---------------------------------------------------------- |
| **ATLAS ID** | AML.T0043 - Craft Adversarial Data |
| **Description** | Attacker crafts commands that bypass approval allowlist |
| **Attack Vector** | Command obfuscation, alias exploitation, path manipulation |
| **Affected Components** | exec-approvals.ts, command allowlist |
| **Current Mitigations** | Allowlist + ask mode |
| **Residual Risk** | High - No command sanitization |
| **Recommendations** | Implement command normalization, expand blocklist |
---
### 3.4 Persistence (AML.TA0006)
#### T-PERSIST-001: Malicious Skill Installation
| Attribute | Value |
| ----------------------- | ------------------------------------------------------------------------ |
| **ATLAS ID** | AML.T0010.001 - Supply Chain Compromise: AI Software |
| **Description** | Attacker publishes malicious skill to ClawHub |
| **Attack Vector** | Create account, publish skill with hidden malicious code |
| **Affected Components** | ClawHub, skill loading, agent execution |
| **Current Mitigations** | GitHub account age verification, pattern-based moderation flags |
| **Residual Risk** | Critical - No sandboxing, limited review |
| **Recommendations** | VirusTotal integration (in progress), skill sandboxing, community review |
#### T-PERSIST-002: Skill Update Poisoning
| Attribute | Value |
| ----------------------- | -------------------------------------------------------------- |
| **ATLAS ID** | AML.T0010.001 - Supply Chain Compromise: AI Software |
| **Description** | Attacker compromises popular skill and pushes malicious update |
| **Attack Vector** | Account compromise, social engineering of skill owner |
| **Affected Components** | ClawHub versioning, auto-update flows |
| **Current Mitigations** | Version fingerprinting |
| **Residual Risk** | High - Auto-updates may pull malicious versions |
| **Recommendations** | Implement update signing, rollback capability, version pinning |
#### T-PERSIST-003: Agent Configuration Tampering
| Attribute | Value |
| ----------------------- | --------------------------------------------------------------- |
| **ATLAS ID** | AML.T0010.002 - Supply Chain Compromise: Data |
| **Description** | Attacker modifies agent configuration to persist access |
| **Attack Vector** | Config file modification, settings injection |
| **Affected Components** | Agent config, tool policies |
| **Current Mitigations** | File permissions |
| **Residual Risk** | Medium - Requires local access |
| **Recommendations** | Config integrity verification, audit logging for config changes |
---
### 3.5 Defense Evasion (AML.TA0007)
#### T-EVADE-001: Moderation Pattern Bypass
| Attribute | Value |
| ----------------------- | ---------------------------------------------------------------------- |
| **ATLAS ID** | AML.T0043 - Craft Adversarial Data |
| **Description** | Attacker crafts skill content to evade moderation patterns |
| **Attack Vector** | Unicode homoglyphs, encoding tricks, dynamic loading |
| **Affected Components** | ClawHub moderation.ts |
| **Current Mitigations** | Pattern-based FLAG_RULES |
| **Residual Risk** | High - Simple regex easily bypassed |
| **Recommendations** | Add behavioral analysis (VirusTotal Code Insight), AST-based detection |
#### T-EVADE-002: Content Wrapper Escape
| Attribute | Value |
| ----------------------- | --------------------------------------------------------- |
| **ATLAS ID** | AML.T0043 - Craft Adversarial Data |
| **Description** | Attacker crafts content that escapes XML wrapper context |
| **Attack Vector** | Tag manipulation, context confusion, instruction override |
| **Affected Components** | External content wrapping |
| **Current Mitigations** | XML tags + security notice |
| **Residual Risk** | Medium - Novel escapes discovered regularly |
| **Recommendations** | Multiple wrapper layers, output-side validation |
---
### 3.6 Discovery (AML.TA0008)
#### T-DISC-001: Tool Enumeration
| Attribute | Value |
| ----------------------- | ----------------------------------------------------- |
| **ATLAS ID** | AML.T0040 - AI Model Inference API Access |
| **Description** | Attacker enumerates available tools through prompting |
| **Attack Vector** | "What tools do you have?" style queries |
| **Affected Components** | Agent tool registry |
| **Current Mitigations** | None specific |
| **Residual Risk** | Low - Tools generally documented |
| **Recommendations** | Consider tool visibility controls |
#### T-DISC-002: Session Data Extraction
| Attribute | Value |
| ----------------------- | ----------------------------------------------------- |
| **ATLAS ID** | AML.T0040 - AI Model Inference API Access |
| **Description** | Attacker extracts sensitive data from session context |
| **Attack Vector** | "What did we discuss?" queries, context probing |
| **Affected Components** | Session transcripts, context window |
| **Current Mitigations** | Session isolation per sender |
| **Residual Risk** | Medium - Within-session data accessible |
| **Recommendations** | Implement sensitive data redaction in context |
---
### 3.7 Collection & Exfiltration (AML.TA0009, AML.TA0010)
#### T-EXFIL-001: Data Theft via web_fetch
| Attribute | Value |
| ----------------------- | ---------------------------------------------------------------------- |
| **ATLAS ID** | AML.T0009 - Collection |
| **Description** | Attacker exfiltrates data by instructing agent to send to external URL |
| **Attack Vector** | Prompt injection causing agent to POST data to attacker server |
| **Affected Components** | web_fetch tool |
| **Current Mitigations** | SSRF blocking for internal networks |
| **Residual Risk** | High - External URLs permitted |
| **Recommendations** | Implement URL allowlisting, data classification awareness |
#### T-EXFIL-002: Unauthorized Message Sending
| Attribute | Value |
| ----------------------- | ---------------------------------------------------------------- |
| **ATLAS ID** | AML.T0009 - Collection |
| **Description** | Attacker causes agent to send messages containing sensitive data |
| **Attack Vector** | Prompt injection causing agent to message attacker |
| **Affected Components** | Message tool, channel integrations |
| **Current Mitigations** | Outbound messaging gating |
| **Residual Risk** | Medium - Gating may be bypassed |
| **Recommendations** | Require explicit confirmation for new recipients |
#### T-EXFIL-003: Credential Harvesting
| Attribute | Value |
| ----------------------- | ------------------------------------------------------- |
| **ATLAS ID** | AML.T0009 - Collection |
| **Description** | Malicious skill harvests credentials from agent context |
| **Attack Vector** | Skill code reads environment variables, config files |
| **Affected Components** | Skill execution environment |
| **Current Mitigations** | None specific to skills |
| **Residual Risk** | Critical - Skills run with agent privileges |
| **Recommendations** | Skill sandboxing, credential isolation |
---
### 3.8 Impact (AML.TA0011)
#### T-IMPACT-001: Unauthorized Command Execution
| Attribute | Value |
| ----------------------- | --------------------------------------------------- |
| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity |
| **Description** | Attacker executes arbitrary commands on user system |
| **Attack Vector** | Prompt injection combined with exec approval bypass |
| **Affected Components** | Bash tool, command execution |
| **Current Mitigations** | Exec approvals, Docker sandbox option |
| **Residual Risk** | Critical - Host execution without sandbox |
| **Recommendations** | Default to sandbox, improve approval UX |
#### T-IMPACT-002: Resource Exhaustion (DoS)
| Attribute | Value |
| ----------------------- | -------------------------------------------------- |
| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity |
| **Description** | Attacker exhausts API credits or compute resources |
| **Attack Vector** | Automated message flooding, expensive tool calls |
| **Affected Components** | Gateway, agent sessions, API provider |
| **Current Mitigations** | None |
| **Residual Risk** | High - No rate limiting |
| **Recommendations** | Implement per-sender rate limits, cost budgets |
#### T-IMPACT-003: Reputation Damage
| Attribute | Value |
| ----------------------- | ------------------------------------------------------- |
| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity |
| **Description** | Attacker causes agent to send harmful/offensive content |
| **Attack Vector** | Prompt injection causing inappropriate responses |
| **Affected Components** | Output generation, channel messaging |
| **Current Mitigations** | LLM provider content policies |
| **Residual Risk** | Medium - Provider filters imperfect |
| **Recommendations** | Output filtering layer, user controls |
---
## 4. ClawHub Supply Chain Analysis
### 4.1 Current Security Controls
| Control | Implementation | Effectiveness |
| -------------------- | --------------------------- | ---------------------------------------------------- |
| GitHub Account Age | `requireGitHubAccountAge()` | Medium - Raises bar for new attackers |
| Path Sanitization | `sanitizePath()` | High - Prevents path traversal |
| File Type Validation | `isTextFile()` | Medium - Only text files, but can still be malicious |
| Size Limits | 50MB total bundle | High - Prevents resource exhaustion |
| Required SKILL.md | Mandatory readme | Low security value - Informational only |
| Pattern Moderation | FLAG_RULES in moderation.ts | Low - Easily bypassed |
| Moderation Status | `moderationStatus` field | Medium - Manual review possible |
### 4.2 Moderation Flag Patterns
Current patterns in `moderation.ts`:
```javascript
// Known-bad identifiers
/(keepcold131\/ClawdAuthenticatorTool|ClawdAuthenticatorTool)/i
// Suspicious keywords
/(malware|stealer|phish|phishing|keylogger)/i
/(api[-_ ]?key|token|password|private key|secret)/i
/(wallet|seed phrase|mnemonic|crypto)/i
/(discord\.gg|webhook|hooks\.slack)/i
/(curl[^\n]+\|\s*(sh|bash))/i
/(bit\.ly|tinyurl\.com|t\.co|goo\.gl|is\.gd)/i
```
**Limitations:**
- Only checks slug, displayName, summary, frontmatter, metadata, file paths
- Does not analyze actual skill code content
- Simple regex easily bypassed with obfuscation
- No behavioral analysis
### 4.3 Planned Improvements
| Improvement | Status | Impact |
| ---------------------- | ------------------------------------- | --------------------------------------------------------------------- |
| VirusTotal Integration | In Progress | High - Code Insight behavioral analysis |
| Community Reporting | Partial (`skillReports` table exists) | Medium |
| Audit Logging | Partial (`auditLogs` table exists) | Medium |
| Badge System | Implemented | Medium - `highlighted`, `official`, `deprecated`, `redactionApproved` |
---
## 5. Risk Matrix
### 5.1 Likelihood vs Impact
| Threat ID | Likelihood | Impact | Risk Level | Priority |
| ------------- | ---------- | -------- | ------------ | -------- |
| T-EXEC-001 | High | Critical | **Critical** | P0 |
| T-PERSIST-001 | High | Critical | **Critical** | P0 |
| T-EXFIL-003 | Medium | Critical | **Critical** | P0 |
| T-IMPACT-001 | Medium | Critical | **High** | P1 |
| T-EXEC-002 | High | High | **High** | P1 |
| T-EXEC-004 | Medium | High | **High** | P1 |
| T-ACCESS-003 | Medium | High | **High** | P1 |
| T-EXFIL-001 | Medium | High | **High** | P1 |
| T-IMPACT-002 | High | Medium | **High** | P1 |
| T-EVADE-001 | High | Medium | **Medium** | P2 |
| T-ACCESS-001 | Low | High | **Medium** | P2 |
| T-ACCESS-002 | Low | High | **Medium** | P2 |
| T-PERSIST-002 | Low | High | **Medium** | P2 |
### 5.2 Critical Path Attack Chains
**Attack Chain 1: Skill-Based Data Theft**
```
T-PERSIST-001 → T-EVADE-001 → T-EXFIL-003
(Publish malicious skill) → (Evade moderation) → (Harvest credentials)
```
**Attack Chain 2: Prompt Injection to RCE**
```
T-EXEC-001 → T-EXEC-004 → T-IMPACT-001
(Inject prompt) → (Bypass exec approval) → (Execute commands)
```
**Attack Chain 3: Indirect Injection via Fetched Content**
```
T-EXEC-002 → T-EXFIL-001 → External exfiltration
(Poison URL content) → (Agent fetches & follows instructions) → (Data sent to attacker)
```
---
## 6. Recommendations Summary
### 6.1 Immediate (P0)
| ID | Recommendation | Addresses |
| ----- | ------------------------------------------- | -------------------------- |
| R-001 | Complete VirusTotal integration | T-PERSIST-001, T-EVADE-001 |
| R-002 | Implement skill sandboxing | T-PERSIST-001, T-EXFIL-003 |
| R-003 | Add output validation for sensitive actions | T-EXEC-001, T-EXEC-002 |
### 6.2 Short-term (P1)
| ID | Recommendation | Addresses |
| ----- | ---------------------------------------- | ------------ |
| R-004 | Implement rate limiting | T-IMPACT-002 |
| R-005 | Add token encryption at rest | T-ACCESS-003 |
| R-006 | Improve exec approval UX and validation | T-EXEC-004 |
| R-007 | Implement URL allowlisting for web_fetch | T-EXFIL-001 |
### 6.3 Medium-term (P2)
| ID | Recommendation | Addresses |
| ----- | ----------------------------------------------------- | ------------- |
| R-008 | Add cryptographic channel verification where possible | T-ACCESS-002 |
| R-009 | Implement config integrity verification | T-PERSIST-003 |
| R-010 | Add update signing and version pinning | T-PERSIST-002 |
---
## 7. Appendices
### 7.1 ATLAS Technique Mapping
| ATLAS ID | Technique Name | OpenClaw Threats |
| ------------- | ------------------------------ | ---------------------------------------------------------------- |
| AML.T0006 | Active Scanning | T-RECON-001, T-RECON-002 |
| AML.T0009 | Collection | T-EXFIL-001, T-EXFIL-002, T-EXFIL-003 |
| AML.T0010.001 | Supply Chain: AI Software | T-PERSIST-001, T-PERSIST-002 |
| AML.T0010.002 | Supply Chain: Data | T-PERSIST-003 |
| AML.T0031 | Erode AI Model Integrity | T-IMPACT-001, T-IMPACT-002, T-IMPACT-003 |
| AML.T0040 | AI Model Inference API Access | T-ACCESS-001, T-ACCESS-002, T-ACCESS-003, T-DISC-001, T-DISC-002 |
| AML.T0043 | Craft Adversarial Data | T-EXEC-004, T-EVADE-001, T-EVADE-002 |
| AML.T0051.000 | LLM Prompt Injection: Direct | T-EXEC-001, T-EXEC-003 |
| AML.T0051.001 | LLM Prompt Injection: Indirect | T-EXEC-002 |
### 7.2 Key Security Files
| Path | Purpose | Risk Level |
| ----------------------------------- | --------------------------- | ------------ |
| `src/infra/exec-approvals.ts` | Command approval logic | **Critical** |
| `src/gateway/auth.ts` | Gateway authentication | **Critical** |
| `src/web/inbound/access-control.ts` | Channel access control | **Critical** |
| `src/infra/net/ssrf.ts` | SSRF protection | **Critical** |
| `src/security/external-content.ts` | Prompt injection mitigation | **Critical** |
| `src/agents/sandbox/tool-policy.ts` | Tool policy enforcement | **Critical** |
| `convex/lib/moderation.ts` | ClawHub moderation | **High** |
| `convex/lib/skillPublish.ts` | Skill publishing flow | **High** |
| `src/routing/resolve-route.ts` | Session isolation | **Medium** |
### 7.3 Glossary
| Term | Definition |
| -------------------- | --------------------------------------------------------- |
| **ATLAS** | MITRE's Adversarial Threat Landscape for AI Systems |
| **ClawHub** | OpenClaw's skill marketplace |
| **Gateway** | OpenClaw's message routing and authentication layer |
| **MCP** | Model Context Protocol - tool provider interface |
| **Prompt Injection** | Attack where malicious instructions are embedded in input |
| **Skill** | Downloadable extension for OpenClaw agents |
| **SSRF** | Server-Side Request Forgery |
---
_This threat model is a living document. Report security issues to security@openclaw.ai_

View File

@@ -0,0 +1,171 @@
---
permalink: /security/formal-verification/
summary: 针对 OpenClaw 最高风险路径的机器检查安全模型。
title: 形式化验证(安全模型)
x-i18n:
generated_at: "2026-02-03T07:54:04Z"
model: claude-opus-4-5
provider: pi
source_hash: 8dff6ea41a37fb6b870424e4e788015c3f8a6099075eece5dbf909883c045106
source_path: security/formal-verification.md
workflow: 15
---
# 形式化验证(安全模型)
本页跟踪 OpenClaw 的**形式化安全模型**(目前是 TLA+/TLC根据需要会添加更多
> 注意:一些较旧的链接可能引用了以前的项目名称。
**目标(北极星):** 提供机器检查的论证,证明 OpenClaw 在明确假设下执行其
预期的安全策略(授权、会话隔离、工具门控和
配置错误安全)。
**目前是什么:** 一个可执行的、攻击者驱动的**安全回归测试套件**
- 每个声明都有一个在有限状态空间上运行的模型检查。
- 许多声明有一个配对的**负面模型**,为现实的 bug 类别生成反例追踪。
**目前还不是什么:** 证明"OpenClaw 在所有方面都是安全的"或完整 TypeScript 实现是正确的。
## 模型存放位置
模型维护在一个单独的仓库中:[vignesh07/openclaw-formal-models](https://github.com/vignesh07/openclaw-formal-models)。
## 重要注意事项
- 这些是**模型**,不是完整的 TypeScript 实现。模型和代码之间可能存在偏差。
- 结果受 TLC 探索的状态空间限制;"绿色"并不意味着在建模的假设和边界之外也是安全的。
- 一些声明依赖于明确的环境假设(例如,正确的部署、正确的配置输入)。
## 复现结果
目前,结果通过在本地克隆模型仓库并运行 TLC 来复现(见下文)。未来的迭代可能提供:
- 带有公开产物(反例追踪、运行日志)的 CI 运行模型
- 用于小型、有界检查的托管"运行此模型"工作流
开始使用:
```bash
git clone https://github.com/vignesh07/openclaw-formal-models
cd openclaw-formal-models
# 需要 Java 11+TLC 在 JVM 上运行)。
# 仓库内置了固定版本的 `tla2tools.jar`TLA+ 工具)并提供 `bin/tlc` + Make 目标。
make <target>
```
### Gateway 网关暴露和开放 Gateway 网关配置错误
**声明:** 在没有认证的情况下绑定到 loopback 之外可能使远程入侵成为可能 / 增加暴露;令牌/密码可以阻止未认证的攻击者(根据模型假设)。
- 绿色运行:
- `make gateway-exposure-v2`
- `make gateway-exposure-v2-protected`
- 红色(预期):
- `make gateway-exposure-v2-negative`
另见:模型仓库中的 `docs/gateway-exposure-matrix.md`
### Nodes.run 管道(最高风险能力)
**声明:** `nodes.run` 需要a节点命令允许列表加上声明的命令以及b配置时的实时批准批准被令牌化以防止重放在模型中
- 绿色运行:
- `make nodes-pipeline`
- `make approvals-token`
- 红色(预期):
- `make nodes-pipeline-negative`
- `make approvals-token-negative`
### 配对存储(私信门控)
**声明:** 配对请求遵守 TTL 和待处理请求上限。
- 绿色运行:
- `make pairing`
- `make pairing-cap`
- 红色(预期):
- `make pairing-negative`
- `make pairing-cap-negative`
### 入站门控(提及 + 控制命令绕过)
**声明:** 在需要提及的群组上下文中,未授权的"控制命令"不能绕过提及门控。
- 绿色:
- `make ingress-gating`
- 红色(预期):
- `make ingress-gating-negative`
### 路由/会话键隔离
**声明:** 来自不同对等方的私信不会折叠到同一会话中,除非明确链接/配置。
- 绿色:
- `make routing-isolation`
- 红色(预期):
- `make routing-isolation-negative`
## v1++:额外的有界模型(并发、重试、追踪正确性)
这些是后续模型,围绕真实世界的故障模式(非原子更新、重试和消息扇出)提高保真度。
### 配对存储并发 / 幂等性
**声明:** 配对存储应该在交错情况下也强制执行 `MaxPending` 和幂等性(即"检查然后写入"必须是原子/加锁的;刷新不应创建重复项)。
这意味着:
- 在并发请求下,你不能超过渠道的 `MaxPending`
- 对同一 `(channel, sender)` 的重复请求/刷新不应创建重复的活跃待处理行。
- 绿色运行:
- `make pairing-race`(原子/加锁的上限检查)
- `make pairing-idempotency`
- `make pairing-refresh`
- `make pairing-refresh-race`
- 红色(预期):
- `make pairing-race-negative`(非原子 begin/commit 上限竞争)
- `make pairing-idempotency-negative`
- `make pairing-refresh-negative`
- `make pairing-refresh-race-negative`
### 入站追踪关联 / 幂等性
**声明:** 摄入应在扇出过程中保持追踪关联,并在提供商重试下保持幂等。
这意味着:
- 当一个外部事件变成多个内部消息时,每个部分都保持相同的追踪/事件标识。
- 重试不会导致重复处理。
- 如果提供商事件 ID 缺失,去重回退到安全键(例如,追踪 ID以避免丢弃不同的事件。
- 绿色:
- `make ingress-trace`
- `make ingress-trace2`
- `make ingress-idempotency`
- `make ingress-dedupe-fallback`
- 红色(预期):
- `make ingress-trace-negative`
- `make ingress-trace2-negative`
- `make ingress-idempotency-negative`
- `make ingress-dedupe-fallback-negative`
### 路由 dmScope 优先级 + identityLinks
**声明:** 路由必须默认保持私信会话隔离,只有在明确配置时才折叠会话(渠道优先级 + 身份链接)。
这意味着:
- 渠道特定的 dmScope 覆盖必须优先于全局默认值。
- identityLinks 应该只在明确链接的组内折叠,而不是跨不相关的对等方。
- 绿色:
- `make routing-precedence`
- `make routing-identitylinks`
- 红色(预期):
- `make routing-precedence-negative`
- `make routing-identitylinks-negative`