Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Skill system including the embedded and local skill registries, SKILL.md parser,
| `pipeline` | Build pipeline context and orchestration | `Pipeline`, `Stage`, `BuildContext` |
| `plugins` | Plugin and framework plugin interfaces | `Plugin`, `FrameworkPlugin`, `AgentConfig`, `FrameworkRegistry` |
| `registry` | Embedded skill registry | — |
| `runtime` | LLM agent loop, executor, hooks, memory, guardrails | `AgentExecutor`, `LLMExecutor`, `ToolExecutor` |
| `runtime` | LLM agent loop, executor, hooks, memory, guardrail interface | `AgentExecutor`, `LLMExecutor`, `ToolExecutor`, `GuardrailChecker` |
| `schemas` | Embedded JSON schemas | `agentspec.v1.0.schema.json` |
| `security` | Egress allowlist, security policies, network policies | `EgressConfig`, `Resolve`, `GenerateAllowlistJSON` |
| `skills` | Skill parsing, compilation, requirements resolution | `CompiledSkills`, `Compile`, `WriteArtifacts` |
Expand All @@ -98,7 +98,7 @@ Skill system including the embedded and local skill registries, SKILL.md parser,
| `plugins/crewai` | CrewAI framework adapter | — |
| `plugins/langchain` | LangChain framework adapter | — |
| `plugins/custom` | Custom framework plugin | — |
| `runtime` | CLI-specific runtime (subprocess, watchers, stubs, mocks) | |
| `runtime` | CLI-specific runtime (subprocess, guardrail engine, watchers, stubs, mocks) | `LibraryGuardrailEngine` |
| `server` | A2A HTTP server implementation | — |
| `channels` | Channel configuration and routing | — |
| `skills` | Skill file loading and writing | — |
Expand Down
12 changes: 12 additions & 0 deletions docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,18 @@ forge init [name] [flags]
| `--from-skills` | | | Path to a SKILL.md file for auto-configuration |
| `--non-interactive` | | `false` | Skip interactive prompts |

### Generated Files

`forge init` generates these key files:

| File | Purpose |
|------|---------|
| `forge.yaml` | Agent configuration |
| `guardrails.json` | Guardrail policy config (PII, security, secret patterns, gate config) |
| `SKILL.md` | Agent skill definition |
| `.env` | Environment variables |
| `.gitignore` | Includes `guardrails.json`, `.env`, `.forge/` |

### Examples

```bash
Expand Down
5 changes: 5 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ memory:
keyword_weight: 0.3 # Hybrid search keyword weight
decay_half_life_days: 7 # Temporal decay half-life

guardrails_path: "guardrails.json" # Path to guardrails config (default: "guardrails.json")

schedules: # Recurring scheduled tasks (optional)
- id: "daily-report"
cron: "@daily"
Expand Down Expand Up @@ -108,6 +110,9 @@ schedules: # Recurring scheduled tasks (optional)
| `FORGE_CORS_ORIGINS` | Comma-separated CORS allowed origins for A2A server |
| `FORGE_AUTH_URL` | External auth provider URL for token validation |
| `FORGE_AUTH_ORG_ID` | Organization ID sent to external auth provider |
| `FORGE_GUARDRAILS_DB` | MongoDB URI for DB-backed guardrails config + audit |
| `FORGE_AGENT_ID` | Agent identifier for DB guardrails (falls back to `agent_id` in YAML) |
| `FORGE_ORG_ID` | Organization identifier for DB guardrails |
| `FORGE_PASSPHRASE` | Passphrase for encrypted secrets file |

---
Expand Down
1 change: 1 addition & 0 deletions docs/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Every `forge build` generates container-ready artifacts:

| Artifact | Purpose |
|----------|---------|
| `guardrails.json` | Guardrail policy config (copied from project root if present) |
| `Dockerfile` | Container image with minimal attack surface |
| `deployment.yaml` | Kubernetes Deployment manifest |
| `service.yaml` | Kubernetes Service manifest |
Expand Down
1 change: 1 addition & 0 deletions docs/memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ memory:
```

- Sessions are saved as JSON files with atomic writes (temp file + fsync + rename)
- Orphaned tool calls (assistant tool_calls without matching tool results) are stripped on both save and recovery, preventing API rejection errors
- Automatic cleanup of sessions older than 7 days at startup
- Session recovery on subsequent requests (disk snapshot supersedes task history)
- **Session max age** (default 30 minutes): stale sessions are discarded on recovery to prevent poisoned error context from blocking tool retries. When an LLM accumulates repeated tool failures in a session, it may stop retrying altogether. The max age ensures these poisoned sessions expire, giving the agent a fresh start.
Expand Down
4 changes: 2 additions & 2 deletions docs/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ The core agent loop follows a simple pattern:
User message → Memory → LLM → tool_calls? → Execute tools → LLM → ... → text → Done
```

The loop terminates when `FinishReason == "stop"` or `len(ToolCalls) == 0`.
The loop terminates when `len(ToolCalls) == 0`. Tool calls are always executed even if `FinishReason` is `"stop"` — this prevents orphaned function calls that would cause API rejection on session recovery.

### Q&A Nudge Suppression

Expand Down Expand Up @@ -245,7 +245,7 @@ For details on session persistence, context window management, compaction, and l

The engine fires hooks at key points in the loop. See [Hooks](hooks.md) for details.

The runner registers five hook groups: logging, audit, progress, global guardrail hooks, and skill guardrail hooks. The global guardrail `AfterToolExec` hook scans tool output for secrets and PII, redacting or blocking before results enter the LLM context. Skill guardrail hooks enforce domain-specific rules declared in `SKILL.md` — blocking commands, redacting output, intercepting capability enumeration probes, and replacing binary-enumerating responses. Skill guardrails are loaded from build artifacts or parsed directly from `SKILL.md` at runtime (no `forge build` required). See [Tool Output Scanning](security/guardrails.md#tool-output-scanning) and [Skill Guardrails](security/guardrails.md#skill-guardrails).
The runner registers five hook groups: logging, audit, progress, global guardrail hooks, and skill guardrail hooks. Global guardrails use the `GuardrailChecker` interface backed by the `github.com/initializ/guardrails` library — the `AfterToolExec` hook scans tool output for secrets and PII, redacting or blocking before results enter the LLM context. Guardrail config is loaded from `guardrails.json` (file mode) or MongoDB (DB mode). Skill guardrail hooks enforce domain-specific rules declared in `SKILL.md` — blocking commands, redacting output, intercepting capability enumeration probes, and replacing binary-enumerating responses. Skill guardrails are loaded from build artifacts or parsed directly from `SKILL.md` at runtime (no `forge build` required). See [Guardrails](security/guardrails.md) for full details.

## Streaming

Expand Down
220 changes: 139 additions & 81 deletions docs/security/guardrails.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,130 +2,186 @@

> Part of [Forge Documentation](../../README.md)

The guardrail engine checks inbound and outbound messages against configurable policy rules.
The guardrail engine validates inbound and outbound messages against configurable policy rules using the `github.com/initializ/guardrails` library.

## Built-in Guardrails
## Architecture

| Guardrail | Direction | Description |
|-----------|-----------|-------------|
| `content_filter` | Inbound + Outbound | Blocks messages containing configured blocked words |
| `no_pii` | Outbound | Detects email, phone, SSNs (with structural validation), and credit cards (with Luhn check) |
| `jailbreak_protection` | Inbound | Detects common jailbreak phrases ("ignore previous instructions", etc.) |
| `no_secrets` | Outbound | Detects API keys, tokens, and private keys (OpenAI, Anthropic, AWS, GitHub, Slack, Telegram, etc.) |
Guardrails are implemented as a `GuardrailChecker` interface in forge-core, with the concrete engine in forge-cli wrapping the external guardrails library. Two operational modes are supported:

| Mode | Config Source | Use Case |
|------|--------------|----------|
| **File mode** (default) | `guardrails.json` in project root | Local development, standalone deployments |
| **DB mode** | MongoDB (`AgentConfig` collection) | Platform deployments with centralized config + audit |

Priority: `FORGE_GUARDRAILS_DB` env → `guardrails.json` → built-in defaults.

## Built-in Evaluators

The guardrails library provides these evaluator categories:

| Category | Direction | Description |
|----------|-----------|-------------|
| PII detection | Inbound + Outbound | Detects email, phone, SSN, credit card numbers |
| Jailbreak detection | Inbound | Detects jailbreak and prompt manipulation attempts |
| Prompt injection | Inbound | Detects injection attacks in user input |
| Command injection | Inbound | Detects shell/command injection patterns |
| Secret detection | Outbound + Tool output | Detects API keys, tokens, and private keys via regex rules |
| Custom rules | Configurable per gate | User-defined regex and keyword rules |

## Modes

| Mode | Behavior |
|------|----------|
| `enforce` | Blocks violating inbound messages; **redacts** outbound messages (see below) |
| `enforce` | Blocks violating inbound messages; **redacts** outbound messages |
| `warn` | Logs violation, allows message to pass |

### Outbound Redaction
### Inbound Masking

Outbound messages (from the agent to the user) are always **redacted** rather than blocked, even in `enforce` mode. Blocking would discard a potentially useful agent response (e.g., code analysis) over a false positive from broad PII/secret patterns matching source code. Matched content is replaced with `[REDACTED]` and a warning is logged.
When PII or secrets are detected in inbound messages with action `mask`, the content is redacted **before** it reaches the LLM. The LLM never sees the original sensitive data.

### PII Validators

To reduce false positives, PII patterns use structural validators beyond simple regex:
### Outbound Redaction

| Pattern | Validator | What it checks |
|---------|-----------|---------------|
| SSN | `validateSSN` | Rejects area=000/666/900+, group=00, serial=0000, all-same digits, known test SSNs |
| Credit card | `validateLuhn` | Luhn checksum validation, 13-19 digit length check |
| Email | — | Regex only |
| Phone | — | Regex only (area code 2-9, separators required) |
Outbound messages (from the agent to the user) are always **redacted** rather than blocked, even in `enforce` mode. Blocking would discard a potentially useful agent response over a false positive. Matched content is replaced with the library's masked output and a warning is logged.

## Configuration

Guardrails are defined in the policy scaffold, loaded from `policy-scaffold.json` or generated during `forge build`.
### `guardrails.json`

Custom guardrail rules can be added to the policy scaffold:
Guardrails are configured in `guardrails.json` at the project root. This file is generated by `forge init` and can be customized:

```json
{
"guardrails": {
"content_filter": {
"mode": "enforce",
"blocked_words": ["password", "credit card"]
},
"no_pii": {
"mode": "enforce"
"pii": {
"enabled": true,
"action": "mask",
"categories": {
"email": { "enabled": true, "action": "mask" },
"phoneNumber": { "enabled": true, "action": "mask" },
"ssn": { "enabled": true, "action": "mask" },
"creditCard": { "enabled": true, "action": "mask" }
}
},
"security": {
"jailbreakDetection": {
"enabled": true,
"confidenceThreshold": 25,
"action": "block"
},
"jailbreak_protection": {
"mode": "warn"
"promptInjection": {
"enabled": true,
"confidenceThreshold": 30,
"action": "block"
},
"no_secrets": {
"mode": "enforce"
"commandInjection": {
"enabled": true,
"confidenceThreshold": 35,
"action": "block"
}
},
"customRules": {
"rules": [
{
"id": "secret_openai",
"name": "OpenAI API Key",
"type": "regex",
"constraint": "hard",
"pattern": "sk-[A-Za-z0-9]{20,}",
"action": "mask",
"gates": ["output", "tool_call"]
}
]
},
"gateConfig": {
"inputGate": true,
"toolCallGate": true,
"outputGate": true,
"contextGate": false,
"streamGate": false
}
}
```

### Custom Path

Override the guardrails config file path in `forge.yaml`:

```yaml
guardrails_path: "config/my-guardrails.json"
```

### Default Secret Patterns

The default `guardrails.json` includes regex rules for these secret types:

| Rule ID | Pattern |
|---------|---------|
| `secret_anthropic` | `sk-ant-[A-Za-z0-9\-]{20,}` |
| `secret_openai` | `sk-[A-Za-z0-9]{20,}` |
| `secret_github_pat` | `ghp_[A-Za-z0-9]{36}` |
| `secret_github_oauth` | `gho_[A-Za-z0-9]{36}` |
| `secret_github_server` | `ghs_[A-Za-z0-9]{36}` |
| `secret_github_fine` | `github_pat_[A-Za-z0-9_]{22,}` |
| `secret_aws` | `AKIA[0-9A-Z]{16}` |
| `secret_slack_bot` | `xoxb-[0-9]{10,}-[A-Za-z0-9-]+` |
| `secret_slack_user` | `xoxp-[0-9]{10,}-[A-Za-z0-9-]+` |
| `secret_private_key` | `-----BEGIN (RSA\|EC\|OPENSSH\|PRIVATE) .*KEY-----` |
| `secret_telegram` | `[0-9]{8,10}:[A-Za-z0-9_-]{35,}` |

### Gate Configuration

Gates control which evaluation points are active:

| Gate | Default | Description |
|------|---------|-------------|
| `inputGate` | `true` | Validates user messages before LLM processing |
| `toolCallGate` | `true` | Validates tool arguments before execution |
| `outputGate` | `true` | Validates agent responses before delivery |
| `contextGate` | `false` | Validates context window content |
| `streamGate` | `false` | Validates streaming chunks |

## DB Mode (Platform Deployments)

When `FORGE_GUARDRAILS_DB` is set to a MongoDB connection URI, the engine loads guardrails config from the `AgentConfig` collection and enables audit logging.

```bash
export FORGE_GUARDRAILS_DB="mongodb://localhost:27017"
export FORGE_AGENT_ID="my-agent"
export FORGE_ORG_ID="my-org"
forge run
```

The library queries `AgentConfig` with `{agent_id, org_id}` to load the `StructuredGuardrails` config. If the DB is unreachable, it falls back to file mode.

| Environment Variable | Description |
|---------------------|-------------|
| `FORGE_GUARDRAILS_DB` | MongoDB connection URI |
| `FORGE_AGENT_ID` | Agent identifier (falls back to `agent_id` in `forge.yaml`) |
| `FORGE_ORG_ID` | Organization identifier |

## Runtime

```bash
# Default: guardrails enforced (all built-in guardrails active)
# Default: guardrails enforced (all evaluators active)
forge run

# Explicitly disable guardrail enforcement
forge run --no-guardrails
```

All four built-in guardrails (`content_filter`, `no_pii`, `jailbreak_protection`, `no_secrets`) are active by default, even without running `forge build`. Use `--no-guardrails` to opt out.
All configured guardrails are active by default, even without running `forge build`. Use `--no-guardrails` to opt out.

## Tool Output Scanning

The guardrail engine scans tool output via an `AfterToolExec` hook, catching secrets and PII before they enter the LLM context or outbound messages. The hook passes the tool name to enable per-tool exemptions (see [Per-Tool PII Exemptions](#per-tool-pii-exemptions) below).

| Guardrail | What it detects in tool output |
|-----------|-------------------------------|
| `no_secrets` | API keys, tokens, private keys (same patterns as outbound message scanning) |
| `no_pii` | Email addresses, phone numbers, SSNs |
The guardrail engine scans tool output via an `AfterToolExec` hook, catching secrets and PII before they enter the LLM context or outbound messages. The engine calls the library's `OutputGate` with tool metadata attached.

**Behavior by mode:**

| Mode | Behavior |
|------|----------|
| `enforce` | Returns an error identifying the guardrail that triggered (e.g., `"tool output blocked by no_pii guardrail (PII detected in output)"`), blocking the result from entering the LLM context. |
| `warn` | Replaces matched patterns with `[REDACTED]`, logs a warning, and allows the redacted output through |

The hook writes the redacted text back to `HookContext.ToolOutput`, which the agent loop reads after all hooks fire. This is backwards-compatible — existing hooks that don't modify `ToolOutput` leave it unchanged.

### Per-Tool PII Exemptions
| `enforce` | Returns an error identifying the violation, blocking the result from entering the LLM context |
| `warn` | Replaces matched patterns with masked content, logs a warning, allows through |

Some tools legitimately return PII as part of their function (e.g., `github_get_user` returning public email addresses). The `allow_tools` config option lets specific tools bypass a guardrail entirely.

```json
{
"guardrails": [
{
"type": "no_pii",
"config": {
"allow_tools": [
"github_get_user",
"github_pr_author_profiles",
"github_stargazer_profiles",
"file_create",
"code_agent_write",
"code_agent_edit",
"cli_execute",
"web_search"
]
}
}
]
}
```

**Key behaviors:**

| Behavior | Detail |
|----------|--------|
| Per-guardrail scope | `allow_tools` on `no_pii` does **not** bypass `no_secrets` — each guardrail has its own allowlist |
| Write tools included | `file_create`, `code_agent_write`, `code_agent_edit`, and `cli_execute` are included because they echo back content the LLM already has or return operational output that may contain incidental PII (e.g., git log author emails) |
| Web search included | `web_search` is included because search results routinely contain names, emails, and other PII that is public web content — blocking these results would make Q&A conversations unusable |
| Default config | The default policy scaffold pre-configures `allow_tools` for GitHub profile tools and write tools |
| Custom overrides | Override via `policy-scaffold.json` to add or remove tools from the allowlist |
The hook writes the redacted text back to `HookContext.ToolOutput`, which the agent loop reads after all hooks fire.

## Path Containment

Expand Down Expand Up @@ -241,9 +297,11 @@ The `cli_execute` tool blocks arguments containing `file://` URLs (case-insensit
Guardrail evaluations are logged as structured audit events:

```json
{"ts":"2026-02-28T10:00:00Z","event":"guardrail_check","correlation_id":"a1b2c3d4","fields":{"guardrail":"no_pii","direction":"outbound","result":"blocked"}}
{"ts":"2026-02-28T10:00:00Z","event":"guardrail_check","correlation_id":"a1b2c3d4","fields":{"guardrail":"pii","direction":"inbound","result":"masked"}}
```

In DB mode, the guardrails library writes audit records to MongoDB automatically when `EnableAudit` is set.

See [Security Overview](overview.md) for the full security architecture.

---
Expand Down
2 changes: 1 addition & 1 deletion forge-cli/build/dockerfile_stage.go
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ func (s *DockerfileStage) copyProjectSources(bc *pipeline.BuildContext) error {
outDir := bc.Opts.OutputDir

// Individual files to copy
filesToCopy := []string{"forge.yaml"}
filesToCopy := []string{"forge.yaml", "guardrails.json"}
// Include channel config files (e.g. slack-config.yaml, telegram-config.yaml)
if bc.Config != nil {
for _, ch := range bc.Config.Channels {
Expand Down
Loading
Loading