feat: add DLP scanning to block credential exfiltration in URLs#1288
feat: add DLP scanning to block credential exfiltration in URLs#1288
Conversation
Add opt-in --enable-dlp flag that configures Squid proxy URL regex ACLs to detect and block outbound requests containing credential patterns (GitHub tokens, OpenAI/Anthropic API keys, AWS keys, Slack tokens, etc.) in URLs. This protects against accidental credential leakage via query parameters, path segments, and encoded URL content. Closes #308 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (3 files)
✨ New Files (1 files)
Coverage comparison generated by |
Smoke Test ResultsLast 2 merged PRs:
Overall: PASS
|
|
PRs (last merged):
|
|
Smoke test results for
Overall: PASS
|
There was a problem hiding this comment.
Pull request overview
Adds an opt-in Data Loss Prevention (DLP) capability to the firewall by generating Squid url_regex ACL rules that deny requests containing credential-like patterns in URLs, wired through a new CLI flag and config plumbing.
Changes:
- Add
--enable-dlpCLI flag and plumbenableDlpthroughWrapperConfig→SquidConfig→ Squid config generation. - Introduce
src/dlp.tswith built-in credential regex patterns, a scanner helper, and Squid ACL/rule generation. - Add unit/integration tests for DLP pattern matching and Squid config integration.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/types.ts | Adds enableDlp config fields and documents intended behavior. |
| src/cli.ts | Adds --enable-dlp flag and logs when DLP is enabled; passes into WrapperConfig. |
| src/docker-manager.ts | Passes enableDlp into generateSquidConfig() when writing squid.conf. |
| src/squid-config.ts | Injects generated DLP ACLs and http_access deny dlp_blocked into Squid config output. |
| src/squid-config.test.ts | Adds integration tests asserting presence/placement of DLP rules in generated config. |
| src/dlp.ts | Defines credential patterns and functions to scan strings and generate Squid ACL/rules. |
| src/dlp.test.ts | Adds unit tests for pattern matching and Squid ACL generation. |
Comments suppressed due to low confidence (5)
src/dlp.test.ts:58
- This test includes a fine‑grained GitHub PAT-shaped string (
github_pat_...) as a contiguous literal, which can trigger secret scanning / push protection. Generate it dynamically (split/concatenate) so the full token format never appears verbatim in the repo.
it('should detect GitHub fine-grained PAT (github_pat_)', () => {
const matches = scanForCredentials(
'https://api.example.com/?key=github_pat_1234567890abcdefghijkl_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456'
);
expect(matches).toContain('GitHub Fine-Grained PAT');
src/dlp.test.ts:44
- This test includes a GitHub token-shaped string (
ghs_+ 36 chars) as a contiguous literal, which can trigger secret scanning / push protection. Build the token dynamically so the full pattern is not present verbatim in the file.
it('should detect GitHub App installation token (ghs_)', () => {
const matches = scanForCredentials(
'https://api.example.com/?key=ghs_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij'
);
expect(matches).toContain('GitHub App Installation Token');
src/dlp.test.ts:51
- This test includes a GitHub token-shaped string (
ghu_+ 36 chars) as a contiguous literal, which can trigger secret scanning / push protection. Build the token dynamically so the full pattern is not present verbatim in the file.
it('should detect GitHub App user-to-server token (ghu_)', () => {
const matches = scanForCredentials(
'https://api.example.com/?key=ghu_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij'
);
expect(matches).toContain('GitHub App User-to-Server Token');
src/dlp.test.ts:90
- This test includes an AWS Access Key ID-shaped string (
AKIA+ 16 chars) as a contiguous literal, which is frequently flagged by secret scanning / push protection. Construct it dynamically so the full key pattern is not present verbatim in the source.
// AWS
it('should detect AWS access key ID (AKIA)', () => {
const matches = scanForCredentials(
'https://api.example.com/?key=AKIAIOSFODNN7EXAMPLE'
);
expect(matches).toContain('AWS Access Key ID');
src/dlp.test.ts:30
- This test includes a GitHub token-shaped string (
ghp_+ 36 chars) as a contiguous literal. GitHub secret scanning / push protection commonly blocks commits containing these formats even when they are fake. Generate the token at runtime (e.g., via concatenation/repeat) so the full token pattern never appears verbatim in the repository.
it('should detect GitHub personal access token (ghp_)', () => {
const matches = scanForCredentials(
'https://api.example.com/data?token=ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij'
);
expect(matches).toContain('GitHub Personal Access Token (classic)');
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const matches = scanForCredentials( | ||
| 'https://api.example.com/data?token=ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij' | ||
| ); | ||
| expect(matches).toContain('GitHub Personal Access Token (classic)'); | ||
| }); | ||
|
|
||
| it('should detect GitHub OAuth token (gho_)', () => { | ||
| const matches = scanForCredentials( | ||
| 'https://api.example.com/gho_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij/resource' | ||
| ); | ||
| expect(matches).toContain('GitHub OAuth Access Token'); | ||
| }); | ||
|
|
||
| it('should detect GitHub App installation token (ghs_)', () => { | ||
| const matches = scanForCredentials( | ||
| 'https://api.example.com/?key=ghs_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij' | ||
| ); | ||
| expect(matches).toContain('GitHub App Installation Token'); | ||
| }); | ||
|
|
||
| it('should detect GitHub App user-to-server token (ghu_)', () => { | ||
| const matches = scanForCredentials( | ||
| 'https://api.example.com/?key=ghu_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij' | ||
| ); | ||
| expect(matches).toContain('GitHub App User-to-Server Token'); | ||
| }); | ||
|
|
||
| it('should detect GitHub fine-grained PAT (github_pat_)', () => { | ||
| const matches = scanForCredentials( | ||
| 'https://api.example.com/?key=github_pat_1234567890abcdefghijkl_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456' |
| if (enableDlp) { | ||
| const dlp = generateDlpSquidConfig(); | ||
| dlpAclSection = '\n' + dlp.aclLines.join('\n') + '\n'; | ||
| dlpAccessSection = '\n' + dlp.accessRules.join('\n') + '\n'; |
| // OpenAI | ||
| { | ||
| name: 'OpenAI API Key', | ||
| description: 'OpenAI API key (sk-)', | ||
| regex: 'sk-[a-zA-Z0-9]{20}T3BlbkFJ[a-zA-Z0-9]{20}', | ||
| }, |
| * Enable Data Loss Prevention (DLP) scanning | ||
| * | ||
| * When true, Squid proxy will block outgoing requests that contain | ||
| * credential-like patterns (API keys, tokens, secrets) in URLs. | ||
| * This protects against accidental credential exfiltration via | ||
| * query parameters, path segments, or encoded URL content. | ||
| * | ||
| * Detected patterns include: GitHub tokens (ghp_, gho_, ghs_, ghu_, | ||
| * github_pat_), OpenAI keys (sk-), Anthropic keys (sk-ant-), | ||
| * AWS access keys (AKIA), Google API keys (AIza), Slack tokens, | ||
| * and generic credential patterns. |
Chroot Version Comparison Results
Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Summary
--enable-dlpCLI flag that enables Data Loss Prevention scanning on outbound HTTP/HTTPS trafficsrc/dlp.tsmodule with pattern definitions, credential scanning function, and Squid ACL generationTest plan
sudo awf --enable-dlp --allow-domains example.com -- curl "https://example.com/?token=ghp_testtoken123456789012345678901234"Closes #308
🤖 Generated with Claude Code