Skip to content

Ignore file-like domain suffixes in terminal sandbox#307780

Merged
dileepyavan merged 2 commits intomicrosoft:mainfrom
dileepyavan:DileepY/307510
Apr 4, 2026
Merged

Ignore file-like domain suffixes in terminal sandbox#307780
dileepyavan merged 2 commits intomicrosoft:mainfrom
dileepyavan:DileepY/307510

Conversation

@dileepyavan
Copy link
Copy Markdown
Member

@dileepyavan dileepyavan commented Apr 4, 2026

Summary

  • ignore hostname matches whose final label looks like a common file extension when normalizing candidate domains in the terminal sandbox service
  • reduce false positives from commands that reference local files such as .js, .json, .php, .java, .css, .md, and .env
  • add targeted regressions covering direct filename-like matches and commands that include filenames so they remain sandboxed

Testing

  • ./scripts/test.sh --run src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/terminalSandboxService.test.ts
  • node --experimental-strip-types build/hygiene.ts src/vs/workbench/contrib/terminalContrib/chatAgentTools/common/terminalSandboxService.ts src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/terminalSandboxService.test.ts

fixes #307510

Copilot AI review requested due to automatic review settings April 4, 2026 04:06
@dileepyavan dileepyavan enabled auto-merge (squash) April 4, 2026 04:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the terminal sandbox’s domain detection to reduce false positives by ignoring hostname-like matches whose final label resembles a common file extension, and adds regressions to ensure typical filename-containing commands remain sandboxed.

Changes:

  • Add a file-extension suffix filter to domain normalization to ignore filename-like “domains”.
  • Add tests covering filename-like tokens (eg bundle.js, README.md) so they don’t trigger blocked-domain prompts.
  • Add additional test cases for common commands that include filenames.
Show a summary per file
File Description
src/vs/workbench/contrib/terminalContrib/chatAgentTools/common/terminalSandboxService.ts Adds a suffix-based filter during domain normalization to drop filename-like matches.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/terminalSandboxService.test.ts Adds regressions ensuring commands with common filename extensions don’t surface blocked domains.

Copilot's findings

Comments suppressed due to low confidence (2)

src/vs/workbench/contrib/terminalContrib/chatAgentTools/common/terminalSandboxService.ts:583

  • The file-extension filter is implemented inside _normalizeDomain, which is also used to normalize user-configured allow/deny patterns (_matchesDomainPattern) and domains extracted from explicit URLs/ssh remotes. This couples a command-parsing heuristic (avoid filename false positives) to configuration validation and URL parsing, and can unexpectedly reject otherwise-valid allow/deny patterns or ignore URLs.

To keep behavior scoped, consider moving this check to _extractDomains only for _hostRegex matches (bare hostname-like tokens), or add a parameter so URL/setting normalization isn’t affected.

		// Disallow patterns that look like file names with common extensions, as these are unlikely to be intended as network domains and may be false positives from the regex.
		const lastLabel = host.slice(host.lastIndexOf('.') + 1);
		if (TerminalSandboxService._fileExtensionSuffixes.has(lastLabel)) {
			return undefined;
		}

		return hasWildcardPrefix ? `*.${host}` : host;

src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/terminalSandboxService.test.ts:441

  • The .env entry in this test doesn’t actually exercise the new domain-suffix filtering: .env won’t match the domain regexes because there’s no label before the dot. As written, this assertion would pass even without the PR’s change.

Consider replacing it with a filename that does match the hostname regex (eg config.env) so the test validates the intended behavior.

		const commands = [
			'node server.js',
			'php index.php',
			'java -jar app.java',
			'cat styles.css',
			'cat README.md',
			'cat .env',
		];
  • Files reviewed: 2/2 changed files
  • Comments generated: 2

@dileepyavan dileepyavan disabled auto-merge April 4, 2026 04:27
private static readonly _urlRegex = /(?:https?|wss?):\/\/[^\s'"`|&;<>]+/gi;
private static readonly _sshRemoteRegex = /(?:^|[\s'"`])(?:[^\s@:'"`]+@)?([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})(?::[^\s'"`|&;<>]+)(?=$|[\s'"`|&;<>])/gi;
private static readonly _hostRegex = /(?:^|[\s'"`(=])([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})(?::\d+)?(?=(?:\/[^\s'"`|&;<>]*)?(?:$|[\s'"`)\]|,;|&<>]))/gi;
private static readonly _fileExtensionSuffixes = new Set([
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's kinda annoying to maintain a hardcoded list like this (unless we know for sure these are all the cases we might have), maybe down the line we can use regex or something of the sort.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is tricky as the valid regex for domain can pick up file names too. This list is to avoid false positives.

justschen
justschen previously approved these changes Apr 4, 2026
@dileepyavan dileepyavan merged commit b33353e into microsoft:main Apr 4, 2026
19 checks passed
@vs-code-engineering vs-code-engineering bot added this to the 1.115.0 milestone Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent sandbox incorrectly identifies a file name as a domain name

3 participants