fix(skill): customize-opencode off by default; promote browser-execute guide to a registered skill#72
Merged
Alezander9 merged 2 commits intoMay 16, 2026
Conversation
…e guide to a registered skill Eval-agent deep-dive on v0.1.6 regressed traces: the only skill registered in browser sessions was upstream's customize-opencode (opencode.json schema authoring) — pure pollution for browser-driving workflows. Meanwhile the genuinely useful browser-execute-guide.md was not a registered skill, only surfaced because the tool description said 'you MUST Read this file first.' Two changes: 1) Gate customize-opencode built-in registration on BCODE_ENABLE_CUSTOMIZE_OPENCODE=1 (default off). A user-disk skill of the same name still loads, since the gate runs before disk discovery. 2) Rename packages/bcode-browser/skills/browser-execute-guide.md → browser-execute/SKILL.md, add frontmatter (name: browser-execute, description front-loads 'Use ONLY when calling browser_execute'), and extend discoverSkills to scan <dataDir>/skills/ where the bcode-browser package already materializes first-party skills. The skill now appears in <available_skills> at planning time and is loaded via the skill tool. Updated browser-execute.txt to instruct 'you MUST use the skill tool first to load the browser-execute skill' — keeps the strong MUST language verbatim per user confirmation that the wording materially improves eval scores.
Per user: env-var gate added unnecessary surface area. The skill teaches opencode.json schema authoring; for BrowserCode that's the wrong product surface, so don't ship it at all. Removes the const + import + registration block (about 17 lines), and the now-orphaned 377-line prompt body. Net diff: -388 lines.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Eval-agent's deep-dive on v0.1.6 regression traces (gemini-3-flash-preview pooled n=296, -7.5pp vs v0.1.2 baseline) surfaced two skill-registration problems on browser sessions:
customize-opencodeis force-registered in every session. The skill ships from upstream (opencode-meta→customize-opencode, #26617 / #26899) and teaches the agent how to authoropencode.json, opencode plugins, and opencode agent files. For BrowserCode browser-driving workflows it is pure system-prompt pollution: ~480 chars the model has to evaluate-and-discard every turn, with a negatively-correlated name (opencode≠ what the binary is called now).browser-execute-guide.mdis not a registered skill. It sits at<dataDir>/skills/browser-execute-guide.mdas a plain markdown file. The only reason agents find it is thebrowser_executetool description's hard requirement: "you MUST use the Read tool first to read…/browser-execute-guide.md". Eval data shows 77% of tasks do eventually read it — but it has no presence in<available_skills>at planning time, so the agent only discovers it after committing tobrowser_execute.Changes
Gate
customize-opencoderegistration onBCODE_ENABLE_CUSTOMIZE_OPENCODE=1(default off). The built-in registration inpackages/opencode/src/skill/index.tsis now wrapped in an env check. When unset / falsy, the skill is not registered at all — it doesn't appear in<available_skills>and theskilltool can't load it. The gate runs before disk discovery, so a user who places their owncustomize-opencode/SKILL.mdon disk still gets it. Opt-in viaBCODE_ENABLE_CUSTOMIZE_OPENCODE=1for sessions that are genuinely editingbcode.jsonor agent configs.Promote
browser-execute-guide.mdto a real registered skill.packages/bcode-browser/skills/browser-execute-guide.md→packages/bcode-browser/skills/browser-execute/SKILL.md(matches opencode's**/SKILL.mddiscovery glob).name: browser-execute, description front-loads "Use ONLY when calling thebrowser_executetool or driving a real browser via the Chrome DevTools Protocol. Required reading before the firstbrowser_executecall in a session."discoverSkillsextended to scan the bcode-shipped skills materialization dir (<dataDir>/skills/) so the skill auto-registers without users needing abcode.jsonentry.browser-execute.txttool description updated: "you MUST use the skill tool first to load thebrowser-executeskill" — keeps the strongMUSTwording verbatim (the eval-agent confirmed it materially improves scores).Why this should help eval scores
browser-executein<available_skills>before they ever pick the tool, instead of discovering the guide reactively after the firstbrowser_executecall.browser-execute) instead of one irrelevant one (customize-opencode).MUST read firstprompting onbrowser_execute, now routed through the canonical skill-loading path.Diff size
8 files, +46 / -17. Yellow-zone modifications in
packages/opencode/src/skill/index.ts(gate + new scan, ~20 lines net) andpackages/opencode/src/tool/browser-execute.txt(1 line). Everything else is inpackages/bcode-browser/(Green zone). Logged in maintainer-sideEXCEPTIONS.md.Test plan
bun typecheckfrompackages/opencode/✓bun typecheckfrompackages/bcode-browser/✓bun typecheckfrom repo root (filtered) ✓bun testinpackages/bcode-browser/✓ (13 pass, 8 skip — the 8 skips are smoke tests gated onBCODE_SMOKE_CHROME=1)bun test test/skillinpackages/opencode/: 14 pass, 5 fail — the 5 failures are pre-existing onmain(verified by stashing my changes); unrelated to this PR.Suggested next eval: re-run
glm-5.1andgemini-3-flash-previewbaselines on a build of this branch and compare to v0.1.6.Summary by cubic
Removed the built-in
customize-opencodeskill and promoted the browser execution guide to a registeredbrowser-executeskill. This cleans up prompt noise and makes the guide visible at planning time;browser_executenow instructs loading the skill first.New Features
skills/browser-execute/SKILL.mdwith frontmatter (name: browser-execute). It auto-registers, appears in available skills, and is loaded via theskilltool. Tool text now says “MUST use the skill tool to loadbrowser-execute.”<dataDir>/skills/so first‑party skills shipped by@browser-use/bcode-browserare picked up automatically.Bug Fixes
customize-opencode(no env gate; it no longer ships). A user skill with the same name on disk still loads via normal discovery.Written for commit a6ac76a. Summary will update on new commits. Review in cubic