Add cold-outbound skill — SDR lead gen with Browserbase search and deep research#68
Add cold-outbound skill — SDR lead gen with Browserbase search and deep research#68jay-sahnan wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 985032a. Configure here.
| def load_batches(tmp_dir: str) -> list[dict]: | ||
| """Load all enrichment batch files and extract company records.""" | ||
| pattern = os.path.join(tmp_dir, "cold_enrichment_batch_*.json") | ||
| files = sorted(glob.glob(pattern)) |
There was a problem hiding this comment.
Final CSV misses email data from Step 8
High Severity
load_batches only globs for cold_enrichment_batch_*.json, but Step 8 email-generation subagents write their output (with emails and contact columns) to cold_final_batch_*.json. When compile_csv.py is re-run in Step 8 to produce the final CSV, it never reads the final batch files — so the compiled CSV silently omits all personalized emails and contact info. Notably the cleanup code on line 163 does reference cold_final_batch_*.json for deletion, confirming the file pattern exists but is never loaded.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 985032a. Configure here.
| } | ||
|
|
||
| // Structured mode: parse HTML into company data | ||
| console.error(`[smart_fetch] Fetch API succeeded, parsing HTML...`); |
There was a problem hiding this comment.
Log message hardcodes wrong fetch method source
Low Severity
The log message on line 221 hardcodes "Fetch API succeeded" but this code path also executes when content came from the browser fallback. The fetchMethod variable already tracks the actual source. This produces misleading debug output when diagnosing fetch issues.
Reviewed by Cursor Bugbot for commit 985032a. Configure here.
| console.error("[smart_fetch] Falling back to browser..."); | ||
| content = await fetchWithBrowser(url); | ||
| fetchMethod = "browser"; | ||
| } |
There was a problem hiding this comment.
Raw mode bypassed by HTML content quality heuristics
Medium Severity
tryFetchApi applies HTML-focused quality heuristics (needsBrowserFallback) unconditionally, but the raw flag isn't checked until after the fallback decision in main(). For --raw fetches of sitemap.xml or llms.txt, short content (<500 chars) or low text density in XML triggers a browser fallback. The browser then returns rendered HTML via page.content() instead of raw XML/text, corrupting the output that downstream sitemap parsing depends on.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 985032a. Configure here.
shrey150
left a comment
There was a problem hiding this comment.
Structurally need to fix:
- Remove Playwright dep & use Stagehand directly
- Use
bbinstead of custom API wrappers - Confirm that you're following best skill standards
- Use
.mjsinstead of.pyfiles for custom scripts (sincebbis Node-based)
| @@ -0,0 +1,96 @@ | |||
| // Browserbase Search API wrapper for cold outbound lead discovery. | |||
There was a problem hiding this comment.
Can we use bb search instead of needing this script?
| @@ -0,0 +1,42 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Overall, let's not mix and match Python scripts and TS/JS. I'd prefer everything is in TypeScript via .mjs files. see /cookie-sync for an example on how to do this
| "private": true, | ||
| "dependencies": { | ||
| "@browserbasehq/sdk": "^2.8.0", | ||
| "playwright": "^1.52.0", |
There was a problem hiding this comment.
can we avoid Playwright unless needed? Please use Stagehand instead. Notice you need a double dep with Browserbase + Playwright instead of just Stagehand bc of this
| @@ -0,0 +1,169 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Instead of compiling a CSV from disparate JSON files, it might be better to provide example individual company research in mini Markdown files, then instructing the LLM to compile all Markdown files into a final CSV by generating code on the fly, instead of prescribing a solution via Python this way.
TLDR skills should remain readable (i.e. prefer Markdown as intermediate data format) and not overly prescriptive (i.e. provide a set script to run). Open to opinion on the 2nd based on your benchmarking as to what works best. My guess is the model is smart enough to gen its own approach given a few pointers. Perhaps try auto-researching to eval what works best?
| @@ -0,0 +1,236 @@ | |||
| // Smart fetch with Browserbase Fetch API fast-path and Stagehand browser fallback. | |||
There was a problem hiding this comment.
again, let's use bb fetch here?
| @@ -0,0 +1,287 @@ | |||
| --- | |||
There was a problem hiding this comment.
Can you verify in PR description that you've done the following:
- Followed the Agent Skills standard as best as possible (https://agentskills.io/llms-full.txt)
- Compared against popular skills in https://github.com/anthropics/skills
- Observed 5-10 runs and noticed for any steps for which the agent thrashes, distilled learnings from that into the skill, and ran again to verify it doesn't happen anymore
- Checked that the skill performs well on both Claude Code & Codex, ideally also benchmark on OpenCode as well.
^ may be best to automate this in CI as mentioned!


Summary
fit, finds contacts, and generates personalized cold emails compiled into a scored CSV
Note
Medium Risk
Adds a brand-new outbound automation skill plus helper scripts that fetch external web content and write/clean up files under
/tmp, so failures could affect lead output quality and local file hygiene but the changes are isolated to a newskills/cold-outbounddirectory.Overview
Introduces a new
cold-outboundskill that defines an 8-step pipeline for outbound lead generation: build/confirm a persistent sender company profile, discover targets via Browserbase Search, enrich and ICP-score companies using a Plan→Research→Synthesize workflow, optionally find contacts, then generate personalized emails and compile a final scored CSV.Adds bundled automation scripts to support the pipeline:
bb_search.ts(Browserbase Search with rate-limit retry),bb_smart_fetch.ts(Fetch API fast-path with Playwright/Browserbase session fallback and optional raw-mode forsitemap.xml/llms.txt),list_discovery_urls.py(dedupe discovered URLs by domain),write_batch.py(stdin→JSON batch writer), andcompile_csv.py(merge/dedupe enrichment batches, produce CSV, emit summary, and optionally clean up batch files). Includes reference docs and example profiles to standardize subagent prompts, schemas, and email templates.Reviewed by Cursor Bugbot for commit 985032a. Bugbot is set up for automated code reviews on this repo. Configure here.