Skip to content

[refactor] Semantic function clustering: consolidate duplicated value-serialization logic and util-package reimplementations #37451

@github-actions

Description

@github-actions

Overview

This is an automated semantic function-clustering pass over all non-test .go files in pkg/ (886 source files; ~4,000 top-level functions across 26 packages). The codebase is, on the whole, well organized: it already centralizes helpers into dedicated *util packages (stringutil, sliceutil, fileutil, gitutil, typeutil, timeutil, jsonutil, repoutil, errorutil, envutil), follows a clean one-feature-per-file convention (e.g. the 50+ codemod_*.go files, the 21 linter analyzers), and uses _wasm.go build-tag variants idiomatically. Most superficially "duplicate" function names are deliberate: build-tag pairs, per-analyzer run(pass) entry points, linter testdata fixtures, and init().

The findings below are the genuine consolidation opportunities that survived verification — each was confirmed by reading both definitions. The single highest-value item is duplicated import-input serialization logic between pkg/parser and pkg/workflow.

Summary

Metric Value
Source files analyzed (pkg/, excl. tests) 886
Top-level functions cataloged ~4,049
Distinct duplicate-name clusters reviewed ~40
Genuine refactor candidates (verified) 5
Status ⚠️ Minor, low-risk cleanups available

Critical / High-Value Finding

1. Duplicated import-input resolution + serialization (pkg/parserpkg/workflow)

The same logic — dotted-path lookup into an import-inputs map, plus reflection-based normalization of typed slices/maps before JSON marshaling — is implemented twice, independently:

  • pkg/parser/import_field_extractor.go — cleanly decomposed: resolveImportInputPath (:1236), resolveImportInputValue (:1244), formatResolvedImportInputValue (:1262), formatReflectiveImportInputValue (:1275), marshalImportInputValue (:1287), normalizeSliceForImportInput (:1295), normalizeMapForImportInput (:1303).
  • pkg/workflow/expression_extraction.go — a monolithic marshalImportInputValue (:505) and resolveImportInputPath (:554) that inline the identical dotted-path cut, typed-slice→[]any reflection, sorted-key typed-map→map[string]any reflection, and json.Marshal fallback.

The dotted-path resolution is character-for-character the same algorithm (strings.Cut(path, "."), one level of nesting), and the reflection normalization blocks are identical down to sort.Strings(keys).

Recommendation: Extract a single shared implementation (the pkg/parser decomposition is the better-factored version) and have pkg/workflow call into it, or move the primitives into a small shared helper. Watch the signature divergence — parser returns (string, bool) / (any, bool) while workflow returns string / (any, bool).

Impact: High — removes a real maintenance hazard where a fix to import-input handling in one package silently misses the other.

Util-Package Reimplementations

These reimplement logic that already exists in a centralized *util package, without delegating to it.

2. IsCommitSHA reinvents gitutil.IsHexString
  • Location: pkg/cli/spec.go:507
  • Duplicates: pkg/gitutil/gitutil.go:51 (IsHexString) / :63 (IsValidFullSHA)
  • IsCommitSHA inlines len(version) != 40 followed by the exact hex character-class loop found verbatim in gitutil.IsHexString:
for _, char := range version {
    if (char < '0' || char > '9') && (char < 'a' || char > 'f') && (char < 'A' || char > 'F') {
        return false
    }
}
  • Recommendation: Implement as len(version) == 40 && gitutil.IsHexString(version). Note the one behavioral difference: IsCommitSHA accepts uppercase hex, whereas gitutil.IsValidFullSHA is lowercase-only — decide which semantics are intended and converge.
  • Impact: Medium — removes a copy of security-relevant SHA validation.
3. extractIntField near-duplicates typeutil.ParseIntValue
  • Location: pkg/workflow/compiler_experiments.go:226
  • Duplicates: pkg/typeutil/convert.go:48 (ParseIntValue)
  • Both are func(any) (int, bool) with the same int / int64 / uint64 / float64 type switch and overflow guards. extractIntField additionally rejects negative values and non-integral floats.
  • Recommendation: Either extend typeutil with a ParseNonNegativeIntValue (or an option) and delegate, or document why the stricter local variant must diverge. If the extra strictness is incidental, just call typeutil.ParseIntValue.
  • Impact: Medium — needs care because the semantics are intentionally stricter; not a blind merge.
4. getString duplicates typeutil.LookupString
  • Location: pkg/cli/project_command.go:709
  • Duplicates: pkg/typeutil/convert.go (LookupString)
func getString(m map[string]any, key string) string {
    if val, ok := m[key].(string); ok {
        return val
    }
    return ""
}
  • Recommendation: Replace with typeutil.LookupString (which returns (string, bool)); discard the bool at call sites where only the value is needed. Trivial but removes a redundant local helper.
  • Impact: Low.

Lower-Priority / Conceptual Overlaps

5. hasCopilotRequestsWritePermission — parallel implementations over different inputs
  • pkg/workflow/permissions_operations.go:54func(*WorkflowData) bool
  • pkg/cli/workflow_secrets.go:116func(map[string]any) bool

Both answer the same question (does the config grant copilot-requests write?) but over different data models, so this is a softer overlap. Worth a shared predicate only if a common representation is convenient — otherwise leave as-is.

Impact: Low.

String/error classifiers that overlap errorutil but are intentionally broader
  • pkg/cli/codespace.go:23 is403PermissionError overlaps errorutil.IsForbiddenError but adds a "permission denied" substring check and takes a string rather than error.
  • pkg/cli/audit.go:255 isPermissionErrorStr / :267 isPermissionError — a bespoke gh-auth-failure classifier, broader than any single errorutil helper.

These are not clean reimplementations (different contracts/inputs) and are listed only for awareness. No action recommended unless errorutil later grows string-based variants.

What Was Checked and Found Clean

Patterns reviewed and confirmed as good organization (no action)
  • codemod_*.go (50+ files in pkg/cli) — one codemod per file, dispatched via a factory. Textbook one-feature-per-file. ✓
  • Linter analyzers — 21 run(pass *analysis.Pass) entry points, one per analyzer package under pkg/linters/. Same name, unrelated bodies by design. ✓
  • _wasm.go build-tag pairs (pkg/parser, pkg/workflow, pkg/console, pkg/tty) — mutually exclusive native/WASM variants. Idiomatic, not duplication. ✓
  • Existing thin wrappers already delegate correctly: cli/engine_secrets.go splitRepoSlugrepoutil.SplitRepoSlug; cli/git.go findGitRootForPathgitutil.FindGitRootFrom; parser/frontmatter_hash.go parseBoolFromFrontmattertypeutil.ParseBool. ✓
  • pkg/cli/audit_agentic_analysis.go:492 containsAny is a string-substring helper, not a reimplementation of sliceutil.Any (different contract). ✓

Recommended Priority

  1. High: Consolidate the import-input logic shared by pkg/parser and pkg/workflow (Finding 1).
  2. Medium: Route IsCommitSHA through gitutil (Finding 2); reconcile extractIntField with typeutil.ParseIntValue (Finding 3).
  3. Low: Replace getString with typeutil.LookupString (Finding 4); optional shared predicate for Finding 5.

Implementation Checklist

  • Extract shared import-input resolve/serialize helper; point pkg/workflow at it (Finding 1)
  • Reimplement IsCommitSHA via gitutil.IsHexString, reconciling upper/lowercase semantics (Finding 2)
  • Decide whether extractIntField can delegate to typeutil.ParseIntValue or document the divergence (Finding 3)
  • Replace getString with typeutil.LookupString (Finding 4)
  • Run go test ./... and the linter suite after each change

Analysis Metadata

  • Method: ripgrep extraction of all top-level function declarations + cross-file name clustering, followed by manual verification of each candidate against both definitions.
  • Total Go files analyzed: 886 (non-test, pkg/)
  • Genuine refactor candidates: 5 (1 high, 2 medium, 2 low)
  • Analysis date: 2026-06-07

References: §27077516276

Generated by 🔧 Semantic Function Refactoring · 661.7 AIC · ⌖ 15.8 AIC · ⊞ 9.9K ·

  • expires on Jun 9, 2026, 12:19 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions