Skip to content

Optimize OSV vulnerability scanning#44684

Open
ksykulev wants to merge 5 commits intomainfrom
44391-osv-optimization
Open

Optimize OSV vulnerability scanning#44684
ksykulev wants to merge 5 commits intomainfrom
44391-osv-optimization

Conversation

@ksykulev
Copy link
Copy Markdown
Contributor

@ksykulev ksykulev commented May 4, 2026

Related issue: Resolves #44391

Checklist for submitter

If some of the following don't apply, delete the relevant line.

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.
    See Changes files for more information.
  • Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements), JS inline code is prevented especially for url redirects, and untrusted data interpolated into shell scripts/commands is validated against shell metacharacters.

Testing

I tested this locally
number of ubuntu hosts: 6,252
average software per host: 2,302
distinct software items: 61,213
host_software rows: 14.4M
generates software_cve rows 305,826
OS sub-versions: 25

The time before my optimization 10m53s down to 4m26s the optimization.

Summary by CodeRabbit

  • Chores

    • Optimized OSV vulnerability scanning to aggregate work by OS version and batch lookups, reducing redundant queries for faster scans.
  • Refactor

    • Restructured scanning flow to process OS versions in batched chunks with clearer logging and early exits when no work is required.
  • Tests

    • Added tests for querying, batching, source filtering, deduplication, and empty-input behaviors.

Copilot AI review requested due to automatic review settings May 4, 2026 18:23
@ksykulev ksykulev requested a review from a team as a code owner May 4, 2026 18:23
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c9da2f0d-3e8e-4cbd-8f01-02610cce5aaa

📥 Commits

Reviewing files that changed from the base of the PR and between 172bd43 and 372aa13.

📒 Files selected for processing (1)
  • changes/44391-osv-vuln-optimizations
✅ Files skipped from review due to trivial changes (1)
  • changes/44391-osv-vuln-optimizations

Walkthrough

This PR changes OSV vulnerability scanning to operate per OS version instead of per host. It adds two Datastore APIs—ListSoftwareForVulnDetectionByOSVersion and ListSoftwareVulnerabilitiesBySoftwareIDs—with MySQL implementations that batch software and vulnerability queries (configurable batch size). The OSV analyzer was refactored to use a new analyzeOSV helper which lists distinct software by OS version, processes software in chunks with a matcher, computes vulnerability deltas, performs batched deletes/inserts, deduplicates inserts, and optionally returns inserted vulnerabilities. Mock methods and unit tests were added.

Possibly related PRs

  • fleetdm/fleet#43377: Modified server/vulnerabilities/osv/analyzer.go to add RHEL OSV scanning logic that this change now batches per OS version.
  • fleetdm/fleet#42063: Introduced OSV analyzer/artifact handling and dataflow that was refactored here to use per-OSVersion batching.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly summarizes the main change: optimizing OSV vulnerability scanning performance through the implementation of batched queries grouped by OS version instead of per-host queries.
Description check ✅ Passed The PR description comprehensively addresses the template requirements: links to issue #44391, confirms changes file added, validates secure SQL practices, documents testing including automated tests with host isolation and manual QA, and provides concrete performance metrics showing 10m53s to 4m26s improvement.
Linked Issues check ✅ Passed The PR directly addresses issue #44391's core objective of reducing vulnerability cron runtime. The implementation optimizes OSV scanning from per-host queries to per-OS-version batched queries, reducing redundant database operations and demonstrating 60% performance improvement (10m53s → 4m26s) on large-scale test data matching the issue's ~175K-host scenario.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the optimization objective: new datastore methods for batched vulnerability queries, corresponding interface definitions, mock implementations, comprehensive tests, and refactored analyzer logic. No extraneous modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 44391-osv-optimization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/vulnerabilities/osv/analyzer.go (1)

128-137: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return the matched vulnerabilities, not just the insert delta.

When collectVulns is true, this now returns only newVulns, so a no-op re-scan yields an empty slice even though found is non-empty. That changes Analyze/AnalyzeRHEL from “collect detected vulns” to “collect newly inserted vulns”.

Suggested fix
-	newVulns, err := ds.InsertSoftwareVulnerabilities(ctx, toInsert, source)
-	if err != nil {
-		return nil, fmt.Errorf("inserting software vulnerabilities: %w", err)
-	}
+	if _, err := ds.InsertSoftwareVulnerabilities(ctx, toInsert, source); err != nil {
+		return nil, fmt.Errorf("inserting software vulnerabilities: %w", err)
+	}
 
 	if !collectVulns {
 		return nil, nil
 	}
 
-	return newVulns, nil
+	return found, nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/vulnerabilities/osv/analyzer.go` around lines 128 - 137, The current
code returns only the insertion delta (newVulns) when collectVulns is true, but
callers expect the matched/detected vulnerabilities (the variable found); change
the logic so that after calling ds.InsertSoftwareVulnerabilities you still
return the detected slice (found) when collectVulns is true instead of newVulns.
In practice, in analyzer.go replace the branch that returns newVulns with a
return of found (or the appropriate merged slice of found items if needed),
keeping the error handling around ds.InsertSoftwareVulnerabilities unchanged and
ensuring the function's return types still match.
🧹 Nitpick comments (4)
server/fleet/datastore.go (1)

683-685: 💤 Low value

Document nil/empty-slice behavior for softwareIDs in the interface contract comment.

The current comment is silent on what callers should expect when softwareIDs is nil or empty. Given that the OSV analyzer calls this method after a potentially empty ListSoftwareForVulnDetectionByOSVersion result, a well-defined contract (e.g., "returns nil, nil when softwareIDs is empty") prevents defensive coding duplication in every caller.

📝 Suggested doc addition
-	// ListSoftwareVulnerabilitiesBySoftwareIDs returns vulnerabilities for the given software IDs
-	// filtered by source. Queries software_cve directly without joining through host_software.
-	ListSoftwareVulnerabilitiesBySoftwareIDs(ctx context.Context, softwareIDs []uint, source VulnerabilitySource) ([]SoftwareVulnerability, error)
+	// ListSoftwareVulnerabilitiesBySoftwareIDs returns vulnerabilities for the given software IDs
+	// filtered by source. Queries software_cve directly without joining through host_software.
+	// Returns nil, nil when softwareIDs is nil or empty.
+	ListSoftwareVulnerabilitiesBySoftwareIDs(ctx context.Context, softwareIDs []uint, source VulnerabilitySource) ([]SoftwareVulnerability, error)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/fleet/datastore.go` around lines 683 - 685, Update the interface
comment for ListSoftwareVulnerabilitiesBySoftwareIDs to explicitly document
nil/empty-slice behavior for the softwareIDs parameter (e.g., state that passing
a nil or zero-length slice returns (nil, nil) or other defined result), so
callers like the OSV analyzer and ListSoftwareForVulnDetectionByOSVersion don’t
need to defensively handle empty inputs; mention the types involved
(softwareIDs, VulnerabilitySource, SoftwareVulnerability) and the chosen
contract (exact return values) so implementers and callers have a clear
expectation.
server/datastore/mysql/software_test.go (1)

12064-12069: ⚡ Quick win

Strengthen OS-version result assertions beyond name-only checks.

Right now the test only asserts software names, so a regression that returns incorrect Version/Source for the same names could pass undetected. Prefer asserting full identity tuples.

Suggested assertion tightening
-	names := make([]string, len(result))
-	for i, sw := range result {
-		names[i] = sw.Name
-	}
-	sort.Strings(names)
-	require.Equal(t, []string{"libbar", "libbaz", "libfoo"}, names)
+	type swKey struct {
+		Name    string
+		Version string
+		Source  string
+	}
+	got := make([]swKey, 0, len(result))
+	for _, sw := range result {
+		got = append(got, swKey{Name: sw.Name, Version: sw.Version, Source: sw.Source})
+	}
+	require.ElementsMatch(t, []swKey{
+		{Name: "libbar", Version: "4.5.6", Source: "deb_packages"},
+		{Name: "libbaz", Version: "7.8.9", Source: "deb_packages"},
+		{Name: "libfoo", Version: "1.2.3", Source: "deb_packages"},
+	}, got)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/datastore/mysql/software_test.go` around lines 12064 - 12069, Current
test only collects sw.Name from result and asserts names; modify it to assert
full identity tuples (Name, Version, Source) to catch regressions in
version/source. Replace the names slice with a slice built from result that
combines sw.Name, sw.Version and sw.Source (e.g., formatted tuple or small
struct), sort that slice, and assert equality against an expected sorted list of
full tuples like {"libbar|1.2.3|apt", ...}. Update the require.Equal call to
compare these full tuples instead of just names, referencing the existing result
and sw variables.
server/datastore/mysql/software.go (1)

3314-3392: Add stage-level timing metrics for these new batch paths.

Given the runtime-regression objective, consider recording per-call metrics (ID fetch duration, batch count, rows returned, vulnerability fetch duration) so first-run vs steady-state behavior can be compared in low-sampling environments.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/datastore/mysql/software.go` around lines 3314 - 3392, Instrument the
two new batch paths to record stage-level timing and counts: in
ListSoftwareForVulnDetectionByOSVersion, measure and emit the duration and row
count for the initial SELECT DISTINCT (the sqlx.SelectContext that produces
softwareIDs), then for each batch inside the common_mysql.BatchProcessSimple
lambda measure batch duration, record batch size and rows returned from the
batchResult query, and emit a cumulative batch count; in
ListSoftwareVulnerabilitiesBySoftwareIDs, likewise instrument each
BatchProcessSimple invocation to time the per-batch SELECT (the
sqlx.SelectContext that fetches from software_cve), record batch size, rows
returned and cumulative batches processed, and include the source
(fleet.VulnerabilitySource) as a tag/label. Use the project’s existing
metrics/telemetry helper on ds (or the common metrics facility used elsewhere in
Datastore) to record durations (histogram) and counts (counter/gauge) with
labels for function name, platform/os_version or source, and batch index so
first-run vs steady-state comparisons are possible; ensure metrics are emitted
on both success and error paths inside the lambdas.
server/vulnerabilities/osv/analyzer.go (1)

101-130: ⚡ Quick win

Add write-side timings to the new bulk-flow log.

The new debug log only measures the read/match phases. If delete/insert is still the dominant cost on large runs, this output will not explain the remaining runtime.

Suggested instrumentation
+	var deleteTime time.Duration
 	// Delete stale vulnerabilities.
 	if len(toDelete) > 0 {
+		deleteStart := time.Now().UTC()
 		toDeleteMap := make(map[string]fleet.SoftwareVulnerability, len(toDelete))
 		for _, v := range toDelete {
 			toDeleteMap[v.Key()] = v
 		}
 		if err := utils.BatchProcess(toDeleteMap, func(v []fleet.SoftwareVulnerability) error {
 			return ds.DeleteSoftwareVulnerabilities(ctx, v)
 		}, vulnBatchSize); err != nil {
 			return nil, fmt.Errorf("deleting stale vulnerabilities: %w", err)
 		}
+		deleteTime = time.Since(deleteStart)
 	}
 
 	// Insert new vulnerabilities.
-	newVulns, err := ds.InsertSoftwareVulnerabilities(ctx, toInsert, source)
+	insertStart := time.Now().UTC()
+	newVulns, err := ds.InsertSoftwareVulnerabilities(ctx, toInsert, source)
 	if err != nil {
 		return nil, fmt.Errorf("inserting software vulnerabilities: %w", err)
 	}
+	insertTime := time.Since(insertStart)
 
 	logger.DebugContext(ctx, "osv analysis completed",
 		"platform", ver.Platform,
 		"version", ver.Version,
 		"distinct_software", len(software),
 		"software_query_time", softwareTime,
 		"match_time", matchTime,
 		"existing_query_time", existingTime,
+		"delete_time", deleteTime,
+		"insert_time", insertTime,
 		"found_vulns", len(found),
 		"existing_vulns", len(existing),
 		"to_insert", len(toInsert),
 		"to_delete", len(toDelete),
 	)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/vulnerabilities/osv/analyzer.go` around lines 101 - 130, The debug log
at logger.DebugContext("osv analysis completed", ...) only includes read/match
timings; measure and log the write-side timings by timing the delete and insert
operations around the existing delete and insert blocks and add those durations
and counts to the same DebugContext output (e.g., capture deleteStart, measure
deleteTime around utils.BatchProcess/ ds.DeleteSoftwareVulnerabilities and
capture insertStart, measure insertTime around
ds.InsertSoftwareVulnerabilities), and include fields like "delete_time",
"delete_count", "insert_time", and "insert_count" (or "to_delete", "to_insert"
equivalents) so the final debug record reflects both read and write costs.
Ensure timings are recorded even on errors so the error paths can surface the
measured durations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@server/vulnerabilities/osv/analyzer.go`:
- Around line 128-137: The current code returns only the insertion delta
(newVulns) when collectVulns is true, but callers expect the matched/detected
vulnerabilities (the variable found); change the logic so that after calling
ds.InsertSoftwareVulnerabilities you still return the detected slice (found)
when collectVulns is true instead of newVulns. In practice, in analyzer.go
replace the branch that returns newVulns with a return of found (or the
appropriate merged slice of found items if needed), keeping the error handling
around ds.InsertSoftwareVulnerabilities unchanged and ensuring the function's
return types still match.

---

Nitpick comments:
In `@server/datastore/mysql/software_test.go`:
- Around line 12064-12069: Current test only collects sw.Name from result and
asserts names; modify it to assert full identity tuples (Name, Version, Source)
to catch regressions in version/source. Replace the names slice with a slice
built from result that combines sw.Name, sw.Version and sw.Source (e.g.,
formatted tuple or small struct), sort that slice, and assert equality against
an expected sorted list of full tuples like {"libbar|1.2.3|apt", ...}. Update
the require.Equal call to compare these full tuples instead of just names,
referencing the existing result and sw variables.

In `@server/datastore/mysql/software.go`:
- Around line 3314-3392: Instrument the two new batch paths to record
stage-level timing and counts: in ListSoftwareForVulnDetectionByOSVersion,
measure and emit the duration and row count for the initial SELECT DISTINCT (the
sqlx.SelectContext that produces softwareIDs), then for each batch inside the
common_mysql.BatchProcessSimple lambda measure batch duration, record batch size
and rows returned from the batchResult query, and emit a cumulative batch count;
in ListSoftwareVulnerabilitiesBySoftwareIDs, likewise instrument each
BatchProcessSimple invocation to time the per-batch SELECT (the
sqlx.SelectContext that fetches from software_cve), record batch size, rows
returned and cumulative batches processed, and include the source
(fleet.VulnerabilitySource) as a tag/label. Use the project’s existing
metrics/telemetry helper on ds (or the common metrics facility used elsewhere in
Datastore) to record durations (histogram) and counts (counter/gauge) with
labels for function name, platform/os_version or source, and batch index so
first-run vs steady-state comparisons are possible; ensure metrics are emitted
on both success and error paths inside the lambdas.

In `@server/fleet/datastore.go`:
- Around line 683-685: Update the interface comment for
ListSoftwareVulnerabilitiesBySoftwareIDs to explicitly document nil/empty-slice
behavior for the softwareIDs parameter (e.g., state that passing a nil or
zero-length slice returns (nil, nil) or other defined result), so callers like
the OSV analyzer and ListSoftwareForVulnDetectionByOSVersion don’t need to
defensively handle empty inputs; mention the types involved (softwareIDs,
VulnerabilitySource, SoftwareVulnerability) and the chosen contract (exact
return values) so implementers and callers have a clear expectation.

In `@server/vulnerabilities/osv/analyzer.go`:
- Around line 101-130: The debug log at logger.DebugContext("osv analysis
completed", ...) only includes read/match timings; measure and log the
write-side timings by timing the delete and insert operations around the
existing delete and insert blocks and add those durations and counts to the same
DebugContext output (e.g., capture deleteStart, measure deleteTime around
utils.BatchProcess/ ds.DeleteSoftwareVulnerabilities and capture insertStart,
measure insertTime around ds.InsertSoftwareVulnerabilities), and include fields
like "delete_time", "delete_count", "insert_time", and "insert_count" (or
"to_delete", "to_insert" equivalents) so the final debug record reflects both
read and write costs. Ensure timings are recorded even on errors so the error
paths can surface the measured durations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1d4e34a5-500c-4c4c-adbb-6dae6014183d

📥 Commits

Reviewing files that changed from the base of the PR and between 6a5c67d and 0bd77e3.

📒 Files selected for processing (6)
  • changes/44391-osv-vuln-optimizations
  • server/datastore/mysql/software.go
  • server/datastore/mysql/software_test.go
  • server/fleet/datastore.go
  • server/mock/datastore_mock.go
  • server/vulnerabilities/osv/analyzer.go

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets the vulnerabilities cron runtime regression by reducing redundant OSV scanning work at scale. It refactors OSV analyzers to operate on distinct software per OS version (instead of per-host) and adds supporting datastore APIs to query software/vulns more directly.

Changes:

  • Refactored Ubuntu/RHEL OSV analysis to query distinct software per OS version and compute a single vuln delta per OS version.
  • Added datastore methods to (1) list distinct software for vuln detection by OS version and (2) list existing vulns by software IDs + source.
  • Added MySQL unit tests for the new datastore methods and a user-visible changes entry.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
server/vulnerabilities/osv/analyzer.go Consolidates OSV analysis to a per-OS-version distinct-software pass and scopes existing vuln lookup by software IDs.
server/mock/datastore_mock.go Extends the datastore mock with the new software/vuln listing APIs.
server/fleet/datastore.go Extends the Datastore interface with the new OS-version and software-ID based methods.
server/datastore/mysql/software.go Implements the new MySQL queries for distinct software-by-OS-version and vulns-by-software-IDs.
server/datastore/mysql/software_test.go Adds coverage for the new MySQL datastore methods.
changes/44391-osv-vuln-optimizations Documents the performance optimization for release notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread server/vulnerabilities/osv/analyzer.go Outdated
Comment thread server/datastore/mysql/software_test.go Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 33.08824% with 91 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.69%. Comparing base (6a5c67d) to head (372aa13).
⚠️ Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
server/vulnerabilities/osv/analyzer.go 0.00% 81 Missing ⚠️
server/datastore/mysql/software.go 81.81% 7 Missing and 3 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #44684   +/-   ##
=======================================
  Coverage   66.69%   66.69%           
=======================================
  Files        2651     2651           
  Lines      213559   213605   +46     
  Branches     9610     9610           
=======================================
+ Hits       142424   142472   +48     
+ Misses      58170    58167    -3     
- Partials    12965    12966    +1     
Flag Coverage Δ
backend 68.56% <33.08%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
server/vulnerabilities/osv/analyzer.go (1)

101-112: Log delete/insert timings too.

The new debug event times the read/match phases, but not the delete/insert phases. Since the linked investigation explicitly called out DB writes as a suspected bottleneck, counts alone will still leave the slowest stage opaque in customer runs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/vulnerabilities/osv/analyzer.go` around lines 101 - 112, Add timing
for the DB write phases and include them in the existing debug event: capture
the duration of the insert and delete operations that act on toInsert and
toDelete (e.g., record start := time.Now() before performing the writes and
compute insertTime := time.Since(start) / deleteTime similarly around the delete
path), then pass "insert_time" and "delete_time" (and keep existing counts
len(toInsert)/len(toDelete)) into the logger.DebugContext call so the debug line
that uses logger.DebugContext reports both write durations and counts alongside
the existing read/match timings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/vulnerabilities/osv/analyzer.go`:
- Around line 67-140: The helper currently materializes the entire OS-version
dataset (software, softwareIDs, found, existing, delta, dedupe maps) which can
blow memory for large OS buckets; update the flow in the function that calls
ds.ListSoftwareForVulnDetectionByOSVersion, matcher, utils.VulnsDelta,
ds.ListSoftwareVulnerabilitiesBySoftwareIDs, ds.DeleteSoftwareVulnerabilities
and ds.InsertSoftwareVulnerabilities to process the software slice in
configurable chunks (e.g., softwareChunkSize or reusing vulnBatchSize), for each
chunk: 1) build chunkSoftware and chunkSoftwareIDs, 2) call
matcher(chunkSoftware) and then ListSoftwareVulnerabilitiesBySoftwareIDs for
that chunk, 3) compute toInsert/toDelete via utils.VulnsDelta for the chunk, 4)
perform deletes with utils.BatchProcess and ds.DeleteSoftwareVulnerabilities and
perform inserts immediately via ds.InsertSoftwareVulnerabilities (deduplicating
inserts using a global seen map keyed by v.Key() to avoid duplicate insert
across chunks), and 5) aggregate counters/timings for logging; this keeps query
optimization but bounds peak memory to the chunk size while preserving existing
function names and behavior.

---

Nitpick comments:
In `@server/vulnerabilities/osv/analyzer.go`:
- Around line 101-112: Add timing for the DB write phases and include them in
the existing debug event: capture the duration of the insert and delete
operations that act on toInsert and toDelete (e.g., record start := time.Now()
before performing the writes and compute insertTime := time.Since(start) /
deleteTime similarly around the delete path), then pass "insert_time" and
"delete_time" (and keep existing counts len(toInsert)/len(toDelete)) into the
logger.DebugContext call so the debug line that uses logger.DebugContext reports
both write durations and counts alongside the existing read/match timings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7afab038-b37a-4d58-a8ad-dd90b663c799

📥 Commits

Reviewing files that changed from the base of the PR and between 0bd77e3 and 92df2b2.

📒 Files selected for processing (3)
  • server/datastore/mysql/software.go
  • server/datastore/mysql/software_test.go
  • server/vulnerabilities/osv/analyzer.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/datastore/mysql/software.go
  • server/datastore/mysql/software_test.go

Comment thread server/vulnerabilities/osv/analyzer.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +3313 to +3328

func (ds *Datastore) ListSoftwareForVulnDetectionByOSVersion(
ctx context.Context,
osVer fleet.OSVersion,
) ([]fleet.Software, error) {
var softwareIDs []uint
err := sqlx.SelectContext(ctx, ds.reader(ctx), &softwareIDs, `
SELECT DISTINCT hs.software_id
FROM host_software hs
JOIN hosts h ON hs.host_id = h.id
WHERE h.platform = ? AND h.os_version = ?
`, osVer.Platform, osVer.Name)
if err != nil {
return nil, ctxerr.Wrap(ctx, err, "listing distinct software IDs for OS version")
}

Comment thread changes/44391-osv-vuln-optimizations Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@mostlikelee mostlikelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I'm guessing iterating over OS version software will also perform much better than the OVAL scanner as well. Just have one question.

func (ds *Datastore) ListSoftwareForVulnDetectionByOSVersion(
ctx context.Context,
osVer fleet.OSVersion,
) ([]fleet.Software, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if we'll run into memory issues here. I believe the old pattern paginated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vuln cron runtime ~2x longer (~4h→8h) on v4.84.0 at large scale

3 participants