Skip to content

Conversation

@kendrickcurtis
Copy link

OpenSSF publishes a regularly updated list of 3rd party packages that contain malware. This PR adds a new Trivy rule "Malicious packages detection" at the highest severity level.

The OpenSSF DB is 227mb which is about 1/3rd the size of Trivy's vuln DB (734mb) -- I would hope this would not add a vast extra burden on processing times. Probably we would want/need to run malicious package detection both on commit and in the nightly SCA process, since packages can be retroactively designated as malicious.

@kendrickcurtis kendrickcurtis requested a review from a team as a code owner August 11, 2025 11:34
@alerizzo alerizzo requested a review from Copilot August 13, 2025 14:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates OpenSSF's malicious packages database into Trivy to detect known malicious packages. This adds a new "Malicious packages detection" rule at the highest severity level to identify packages containing malware, typosquatting attacks, and dependency confusion attacks.

  • Implements OpenSSF scanner with OSV format parsing for malicious package detection
  • Adds new malicious_packages rule with critical severity level
  • Integrates scanning into the main tool execution flow

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/tool/tool.go Integrates OpenSSF scanner into main execution flow and adds new rule pattern
internal/tool/openssf_scanner.go Core implementation of OpenSSF malicious packages scanner with OSV parsing
internal/docgen/rule.go Adds rule definition for malicious packages detection
docs/patterns.json Adds pattern configuration for malicious packages rule
docs/multiple-tests/pattern-malicious/* Test files demonstrating malicious package detection
docs/description/* Documentation files for the new pattern
Dockerfile Copies OpenSSF cache data into container
.circleci/config.yml Downloads OpenSSF database during CI build

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@afsmeira afsmeira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still more fixes and improvements but I think I've left enough comments for a first pass.

We'll move to those fixes after that. And obviously there are still tests to write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete and add a line to .gitignore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not delete the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete and add a line to .gitignore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not delete the file.

Comment on lines 212 to 214
if s.shouldAnalyzeFile(toolExecution.Files, target) {
return []codacy.Result{issue}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already logic to filter out results from files that should not be analyzed in tool.go. We should rely on that for consistency.

Or we want to filter when OSSF scanning, do it way earlier.

Comment on lines 524 to 546
func (s *OpenSSFScanner) isPatternEnabled(patterns *[]codacy.Pattern, patternID string) bool {
if patterns == nil {
return false
}
for _, pattern := range *patterns {
if pattern.ID == patternID {
return true
}
}
return false
}

func (s *OpenSSFScanner) shouldAnalyzeFile(knownFiles *[]string, fileName string) bool {
if knownFiles == nil {
return true
}
for _, file := range *knownFiles {
if file == fileName {
return true
}
}
return false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given my suggestion above you can delete these.

}

func (s *OpenSSFScanner) findPackageLineNumber(sourceDir, fileName, pkgName string) int {
return fallbackSearchForLineNumber(sourceDir, fileName, pkgName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simply uses the fallback. Please use the same logic in tool.go get the line number.

Comment on lines 468 to 472
for _, affectedVersion := range affectedVersions {
if version == affectedVersion {
return true
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for _, affectedVersion := range affectedVersions {
if version == affectedVersion {
return true
}
}
if slices.Contains(affectedVersions, version) {
return true
}

Comment on lines 474 to 475
if versionRange.Type == "SEMVER" || versionRange.Type == "ECOSYSTEM" {
if s.checkSingleVersionRange(version, versionRange.Events) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if versionRange.Type == "SEMVER" || versionRange.Type == "ECOSYSTEM" {
if s.checkSingleVersionRange(version, versionRange.Events) {
if (versionRange.Type == "SEMVER" || versionRange.Type == "ECOSYSTEM") && s.checkSingleVersionRange(version, versionRange.Events) {

Copy link
Contributor

@afsmeira afsmeira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still comments to address. And unit tests to add.

Comment on lines +23 to +26




Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: delete empty lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not delete the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not delete the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not delete the file.

Comment on lines +39 to +40
BASE = os.environ.get('OPENSSF_OSV_DIR', 'openssf-cache/osv')
OUT = os.environ.get('OPENSSF_INDEX_OUT', 'openssf-index.json.gz')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the environment variables are not used right now, I would remove them and just use the hardcoded values.

# Generate the OpenSSF index during build
COPY scripts/ scripts/
COPY openssf-cache/osv/ openssf-cache/osv/
RUN apk add --no-cache python3 && \
Copy link
Contributor

@afsmeira afsmeira Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I was thinking that we should probably run this in the CI and just copy the resulting file to the image.

This avoids adding python3 to the image and I'm pretty sure our CI already has python available.

with gzip.open(OUT, 'wt', encoding='utf-8') as gz:
json.dump(index, gz)

print(f"Wrote index: {OUT} (ecosystems={len(index)})")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if print adds a new line, but if it doesn't please add one manually.

Comment on lines +239 to +241
if toolExecution.Files == nil {
return mapIssuesWithoutLineNumber(issues), nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this generated by AI? Do you actually know if this description is true?

Comment on lines +72 to +75
if err := s.ensureIndexLoaded(); err != nil {
log.Printf("Warning: Failed to load OpenSSF malicious packages database: %v", err)
return results
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this pattern is selected but we can't load the file, the should fail instead of returning empty results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants