Skip to content

Commit 1e146bf

Browse files
authored
Example custom semgrep rule for detecting fixed time references that is stored in repo for scanning against pull requests (#26647)
* Include custom semgrep rule stored in repo for scanning against pull requests * disable metrics and root path to avoid warnings * This rule must use the generic semgrep parser * include a way to skip the local semgrep scan by including [skip semgrep] in commit message * Require a fetch-depth of 0 to get all of the history * Iin CI we compare committed changes made but when run locally we want to consider all changes made to the working directory (including uncommitted) * Improved warning message for coming soon and included both committed and uncommitted changes in the local semgrep check * Avoid fatal git error on ownership within CLI working directory
1 parent 6704faf commit 1e146bf

File tree

4 files changed

+105
-2
lines changed

4 files changed

+105
-2
lines changed

.github/workflows/semgrep.yml

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,15 @@ on:
22
workflow_dispatch: {}
33
schedule:
44
- cron: "0 4 * * *"
5+
pull_request: {}
6+
57
name: Semgrep config
68
permissions:
79
contents: read
10+
811
jobs:
912
semgrep:
10-
name: semgrep/ci
13+
name: semgrep
1114
runs-on: ubuntu-latest
1215
env:
1316
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
@@ -18,4 +21,35 @@ jobs:
1821
image: semgrep/semgrep
1922
steps:
2023
- uses: actions/checkout@v4
21-
- run: semgrep ci
24+
with:
25+
# fetch full history so Semgrep can compare against the base branch
26+
fetch-depth: 0
27+
28+
# Semgrep CI to run on Schedule (Cron) or Manual Dispatch
29+
# scans using managed rules at cloudflare.semgrep.dev
30+
- name: Semgrep CI Rules (Managed rules at cloudflare.semgrep.dev)
31+
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
32+
run: semgrep ci
33+
34+
# Semgrep Scan to run on Pull Request events
35+
# scans using rules inside the .semgrep/ folder and fails on error
36+
# include [skip semgrep] in top-most commit message to skip scan
37+
- name: Semgrep Repo Rules (Custom rules found in .semgrep/)
38+
if: github.event_name == 'pull_request' && !contains(github.event.head_commit.message, '[skip semgrep]')
39+
run: |
40+
41+
git config --global --add safe.directory $PWD
42+
base_commit=$(git merge-base HEAD origin/$GITHUB_BASE_REF)
43+
git diff $base_commit... --diff-filter=ACMRT --name-only | grep -E '\.(htm|html|yaml|yml|md|mdx)$' > tools/relevant_changed_files.txt || true
44+
45+
# Check if file list is empty to prevent errors
46+
if [ -s tools/relevant_changed_files.txt ]; then
47+
list_of_files=$(cat tools/relevant_changed_files.txt | tr '\n' ' ')
48+
semgrep scan \
49+
--config .semgrep --metrics=off \
50+
--include "*.mdx" --include "*.mdx" \
51+
$list_of_files
52+
# add '--error' to return error code to workflow
53+
else
54+
echo "No relevant files changed."
55+
fi

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,5 @@ pnpm-debug.log*
3030
/worker/functions/
3131

3232
.idea
33+
34+
tools/relevant_changed_files.txt

.semgrep/dates-in-docs.yaml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
rules:
2+
- id: coming-soon
3+
languages: [generic]
4+
message: "Found forbidden string 'coming soon'. Too often we set expectations unfairly by attaching this phrase to a feature that may not actually arrive soon."
5+
severity: MEDIUM
6+
paths:
7+
include:
8+
- "*.htm"
9+
- "*.html"
10+
- "*.md"
11+
- "*.mdx"
12+
- "*.yaml"
13+
- "*.yml"
14+
exclude:
15+
- "/src/content/changelog/**"
16+
- "/src/content/release-notes/**"
17+
- "/.semgrep/**"
18+
- "/.github/**"
19+
patterns:
20+
- pattern-regex: "[Cc]oming [Ss]oon"
21+
22+
- id: potential-date
23+
languages: [generic]
24+
message: "Potential date found. Documentation should strive to represent universal truth, not something time-bound."
25+
severity: MEDIUM
26+
paths:
27+
include:
28+
- "*.htm"
29+
- "*.html"
30+
- "*.md"
31+
- "*.mdx"
32+
- "*.yaml"
33+
- "*.yml"
34+
exclude:
35+
- "/src/content/changelog/**"
36+
- "/src/content/release-notes/**"
37+
- "/.semgrep/**"
38+
- "/.github/**"
39+
pattern-either:
40+
- pattern-regex: Jan\| Feb\| Mar\| Apr\| May\| Jun\| Jul\| Aug\| Sep\| Nov\| Dec
41+
- pattern-regex: \ 20[0-9][0-9]

tools/semgrep-repo-rules

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
#! /bin/bash
2+
3+
repo_root_dir="$(git rev-parse --show-toplevel)"
4+
5+
pushd "${repo_root_dir}" > /dev/null || return
6+
7+
base_commit=$(git merge-base HEAD origin/production)
8+
git diff $base_commit... --diff-filter=ACMRT --name-only | grep -E '\.(htm|html|yaml|yml|md|mdx)$' > tools/relevant_changed_files.txt || true
9+
10+
# this file wants to also match uncommitted changes, not just commited changes (in CI this is not the case)
11+
git diff --diff-filter=ACMRT --name-only | grep -E '\.(htm|html|yaml|yml|md|mdx)$' >> tools/relevant_changed_files.txt || true
12+
13+
if [ -s tools/relevant_changed_files.txt ]; then
14+
list_of_files=$(cat tools/relevant_changed_files.txt | tr '\n' ' ')
15+
16+
docker run --rm -v "${PWD}:/src" semgrep/semgrep \
17+
semgrep scan \
18+
--config .semgrep --metrics=off \
19+
--include "*.mdx" --include "*.mdx" \
20+
--force-color \
21+
$list_of_files
22+
else
23+
echo "No relevant files changed."
24+
fi
25+
26+
popd > /dev/null || return

0 commit comments

Comments
 (0)