Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds automated analysis of file changes from the "Sync fabricv4 API spec" workflow and automatically updates PR descriptions with a formatted summary of added, modified, deleted, and renamed API and model files. The implementation includes a Python script to analyze git diffs and a GitHub Actions workflow to run the analysis and update PRs.
Changes:
- Added Python script (
branch_file_analyzer.py) to analyze git branch changes and categorize Java API and model files - Added GitHub Actions workflow (
pr-analysis.yml) that triggers after the sync workflow completes to update PR descriptions - Modified test assertion in
MetrosApiTest.javato be less specific (checks for non-empty data instead of specific metro code)
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 17 comments.
| File | Description |
|---|---|
| script/branch_file_analyzer.py | New Python script that analyzes git diffs, categorizes Java files (APIs/Models), and generates formatted reports |
| .github/workflows/pr-analysis.yml | New workflow that triggers after sync workflow, runs the analyzer, and updates PR descriptions with the analysis |
| equinix-openapi-fabric-tests/src/test/java/com/equinix/openapi/fabric/tests/MetrosApiTest.java | Weakened test assertion from checking specific metro code to just checking non-empty response |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assertEquals(200, metrosApi.getApiClient().getStatusCode()); | ||
| boolean metroFound = metroResponse.getData().stream().anyMatch(metro -> metro.getCode().equals(metroCode)); | ||
| assertTrue(metroFound); | ||
| assertTrue(!metroResponse.getData().isEmpty()); |
There was a problem hiding this comment.
The assertion assertTrue(!metroResponse.getData().isEmpty()) uses double negation, which is harder to read and less idiomatic than using assertFalse.
Based on other test files in the codebase (e.g., NetworksApiTest.java:155, ServiceProfilesApiTest.java:86, 98, 143), the convention is to use assertFalse for isEmpty checks. Consider changing this to:
assertFalse(metroResponse.getData().isEmpty());This is more consistent with the existing test code style.
| assertEquals(200, metrosApi.getApiClient().getStatusCode()); | ||
| boolean metroFound = metroResponse.getData().stream().anyMatch(metro -> metro.getCode().equals(metroCode)); | ||
| assertTrue(metroFound); | ||
| assertTrue(!metroResponse.getData().isEmpty()); |
There was a problem hiding this comment.
This test change appears unrelated to the PR's stated purpose of "adding flow to push added, modified, renamed files after sync action to pull request description." The PR description doesn't mention any test modifications.
If this change is intentional and related to the analysis workflow (perhaps as a test case), it should be explained in the PR description. If it's unrelated, it should be moved to a separate PR to keep changes focused and easier to review.
| # Remove old analysis section if exists | ||
| NEW_DESC=$(echo "$CURRENT_DESC" | sed '/## 📋 Branch File Analysis/,/^\*Auto-generated by Branch File Analysis workflow\*$/d') | ||
|
|
||
| # Add new analysis section | ||
| UPDATED_DESC=$(printf "%s\n\n## 📋 Branch File Analysis\n\n%s\n\n---\n*Auto-generated by Branch File Analysis workflow*" "$NEW_DESC" "$ANALYSIS_RESULT") | ||
|
|
||
| # Update PR | ||
| gh pr edit $PR_NUMBER --body "$UPDATED_DESC" |
There was a problem hiding this comment.
The sed command at line 48 and the printf command at line 51 process user-generated content (PR descriptions) without proper escaping. If a PR description contains special characters like backticks, dollar signs, or backslashes, it could lead to command injection or unexpected behavior.
Consider using a more robust method to manipulate the PR description, such as:
- Writing the content to temporary files instead of using shell variables
- Using a Python or other scripting language that handles strings more safely
- Properly escaping the content before using it in shell commands
This is especially important since the PR description could potentially be modified by attackers if they have write access to the repository.
| # Remove old analysis section if exists | |
| NEW_DESC=$(echo "$CURRENT_DESC" | sed '/## 📋 Branch File Analysis/,/^\*Auto-generated by Branch File Analysis workflow\*$/d') | |
| # Add new analysis section | |
| UPDATED_DESC=$(printf "%s\n\n## 📋 Branch File Analysis\n\n%s\n\n---\n*Auto-generated by Branch File Analysis workflow*" "$NEW_DESC" "$ANALYSIS_RESULT") | |
| # Update PR | |
| gh pr edit $PR_NUMBER --body "$UPDATED_DESC" | |
| # Export variables for Python script | |
| export ANALYSIS_RESULT | |
| export CURRENT_DESC | |
| # Use Python to safely update the PR description and write it to a file | |
| python3 - << 'PY' | |
| import os | |
| import re | |
| current_desc = os.environ.get("CURRENT_DESC", "") | |
| analysis_result = os.environ.get("ANALYSIS_RESULT", "") | |
| # Remove existing "Branch File Analysis" section, if present | |
| pattern = ( | |
| r"\n?## 📋 Branch File Analysis" | |
| r".*?" | |
| r"\*Auto-generated by Branch File Analysis workflow\*" | |
| r"\n?" | |
| ) | |
| new_desc = re.sub(pattern, "", current_desc, flags=re.DOTALL) | |
| new_desc = new_desc.rstrip() | |
| updated_desc = ( | |
| (new_desc + "\n\n" if new_desc else "") | |
| + "## 📋 Branch File Analysis\n\n" | |
| + analysis_result | |
| + "\n\n---\n*Auto-generated by Branch File Analysis workflow*" | |
| ) | |
| with open("updated_desc.txt", "w", encoding="utf-8") as f: | |
| f.write(updated_desc) | |
| PY | |
| # Update PR using the file to avoid shell interpolation issues | |
| gh pr edit $PR_NUMBER --body-file updated_desc.txt |
| BRANCH="${{ github.event.workflow_run.head_branch }}" | ||
| PR_NUMBER=$(gh pr list --head "$BRANCH" --state open --json number --jq '.[0].number // empty') | ||
|
|
||
| # Run Python script and get result | ||
| ANALYSIS_RESULT=$(python3 script/branch_file_analyzer.py) | ||
|
|
||
| # Get current PR description | ||
| CURRENT_DESC=$(gh pr view $PR_NUMBER --json body --jq '.body // ""') | ||
|
|
||
| # Remove old analysis section if exists | ||
| NEW_DESC=$(echo "$CURRENT_DESC" | sed '/## 📋 Branch File Analysis/,/^\*Auto-generated by Branch File Analysis workflow\*$/d') | ||
|
|
||
| # Add new analysis section | ||
| UPDATED_DESC=$(printf "%s\n\n## 📋 Branch File Analysis\n\n%s\n\n---\n*Auto-generated by Branch File Analysis workflow*" "$NEW_DESC" "$ANALYSIS_RESULT") | ||
|
|
||
| # Update PR | ||
| gh pr edit $PR_NUMBER --body "$UPDATED_DESC" |
There was a problem hiding this comment.
The shell script uses unquoted variables in several places (e.g., $PR_NUMBER, $ANALYSIS_RESULT, $NEW_DESC). If any of these variables contain special characters or whitespace, the commands could fail or behave unexpectedly.
While GitHub Actions usually sets set -e by default (which helps catch errors), it's still a best practice to quote variables, especially when they contain user-generated content like PR descriptions. Consider quoting variables throughout:
PR_NUMBER=$(gh pr list --head "$BRANCH" --state open --json number --jq '.[0].number // empty')
CURRENT_DESC=$(gh pr view "$PR_NUMBER" --json body --jq '.body // ""')| # Install GitHub CLI | ||
| curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg | ||
| echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null | ||
| sudo apt update | ||
| sudo apt install gh |
There was a problem hiding this comment.
The GitHub CLI is being manually installed instead of using the official GitHub-provided action. GitHub-hosted runners already have gh CLI pre-installed, so lines 30-34 (installing gh) are unnecessary and add execution time.
You can remove these lines and directly use the gh command. If you need a specific version, you can use the gh version that comes with the runner or explicitly verify the version is available.
| except subprocess.CalledProcessError: | ||
| return "" |
There was a problem hiding this comment.
The _run_git_command method silently swallows all git command errors by catching CalledProcessError and returning an empty string. This can hide important errors and make debugging difficult. For example, if git is not installed or if there's a permission issue, the error will be silently ignored.
Consider logging the error or at least making the error handling more selective. For critical commands, you may want to let the exception propagate or provide more context about what failed.
| except subprocess.CalledProcessError: | |
| return "" | |
| except subprocess.CalledProcessError as e: | |
| print( | |
| f"Warning: git command failed: {cmd} (return code {e.returncode}). " | |
| f"stderr: {e.stderr.strip() if e.stderr else 'None'}", | |
| file=sys.stderr, | |
| ) | |
| return "" | |
| except FileNotFoundError as e: | |
| # git executable not found | |
| print( | |
| f"Error: git command not found while running: {cmd}. " | |
| f"Details: {e}", | |
| file=sys.stderr, | |
| ) | |
| return "" |
| # Added Files by category | ||
| if self.added_files: | ||
| # Separate by specific categories | ||
| added_apis = [f for f in self.added_files if f.endswith('.java') and ('/api/' in f and f.endswith('Api.java'))] |
There was a problem hiding this comment.
The condition ('/api/' in f and f.endswith('Api.java')) is redundant because it checks f.endswith('Api.java') when you've already checked f.endswith('.java') in the outer condition. This pattern repeats on lines 161, 185, 209, 238, and 239.
The inner check f.endswith('Api.java') is sufficient and more specific than the combined check. Consider simplifying to:
added_apis = [f for f in self.added_files if '/api/' in f and f.endswith('Api.java')]This removes the redundant .endswith('.java') check and makes the code clearer.
| added_apis = [f for f in self.added_files if f.endswith('.java') and ('/api/' in f and f.endswith('Api.java'))] | |
| added_apis = [f for f in self.added_files if '/api/' in f and f.endswith('Api.java')] |
| def categorize_files_by_type(self, file_list): | ||
| """Categorize files by their type/extension.""" | ||
| categories = defaultdict(list) | ||
|
|
||
| for filepath in file_list: | ||
| file_path = Path(filepath) | ||
| filename = file_path.name | ||
|
|
||
| # Categorize by file type | ||
| if filepath.endswith('.java'): | ||
| if '/model/' in filepath or '/dto/' in filepath: | ||
| categories['Java Models'].append(filename) | ||
| elif '/api/' in filepath and filename.endswith('Api.java'): | ||
| categories['API Classes'].append(filename) | ||
| elif '/test/' in filepath or 'test' in filename.lower(): | ||
| categories['Test Files'].append(filename) | ||
| # Removed Java Files category - these files will be ignored | ||
| elif filepath.endswith(('.py', '.sh', '.js', '.ts')): | ||
| categories['Scripts'].append(filename) | ||
| elif filepath.endswith(('.json', '.xml')): | ||
| categories['Data Files'].append(filename) | ||
| else: | ||
| categories['Other Files'].append(filename) | ||
|
|
||
| return categories | ||
|
|
There was a problem hiding this comment.
The categorize_files_by_type method (lines 127-151) is defined but never used anywhere in the code. This dead code should be removed to improve code maintainability and avoid confusion.
If this method is intended for future use, consider removing it for now and adding it back when it's actually needed.
| def categorize_files_by_type(self, file_list): | |
| """Categorize files by their type/extension.""" | |
| categories = defaultdict(list) | |
| for filepath in file_list: | |
| file_path = Path(filepath) | |
| filename = file_path.name | |
| # Categorize by file type | |
| if filepath.endswith('.java'): | |
| if '/model/' in filepath or '/dto/' in filepath: | |
| categories['Java Models'].append(filename) | |
| elif '/api/' in filepath and filename.endswith('Api.java'): | |
| categories['API Classes'].append(filename) | |
| elif '/test/' in filepath or 'test' in filename.lower(): | |
| categories['Test Files'].append(filename) | |
| # Removed Java Files category - these files will be ignored | |
| elif filepath.endswith(('.py', '.sh', '.js', '.ts')): | |
| categories['Scripts'].append(filename) | |
| elif filepath.endswith(('.json', '.xml')): | |
| categories['Data Files'].append(filename) | |
| else: | |
| categories['Other Files'].append(filename) | |
| return categories |
| uses: actions/checkout@v4 | ||
|
|
There was a problem hiding this comment.
The workflow doesn't check out the correct branch. The actions/checkout@v4 step at line 21 will check out the default branch (main), not the branch from the triggering workflow. This means the Python script will analyze the wrong branch and won't detect the actual changes made by the "Sync fabricv4 API spec" workflow.
To fix this, you need to check out the branch that triggered the workflow using:
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ github.event.workflow_run.head_branch }}Additionally, you need to fetch the base branch to ensure git diff has the proper reference:
- name: Fetch base branch
run: git fetch origin main:main| uses: actions/checkout@v4 | |
| uses: actions/checkout@v4 | |
| with: | |
| ref: ${{ github.event.workflow_run.head_branch }} | |
| - name: Fetch base branch | |
| run: git fetch origin main:main |
|
|
||
| # Get PR for branch | ||
| BRANCH="${{ github.event.workflow_run.head_branch }}" | ||
| PR_NUMBER=$(gh pr list --head "$BRANCH" --state open --json number --jq '.[0].number // empty') |
There was a problem hiding this comment.
The workflow lacks error handling for the case when no PR is found for the branch. If PR_NUMBER is empty (which can happen if the branch has no open PR), subsequent commands using $PR_NUMBER will fail with unclear errors.
Add a check after line 39:
if [ -z "$PR_NUMBER" ]; then
echo "No open PR found for branch $BRANCH"
exit 0
fi| PR_NUMBER=$(gh pr list --head "$BRANCH" --state open --json number --jq '.[0].number // empty') | |
| PR_NUMBER=$(gh pr list --head "$BRANCH" --state open --json number --jq '.[0].number // empty') | |
| if [ -z "$PR_NUMBER" ]; then | |
| echo "No open PR found for branch $BRANCH" | |
| exit 0 | |
| fi |
|
@copilot are there GH Action alternatives to this custom package diff / breaking change notifier? JApiCmp as an example (perhaps without needing the jar, but if we need the jar we would need to build both, presumably) |
|
@displague I've opened a new pull request, #50, to work on those changes. Once the pull request is ready, I'll request review from you. |
hmm I've created that POC for my team responsible for prepare release notes. it is enough to make faster our manual work - adding release notes. I don't need any dependency in the repository, python script just verifies commits for current branch and categorize each file to models, api etc. |
Added flow to push added, modified, renamed files after sync action to pull request description automatically