Add pytest test suite and CI for data_format module#65
Open
francose wants to merge 2 commits intoComcast:mainfrom
Open
Add pytest test suite and CI for data_format module#65francose wants to merge 2 commits intoComcast:mainfrom
francose wants to merge 2 commits intoComcast:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a first pytest-based test suite for the core xgitguard.common.data_format extraction/preprocessing functions and introduces a GitHub Actions workflow to run the tests on pushes/PRs.
Changes:
- Added pytest coverage for
keys_extractor(),credential_extractor(),remove_url_from_keys(), andremove_url_from_creds(). - Added a CI workflow to run the test suite across a Python version matrix.
- Added
testspackage initializer.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
tests/test_data_format.py |
New unit tests covering the key/credential extraction pipeline and URL/special-char stripping helpers. |
tests/__init__.py |
Marks tests as a package (empty initializer). |
.github/workflows/tests.yml |
New GitHub Actions workflow to install deps and run pytest on pushes/PRs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| runs-on: ubuntu-latest | ||
| strategy: | ||
| matrix: | ||
| python-version: ["3.9", "3.10", "3.11"] |
Comment on lines
+27
to
+28
| pip install -r requirements.txt | ||
| pip install pytest |
| prove the detection pipeline works end to end. | ||
| """ | ||
|
|
||
| import pytest |
Comment on lines
+10
to
+14
| import sys | ||
| import os | ||
|
|
||
| sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")) | ||
|
|
- Remove unused pytest import and sys.path hack from test module - Install only urlextract+pytest in CI instead of full requirements.txt - Drop Python 3.11 from matrix (pinned numpy/pandas don't support it)
Author
|
Addressed all four Copilot suggestions in 5767bdc:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Noticed the repo doesn't have a test suite yet, so I put one together for the core detection module.
Added 18 tests covering keys_extractor(), credential_extractor(), remove_url_from_keys(), and remove_url_from_creds() — the core detection pipeline. Tests cover true positives (AWS keys, private key headers, Slack webhooks), true negatives (plain text, empty strings), and the preprocessing helpers.
Also added a GitHub Actions workflow that runs pytest on Python 3.9/3.10/3.11 on every push and PR to main. Nothing fancy, just the basics so regressions get caught before they ship.
Tests pass against current main. Kept the scope tight — just data_format.py for now. Happy to expand to the other modules if this lands.