-
Notifications
You must be signed in to change notification settings - Fork 1k
refactor: consistent cloning & pattern-handling #388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7ab4af2 to
831a36d
Compare
cf1aa6f to
d1224a6
Compare
979c88a to
3031e7c
Compare
ix-56h
reviewed
Jul 12, 2025
ix-56h
reviewed
Jul 12, 2025
ix-56h
reviewed
Jul 13, 2025
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
ix-56h
approved these changes
Jul 13, 2025
Contributor
ix-56h
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as duplicate.
This comment was marked as duplicate.
Contributor
Author
TODO:
|
f98250f to
8438492
Compare
Add `_handle_remove_readonly` on-error callback for `shutil.rmtree` that removes the read-only attribute and retries, preventing WinError 5 during temp-repo cleanup in tests.
fix blob/tag added test
2d6d8c0 to
c5c9d5d
Compare
NicolasIRAGNE
approved these changes
Jul 23, 2025
Contributor
NicolasIRAGNE
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks Filip!
|
⚙️ Preview environment was undeployed. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
✨ Refactor: consistent cloning & pattern-handling
Why
→ Caching and reproducibility suffered.
query_parserand duplicated ignore/include logic.→ Hard to unit-test and reuse.
packed-object files.
→ Shallow-clone cleanup broke with WinError 5.
What’s new
utils.git_utils.resolve_commit()guarantees we always fetch the exact SHA(HEAD / branch / tag) before checkout.
Deterministic → enables caching (feat: Implement caching on a per-commit basis #343).
utils.pattern_utils.process_patterns()centralises include/exclude parsing(moved out of
query_parser). Adds thorough tests._handle_remove_readonlyon-error callback makes read-only Git objectswritable and retries
shutil.rmtree(), preventing WinError 5 in CI.parse_query()removed in favour ofparse_remote_repo()(URLs/slugs)parse_local_dir_path()(local paths)clone_repo,ingest_query,parse_queryare no longer re-exported fromgitingest.__init__.(optional) → fetch commit → checkout → submodule update (optional).
_checkout_partial_clonerenamed & moved →git_utils.checkout_partial_clone.query_processoruses new pattern utilities and passes a typed enum.IngestRequest.validate_input_text()now removes any.gitsuffix._is_safe_symlink_is_valid_pattern,InvalidPatternError, andtest_parse_patterns_invalid_characters)File changes
gitingest__init__.pyclone_repo,ingest_query,parse_query)clone.pyclone_repoto callresolve_commit, sparse checkout, and uniform fetch/checkout steps; move helper & renameentrypoint.py_handle_remove_readonlycallback for Windows temp-dir cleanup;ingest_asyncnow usesparse_remote_repo/parse_local_dir_pathoutput_formatter.pyTagprefix in_create_summary_prefixquery_parser.pyparse_query; move pattern helpers topattern_utils.pyutils/exceptions.pyInvalidPatternErrorutils/git_utils.pyresolve_commithelperutils/os_utils.pyensure_directory→ensure_directory_exists_or_createutils/pattern_utils.pyprocess_patternsutils/query_parser_utils.py_is_valid_patternpath_utils.py(removed)_is_safe_symlinkservermodels.py.gitsuffix invalidate_input_textquery_processor.pyparse_querywithparse_remote_repo, integratePatternType, useprocess_patternsrouters_utils.pypattern_typeintoPatternTypeenumtestsconftest.pyquery_parser/test_git_host_agnostic.pyparse_remote_repoquery_parser/test_query_parser.pyparse_remote_repo; addparse_local_dir_pathcoveragetest_clone.pytest_pattern_utils.py_parse_patternsandprocess_patterns; removetest_parse_patterns_invalid_characters(pattern validation no longer enforced)test_summary.pygitingest.ingest()emits correct summaries