fix: granite33 response_end span uses sentence length not full respon…#845
Open
planetf1 wants to merge 5 commits intogenerative-computing:mainfrom
Open
fix: granite33 response_end span uses sentence length not full respon…#845planetf1 wants to merge 5 commits intogenerative-computing:mainfrom
planetf1 wants to merge 5 commits intogenerative-computing:mainfrom
Conversation
Contributor
|
The PR description has been updated. Please fill out the template for your PR to be reviewed. |
…se length _add_citation_response_spans computed response_end as index + len(response_text_without_citations), which is the length of the entire stripped response rather than the cited sentence. This caused every citation span to overshoot, potentially past the end of the string. Fixes generative-computing#843
…issue generative-computing#843) Adds TestAddCitationResponseSpans covering: - multi-sentence response where cited sentence is shorter than full response (the exact scenario that triggered the bug: response_end overshot) - multiple citations each getting a correct, non-overlapping span - single-sentence response as baseline Fixes generative-computing#843
Remove misleading doc_id parameter from _make_citation (citations are matched positionally, the value was never read by the function under test). Add comment clarifying positional matching. Remove issue number from class docstring.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Misc PR
Type of PR
Description
One-line fix:
response_endin granite 3.3 citation spans was computed asindex + len(response_text_without_citations)(the full response length) instead ofindex + len(response_text)(the cited sentence length). Every citation span therefore overshot its end index — potentially beyond the end of the string — meaning any consumer slicingresponse[begin:end]would get far more text than intended.Granite 3.2 already does this correctly (see
granite32/output.py:291–293).Branch rebased on
main(clean; #818 has merged).Testing
Three regression tests added in
TestAddCitationResponseSpans(test/formatters/granite/test_granite33_output.py):test_response_end_uses_sentence_length_not_full_response— direct regression for Bug: response_end span in granite33 uses full response length instead of sentence length #843: multi-sentence response where the cited sentence is shorter than the full response; pinsbegin == 0andend == len(sent1). Fails against the old code.test_multiple_citations_each_span_correct— two citations across two sentences; asserts both spans are correct and non-overlapping.test_single_sentence_response— baseline: single sentence, span covers it exactly.Coverage on
granite33/output.py: 81.64% → 82.77% (+1.1pp). The remaining uncovered lines are error/warning paths (e.g. citation-not-found, sentence-splitting edge cases) that require malformed model output to trigger.Follow-up
#851 —
str.find()first-occurrence issue: if the same sentence appears more than once in a response, citations after the first get the wrong span. Spotted during review of this PR; out of scope here.