Skip to content

test: pure-logic unit tests for stdlib, core, backends, telemetry (#860)#862

Queued
planetf1 wants to merge 3 commits intogenerative-computing:mainfrom
planetf1:test/coverage-gaps-813
Queued

test: pure-logic unit tests for stdlib, core, backends, telemetry (#860)#862
planetf1 wants to merge 3 commits intogenerative-computing:mainfrom
planetf1:test/coverage-gaps-813

Conversation

@planetf1
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 commented Apr 15, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Adds ~280 new unit tests covering the remaining pure-logic gaps in the library — everything that can be tested without a live backend, GPU, or optional infrastructure dependency. All tests run in the default pytest invocation with no backend markers needed.

What's been added

Area Tests Files
stdlib/components ~97 test_simple, test_instruction, test_chat (extended), test_genstub_unit, test_mobject
stdlib/functional 6 test_functional_unit
stdlib/sampling 30 test_majority_voting_unit, test_sofai_unit, test_sampling_base_unit
stdlib/session 7 test_session_unit
telemetry 34 test_backend_instrumentation, test_tracing_helpers
backends 37 test_utils (extended), test_openai_unit, test_ollama_unit
formatters/granite/base 8 test_base_util
core 24 test_base (extended), test_requirement_helpers

Two small infrastructure additions worth calling out:

  • require_nltk_data() predicate added to test/predicates.py. The Granite 3.2/3.3 citation pipeline tests call nltk.sent_tokenize, which needs the punkt_tab data download. The predicate skips with an actionable message that distinguishes "package not installed" from "data not downloaded". CI already runs python -m nltk.downloader punkt_tab in quality.yml.

  • test_tracing_helpers.py — the mock-based tracing helper tests have been split out of the OTel-gated test_tracing.py into a new file without a module-level importorskip, so they run unconditionally. The one test that genuinely needs OTel uses a per-test importorskip.

Coverage improvement (local filter, unit/integration only)

Metric Before (2026-04-14) After
Tests 1,202 ~1,480
Line coverage 58.1% ~62%
stdlib/components/ 62.0% 76.9%
backends/ (all) 33.0% 38.3%
core/ 79.4% 81.9%

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code was added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

~280 new unit tests covering every remaining pure-logic gap in the library
that doesn't require live backends, GPU, or optional infrastructure.

Areas covered:
- stdlib/components: SimpleComponent, Instruction, Message/ToolMessage/chat,
  GenerativeStub (pure helpers + _parse), Query/Transform/MObject
- stdlib/functional: _parse_and_clean_image_args
- stdlib/sampling: majority_voting compare_strings (math + RougeL),
  sofai static helpers, RepairTemplateStrategy.repair()
- stdlib/session: backend_name_to_class, get_session error path
- telemetry: backend_instrumentation helpers, tracing _set_attribute_safe,
  end_backend_span (in new test_tracing_helpers.py, not behind OTel guard)
- backends: get_value/to_tool_calls/to_chat (utils), OpenAI filter/merge
  helpers, Ollama merge helpers + chat_response_delta_merge
- formatters/granite/base: find_substring_in_text
- core: CBlock TypeError, ImageBlock validation, MOT._copy_from,
  ValidationResult properties, default_output_to_bool

Infrastructure:
- require_nltk_data() predicate added to test/predicates.py — distinguishes
  "nltk not installed" from "punkt_tab not downloaded" in skip messages.
  Applied to the two tests (Granite 3.2/3.3 citation pipeline) that call
  nltk.sent_tokenize().
- test_tracing_helpers.py: split 9 mock-based tracing tests out of the
  OTel-gated test_tracing.py so they run unconditionally.

Closes generative-computing#860. Part of generative-computing#813. Parent epic: generative-computing#726.
@planetf1 planetf1 force-pushed the test/coverage-gaps-813 branch from 2dff0b7 to 9caaeb7 Compare April 15, 2026 09:06
@planetf1 planetf1 changed the title Test/coverage gaps 813 test: pure-logic unit tests for stdlib, core, backends, telemetry (#860) Apr 15, 2026
@planetf1 planetf1 marked this pull request as ready for review April 15, 2026 09:20
@planetf1 planetf1 requested review from a team as code owners April 15, 2026 09:20
Comment thread test/backends/test_ollama_unit.py Outdated
Comment thread test/backends/test_ollama_unit.py Outdated
Comment thread test/backends/test_ollama_unit.py Outdated
- test_simplify_and_merge_none_returns_empty_dict: assert result == {}
  rather than just isinstance check
- test_simplify_and_merge_per_call_overrides_backend: extract
  _make_backend() helper so the override test can construct a backend
  with a pre-set num_predict=128, proving the per-call value of 256
  actually wins the merge
Address review feedback on mapping tests (ollama + openai):

1. Structural consistency — assert from_mellea keys are a subset of
   to_mellea values (maps agree with each other)
2. Iterative round-trip — exercise _simplify_and_merge and
   _make_backend_specific_and_remove against every map entry, verifying
   the methods actually use the maps correctly
3. Hardcoded anchor — one named test per backend for the most critical
   mapping (num_predict / max_completion_tokens ↔ MAX_NEW_TOKENS) to
   catch wrong-string regressions with a meaningful failure message

For OpenAI, structural and round-trip tests are parametrized over
chats/completions since it maintains separate map pairs.

Also: extract _make_backend() helper in test_openai_unit.py (matching
the ollama pattern) so the override test can construct a backend with
pre-set model options.
@planetf1 planetf1 added this pull request to the merge queue Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test: coverage wrap-up — remaining pure-logic gaps in stdlib, core, backends, telemetry

2 participants