Skip to content

Conversation

@jeffreyscarpenter
Copy link
Owner

🐛 Problem Description

The content safety actions were failing with the following error:

WARNING: Error while execution 'content_safety_check_input' with parameters '{...}': 
Could not find prompt for task content_safety_check_input $model=content_safety and model nimchat/nvidia/llama-3_3-nemotron-super-49b-v1_5

🔍 Root Cause Analysis

The issue occurred in the get_task_model function in nemoguardrails/llm/prompts.py. When content safety actions were called with task names like content_safety_check_input $model=content_safety, the system was:

  1. Incorrectly using the main model (nvidia/llama-3_3-nemotron-super-49b-v1_5) instead of the specified content safety model
  2. Failing to parse the $model=content_safety part of the task name
  3. Defaulting to the main model for prompt resolution, causing the lookup to fail

🛠️ Solution

Updated the get_task_model function to properly handle task names with model specifications:

  • Added parsing for task names containing $model= specifications
  • Extracts model type from task names (e.g., content_safety from content_safety_check_input $model=content_safety)
  • Prioritizes specified models over default fallbacks

✅ Changes Made

  1. Enhanced Model Resolution Logic - Added parsing for $model= specifications
  2. Comprehensive Test Coverage - Added new test case with positive and negative scenarios
  3. Backward Compatibility - No breaking changes to existing functionality

🧪 Testing

  • ✅ All existing tests continue to pass
  • ✅ New test case validates the fix works correctly
  • ✅ Content safety actions now work as expected
  • ✅ Error messages no longer appear

🔧 Configuration Example

This fix enables proper functioning of configurations like:

models:
  - type: content_safety
    engine: nim
    model: nvidia/llama-3.1-nemoguard-8b-content-safety

rails:
  input:
    flows:
      - content safety check input $model=content_safety

🚀 Impact

Before: Content safety actions failed with confusing error messages
After: Content safety actions work as expected with proper model resolution

This is a non-breaking bug fix that improves the user experience for content safety workflows.

mikemckiernan and others added 30 commits May 9, 2025 13:01
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
…Mo#1178)

* feat: add RailException support and improve error handling

- Add TypedDict for structured return values
- implement RailException for injection detection (a must have for checks)
- improve error handling for malformed YARA rules

* improve test coverage
dd8041b
Bumps [h11](https://github.com/python-hyper/h11) from 0.14.0 to 0.16.0.
- [Commits](python-hyper/h11@v0.14.0...v0.16.0)

---
updated-dependencies:
- dependency-name: h11
  dependency-version: 0.16.0
  dependency-type: indirect
…1194)

* add `_ensure_explain_info` function to solve explain_info_var context between `stream_async` and `generate_async`

* apply pre-commit changes

---------

Co-authored-by: Sandro Cavallari <scavallari@gmail.com>
* fix(prompt_security): correct flow actions and syntax

* fix(privateai): update PII detection and masking configurations

Set `is_system_action` to `False` for `detect_pii` and `mask_pii`
actions to align with updated requirements. Remove `@active`
decorators from flows in `flows.co` to streamline flow activation
logic.

* feat(prompt_security): add rails exceptions to colang 2

* feat(prompt_security): add rails exceptions to colang 1
…ler.py (NVIDIA-NeMo#1182)

- modified the `StreamingConsumer` class in to store its asyncio task
- added a `cancel()` method, and handle `asyncio.CancelledError`
- Updating test functions (`test_single_chunk`,
  `test_sequence_of_chunks`) and the helper function
  (`_test_pattern_case`) to call `consumer.cancel()` in a
  `finally` block.

These changes prevent `RuntimeError: Event loop is closed` and
`Task was destroyed but it is pending!` warnings by ensuring
background tasks are correctly cancelled and awaited upon test
completion.
…coverage (NVIDIA-NeMo#1183)

* fix(tests): ensure proper asyncio task cleanup in test_streaming_handler.py

- modified the `StreamingConsumer` class in to store its asyncio task
- added a `cancel()` method, and handle `asyncio.CancelledError`
- Updating test functions (`test_single_chunk`,
  `test_sequence_of_chunks`) and the helper function
  (`_test_pattern_case`) to call `consumer.cancel()` in a
  `finally` block.

These changes prevent `RuntimeError: Event loop is closed` and
`Task was destroyed but it is pending!` warnings by ensuring
background tasks are correctly cancelled and awaited upon test
completion.

* test(streaming): add extensive tests for StreamingHandler to enhance
coverage

- Added tests for various functionalities of StreamingHandler, including:
  - Piping streams between handlers
  - Buffering enable/disable behavior
  - Handling multiple stop tokens
  - Metadata inclusion and processing
  - Suffix and prefix pattern handling
  - Edge cases for __anext__ method
  - First token handling and generation info
- Improved test coverage for async methods and error scenarios.
- Addressed potential issues with streaming termination signals.
…ling (NVIDIA-NeMo#1185)

* refactor(streaming): introduce END_OF_STREAM sentinel and update handling

- Replaced inconsistent use of `None` and `""` for stream termination
  in `StreamingHandler` with a dedicated `END_OF_STREAM` sentinel object.
- Modified `push_chunk` to convert `None` to `END_OF_STREAM`.
- Updated `__anext__` to raise `StopAsyncIteration` only for `END_OF_STREAM`
  and to return empty strings or dicts with empty/None text as data.
- Adjusted `_process` to correctly handle `END_OF_STREAM` for buffering
  and queueing logic.
- Updated `on_llm_end` to use `END_OF_STREAM`.
- Revised tests in `test_streaming_handler.py` to reflect these changes,
  including how empty first tokens are handled and how `__anext__` behaves
  with various inputs.

* coverage to the moon: fix missing generation_info and add more tests
…eMo#1136)

Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Co-authored-by: Sandro Cavallari <scavallari@gmail.com>
Co-authored-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Limit pytest-asyncio version to below 1.0.0 to avoid breaking
changes introduced in major releases 1.0.0 released on May 26.
Update poetry.lock content hash accordingly.
* chore: update changelog for v0.14.0 release

* chore: bump release version to 0.14.0

* chore: update version in README.md
* docs: Release notes for 0.14.0

Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>

* fix: Identify 0.13.0 behavior with traces

Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>

---------

Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
* Update jailbreak detection compatibility for NIM to allow providing an API key.

* Allow configurable classification path.

* Clean up unused dependencies. Update `JailbreakDetectionConfig` object to use base_url and endpoints. Refactor checks to align with base_uri and api_key_env_var approaches. Add additional error handling and logging. Fix tests to reflect changes.

Signed-off-by: Erick Galinkin <egalinkin@nvidia.com>

* apply black

Signed-off-by: Erick Galinkin <egalinkin@nvidia.com>

* style: apply pre-commits

* Support deprecated `nim_url` and `nim_port` fields.

Signed-off-by: Erick Galinkin <egalinkin@nvidia.com>

* Push test update for deprecated parameters

Signed-off-by: Erick Galinkin <egalinkin@nvidia.com>

* fix: improve error handling in check_jailbreak function

- Fix TypeError when classifier is None by adding defensive programming
- Replace silent failure with clear RuntimeError and descriptive message
- Simplify calling code by removing redundant null checks from actions.py and server.py
- Update tests to match new function signature and behavior
- Add test coverage for new RuntimeError path

This resolves the critical bug where check_jailbreak(prompt) would crash with
"TypeError: 'NoneType' object is not callable" when EMBEDDING_CLASSIFIER_PATH
is not set. Now it raises a clear RuntimeError with guidance on how to fix it.

* fix

fix

* fix(request): make nim_auth_token optional in request

* test: add more tests

* fix model path mocking and assertion for windows

---------

Signed-off-by: Erick Galinkin <egalinkin@nvidia.com>
Co-authored-by: Pouyanpi <13303554+Pouyanpi@users.noreply.github.com>
…DIA-NeMo#1221)

Co-authored-by: Pouyanpi <13303554+Pouyanpi@users.noreply.github.com>
miyoungc and others added 30 commits August 6, 2025 13:13
* start finalizing release notes for 0.15

* bump versions for doc

* some fixes in basic config doc page

* nit

* final changes
---------

Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
* add kv caching draft

* generalize string placeholders

* minor edits

* minor edits

* incorporate feedback

* sneak in link open new tab functionality
…VIDIA-NeMo#1301)

* fix: Add explicit global declarations in sensitive_data_detection Colang v2 flows
…ns for Colang 2.0 (NVIDIA-NeMo#1335)

Convert InternalEvent objects to dictionary format before passing to to_chat_messages() to prevent
'InternalEvent' object is not subscriptable TypeError when using topic safety with Colang 2.0 runtime.
…th empty models config (NVIDIA-NeMo#1334)

* fix(prompts): prevent IndexError when LLM provided via constructor with empty models config

- Add check in get_task_model to handle empty _models list gracefully
- Return None instead of throwing IndexError when no models match
- Add comprehensive test coverage for various model configuration scenarios

Fixes the issue where providing an LLM object directly to LLMRails constructor
would fail if the YAML config had an empty models list.
---------

Co-authored-by: Konstantin Lapine <konstantin.lapine@pangea.cloud>
…ion (NVIDIA-NeMo#1336)

- Only copy model_kwargs if it exists and is not None
- Prevents AttributeError for models like ChatNVIDIA that don't have model_kwargs
- Fix FakeLLM to share counter across copied instances for test consistency
…#1342)

Update isolated LLM creation to only target actions defined in rails config
flows. This ensures that LLMs are not unnecessarily created for actions not
present in the configuration. Adds comprehensive tests to verify correct
behavior, including handling of empty configs and skipping already-registered
LLMs.
…VIDIA-NeMo#1331)

* feat: enhance tracing system with OpenTelemetry semantic conventions and configurable span formats

Introduces a major enhancement to the NeMo Guardrails tracing and telemetry infrastructure with support for multiple span formats, OpenTelemetry semantic convention compliance, and privacy-focused content capture controls. The system now supports both legacy and OpenTelemetry-compliant span formats while maintaining backward compatibility.

Key changes:
- Add configurable span format support (flat/opentelemetry)
- Implement OpenTelemetry semantic conventions for GenAI
- Add privacy controls for prompt/response content capture
- Enhance LLM call tracking with model provider information
- Improve span extraction and modeling architecture
- Add comprehensive test coverage for new functionality
…NeMo#1344)

Bumps [vllm](https://github.com/vllm-project/vllm) from 0.9.0 to 0.10.1.1.
- [Release notes](https://github.com/vllm-project/vllm/releases)
- [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md)
- [Commits](vllm-project/vllm@v0.9.0...v0.10.1.1)

---
updated-dependencies:
- dependency-name: vllm
  dependency-version: 0.10.1.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…VIDIA-NeMo#1348)

* fix(llmrails): move LLM isolation setup to after KB initialization

The LLM isolation code was called too early in __init__, before all
components were fully initialized. This caused flow matching to fail
when trying to resolve rail flow IDs.

Move _create_isolated_llms_for_actions() call to after KB setup to
ensure all initialization is complete before creating isolated LLMs.
…eMo#1352)

Added the @pytest.mark.asyncio decorator to all async test methods in
TestJailbreakDetectionActions to ensure proper async test execution.
This resolves issues with pytest not recognizing async tests and
improves test reliability.
* Initial checkin of first tracing notebook

* Removed unused files

* Finish notebook

* Remove config files, these are created programmatically now

* Rewrote notebook to remove config files, cleared output cells

* Update notebook

* Clean run with latest develop branch

* Remove unsafe prompts from notebook

* Added link to Parallel Rails docs

* Run black linter over notebook

* docs: doc edits for tracing notebook (NVIDIA-NeMo#1349)

* doc edits for tracing notebook

* more improvs

* last bits of editing

* Add Miyoung's updates

* Apply miyoung's suggested changes

Changes links to docs from relative to absolute

Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
Signed-off-by: Tim Gasser <200644301+tgasser-nv@users.noreply.github.com>

* Address Pouyan's comments

---------

Signed-off-by: Tim Gasser <200644301+tgasser-nv@users.noreply.github.com>
Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
* Initial checkin of Jaeger tracing notebook

* Add missing cell on spinnup of the Jaeger Docker container

* Remove ipynb checkpoint with Jaeger screenshot

* docs: edits for tracing notebook 2 (NVIDIA-NeMo#1356)

* doc edits for tracing notebook 2

* fix typo

---------

Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
Add documentation describing the new OpenTelemetry-based span format for
tracing, including configuration, key differences from the legacy format,
migration steps, and important considerations around privacy and
performance. Also add a test script to verify Jaeger integration with
NeMo-Guardrails using OpenTelemetry, demonstrating trace export and
event-span correlation.
…-NeMo#1363)

Bumps [vllm](https://github.com/vllm-project/vllm) from 0.9.0 to 0.10.1.1.
- [Release notes](https://github.com/vllm-project/vllm/releases)
- [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md)
- [Commits](vllm-project/vllm@v0.9.0...v0.10.1.1)

---
updated-dependencies:
- dependency-name: vllm
  dependency-version: 0.10.1.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
- Fix get_task_model function to properly parse task names with = specifications
- Extract model type from task names like 'content_safety_check_input =content_safety'
- Use extracted model type to find correct model configuration instead of defaulting to main model
- Add comprehensive test coverage for the new functionality
- Maintain backward compatibility for existing task names

Fixes issue where content safety actions would fail with error:
'Could not find prompt for task content_safety_check_input =content_safety
and model [main_model_name]'

This fix ensures that when a task specifies a model type via =, the system
correctly uses that model type for prompt resolution rather than falling back
to the main model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.