Skip to content

Conversation

@szalpal
Copy link
Member

@szalpal szalpal commented Dec 16, 2025

Category:

Refactoring (Redesign of existing code that doesn't affect functionality)

Description:

This PR takes out of experimental the following operators:
experimental.audio_resample
experimental.debayer
experimental.equalize
experimental.filter
experimental.tensor_resize
nvidia.dali.plugin.numba.fn.experimental.numba_function

The experimental flavours of the operators is kept to maintain backwards compatibility.

Tests are configured to test both: experimental and regular flavours of the operators. Since the backwards compatibility is maintained, both of these operators need to be tested.

TODO:

  • Adjust documentation (hide experimental flavours)
  • Unexperimentalize decoders.video

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-4504


def check_numba_compatibility_gpu(if_skip=True):
import nvidia.dali.plugin.numba.experimental as ex
def check_numba_compatibility_gpu(if_skip=True, use_experimental: bool = False):

Check notice

Code scanning / CodeQL

Explicit returns mixed with implicit (fall through) returns Note test

Mixing implicit and explicit returns may indicate an error, as implicit returns always return None.
@szalpal
Copy link
Member Author

szalpal commented Dec 16, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40306038]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40306038]: BUILD FAILED

@szalpal szalpal force-pushed the unexperimentalize branch 3 times, most recently from e630529 to 51e599d Compare December 17, 2025 07:12
# try importing cuda.core as it can be used later to check the compatibility
# it is okay to fail as it may not be installed, the check later can handle this
import cuda.core
except ImportError:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.
@szalpal
Copy link
Member Author

szalpal commented Dec 17, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40349534]: BUILD STARTED

@szalpal szalpal force-pushed the unexperimentalize branch 2 times, most recently from a3c3ac6 to a424878 Compare December 17, 2025 07:41
@szalpal
Copy link
Member Author

szalpal commented Dec 17, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40350589]: BUILD STARTED

@szalpal szalpal marked this pull request as ready for review December 17, 2025 11:06
@greptile-apps
Copy link

greptile-apps bot commented Dec 17, 2025

Greptile Summary

This PR promotes several DALI operators from experimental to stable status while maintaining full backwards compatibility. The following operators are now available without the experimental prefix:

  • fn.audio_resample (from fn.experimental.audio_resample)
  • fn.debayer (from fn.experimental.debayer)
  • fn.equalize (from fn.experimental.equalize)
  • fn.filter (from fn.experimental.filter)
  • fn.tensor_resize (from fn.experimental.tensor_resize)
  • fn.decoders.video (from fn.experimental.decoders.video)
  • nvidia.dali.plugin.numba.fn.numba_function (from nvidia.dali.plugin.numba.fn.experimental.numba_function)

Key Implementation Details:

  • All experimental operator names are preserved as deprecated aliases pointing to the new stable versions
  • C++ schemas use DALI_SCHEMA().Deprecate() to mark experimental variants as deprecated with proper documentation
  • Python numba plugin moved NumbaFunction class to the parent module with re-exports for backwards compatibility
  • Tests are updated to verify both experimental and regular operator variants work correctly

No issues found. The refactoring is well-executed with proper deprecation handling and comprehensive test coverage for both old and new API paths.

Confidence Score: 5/5

  • This PR is safe to merge - it's a clean refactoring that promotes stable operators while preserving backwards compatibility.
  • The PR is a straightforward refactoring with no functional changes. All experimental operators are preserved as deprecated aliases, ensuring existing code continues to work. Tests are updated to verify both old and new API paths.
  • No files require special attention - all changes follow a consistent pattern.

Important Files Changed

Filename Overview
dali/operators/generic/resize/tensor_resize_cpu.cc Renamed DALI_SCHEMA from experimental__TensorResize to TensorResize and added deprecated alias for backwards compatibility. Both experimental and regular operators are registered correctly.
dali/operators/image/color/debayer.cc Renamed DALI_SCHEMA from experimental__Debayer to Debayer with a deprecated alias for experimental version. Documentation properly hidden for deprecated alias.
dali/operators/image/color/equalize.cc Renamed DALI_SCHEMA from experimental__Equalize to Equalize with deprecated alias. CPU operator registration added for both variants.
dali/operators/image/convolution/filter.cc Renamed DALI_SCHEMA from experimental__Filter to Filter with deprecated alias. CPU operator registration added for both variants.
dali/operators/video/decoder/video_decoder_cpu.cc Renamed DALI_SCHEMA from experimental__decoders__Video to decoders__Video with deprecated alias. CPU operator registration added for both variants.
dali/python/nvidia/dali/plugin/numba/init.py Moved NumbaFunction class from experimental submodule to the main numba module. Registered both regular and experimental fn wrappers for backwards compatibility.
dali/test/python/test_dali_variable_batch_size.py Updated variable batch size tests to include both experimental and regular operator variants. Added entries in tested_methods list for new operator names.

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant ExpAPI as fn.experimental.*
    participant StableAPI as fn.*
    participant OpImpl as Operator Implementation

    Note over User,OpImpl: Before PR (Experimental Only)
    User->>ExpAPI: fn.experimental.debayer()
    ExpAPI->>OpImpl: experimental__Debayer
    OpImpl-->>User: Result

    Note over User,OpImpl: After PR (Both APIs Work)
    User->>StableAPI: fn.debayer()
    StableAPI->>OpImpl: Debayer
    OpImpl-->>User: Result

    User->>ExpAPI: fn.experimental.debayer()
    ExpAPI->>StableAPI: Deprecated alias → Debayer
    StableAPI->>OpImpl: Debayer
    OpImpl-->>User: Result
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (6)

  1. dali/operators/image/color/equalize.cc, line 37 (link)

    style: The docstring reference should use :meth:nvidia.dali.fn.equalize`` for consistency with DALI's documentation standards

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  2. dali/python/nvidia/dali/plugin/numba/__init__.pyi, line 35 (link)

    logic: Return type annotation may be incorrect - setup_fn appears to return None but the signature suggests it takes parameters and returns None

    Should the setup_fn callable return type be None or should it return something else based on the actual implementation?

  3. dali/python/nvidia/dali/plugin/numba/fn/__init__.pyi, line 36 (link)

    syntax: The setup_fn parameter type annotation appears malformed - it has None as a return type in the callable signature but should likely return something or be Optional[Callable[...]] without the trailing None

  4. dali/operators/generic/resize/tensor_resize_cpu.cc, line 49-62 (link)

    style: The deprecated schema definition duplicates parent metadata (NumInput, NumOutput, SupportVolumetric, AllowSequences) that is already inherited from "TensorResize". Since AddParent("TensorResize") inherits all properties, these redundant specifications are unnecessary.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  5. dali/test/python/checkpointing/test_dali_stateless_operators.py, line 643-646 (link)

    logic: inconsistent with other tests - this test only uses fn.audio_resample but the decorator includes both experimental and non-experimental versions

  6. dali/python/nvidia/dali/plugin/numba/__init__.py, line 187 (link)

    logic: Incorrect variable used: should be num_ins instead of num_outs for the input shapes calculation

25 files reviewed, 6 comments

Edit Code Review Agent Settings | Greptile

@szalpal szalpal marked this pull request as draft December 17, 2025 13:13
@szalpal
Copy link
Member Author

szalpal commented Dec 17, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40376844]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40376844]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40349534]: BUILD FAILED

@szalpal
Copy link
Member Author

szalpal commented Dec 17, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40397505]: BUILD STARTED

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
@szalpal szalpal marked this pull request as ready for review December 18, 2025 00:16
@szalpal
Copy link
Member Author

szalpal commented Dec 18, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40400586]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40350589]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40397505]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40400586]: BUILD FAILED

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Dec 18, 2025

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [40430374]: BUILD STARTED

@JanuszL
Copy link
Contributor

JanuszL commented Dec 19, 2025

A general remark: Do we need to extensively test experimental legacy aliases versus unexperimented operators, or would it be sufficient to have just 1–2 basic tests to confirm the operator is available? Since the underlying implementation is the same, I don’t think thorough verification is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we really need to test experimental here, I'd just mark both as tested and run the non-exp one.

Copy link
Member

@stiepan stiepan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing the message on the tests: I am not a fan of doubling the amount of tests run just to make sure old aliases run the same.

You could either run subset of the suites with experimental or when parametrizing large suite, make sure some of them (every second) is assigned the experimental op, but without changing the total number of tests.

When running with the deprecated op, it is supposed to issue a warning. We have assign_wars context mgr that would fit here nicely to: 1. test if we warn as we are supposed to, 2. prevent the deprecation warnings from showing up and cluttering the logs.

@stiepan
Copy link
Member

stiepan commented Dec 19, 2025

FYI: de632e0 @klecki pointed out that we can use parentschema for less duplication. Indeed, it looks like only numinput and output needs to be redefined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants