Skip to content

test_vmm_allocator_policy_configuration failure: Windows / A6000 / WDDM #1264

@rwgk

Description

@rwgk

Tracking the failure below.

xref: #1242 (comment)

All details are in the full logs:

qa_bindings_windows_2025-11-18+102913_build_log.txt

qa_bindings_windows_2025-11-18+102913_tests_log.txt

The only non-obvious detail:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0 was installed from cuda_13.0.1_windows.exe

EDIT: The exact same error appeared when retesting with v13.0 installed from cuda_13.0.2_windows.exe

C:\Users\rgrossekunst\forked\cuda-python>nvidia-smi
Tue Nov 18 10:31:56 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.34                 Driver Version: 591.34         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A6000             WDDM  |   00000000:C1:00.0 Off |                  Off |
| 30%   31C    P8             19W /  300W |    1778MiB /  49140MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
================================== FAILURES ===================================
___________________ test_vmm_allocator_policy_configuration ___________________

    def test_vmm_allocator_policy_configuration():
        """Test VMM allocator with different policy configurations.
    
        This test verifies that VirtualMemoryResource can be configured
        with different allocation policies and that the configuration affects
        the allocation behavior.
        """
        device = Device()
        device.set_current()
    
        # Skip if virtual memory management is not supported
        if not device.properties.virtual_memory_management_supported:
            pytest.skip("Virtual memory management is not supported on this device")
    
        # Skip if GPU Direct RDMA is supported (we want to test the unsupported case)
        if not device.properties.gpu_direct_rdma_supported:
            pytest.skip("This test requires a device that doesn't support GPU Direct RDMA")
    
        # Test with custom VMM config
        custom_config = VirtualMemoryResourceOptions(
            allocation_type="pinned",
            location_type="device",
            granularity="minimum",
            gpu_direct_rdma=True,
            handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
            peers=(),
            self_access="rw",
            peer_access="rw",
        )
    
        vmm_mr = VirtualMemoryResource(device, config=custom_config)
    
        # Verify configuration is applied
        assert vmm_mr.config == custom_config
        assert vmm_mr.config.gpu_direct_rdma is True
        assert vmm_mr.config.granularity == "minimum"
    
        # Test allocation with custom config
        buffer = vmm_mr.allocate(8192)
        assert buffer.size >= 8192
        assert buffer.device_id == device.device_id
    
        # Test policy modification
        new_config = VirtualMemoryResourceOptions(
            allocation_type="pinned",
            location_type="device",
            granularity="recommended",
            gpu_direct_rdma=False,
            handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
            peers=(),
            self_access="r",  # Read-only access
            peer_access="r",
        )
    
        # Modify allocation policy
>       modified_buffer = vmm_mr.modify_allocation(buffer, 16384, config=new_config)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests\test_memory.py:440: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cuda\core\experimental\_memory\_virtual_memory_resource.py:230: in modify_allocation
    raise_if_driver_error(res)
cuda\core\experimental\_utils\cuda_utils.pyx:67: in cuda.core.experimental._utils.cuda_utils._check_driver_error
    cpdef inline int _check_driver_error(cydriver.CUresult error) except?-1 nogil:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise CUDAError(f"{name.decode()}: {expl}")
E   cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.

cuda\core\experimental\_utils\cuda_utils.pyx:78: CUDAError
=========================== short test summary info ===========================
SKIPPED [6] tests\example_tests\utils.py:37: cupy not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:37: torch not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:43: skip C:\Users\rgrossekunst\forked\cuda-python\cuda_core\tests\example_tests\..\..\examples\thread_block_cluster.py
SKIPPED [5] tests\memory_ipc\test_errors.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:91: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_event_ipc.py:106: Device does not support IPC
SKIPPED [8] tests\memory_ipc\test_event_ipc.py:123: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_leaks.py:26: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [12] tests\memory_ipc\test_leaks.py:82: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:16: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:53: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:103: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:153: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_send_buffers.py:18: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:24: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:79: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:125: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:29: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:65: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:109: Device does not support IPC
SKIPPED [1] tests\test_device.py:327: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_device.py:375: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_launcher.py:92: Driver or GPU not new enough for thread block clusters
SKIPPED [1] tests\test_launcher.py:122: Driver or GPU not new enough for thread block clusters
SKIPPED [2] tests\test_launcher.py:274: cupy not installed
SKIPPED [1] tests\test_linker.py:113: nvjitlink requires lto for ptx linking
SKIPPED [1] tests\test_memory.py:514: This test requires a device that doesn't support GPU Direct RDMA
SKIPPED [1] tests\test_memory.py:645: Driver rejects IPC-enabled mempool creation on this platform
SKIPPED [7] tests\test_module.py:345: Test requires numba to be installed
SKIPPED [2] tests\test_module.py:389: Device with compute capability 90 or higher is required for cluster support
SKIPPED [1] tests\test_module.py:404: Device with compute capability 90 or higher is required for cluster support
SKIPPED [2] tests\test_utils.py: got empty parameter set for (in_arr, use_stream)
SKIPPED [1] tests\test_utils.py: CuPy is not installed
FAILED tests/test_memory.py::test_vmm_allocator_policy_configuration - cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.
============ 1 failed, 518 passed, 75 skipped in 68.77s (0:01:08) =============

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcuda.coreEverything related to the cuda.core module

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions