-
Notifications
You must be signed in to change notification settings - Fork 240
Open
Labels
bugSomething isn't workingSomething isn't workingcuda.coreEverything related to the cuda.core moduleEverything related to the cuda.core module
Milestone
Description
Tracking the failure below.
xref: #1242 (comment)
All details are in the full logs:
qa_bindings_windows_2025-11-18+102913_build_log.txt
qa_bindings_windows_2025-11-18+102913_tests_log.txt
The only non-obvious detail:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0 was installed from cuda_13.0.1_windows.exe
EDIT: The exact same error appeared when retesting with v13.0 installed from cuda_13.0.2_windows.exe
C:\Users\rgrossekunst\forked\cuda-python>nvidia-smi
Tue Nov 18 10:31:56 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.34 Driver Version: 591.34 CUDA Version: 13.1 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A6000 WDDM | 00000000:C1:00.0 Off | Off |
| 30% 31C P8 19W / 300W | 1778MiB / 49140MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
================================== FAILURES ===================================
___________________ test_vmm_allocator_policy_configuration ___________________
def test_vmm_allocator_policy_configuration():
"""Test VMM allocator with different policy configurations.
This test verifies that VirtualMemoryResource can be configured
with different allocation policies and that the configuration affects
the allocation behavior.
"""
device = Device()
device.set_current()
# Skip if virtual memory management is not supported
if not device.properties.virtual_memory_management_supported:
pytest.skip("Virtual memory management is not supported on this device")
# Skip if GPU Direct RDMA is supported (we want to test the unsupported case)
if not device.properties.gpu_direct_rdma_supported:
pytest.skip("This test requires a device that doesn't support GPU Direct RDMA")
# Test with custom VMM config
custom_config = VirtualMemoryResourceOptions(
allocation_type="pinned",
location_type="device",
granularity="minimum",
gpu_direct_rdma=True,
handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
peers=(),
self_access="rw",
peer_access="rw",
)
vmm_mr = VirtualMemoryResource(device, config=custom_config)
# Verify configuration is applied
assert vmm_mr.config == custom_config
assert vmm_mr.config.gpu_direct_rdma is True
assert vmm_mr.config.granularity == "minimum"
# Test allocation with custom config
buffer = vmm_mr.allocate(8192)
assert buffer.size >= 8192
assert buffer.device_id == device.device_id
# Test policy modification
new_config = VirtualMemoryResourceOptions(
allocation_type="pinned",
location_type="device",
granularity="recommended",
gpu_direct_rdma=False,
handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
peers=(),
self_access="r", # Read-only access
peer_access="r",
)
# Modify allocation policy
> modified_buffer = vmm_mr.modify_allocation(buffer, 16384, config=new_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests\test_memory.py:440:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cuda\core\experimental\_memory\_virtual_memory_resource.py:230: in modify_allocation
raise_if_driver_error(res)
cuda\core\experimental\_utils\cuda_utils.pyx:67: in cuda.core.experimental._utils.cuda_utils._check_driver_error
cpdef inline int _check_driver_error(cydriver.CUresult error) except?-1 nogil:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> raise CUDAError(f"{name.decode()}: {expl}")
E cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.
cuda\core\experimental\_utils\cuda_utils.pyx:78: CUDAError
=========================== short test summary info ===========================
SKIPPED [6] tests\example_tests\utils.py:37: cupy not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:37: torch not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:43: skip C:\Users\rgrossekunst\forked\cuda-python\cuda_core\tests\example_tests\..\..\examples\thread_block_cluster.py
SKIPPED [5] tests\memory_ipc\test_errors.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:91: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_event_ipc.py:106: Device does not support IPC
SKIPPED [8] tests\memory_ipc\test_event_ipc.py:123: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_leaks.py:26: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [12] tests\memory_ipc\test_leaks.py:82: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:16: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:53: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:103: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:153: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_send_buffers.py:18: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:24: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:79: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:125: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:29: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:65: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:109: Device does not support IPC
SKIPPED [1] tests\test_device.py:327: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_device.py:375: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_launcher.py:92: Driver or GPU not new enough for thread block clusters
SKIPPED [1] tests\test_launcher.py:122: Driver or GPU not new enough for thread block clusters
SKIPPED [2] tests\test_launcher.py:274: cupy not installed
SKIPPED [1] tests\test_linker.py:113: nvjitlink requires lto for ptx linking
SKIPPED [1] tests\test_memory.py:514: This test requires a device that doesn't support GPU Direct RDMA
SKIPPED [1] tests\test_memory.py:645: Driver rejects IPC-enabled mempool creation on this platform
SKIPPED [7] tests\test_module.py:345: Test requires numba to be installed
SKIPPED [2] tests\test_module.py:389: Device with compute capability 90 or higher is required for cluster support
SKIPPED [1] tests\test_module.py:404: Device with compute capability 90 or higher is required for cluster support
SKIPPED [2] tests\test_utils.py: got empty parameter set for (in_arr, use_stream)
SKIPPED [1] tests\test_utils.py: CuPy is not installed
FAILED tests/test_memory.py::test_vmm_allocator_policy_configuration - cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.
============ 1 failed, 518 passed, 75 skipped in 68.77s (0:01:08) =============
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcuda.coreEverything related to the cuda.core moduleEverything related to the cuda.core module