[python-package] Add test for converting a `ctypes` int64 pointer array to a NumPy array #7071

nicklamiller · 2025-10-24T18:02:47Z

Contributes to: #7031

Adds test for _cint64_array_to_numpy:

LightGBM/python-package/lightgbm/basic.py

Lines 504 to 509 in 5dbfcdc

    
           def _cint64_array_to_numpy(*, cptr: "ctypes._Pointer", length: int) -> np.ndarray: 
        
               """Convert a ctypes int pointer array to a numpy array.""" 
        
               if isinstance(cptr, ctypes.POINTER(ctypes.c_int64)): 
        
                   return np.ctypeslib.as_array(cptr, shape=(length,)).copy() 
        
               else: 
        
                   raise RuntimeError("Expected int64 pointer")

jameslamb

Thanks for working on this @nicklamiller

Before I review this... did you try to do this through LightGBM's public API?

From #7031:

Tests should only use lightgbm's public API, unless that is very difficult or expensive. Any function whose name begins with a _ is considered private.

This particular internal function is so small and simple, I think it'd be preferable to have tests which cover it via the public API. That'd give us more coverage of lightgbm under whatever conditions lead to this function being invoked. I'm not exactly sure what code paths do that, probably passing int64 arrays for label or something,... you'd have to do a bit of investigation.

This reverts commit 511070f.

nicklamiller · 2025-10-29T00:50:27Z

did you try to do this through LightGBM's public API?

@jameslamb thank you very much for the detailed instructions in #7031 and sorry for completely missing that point! Testing through the public API to mimic how functionality is called in the wild by users makes sense, and _cint64_array_to_numpy is now tested through Booster(...).predict(...).

Newly covered lines

Running:

pytest \
    --cov=lightgbm \
    --cov-report="term" \
    --cov-report="html:htmlcov" \
    tests/python_package_test/test_engine.py

and viewing coverage report for python-package/lightgbm/basic.py (where _cint64_array_to_numpy is defined)

on master:

on this PR's branch:

Some additional notes/thoughts:

_cint64_array_to_numpy only ever gets called if pred_contrib=True in Booster(...).predict(...), please see call chain/graph below

Even though _cint64_array_to_numpy is defined in basic.py, I noticed some tests in test_engine.py were directly testing Booster(...).predict(...) specifically with pred_contrib=True so I placed my test in test_engine.py instead of test_basic.py, one notable example of such a test is test_contribs_sparse:

LightGBM/tests/python_package_test/test_engine.py

Lines 1929 to 1962 in f2a32f9

    
           def test_contribs_sparse(): 
        
               n_features = 20 
        
               n_samples = 100 
        
               # generate CSR sparse dataset 
        
               X, y = make_multilabel_classification( 
        
                   n_samples=n_samples, sparse=True, n_features=n_features, n_classes=1, n_labels=2 
        
               ) 
        
               y = y.flatten() 
        
               X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42) 
        
               params = { 
        
                   "objective": "binary", 
        
                   "verbose": -1, 
        
               } 
        
               lgb_train = lgb.Dataset(X_train, y_train) 
        
               gbm = lgb.train(params, lgb_train, num_boost_round=20) 
        
               contribs_csr = gbm.predict(X_test, pred_contrib=True) 
        
               assert isspmatrix_csr(contribs_csr) 
        
               # convert data to dense and get back same contribs 
        
               contribs_dense = gbm.predict(X_test.toarray(), pred_contrib=True) 
        
               # validate the values are the same 
        
               if platform.machine() == "aarch64": 
        
                   np.testing.assert_allclose(contribs_csr.toarray(), contribs_dense, rtol=1, atol=1e-12) 
        
               else: 
        
                   np.testing.assert_allclose(contribs_csr.toarray(), contribs_dense) 
        
               assert np.linalg.norm(gbm.predict(X_test, raw_score=True) - np.sum(contribs_dense, axis=1)) < 1e-4 
        
               # validate using CSC matrix 
        
               X_test_csc = X_test.tocsc() 
        
               contribs_csc = gbm.predict(X_test_csc, pred_contrib=True) 
        
               assert isspmatrix_csc(contribs_csc) 
        
               # validate the values are the same 
        
               if platform.machine() == "aarch64": 
        
                   np.testing.assert_allclose(contribs_csc.toarray(), contribs_dense, rtol=1, atol=1e-12) 
        
               else: 
        
                   np.testing.assert_allclose(contribs_csc.toarray(), contribs_dense)

I debated appending to this test but given it was already testing quite a few things, I decided to add test_predict_contrib_int64 as a separate one

There are some failing CI jobs for older CUDA versions, it's not immediately obvious to me if these are related to my changes, I'll dig a bit deeper.

`_cint64_array_to_numpy` call chain from Booster(...).predict(...)

Booster.predict(pred_contrib=True)
    ↓
_InnerPredictor.predict(pred_contrib=True)
    ↓
__pred_for_csr(csr, predict_type=_C_API_PREDICT_CONTRIB)
    ↓
__inner_predict_csr_sparse(csr, predict_type=_C_API_PREDICT_CONTRIB)
    ↓
_LIB.LGBM_BoosterPredictSparseOutput()  # C API call
    ↓
__create_sparse_native(csr, out_ptr_indptr, ...)
    ↓
_cint64_array_to_numpy(cptr=out_ptr_indptr, length=indptr_len)

Add test for _cint64_array_to_numpy

511070f

nicklamiller requested review from StrikerRUS, borchero, guolinke, jameslamb, jmoralez and shiyu1994 as code owners October 24, 2025 18:02

nicklamiller changed the title ~~Add test for converting a ctypes int64 pointer array to a NumPy array~~ [python-package] Add test for converting a ctypes int64 pointer array to a NumPy array Oct 24, 2025

jameslamb requested changes Oct 25, 2025

View reviewed changes

jameslamb added in progress maintenance labels Oct 25, 2025

nicklamiller added 5 commits October 25, 2025 10:34

Revert "Add test for _cint64_array_to_numpy"

195c487

This reverts commit 511070f.

Add test for _cint64_array_to_numpy through public API

643e3b3

Merge remote-tracking branch 'upstream/master' into cint64-arr-test

e43b31b

Move test_predict_contrib_int64 to test_engine.py

f674626

Use conventions in test_engine.py

cbf6908

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[python-package] Add test for converting a `ctypes` int64 pointer array to a NumPy array #7071

[python-package] Add test for converting a `ctypes` int64 pointer array to a NumPy array #7071

nicklamiller commented Oct 24, 2025 •

edited

Loading

Uh oh!

jameslamb left a comment

Uh oh!

nicklamiller commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	def _cint64_array_to_numpy(*, cptr: "ctypes._Pointer", length: int) -> np.ndarray:
	"""Convert a ctypes int pointer array to a numpy array."""
	if isinstance(cptr, ctypes.POINTER(ctypes.c_int64)):
	return np.ctypeslib.as_array(cptr, shape=(length,)).copy()
	else:
	raise RuntimeError("Expected int64 pointer")

[python-package] Add test for converting a ctypes int64 pointer array to a NumPy array #7071

Are you sure you want to change the base?

[python-package] Add test for converting a ctypes int64 pointer array to a NumPy array #7071

Conversation

nicklamiller commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

nicklamiller commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[python-package] Add test for converting a `ctypes` int64 pointer array to a NumPy array #7071

[python-package] Add test for converting a `ctypes` int64 pointer array to a NumPy array #7071

nicklamiller commented Oct 24, 2025 •

edited

Loading