Commit 1ed773c
authored
FEAT: streaming support in fetchone for varcharmax data type (#219)
### Work Item / Issue Reference
<!--
IMPORTANT: Please follow the PR template guidelines below.
For mssql-python maintainers: Insert your ADO Work Item ID below (e.g.
AB#37452)
For external contributors: Insert Github Issue number below (e.g. #149)
Only one reference is required - either GitHub issue OR ADO Work Item.
-->
<!-- mssql-python maintainers: ADO Work Item -->
>
[AB#38110](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/38110)
[AB#34162](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34162)
<!-- External contributors: GitHub Issue -->
> GitHub Issue: #<ISSUE_NUMBER>
-------------------------------------------------------------------
### Summary
<!-- Insert your summary of changes below. Minimum 10 characters
required. -->
This pull request significantly improves the handling of large object
(LOB) data types (such as large strings and binary data) in the MSSQL
Python driver, especially for fetching and streaming variable-length
data. The changes introduce robust streaming logic for LOB columns,
prevent data truncation, and ensure correct type handling for both
single-row and batch fetches. Additionally, the code now detects LOB
columns and automatically switches to per-row streaming when necessary,
improving reliability and correctness for large datasets.
**LOB Streaming and Fetching Improvements:**
* Introduced the `FetchLobColumnData` function in `ddbc_bindings.cpp` to
stream LOB data (CHAR, WCHAR, and BINARY types) in chunks, correctly
handling nulls, null-terminators, and platform-specific encoding. This
prevents truncation and errors when fetching large columns.
* Updated `SQLGetData_wrap` to use streaming for LOB columns or when
data length is unknown/too large, for both narrow and wide character
types, as well as binary data. This ensures correct retrieval of all
data regardless of size.
[[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L1749-L1771)
[[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L1782-L1800)
[[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R1916-R1967)
**Batch Fetch Logic Enhancements:**
* Modified `FetchBatchData` to detect LOB columns and use streaming
fetch for those columns, avoiding exceptions and ensuring all data is
retrieved for large columns in batch operations.
[[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2319-R2415)
[[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2379-R2487)
[[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R2499-R2501)
[[4]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2424-R2515)
[[5]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2516-R2609)
* Updated `FetchMany_wrap` to pre-scan columns for LOB types and, if any
are found, fall back to row-by-row streaming fetch for those rows;
otherwise, it proceeds with standard batch fetching.
**Type Mapping and Constants:**
* Adjusted `_map_sql_type` in `cursor.py` to map long string types to
`SQL_WVARCHAR`/`SQL_VARCHAR` with length 0 for streaming, aligning with
the new LOB streaming logic.
* Defined `SQL_MAX_LOB_SIZE` (8000) as the threshold for LOB streaming,
centralizing the logic for when to treat columns as LOBs.
These changes collectively make LOB handling more robust, reduce the
risk of data truncation, and improve compatibility across platforms.
<!--
### PR Title Guide
> For feature requests
FEAT: (short-description)
> For non-feature requests like test case updates, config updates ,
dependency updates etc
CHORE: (short-description)
> For Fix requests
FIX: (short-description)
> For doc update requests
DOC: (short-description)
> For Formatting, indentation, or styling update
STYLE: (short-description)
> For Refactor, without any feature changes
REFACTOR: (short-description)
> For release related changes, without any feature changes
RELEASE: #<RELEASE_VERSION> (short-description)
### Contribution Guidelines
External contributors:
- Create a GitHub issue first:
https://github.com/microsoft/mssql-python/issues/new
- Link the GitHub issue in the "GitHub Issue" section above
- Follow the PR title format and provide a meaningful summary
mssql-python maintainers:
- Create an ADO Work Item following internal processes
- Link the ADO Work Item in the "ADO Work Item" section above
- Follow the PR title format and provide a meaningful summary
-->1 parent 99d7bd0 commit 1ed773c
File tree
4 files changed
+603
-373
lines changed- mssql_python
- pybind
- tests
4 files changed
+603
-373
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
373 | 373 | | |
374 | 374 | | |
375 | 375 | | |
376 | | - | |
| 376 | + | |
377 | 377 | | |
378 | | - | |
| 378 | + | |
379 | 379 | | |
380 | 380 | | |
381 | 381 | | |
382 | 382 | | |
383 | | - | |
| 383 | + | |
384 | 384 | | |
385 | | - | |
| 385 | + | |
386 | 386 | | |
387 | 387 | | |
388 | 388 | | |
| |||
0 commit comments