Skip to content

[Storage] Add support for CRC64 content validation#47001

Open
jalauzon-msft wants to merge 14 commits into
mainfrom
feature/storage/content-validation-102
Open

[Storage] Add support for CRC64 content validation#47001
jalauzon-msft wants to merge 14 commits into
mainfrom
feature/storage/content-validation-102

Conversation

@jalauzon-msft
Copy link
Copy Markdown
Member

This change adds support for transactional content validation using the Storage specific CRC64 algorithm for all upload/download APIs across all packages. It makes use of the existing validate_content keyword by just expanding it for crc64, and auto.

Copilot AI review requested due to automatic review settings May 19, 2026 21:02
@github-actions github-actions Bot added the Storage Storage Service (Queues, Blobs, Files) label May 19, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Azure Storage CRC64 content validation support across all data-plane Storage SDKs (blob, queue, file-share, file-datalake). Extends the existing validate_content keyword to accept "auto", "crc64", and "md5" (in addition to the legacy bool), with CRC64 implemented via the azure-storage-extensions C extension and the service's structured-message framing.

Changes:

  • New shared validation.py (option parsing, MD5/CRC64 helpers) and streams_async.py (AsyncStructuredMessageDecoder) duplicated into each Storage package's _shared/ folder.
  • policies.py/policies_async.py refactored: extracts _prepare_content_validation and _validate_content_response, adds AsyncContentValidationPolicy, drops the old StorageContentValidation.get_content_md5 static method (callers/tests updated to calculate_content_md5); download paths wrap the response stream in a structured-message decoder when CRC64 is used.
  • Public clients (*_client.py + .pyi) widen the validate_content keyword type and update docstrings; new parametrized test_content_validation[_async].py suites added for file-share and file-datalake, plus async structured-message decoder tests for blob; GenericTestProxyParametrize{1,2} helpers added to devtools_testutils.storage.

Reviewed changes

Copilot reviewed 76 out of 76 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sdk/storage/azure-storage-blob/azure/storage/blob/_shared/validation.py (and matching files in queue/file-share/file-datalake) New shared module: parse validate_content, compute MD5/CRC64.
sdk/storage/azure-storage-blob/azure/storage/blob/_shared/streams_async.py (and matching files) New async structured-message decoder mirroring the sync one.
sdk/storage/azure-storage-blob/azure/storage/blob/_shared/policies.py (and matching files) Replace StorageContentValidation body with helper functions; add CRC64/structured-message upload + download handling.
sdk/storage/azure-storage-blob/azure/storage/blob/_shared/policies_async.py (and matching files) Add AsyncContentValidationPolicy (needs body load for MD5) wired through _prepare_content_validation/_validate_content_response.
sdk/storage/azure-storage-blob/azure/storage/blob/_shared/base_client_async.py (and matching files) Swap sync StorageContentValidation for AsyncContentValidationPolicy in async pipeline.
sdk/storage/azure-storage-blob/azure/storage/blob/_shared/request_handlers.py range_validation switched from string "true" to bool True.
sdk/storage/azure-storage-blob/azure/storage/blob/_blob_client.py / .pyi / _container_client.* / aio variants Widen validate_content keyword type, new docs, parse via parse_validation_option, reject CRC64 + client-side encryption combo.
sdk/storage/azure-storage-blob/azure/storage/blob/_blob_client_helpers.py Threads validate_content through _upload_blob_options/_download_blob_options; uses parse_validation_option in stage/upload page/append.
sdk/storage/azure-storage-blob/azure/storage/blob/_upload_helpers.py + aio validate_content typed as CV_TYPE_PARSED; updated check for "original upload path".
sdk/storage/azure-storage-blob/azure/storage/blob/_download.py + aio Use is_md5_validation for MD5-only paths; track _is_structured_message to gate _download_complete.
sdk/storage/azure-storage-file-share/azure/storage/fileshare/_file_client.py / aio + .pyi Same expansion of validate_content for create_file/upload_file/upload_range/download_file.
sdk/storage/azure-storage-file-share/azure/storage/fileshare/_download.py + aio MD5-only behavior gated through is_md5_validation.
sdk/storage/azure-storage-file-datalake/azure/storage/filedatalake/_data_lake_file_client.py / aio + .pyi + _helpers Same widening for upload_data/append_data/download_file.
sdk/storage/azure-storage-*/dev_requirements.txt Add local azure-storage-extensions dep for tests.
sdk/storage/azure-storage-*/assets.json Bump recorded-test asset tags.
sdk/storage/azure-storage-file-share/tests/test_content_validation*.py, file-datalake equivalents, azure-storage-blob/tests/test_streams_async.py New tests for content-validation modes and async structured-message decoder.
sdk/storage/azure-storage-blob/tests/test_{append,block,page}_blob*.py Switch from removed StorageContentValidation.get_content_md5 to calculate_content_md5.
eng/tools/azure-sdk-tools/devtools_testutils/storage/{,aio/}{,async}decorators.py + init.py Add GenericTestProxyParametrize{1,2} helpers exposed via storage testutils.

Comment thread sdk/storage/azure-storage-blob/azure/storage/blob/_shared/policies.py Outdated
Copy link
Copy Markdown
Member

@weirongw23-msft weirongw23-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Storage Storage Service (Queues, Blobs, Files)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants