Skip to content

Support for Azure storage#571

Open
tramora wants to merge 1 commit intomainfrom
244-support-azure-storage
Open

Support for Azure storage#571
tramora wants to merge 1 commit intomainfrom
244-support-azure-storage

Conversation

@tramora
Copy link
Copy Markdown
Collaborator

@tramora tramora commented Apr 10, 2026

  • Among all the Azure storages, Khiops supports only (via its specific driver) "Files" and "Blobs" (Binary Large Objects)
  • The only supported authentication method is currently the AZURE_STORAGE_CONNECTION_STRING embedding the account name and account key

Completes #244


TODO Before Asking for a Review

  • Rebase your branch to the latest version of main (or main-v10)
  • Make sure all CI workflows are green
  • When adding a public feature/fix: Update the Unreleased section of CHANGELOG.md (no date)
  • Self-Review: Review "Files Changed" tab and fix any problems you find
  • API Docs (only if there are changes in docstrings, rst files or samples):
    • Check the docs build without warning: see the log of the API Docs workflow
    • Check that your changes render well in HTML: download the API Docs artifact and open index.html
    • If there are any problems it is faster to iterate by building locally the API Docs

@tramora tramora force-pushed the 244-support-azure-storage branch 3 times, most recently from be97119 to 08f91e2 Compare April 10, 2026 14:34
@tramora tramora force-pushed the 244-support-azure-storage branch 4 times, most recently from 17fe956 to 80691c3 Compare April 12, 2026 12:59
@tramora
Copy link
Copy Markdown
Collaborator Author

tramora commented Apr 13, 2026

The automated tests on conda environments currently fail because of an outdate version of the azure-storage-file-share package in the conda-forge channel. This PR conda-forge/azure-storage-file-share-feedstock#9 should fix the issue but it is blocked for the moment.

- Among all the Azure storages, Khiops supports only (via its specific driver) "Files" and "Blobs" (Binary Large Objects)
 - The only supported authentication method is currently the `AZURE_STORAGE_CONNECTION_STRING` embedding the account name and account key
@tramora tramora force-pushed the 244-support-azure-storage branch from 80691c3 to 701678b Compare April 13, 2026 16:32
class AzureStorageResourceMixin:
"""Azure compatible Storage Resource Mixin

Among all the Azure storages, Khiops supports only (via its specific driver) :
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the space before :.

- Files
- Blobs (Binary Large Objects)

For shared Files, the URI pattern of a resource is the following :
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the space before :.

Comment on lines +1087 to +1092
"""
Notes:
by default, the sdk will list all the blobs belonging to the container.
An extract filter is then required to simulate a directory listing.
Moreover, the virtual folders hierarchy must be deleted from the result.
"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a code comment not a docstring, because it is an implementation detail.

Comment on lines +1070 to +1077
"""
Notes:
in order to be consistent with the other drivers,
this method will overwrite the destination blob if it exists.
This is not the default behavior.
Moreover, if the virtual folders hierarchy does not exist
it is created automatically as it is part of the blob name.
"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a code comment not a docstring, because it is an implementation detail.

Comment on lines +1052 to +1057
"""
Notes:
as the virtual folder names are part of the blob name,
there is no such test like checking a virtual folder existence.
Only a blob existence is checked here.
"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below: This is a comment.

Comment on lines +1061 to +1066
"""
Notes:
as the virtual folder names are part of the blob name,
there is no such action like removing a virtual folder.
Only a blob removal is done here.
"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below: This is a comment.

Comment on lines +117 to +127
try:
os.remove("/tmp/dummy")
except: # pylint: disable=broad-exception-caught
pass
fs.copy_to_local(
f"{proto}://{bucket_name}/khiops-cicd/samples/{file}", "/tmp/dummy"
f"{proto}://{bucket_name_or_storage_prefix}/khiops-cicd/"
f"samples/{file}",
"/tmp/dummy",
)
# We cannot use `unittest.TestCase.assertTrue` in a class method
assert os.path.isfile("/tmp/dummy")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I don't get why this new code was added
  • If it stays I'd refactor it as:
    • Before the loop create /tmp/khiops-dummy
    • Save each file to /tmp/khiops-dummy/{file}
    • Do not use assert as it is not a programming error, I think it should be something like if not os.path.isfile(...): raise RuntimeError
    • Clean /tmp/khiops-dummy afterwards
      This avoids the try/except with pass in the middle.

Comment on lines -315 to 320
def should_skip_in_a_conda_env(self):
# The S3 driver is now released for conda too.
# No need to skip the tests any longer in a conda environment
return False

def config_exists(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these elimination should go in a different commit.

Comment on lines +759 to +760
class AzureStorageResourceMixin:
"""Azure compatible Storage Resource Mixin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these docs should go in the classes AzureStorage*Resource.

In the docstring of this class they should point to those, or it could be duplicated.

Also I think we should add an URL to a relevant Azure storage page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants