Skip to content

[Doc-18277] Clarify tenant boundary for external data permissions#18278

Open
fzhsbc wants to merge 6 commits into
apache:devfrom
fzhsbc:docs-runtime-credential-boundary
Open

[Doc-18277] Clarify tenant boundary for external data permissions#18278
fzhsbc wants to merge 6 commits into
apache:devfrom
fzhsbc:docs-runtime-credential-boundary

Conversation

@fzhsbc
Copy link
Copy Markdown

@fzhsbc fzhsbc commented May 21, 2026

Was this PR generated or assisted by AI?

NO

Purpose of the pull request

This pull request resolves #18277 by clarifying the boundary between DolphinScheduler tenants and authorization in external data systems.

DolphinScheduler tenants are execution/resource identities, such as the Linux user used by Worker to run a task process. They should not be documented as fine-grained permission principals for external data systems by themselves.

Brief change log

  • Clarify in the English and Chinese security docs that tenants are execution/resource boundaries.
  • Document that external data systems remain responsible for validating and enforcing fine-grained data permissions.
  • Clarify that long-lived external credentials should not be stored in task definitions.

Verify this pull request

This is a documentation-only change. No runtime behavior is changed.

The modified English and Chinese Markdown files were checked for consistency with the narrowed documentation scope.

Pull Request Notice

Pull Request Notice

This pull request does not contain incompatible changes, so no entry is required in docs/docs/en/guide/upgrade/incompatible.md.

Closes #18277

@fzhsbc fzhsbc requested a review from SbloodyS as a code owner May 21, 2026 08:31
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented May 21, 2026

Thanks for opening this pull request! Please check out our contributing guidelines. (https://github.com/apache/dolphinscheduler/blob/dev/docs/docs/en/contribute/join/pull-request.md)

@fzhsbc fzhsbc force-pushed the docs-runtime-credential-boundary branch from 8001027 to 4066c4c Compare May 21, 2026 08:44
@fzhsbc fzhsbc changed the title [Docs] Clarify tenant boundary and runtime credential handling [Doc-18277] Clarify tenant boundary and runtime credentials May 21, 2026
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the template of pr and issue to complete relevant information.

@fzhsbc
Copy link
Copy Markdown
Author

fzhsbc commented May 21, 2026

Thanks for the reminder. I have updated both the PR description and the related issue to follow the project templates.

@SbloodyS SbloodyS added first time contributor First-time contributor improvement make more easy to user or prompt friendly labels May 21, 2026
@SbloodyS SbloodyS added this to the 3.4.2 milestone May 21, 2026
@SbloodyS SbloodyS requested a review from Copilot May 22, 2026 01:38
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should run mvn spotless:apply to format the code. @fzhsbc

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Documentation update clarifying that DolphinScheduler tenants are execution/resource identities (e.g., the Linux user used by Workers) rather than fine-grained authorization principals for external data systems, and adding guidance for task plugin authors on handling short-lived runtime credentials.

Changes:

  • Clarify tenant boundary vs. external system authorization responsibilities in the security guides (EN/ZH).
  • Add SPI guidance for task plugins on requesting, passing, masking, and cleaning up short-lived runtime credentials (EN/ZH).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
docs/docs/en/guide/security/security.md Adds a tenant-boundary section emphasizing external systems enforce fine-grained data permissions.
docs/docs/zh/guide/security/security.md Chinese equivalent tenant-boundary clarification for external data permissions.
docs/docs/en/contribute/backend/spi/task.md Adds recommended practices for task plugins using short-lived runtime credentials.
docs/docs/zh/contribute/backend/spi/task.md Chinese equivalent runtime-credential guidance for task plugins.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/docs/en/guide/security/security.md Outdated

The tenant code is used by the Worker as the execution identity, for example the Linux user that runs a task process. It is an execution resource boundary in DolphinScheduler. Fine-grained permissions for external systems such as databases, object stores, data catalogs, or lakehouse tables should be validated and enforced by those external systems.

Do not treat the tenant code alone as a user-level data permission principal for external systems. If a task needs short-lived external credentials, bind those credentials to auditable task context such as the project, workflow, task instance, datasource, tenant, and worker group, and avoid storing long-lived credentials in task definitions.

Recommended practice:

- Use the task execution context, such as project, workflow, task instance, datasource, tenant, and worker group, when requesting runtime credentials from an external authorization service.
@fzhsbc fzhsbc force-pushed the docs-runtime-credential-boundary branch from 8b07bc7 to 9e93d10 Compare May 22, 2026 09:56
@fzhsbc
Copy link
Copy Markdown
Author

fzhsbc commented May 22, 2026

Thanks. I updated the PR branch to address the formatting issue and standardized the English wording from datasource to data source in the added documentation.

Comment on lines +19 to +29
#### Runtime credentials for task plugins

Some task plugins need to access external systems using short-lived credentials. A plugin should avoid storing long-lived credentials in task parameters and should avoid printing credentials in task logs.

Recommended practice:

- Use the task execution context, such as project, workflow, task instance, data source, tenant, and worker group, when requesting runtime credentials from an external authorization service.
- Pass short-lived credentials to the task process through environment variables or temporary files with restricted file permissions.
- Mask sensitive values before logging command lines, environment variables, or generated configuration files.
- Remove temporary credential files after task completion or cancellation.
- Keep external data authorization in the external system. DolphinScheduler should provide task context and execution lifecycle, while the external system validates and enforces data permissions.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these written by AI? I don't understand the relevance of what you added.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. The motivation was a real deployment pattern where DolphinScheduler tasks need short-lived credentials for external systems, while the DolphinScheduler tenant is only the execution/resource identity and should not be described as the fine-grained authorization principal for those systems.

That said, I agree the added SPI guidance may be too broad for this page and can look unrelated without a concrete task plugin implementation. I can narrow this PR to only clarify the tenant/security boundary in the security guide, and remove the task SPI additions if that is the preferred scope.

@fzhsbc fzhsbc changed the title [Doc-18277] Clarify tenant boundary and runtime credentials [Doc-18277] Clarify tenant boundary for external data permissions Jun 1, 2026
@fzhsbc
Copy link
Copy Markdown
Author

fzhsbc commented Jun 1, 2026

Thanks for the review. I narrowed the PR scope based on your feedback.

The task SPI additions have been removed. The PR now only updates the security guide to clarify the DolphinScheduler tenant boundary: a tenant is an execution/resource identity, while fine-grained permissions for external data systems should be enforced by those external systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

document first time contributor First-time contributor improvement make more easy to user or prompt friendly

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Doc][Security] Clarify tenant boundary for external data permissions

3 participants