Skip to content

Move dataset file path resolution out of CU Master/Worker #5013

@bobbai00

Description

@bobbai00

Task Summary

Sub-issue of #5011.

Move the FileResolver.datasetResolveFunc direct DB call
(common/workflow-core/.../FileResolver.scala) behind an HTTP service
that owns the credentials. The executor forwards the originating user's
JWT.

datasetResolveFunc joins USER × DATASET × DATASET_VERSION to translate
/owner/dataset/version/file into a dataset:///<repo>/<hash>/<file>
URI. It is the only SqlServer call site reachable from CU Master /
Worker that is not about execution metadata. It is invoked from
LogicalPlan.resolveScanSourceOpFileName, which runs during workflow
compile on every execution.

The natural owner for this lookup is file-service (it already owns
the dataset model). Done when no code reachable from CU Master / Worker
calls SqlServer for dataset path resolution, the new endpoint is
@Auth-checked, and the existing FileResolverSpec plus an end-to-end
workflow run that scans a dataset file still pass.

Task Type

  • Refactor / Cleanup
  • DevOps / Deployment / CI
  • Testing / QA
  • Documentation
  • Performance
  • Other

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions