Task Summary
Improve dataset upload retry behavior so batch uploads distinguish between incomplete multipart uploads and files that already exist in the dataset.
| Case |
Expected behavior |
| Active multipart upload session exists for the same path |
Prompt the user to resume or restart the incomplete upload. |
| A file with the same path and size already exists in committed or staged dataset files |
Prompt the user to upload again or skip the matching file. |
The completed-file prompt should use cautious wording because matching by path and size does not prove byte-for-byte equality.
Implementation should include:
- A backend dataset-scoped check for candidate upload paths and sizes.
- Frontend logic that checks active multipart sessions first, then checks existing matching files.
- Support for mixed retry batches where one file resumes and another file can be skipped.
- Tests for multipart resume behavior, completed-file skip behavior, backend committed/staged matches, and invalid or unauthorized requests.
Related discussion: #5744
Related PR: #5929
Task Type
Task Summary
Improve dataset upload retry behavior so batch uploads distinguish between incomplete multipart uploads and files that already exist in the dataset.
The completed-file prompt should use cautious wording because matching by path and size does not prove byte-for-byte equality.
Implementation should include:
Related discussion: #5744
Related PR: #5929
Task Type