-
Notifications
You must be signed in to change notification settings - Fork 122
Parallelize file uploads in fs cp command. #4132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
5896ae4
b933398
2fa03e1
d393388
14cfad9
2a9ec1b
179a944
79dd600
2d1f19d
caaff68
12dec5c
2be2174
7db079d
e720670
4029571
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| localdir/file1.txt -> dbfs:/Volumes/main/default/data/uploaded-dir/file1.txt | ||
| localdir/file2.txt -> dbfs:/Volumes/main/default/data/uploaded-dir/file2.txt |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| mkdir -p localdir | ||
| echo -n "file1 content" > localdir/file1.txt | ||
| echo -n "file2 content" > localdir/file2.txt | ||
|
|
||
| # Recursive directory copy (output sorted for deterministic ordering). | ||
| $CLI fs cp -r localdir dbfs:/Volumes/main/default/data/uploaded-dir 2>&1 | sort | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| Local = true | ||
| Cloud = false | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I recommend setting Cloud = true for this test as well. That validates the implementation against a real server. You can use
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is also nice to ensure that the local test server implementation matches the remote behaviour, atleast for the interfaces we own (like the fs commands) |
||
| Ignore = ["localdir"] | ||
|
|
||
| # Recursive copy: localdir/ -> uploaded-dir/. | ||
| [[Server]] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We already have a local sserver supporing these operations (see: libs/testserver/handlers.go). This should work even without the overrides: |
||
| Pattern = "PUT /api/2.0/fs/directories/Volumes/main/default/data/uploaded-dir" | ||
| Response.StatusCode = 200 | ||
|
|
||
| [[Server]] | ||
| Pattern = "HEAD /api/2.0/fs/directories/Volumes/main/default/data/uploaded-dir" | ||
| Response.StatusCode = 200 | ||
|
|
||
| [[Server]] | ||
| Pattern = "PUT /api/2.0/fs/files/Volumes/main/default/data/uploaded-dir/file1.txt" | ||
| Response.StatusCode = 200 | ||
|
|
||
| [[Server]] | ||
| Pattern = "PUT /api/2.0/fs/files/Volumes/main/default/data/uploaded-dir/file2.txt" | ||
| Response.StatusCode = 200 | ||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
|
|
||
| >>> [CLI] fs cp local.txt dbfs:/Volumes/main/default/data/mydir/ | ||
| local.txt -> dbfs:/Volumes/main/default/data/mydir/local.txt |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| echo -n "hello world!" > local.txt | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as above, local.txt can be commited. |
||
|
|
||
| # Copy file into a directory (trailing slash indicates directory target). | ||
| trace $CLI fs cp local.txt dbfs:/Volumes/main/default/data/mydir/ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| Local = true | ||
| Cloud = false | ||
| Ignore = ["local.txt"] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you commit local.txt, you can remove this config. Same for the test above. |
||
|
|
||
| # Copy file into existing directory: local.txt -> mydir/local.txt. | ||
| [[Server]] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as earlier, we should not need these overrides. |
||
| Pattern = "HEAD /api/2.0/fs/directories/Volumes/main/default/data/mydir" | ||
| Response.StatusCode = 200 | ||
|
|
||
| [[Server]] | ||
| Pattern = "PUT /api/2.0/fs/files/Volumes/main/default/data/mydir/local.txt" | ||
| Response.StatusCode = 200 | ||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
|
|
||
| >>> [CLI] fs cp local.txt dbfs:/Volumes/main/default/data/uploaded.txt | ||
| local.txt -> dbfs:/Volumes/main/default/data/uploaded.txt | ||
|
|
||
| >>> [CLI] fs cp dbfs:/Volumes/main/default/data/remote.txt downloaded.txt | ||
| dbfs:/Volumes/main/default/data/remote.txt -> downloaded.txt | ||
| content from volume |
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,9 @@ | ||||||||
| echo -n "hello world!" > local.txt | ||||||||
|
|
||||||||
| # Upload local file to volume. | ||||||||
| trace $CLI fs cp local.txt dbfs:/Volumes/main/default/data/uploaded.txt | ||||||||
|
|
||||||||
| # Download file from volume to local. | ||||||||
| trace $CLI fs cp dbfs:/Volumes/main/default/data/remote.txt downloaded.txt | ||||||||
|
|
||||||||
| cat downloaded.txt | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (optional)
Suggested change
By removing it here, you do not need to Ignore it in test.toml. |
||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| Local = true | ||
| Cloud = false | ||
| Ignore = ["local.txt", "downloaded.txt"] | ||
|
|
||
| # Upload: local.txt -> dbfs:/Volumes/.../uploaded.txt. | ||
| [[Server]] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same, remove overrides. |
||
| Pattern = "HEAD /api/2.0/fs/directories/Volumes/main/default/data/uploaded.txt" | ||
| Response.StatusCode = 404 | ||
|
|
||
| [[Server]] | ||
| Pattern = "HEAD /api/2.0/fs/files/Volumes/main/default/data/uploaded.txt" | ||
| Response.StatusCode = 404 | ||
|
|
||
| [[Server]] | ||
| Pattern = "HEAD /api/2.0/fs/directories/Volumes/main/default/data" | ||
| Response.StatusCode = 200 | ||
|
|
||
| [[Server]] | ||
| Pattern = "PUT /api/2.0/fs/files/Volumes/main/default/data/uploaded.txt" | ||
| Response.StatusCode = 200 | ||
|
|
||
| # Download: dbfs:/Volumes/.../remote.txt -> downloaded.txt. | ||
| [[Server]] | ||
| Pattern = "HEAD /api/2.0/fs/directories/Volumes/main/default/data/remote.txt" | ||
| Response.StatusCode = 404 | ||
|
|
||
| [[Server]] | ||
| Pattern = "HEAD /api/2.0/fs/files/Volumes/main/default/data/remote.txt" | ||
| Response.StatusCode = 200 | ||
|
|
||
| [[Server]] | ||
| Pattern = "GET /api/2.0/fs/files/Volumes/main/default/data/remote.txt" | ||
| Response.StatusCode = 200 | ||
| Response.Body = "content from volume" | ||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
|
|
||
| >>> errcode [CLI] fs cp src dst --concurrency -1 | ||
| Error: --concurrency must be at least 1 | ||
|
|
||
| Exit code: 1 | ||
|
|
||
| >>> errcode [CLI] fs cp src dst --concurrency 0 | ||
| Error: --concurrency must be at least 1 | ||
|
|
||
| Exit code: 1 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # Invalid concurrency values should fail. | ||
| trace errcode $CLI fs cp src dst --concurrency -1 | ||
| trace errcode $CLI fs cp src dst --concurrency 0 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| Local = true | ||
| Cloud = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating these files at test runtime, you can create and commit them here. That way, the test can directly read them without having to write them first.
I recommend commiting input files directly unless the file contents are dynamic.