Skip to content

fix(engine): Base64-encode Python worker startup config to survive Windows argv#5917

Open
yangzhang75 wants to merge 1 commit into
apache:mainfrom
yangzhang75:fix/python-worker-startup-config-base64
Open

fix(engine): Base64-encode Python worker startup config to survive Windows argv#5917
yangzhang75 wants to merge 1 commit into
apache:mainfrom
yangzhang75:fix/python-worker-startup-config-base64

Conversation

@yangzhang75

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Follow-up fix for #5597, which started passing the Python worker startup config as a
single JSON-string command-line argument. On Windows the JVM assembles argv into one
command line and the inner double quotes are stripped before Python receives them, so
json.loads fails with JSONDecodeError: Expecting property name enclosed in double quotes.
Linux/macOS are unaffected (the JVM passes argv directly, quotes survive).

This Base64-encodes the JSON on the JVM side (encodeStartupConfig) and Base64-decodes it
in texera_run_python_worker.py before parsing. Base64 uses only [A-Za-z0-9+/=], so the
argument carries no quotes or spaces and survives argv quoting on every platform. The 19-key
contract and all validation are unchanged.

Any related issues, documentation, discussions?

Closes #5916

How was this PR tested?

  • Scala: PythonWorkflowWorkerStartupConfigSpec updated to decode Base64; added a regression
    test asserting the encoded output contains no quotes/whitespace. 5/5 pass; scalafmt clean.
  • Python: test_run_python_worker.py updated to feed Base64 input; added a round-trip test.
    ruff check + format clean.
  • Verified the Base64 round-trip is byte-compatible between Java Base64.getEncoder and
    Python base64.b64decode.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • No candidates found from git blame history.

@github-actions github-actions Bot added engine dependencies Pull requests that update a dependency file ddl-change Changes to the TexeraDB DDL fix pyamber frontend Changes related to the frontend GUI ci changes related to CI docs Changes related to documentations dev common platform Non-amber Scala service paths agent-service amber-integration labels Jun 23, 2026
@xuang7

xuang7 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

It seems this branch has diverged from the current main branch. Could you clean up the branch.

@yangzhang75 yangzhang75 force-pushed the fix/python-worker-startup-config-base64 branch from bb77e2c to 28c8df1 Compare June 23, 2026 23:20
@github-actions github-actions Bot removed dependencies Pull requests that update a dependency file ddl-change Changes to the TexeraDB DDL frontend Changes related to the frontend GUI ci changes related to CI docs Changes related to documentations dev common platform Non-amber Scala service paths agent-service amber-integration labels Jun 23, 2026
@codecov-commenter

codecov-commenter commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 54.69%. Comparing base (c3161f7) to head (28c8df1).
⚠️ Report is 15 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5917      +/-   ##
============================================
+ Coverage     54.50%   54.69%   +0.18%     
- Complexity     2915     2952      +37     
============================================
  Files          1108     1110       +2     
  Lines         42807    42855      +48     
  Branches       4604     4608       +4     
============================================
+ Hits          23332    23438     +106     
+ Misses        18119    18050      -69     
- Partials       1356     1367      +11     
Flag Coverage Δ *Carryforward flag
access-control-service 70.44% <ø> (ø) Carriedforward from c3161f7
agent-service 34.36% <ø> (ø) Carriedforward from c3161f7
amber 56.99% <100.00%> (+0.47%) ⬆️
computing-unit-managing-service 1.65% <ø> (ø) Carriedforward from c3161f7
config-service 57.35% <ø> (ø) Carriedforward from c3161f7
file-service 58.59% <ø> (ø) Carriedforward from c3161f7
frontend 48.27% <ø> (ø) Carriedforward from c3161f7
pyamber 90.20% <100.00%> (+<0.01%) ⬆️
python 90.76% <ø> (ø) Carriedforward from c3161f7
workflow-compiling-service 58.69% <ø> (ø) Carriedforward from c3161f7

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 5 worse · ⚪ 10 noise (<±5%) · 0 without baseline

Compared against main 1c580e5 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 411 0.251 23,767/31,766/31,766 us 🔴 +22.9% / 🔴 +113.4%
bs=100 sw=10 sl=64 817 0.499 120,635/156,208/156,208 us ⚪ within ±5% / 🔴 +47.0%
bs=1000 sw=10 sl=64 912 0.557 1,100,507/1,127,471/1,127,471 us ⚪ within ±5% / 🔴 +13.4%
Baseline details

Latest main 1c580e5 from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 411 tuples/sec 459 tuples/sec 781.13 tuples/sec -10.5% -47.4%
bs=10 sw=10 sl=64 MB/s 0.251 MB/s 0.28 MB/s 0.477 MB/s -10.4% -47.4%
bs=10 sw=10 sl=64 p50 23,767 us 22,514 us 12,542 us +5.6% +89.5%
bs=10 sw=10 sl=64 p95 31,766 us 25,853 us 14,886 us +22.9% +113.4%
bs=10 sw=10 sl=64 p99 31,766 us 25,853 us 17,580 us +22.9% +80.7%
bs=100 sw=10 sl=64 throughput 817 tuples/sec 812 tuples/sec 999.37 tuples/sec +0.6% -18.2%
bs=100 sw=10 sl=64 MB/s 0.499 MB/s 0.496 MB/s 0.61 MB/s +0.6% -18.2%
bs=100 sw=10 sl=64 p50 120,635 us 120,386 us 99,687 us +0.2% +21.0%
bs=100 sw=10 sl=64 p95 156,208 us 149,966 us 106,271 us +4.2% +47.0%
bs=100 sw=10 sl=64 p99 156,208 us 149,966 us 115,445 us +4.2% +35.3%
bs=1000 sw=10 sl=64 throughput 912 tuples/sec 924 tuples/sec 1,036 tuples/sec -1.3% -12.0%
bs=1000 sw=10 sl=64 MB/s 0.557 MB/s 0.564 MB/s 0.632 MB/s -1.2% -11.9%
bs=1000 sw=10 sl=64 p50 1,100,507 us 1,073,848 us 970,675 us +2.5% +13.4%
bs=1000 sw=10 sl=64 p95 1,127,471 us 1,137,266 us 1,011,928 us -0.9% +11.4%
bs=1000 sw=10 sl=64 p99 1,127,471 us 1,137,266 us 1,045,045 us -0.9% +7.9%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,486.52,200,128000,411,0.251,23766.55,31766.35,31766.35
1,100,10,64,20,2448.25,2000,1280000,817,0.499,120635.30,156207.52,156207.52
2,1000,10,64,20,21935.23,20000,12800000,912,0.557,1100507.11,1127470.78,1127470.78

@github-actions github-actions Bot added docs Changes related to documentations dev common platform Non-amber Scala service paths agent-service amber-integration labels Jun 24, 2026
@Yicong-Huang Yicong-Huang marked this pull request as draft June 24, 2026 02:02
@Yicong-Huang

Copy link
Copy Markdown
Contributor

@yangzhang75 please rebase your commit history on this branch. right now it contains 4461 commits.

I marked the PR into a draft for now. please reopen it when ready.

@yangzhang75 yangzhang75 force-pushed the fix/python-worker-startup-config-base64 branch from 967020a to 28c8df1 Compare June 24, 2026 05:11
@github-actions github-actions Bot removed dependencies Pull requests that update a dependency file ddl-change Changes to the TexeraDB DDL frontend Changes related to the frontend GUI ci changes related to CI docs Changes related to documentations dev common platform Non-amber Scala service paths agent-service amber-integration labels Jun 24, 2026
@yangzhang75 yangzhang75 marked this pull request as ready for review June 24, 2026 05:13
@carloea2

Copy link
Copy Markdown
Contributor

Sure, feel free to ping me when is ready for a review pass. thanks.

@Yicong-Huang

Copy link
Copy Markdown
Contributor

By default, PR is ready for review. If not, author should mark it a draft

@carloea2

Copy link
Copy Markdown
Contributor

Ok I am doing testing now

@carloea2

Copy link
Copy Markdown
Contributor

It worked, thanks.

@Yicong-Huang Yicong-Huang left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, approved for the change. however base64 causes the payload not readable by human. could potentially make future debugging harder. low priority now.

Thanks!

@Yicong-Huang Yicong-Huang added release/v1.2 back porting to release/v1.2 and removed release/v1.2 back porting to release/v1.2 labels Jun 25, 2026
@Yicong-Huang

Copy link
Copy Markdown
Contributor

as this is a fix for a refactor done in #5597, we don't need to backport it to v1.2

cc @xuang7 for context.

@Yicong-Huang Yicong-Huang enabled auto-merge June 25, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python worker fails to start on Windows: startup-config JSON argv loses quotes

5 participants