Skip to content

fix(test): stabilize flaky ReconfigurationIntegrationSpec pause race#5915

Open
Ma77Ball wants to merge 4 commits into
apache:mainfrom
Ma77Ball:fix/reconfiguration-spec-flaky
Open

fix(test): stabilize flaky ReconfigurationIntegrationSpec pause race#5915
Ma77Ball wants to merge 4 commits into
apache:mainfrom
Ma77Ball:fix/reconfiguration-spec-flaky

Conversation

@Ma77Ball

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

  • Switch the two CSV-sourced tests in ReconfigurationIntegrationSpec from smallCsvScanOpDesc (100 rows) to mediumCsvScanOpDesc (100k rows), so the workflow is still running when pauseWorkflow lands.
  • This removes a timing race: with a 100-row source the run could reach COMPLETED before pause took effect, so PauseHandler.pauseWorkflow emitted COMPLETED instead of PAUSED and TestUtils.shouldReconfigures 10s pausedReached await timed out.
  • Aligns the integration spec with its non-flaky siblings ReconfigurationSpec and PauseSpec, which already use mediumCsvScanOpDesc for the same pause/reconfigure/resume flow.

Any related issues, documentation, discussions?

Closes: #5913

How was this PR tested?

  • sbt "WorkflowExecutionService/Test/compile" compiles cleanly with the change.
  • Full run requires the amber-integration stack (Python UDF + Postgres/MinIO/Lakekeeper) that CI provisions; could not run locally. Reviewer: run the amber-integration job (or sbt "WorkflowExecutionService/testOnly *ReconfigurationIntegrationSpec" in that environment) and confirm all 3 tests pass across repeated runs.
  • Expect a workflow paused log line for every test run (the flaky run showed fewer pauses than test runs).

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8 in compliance with ASF

… a pause is generated leading to a failed ci
@github-actions

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • No candidates found from git blame history.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 2 better · 🔴 5 worse · ⚪ 8 noise (<±5%) · 0 without baseline

Compared against main 7a9730b benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 435 0.265 22,605/32,977/32,977 us 🔴 +19.5% / 🔴 +111.7%
🟢 bs=100 sw=10 sl=64 954 0.582 106,934/119,979/119,979 us 🟢 -8.5% / 🔴 +9.4%
bs=1000 sw=10 sl=64 1,090 0.666 924,444/952,529/952,529 us ⚪ within ±5% / 🟢 -12.0%
Baseline details

Latest main 7a9730b from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 435 tuples/sec 476 tuples/sec 758.88 tuples/sec -8.6% -42.7%
bs=10 sw=10 sl=64 MB/s 0.265 MB/s 0.291 MB/s 0.463 MB/s -8.9% -42.8%
bs=10 sw=10 sl=64 p50 22,605 us 18,915 us 12,965 us +19.5% +74.4%
bs=10 sw=10 sl=64 p95 32,977 us 29,881 us 15,578 us +10.4% +111.7%
bs=10 sw=10 sl=64 p99 32,977 us 29,881 us 18,378 us +10.4% +79.4%
bs=100 sw=10 sl=64 throughput 954 tuples/sec 956 tuples/sec 968.9 tuples/sec -0.2% -1.5%
bs=100 sw=10 sl=64 MB/s 0.582 MB/s 0.584 MB/s 0.591 MB/s -0.3% -1.6%
bs=100 sw=10 sl=64 p50 106,934 us 102,685 us 102,767 us +4.1% +4.1%
bs=100 sw=10 sl=64 p95 119,979 us 131,160 us 109,629 us -8.5% +9.4%
bs=100 sw=10 sl=64 p99 119,979 us 131,160 us 118,129 us -8.5% +1.6%
bs=1000 sw=10 sl=64 throughput 1,090 tuples/sec 1,092 tuples/sec 997.01 tuples/sec -0.2% +9.3%
bs=1000 sw=10 sl=64 MB/s 0.666 MB/s 0.667 MB/s 0.609 MB/s -0.1% +9.4%
bs=1000 sw=10 sl=64 p50 924,444 us 918,924 us 1,009,306 us +0.6% -8.4%
bs=1000 sw=10 sl=64 p95 952,529 us 958,708 us 1,051,088 us -0.6% -9.4%
bs=1000 sw=10 sl=64 p99 952,529 us 958,708 us 1,082,535 us -0.6% -12.0%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,460.20,200,128000,435,0.265,22605.42,32977.19,32977.19
1,100,10,64,20,2096.28,2000,1280000,954,0.582,106933.52,119978.70,119978.70
2,1000,10,64,20,18340.58,20000,12800000,1090,0.666,924444.10,952529.41,952529.41

@codecov-commenter

codecov-commenter commented Jun 23, 2026

Copy link
Copy Markdown

❌ 2 Tests Failed:

Tests completed Failed Passed Skipped
7 2 5 0
View the top 2 failed test(s) by shortest run time
org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec::Engine should be able to modify a python UDF worker in workflow
Stack Traces | 67.9s run time
com.twitter.util.TimeoutException: 5.seconds
	at com.twitter.util.Promise.ready(Promise.scala:680)
	at com.twitter.util.Promise.result(Promise.scala:689)
	at com.twitter.util.Await$.$anonfun$result$1(Awaitable.scala:155)
	at com.twitter.concurrent.LocalScheduler$Activation.blocking(Scheduler.scala:189)
	at com.twitter.concurrent.LocalScheduler.blocking(Scheduler.scala:256)
	at com.twitter.concurrent.Scheduler$.blocking(Scheduler.scala:85)
	at com.twitter.util.Await$.result(Awaitable.scala:155)
	at org.apache.texera.amber.engine.e2e.TestUtils$.shouldReconfigure(TestUtils.scala:314)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.shouldReconfigure(ReconfigurationIntegrationSpec.scala:167)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.$anonfun$new$1(ReconfigurationIntegrationSpec.scala:194)
	at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
	at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
	at org.scalatest.Transformer.apply(Transformer.scala:22)
	at org.scalatest.Transformer.apply(Transformer.scala:20)
	at org.scalatest.flatspec.AnyFlatSpecLike$$anon$5.apply(AnyFlatSpecLike.scala:1832)
	at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
	at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.super$withFixture(ReconfigurationIntegrationSpec.scala:78)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.$anonfun$withFixture$1(ReconfigurationIntegrationSpec.scala:78)
	at org.scalatest.Retries.withRetry(Retries.scala:345)
	at org.scalatest.Retries.withRetry$(Retries.scala:344)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.withRetry(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.Retries.withRetry(Retries.scala:205)
	at org.scalatest.Retries.withRetry$(Retries.scala:205)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.withRetry(ReconfigurationIntegrationSpec.scala:64)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.withFixture(ReconfigurationIntegrationSpec.scala:78)
	at org.scalatest.flatspec.AnyFlatSpecLike.invokeWithFixture$1(AnyFlatSpecLike.scala:1830)
	at org.scalatest.flatspec.AnyFlatSpecLike.$anonfun$runTest$1(AnyFlatSpecLike.scala:1842)
	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTest(AnyFlatSpecLike.scala:1842)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTest$(AnyFlatSpecLike.scala:1824)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.org$scalatest$BeforeAndAfterEach$$super$runTest(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
	at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.runTest(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.flatspec.AnyFlatSpecLike.$anonfun$runTests$1(AnyFlatSpecLike.scala:1900)
	at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
	at scala.collection.immutable.List.foreach(List.scala:323)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
	at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:390)
	at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:427)
	at scala.collection.immutable.List.foreach(List.scala:323)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
	at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTests(AnyFlatSpecLike.scala:1900)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTests$(AnyFlatSpecLike.scala:1899)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.runTests(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.Suite.run(Suite.scala:1114)
	at org.scalatest.Suite.run$(Suite.scala:1096)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.org$scalatest$flatspec$AnyFlatSpecLike$$super$run(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.flatspec.AnyFlatSpecLike.$anonfun$run$1(AnyFlatSpecLike.scala:1945)
	at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
	at org.scalatest.flatspec.AnyFlatSpecLike.run(AnyFlatSpecLike.scala:1945)
	at org.scalatest.flatspec.AnyFlatSpecLike.run$(AnyFlatSpecLike.scala:1943)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.org$scalatest$BeforeAndAfterAll$$super$run(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
	at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
	at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.run(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
	at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
	at sbt.TestRunner.runTest$1(TestFramework.scala:153)
	at sbt.TestRunner.run(TestFramework.scala:168)
	at sbt.TestFramework$$anon$3$$anonfun$$lessinit$greater$1.$anonfun$apply$1(TestFramework.scala:336)
	at sbt.TestFramework$.sbt$TestFramework$$withContextLoader(TestFramework.scala:296)
	at sbt.TestFramework$$anon$3$$anonfun$$lessinit$greater$1.apply(TestFramework.scala:336)
	at sbt.TestFramework$$anon$3$$anonfun$$lessinit$greater$1.apply(TestFramework.scala:336)
	at sbt.TestFunction.apply(TestFramework.scala:348)
	at sbt.Tests$.$anonfun$toTask$1(Tests.scala:436)
	at sbt.std.Transform$$anon$3.$anonfun$apply$2(Transform.scala:47)
	at sbt.std.Transform$$anon$4.work(Transform.scala:69)
	at sbt.Execute.$anonfun$submit$2(Execute.scala:283)
	at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:24)
	at sbt.Execute.work(Execute.scala:292)
	at sbt.Execute.$anonfun$submit$1(Execute.scala:283)
	at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
	at sbt.CompletionService$$anon$2.call(CompletionService.scala:65)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec::Engine should be able to modify two python UDFs in workflow
Stack Traces | 126s run time
com.twitter.util.TimeoutException: 1.minutes
	at com.twitter.util.Promise.ready(Promise.scala:680)
	at com.twitter.util.Promise.result(Promise.scala:689)
	at com.twitter.util.Await$.$anonfun$result$1(Awaitable.scala:155)
	at com.twitter.concurrent.LocalScheduler$Activation.blocking(Scheduler.scala:189)
	at com.twitter.concurrent.LocalScheduler.blocking(Scheduler.scala:256)
	at com.twitter.concurrent.Scheduler$.blocking(Scheduler.scala:85)
	at com.twitter.util.Await$.result(Awaitable.scala:155)
	at org.apache.texera.amber.engine.e2e.TestUtils$.shouldReconfigure(TestUtils.scala:339)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.shouldReconfigure(ReconfigurationIntegrationSpec.scala:167)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.$anonfun$new$5(ReconfigurationIntegrationSpec.scala:263)
	at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
	at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
	at org.scalatest.Transformer.apply(Transformer.scala:22)
	at org.scalatest.Transformer.apply(Transformer.scala:20)
	at org.scalatest.flatspec.AnyFlatSpecLike$$anon$5.apply(AnyFlatSpecLike.scala:1832)
	at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
	at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.super$withFixture(ReconfigurationIntegrationSpec.scala:78)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.$anonfun$withFixture$1(ReconfigurationIntegrationSpec.scala:78)
	at org.scalatest.Retries.withRetry(Retries.scala:345)
	at org.scalatest.Retries.withRetry$(Retries.scala:344)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.withRetry(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.Retries.withRetry(Retries.scala:205)
	at org.scalatest.Retries.withRetry$(Retries.scala:205)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.withRetry(ReconfigurationIntegrationSpec.scala:64)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.withFixture(ReconfigurationIntegrationSpec.scala:78)
	at org.scalatest.flatspec.AnyFlatSpecLike.invokeWithFixture$1(AnyFlatSpecLike.scala:1830)
	at org.scalatest.flatspec.AnyFlatSpecLike.$anonfun$runTest$1(AnyFlatSpecLike.scala:1842)
	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTest(AnyFlatSpecLike.scala:1842)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTest$(AnyFlatSpecLike.scala:1824)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.org$scalatest$BeforeAndAfterEach$$super$runTest(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
	at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.runTest(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.flatspec.AnyFlatSpecLike.$anonfun$runTests$1(AnyFlatSpecLike.scala:1900)
	at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
	at scala.collection.immutable.List.foreach(List.scala:323)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
	at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:390)
	at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:427)
	at scala.collection.immutable.List.foreach(List.scala:323)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
	at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTests(AnyFlatSpecLike.scala:1900)
	at org.scalatest.flatspec.AnyFlatSpecLike.runTests$(AnyFlatSpecLike.scala:1899)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.runTests(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.Suite.run(Suite.scala:1114)
	at org.scalatest.Suite.run$(Suite.scala:1096)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.org$scalatest$flatspec$AnyFlatSpecLike$$super$run(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.flatspec.AnyFlatSpecLike.$anonfun$run$1(AnyFlatSpecLike.scala:1945)
	at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
	at org.scalatest.flatspec.AnyFlatSpecLike.run(AnyFlatSpecLike.scala:1945)
	at org.scalatest.flatspec.AnyFlatSpecLike.run$(AnyFlatSpecLike.scala:1943)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.org$scalatest$BeforeAndAfterAll$$super$run(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
	at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
	at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
	at org.apache.texera.amber.engine.e2e.ReconfigurationIntegrationSpec.run(ReconfigurationIntegrationSpec.scala:64)
	at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
	at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
	at sbt.TestRunner.runTest$1(TestFramework.scala:153)
	at sbt.TestRunner.run(TestFramework.scala:168)
	at sbt.TestFramework$$anon$3$$anonfun$$lessinit$greater$1.$anonfun$apply$1(TestFramework.scala:336)
	at sbt.TestFramework$.sbt$TestFramework$$withContextLoader(TestFramework.scala:296)
	at sbt.TestFramework$$anon$3$$anonfun$$lessinit$greater$1.apply(TestFramework.scala:336)
	at sbt.TestFramework$$anon$3$$anonfun$$lessinit$greater$1.apply(TestFramework.scala:336)
	at sbt.TestFunction.apply(TestFramework.scala:348)
	at sbt.Tests$.$anonfun$toTask$1(Tests.scala:436)
	at sbt.std.Transform$$anon$3.$anonfun$apply$2(Transform.scala:47)
	at sbt.std.Transform$$anon$4.work(Transform.scala:69)
	at sbt.Execute.$anonfun$submit$2(Execute.scala:283)
	at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:24)
	at sbt.Execute.work(Execute.scala:292)
	at sbt.Execute.$anonfun$submit$1(Execute.scala:283)
	at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
	at sbt.CompletionService$$anon$2.call(CompletionService.scala:65)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@Ma77Ball Ma77Ball marked this pull request as ready for review June 26, 2026 23:39
@Ma77Ball

Copy link
Copy Markdown
Contributor Author

/request-review @aglinxinyuan

@github-actions github-actions Bot requested a review from aglinxinyuan June 27, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky ReconfigurationIntegrationSpec: small CSV source can finish before pause

2 participants