Skip to content

[SPARK-55656][BUILD] Upgrade Guava to 33.5.0-jre#54447

Closed
LuciferYang wants to merge 4 commits intoapache:masterfrom
LuciferYang:SPARK-55656
Closed

[SPARK-55656][BUILD] Upgrade Guava to 33.5.0-jre#54447
LuciferYang wants to merge 4 commits intoapache:masterfrom
LuciferYang:SPARK-55656

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Feb 24, 2026

What changes were proposed in this pull request?

This pr aims to upgrade Guava from 33.4.8 to 33.5.0.

Why are the changes needed?

The new version brings some Improvements, such as

The full release notes as follows:

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass Github Actions

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang marked this pull request as draft February 24, 2026 11:45
@LuciferYang
Copy link
Contributor Author

LuciferYang commented Feb 24, 2026

Waiting for #54446 to fix the test failures after the upgrade.

[info] CountVectorizerSuite:
[info] - params (24 milliseconds)
[info] - CountVectorizerModel common cases (1 second, 779 milliseconds)
[info] - CountVectorizer common cases (400 milliseconds)
[info] - CountVectorizer vocabSize and minDF (735 milliseconds)
[info] - CountVectorizer maxDF (113 milliseconds)
[info] - CountVectorizer using both minDF and maxDF (103 milliseconds)
[info] - CountVectorizerModel with minTF count (212 milliseconds)
[info] - CountVectorizerModel with minTF freq (222 milliseconds)
[info] - CountVectorizerModel and CountVectorizer with binary *** FAILED *** (227 milliseconds)
[info]   org.scalatest.exceptions.TestFailedException: Expected (4,[1],[1.0]) and (4,[2],[1.0]) to be within 1.0E-14 using absolute tolerance for all elements.
[info]   
[info]   == Progress ==
[info]      AddData to MemoryStream[_1#606,_2#607,_3#608]: (0,List(a, a, a, a, b, b, b, b, c, d),(4,[0,1,2,3],[1.0,1.0,1.0,1.0])),(1,List(c, c, c),(4,[2],[1.0])),(2,List(a),(4,[0],[1.0]))
[info]   => CheckAnswerByFunc
[info]   
[info]   == Stream ==
[info]   Output Mode: Append
[info]   Stream state: {MemoryStream[_1#606,_2#607,_3#608]: 0}
[info]   Thread state: alive
[info]   Thread stack trace: java.base@17.0.18/java.lang.Thread.sleep(Native Method)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.executeOneBatch(MicroBatchExecution.scala:603)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:521)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.$anonfun$runActivatedStream$1$adapted(MicroBatchExecution.scala:521)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution$$Lambda$2784/0x0000000801e93698.apply(Unknown Source)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.TriggerExecutor.runOneBatch(TriggerExecutor.scala:40)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.TriggerExecutor.runOneBatch$(TriggerExecutor.scala:38)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.ProcessingTimeExecutor.runOneBatch(TriggerExecutor.scala:71)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.ProcessingTimeExecutor.execute(TriggerExecutor.scala:83)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:521)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution.$anonfun$runStream$1(StreamExecution.scala:353)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution$$Lambda$2751/0x0000000801e813a0.apply$mcV$sp(Unknown Source)
[info]   app//scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[info]   app//org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:810)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution.org$apache$spark$sql$execution$streaming$runtime$StreamExecution$$runStream(StreamExecution.scala:313)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution$$anon$1.run(StreamExecution.scala:236)
[info]   
[info]   
[info]   == Sink ==
[info]   0: [(4,[0,1,2,3],[1.0,1.0,1.0,1.0]),(4,[0,1,2,3],[1.0,1.0,1.0,1.0])] [(4,[1],[1.0]),(4,[2],[1.0])] [(4,[0],[1.0]),(4,[0],[1.0])]
[info]   
[info]   
[info]   == Plan ==
[info]   == Parsed Logical Plan ==
[info]   ~WriteToMicroBatchDataSource MemorySink, 0779d714-07d5-4b31-88a2-e89b6406eadb, Append, 0
[info]   +- ~Project [features#632, expected#621]
[info]      +- ~Project [id#619, words#620, expected#621, UDF(words#620) AS features#632]
[info]         +- ~Project [id#616 AS id#619, words#617 AS words#620, expected#618 AS expected#621]
[info]            +- ~Project [_1#606 AS id#616, _2#607 AS words#617, _3#608 AS expected#618]
[info]               +- ~StreamingDataSourceV2ScanRelation[_1#606, _2#607, _3#608] MemoryStreamDataSource
[info]   
[info]   == Analyzed Logical Plan ==
[info]   ~WriteToMicroBatchDataSource MemorySink, 0779d714-07d5-4b31-88a2-e89b6406eadb, Append, 0
[info]   +- ~Project [features#632, expected#621]
[info]      +- ~Project [id#619, words#620, expected#621, UDF(words#620) AS features#632]
[info]         +- ~Project [id#616 AS id#619, words#617 AS words#620, expected#618 AS expected#621]
[info]            +- ~Project [_1#606 AS id#616, _2#607 AS words#617, _3#608 AS expected#618]
[info]               +- ~StreamingDataSourceV2ScanRelation[_1#606, _2#607, _3#608] MemoryStreamDataSource
[info]   
[info]   == Optimized Logical Plan ==
[info]   ~WriteToDataSourceV2 MicroBatchWrite[epoch: 0, writer: org.apache.spark.sql.execution.streaming.sources.MemoryStreamingWrite@3ddc0dda]
[info]   +- ~Project [UDF(_2#607) AS features#632, _3#608 AS expected#621]
[info]      +- ~StreamingDataSourceV2ScanRelation[_1#606, _2#607, _3#608] MemoryStreamDataSource
[info]   
[info]   == Physical Plan ==
[info]   WriteToDataSourceV2 MicroBatchWrite[epoch: 0, writer: org.apache.spark.sql.execution.streaming.sources.MemoryStreamingWrite@3ddc0dda]
[info]   +- *(1) Project [UDF(_2#607) AS features#632, _3#608 AS expected#621]
[info]      +- MicroBatchScan[_1#606, _2#607, _3#608] MemoryStreamDataSource (StreamTest.scala:512)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Assertions.fail(Assertions.scala:933)
[info]   at org.scalatest.Assertions.fail$(Assertions.scala:929)
[info]   at org.scalatest.funsuite.AnyFunSuite.fail(AnyFunSuite.scala:1564)
[info]   at org.apache.spark.sql.streaming.StreamTest.failTest$1(StreamTest.scala:512)
[info]   at org.apache.spark.sql.streaming.StreamTest.executeAction$1(StreamTest.scala:862)
[info]   at org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$69(StreamTest.scala:887)
[info]   at org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$69$adapted(StreamTest.scala:874)
[info]   at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:630)
[info]   at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:628)
[info]   at scala.collection.AbstractIterable.foreach(Iterable.scala:936)
[info]   at org.apache.spark.sql.streaming.StreamTest.liftedTree1$1(StreamTest.scala:874)
[info]   at org.apache.spark.sql.streaming.StreamTest.testStream(StreamTest.scala:873)
[info]   at org.apache.spark.sql.streaming.StreamTest.testStream$(StreamTest.scala:391)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testStream(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerOnStreamData(MLTest.scala:125)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerOnStreamData$(MLTest.scala:104)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testTransformerOnStreamData(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerByGlobalCheckFunc(MLTest.scala:160)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerByGlobalCheckFunc$(MLTest.scala:153)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testTransformerByGlobalCheckFunc(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.util.MLTest.testTransformer(MLTest.scala:150)
[info]   at org.apache.spark.ml.util.MLTest.testTransformer$(MLTest.scala:140)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testTransformer(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.$anonfun$new$19(CountVectorizerSuite.scala:244)
[info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[info]   at org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
[info]   at org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
[info]   at org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
[info]   at org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
[info]   at org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:30)
[info]   at org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:41)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
[info]   at org.apache.spark.SparkTestSuite.withFixture(SparkTestSuite.scala:175)
[info]   at org.apache.spark.SparkTestSuite.withFixture$(SparkTestSuite.scala:169)
[info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:30)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
[info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:30)
[info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
[info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
[info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:30)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
[info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
[info]   at scala.collection.immutable.List.foreach(List.scala:323)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268)
[info]   at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Suite.run(Suite.scala:1114)
[info]   at org.scalatest.Suite.run$(Suite.scala:1096)
[info]   at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273)
[info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272)
[info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:30)
[info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
[info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
[info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
[info]   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:30)
[info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
[info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
[info]   at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:414)
[info]   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[info]   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[info]   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[info]   at java.base/java.lang.Thread.run(Thread.java:840)
[info] - CountVectorizer read/write (360 milliseconds)
[info] - CountVectorizerModel read/write (1 second, 82 milliseconds)
[info] - SPARK-22974: CountVectorModel should attach proper attribute to output column (18 milliseconds)
[info] - SPARK-32662: Test on empty dataset (25 milliseconds)
[info] - SPARK-32662: Remove requirement for minimum vocabulary size (739 milliseconds)
[info] Run completed in 9 seconds, 415 milliseconds.
[info] Total number of tests run: 14
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 13, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.ml.feature.CountVectorizerSuite

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged #54446 to unblock this. Could you rebase once more, @LuciferYang ?

Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked Guava 33.5.0 Release Notes - no surprise.

@LuciferYang
Copy link
Contributor Author

rebased

@LuciferYang LuciferYang marked this pull request as ready for review February 25, 2026 02:49
@LuciferYang
Copy link
Contributor Author

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@LuciferYang
Copy link
Contributor Author

Merged into master. Thanks @dongjoon-hyun @pan3793

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants