[SPARK-55656][BUILD] Upgrade Guava to 33.5.0-jre by LuciferYang · Pull Request #54447 · apache/spark

LuciferYang · 2026-02-24T11:45:07Z

What changes were proposed in this pull request?

This pr aims to upgrade Guava from 33.4.8 to 33.5.0.

Why are the changes needed?

The new version brings some Improvements, such as

cache: Improved the handling of exceptions from compute functions in Cache.asMap().
collect: Improved Iterators.mergeSorted() to preserve stability for equal elements.

The full release notes as follows:

https://github.com/google/guava/releases/tag/v33.5.0

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass Github Actions

Was this patch authored or co-authored using generative AI tooling?

No

LuciferYang · 2026-02-24T11:46:00Z

Waiting for #54446 to fix the test failures after the upgrade.

[info] CountVectorizerSuite:
[info] - params (24 milliseconds)
[info] - CountVectorizerModel common cases (1 second, 779 milliseconds)
[info] - CountVectorizer common cases (400 milliseconds)
[info] - CountVectorizer vocabSize and minDF (735 milliseconds)
[info] - CountVectorizer maxDF (113 milliseconds)
[info] - CountVectorizer using both minDF and maxDF (103 milliseconds)
[info] - CountVectorizerModel with minTF count (212 milliseconds)
[info] - CountVectorizerModel with minTF freq (222 milliseconds)
[info] - CountVectorizerModel and CountVectorizer with binary *** FAILED *** (227 milliseconds)
[info]   org.scalatest.exceptions.TestFailedException: Expected (4,[1],[1.0]) and (4,[2],[1.0]) to be within 1.0E-14 using absolute tolerance for all elements.
[info]   
[info]   == Progress ==
[info]      AddData to MemoryStream[_1#606,_2#607,_3#608]: (0,List(a, a, a, a, b, b, b, b, c, d),(4,[0,1,2,3],[1.0,1.0,1.0,1.0])),(1,List(c, c, c),(4,[2],[1.0])),(2,List(a),(4,[0],[1.0]))
[info]   => CheckAnswerByFunc
[info]   
[info]   == Stream ==
[info]   Output Mode: Append
[info]   Stream state: {MemoryStream[_1#606,_2#607,_3#608]: 0}
[info]   Thread state: alive
[info]   Thread stack trace: java.base@17.0.18/java.lang.Thread.sleep(Native Method)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.executeOneBatch(MicroBatchExecution.scala:603)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:521)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.$anonfun$runActivatedStream$1$adapted(MicroBatchExecution.scala:521)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution$$Lambda$2784/0x0000000801e93698.apply(Unknown Source)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.TriggerExecutor.runOneBatch(TriggerExecutor.scala:40)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.TriggerExecutor.runOneBatch$(TriggerExecutor.scala:38)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.ProcessingTimeExecutor.runOneBatch(TriggerExecutor.scala:71)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.ProcessingTimeExecutor.execute(TriggerExecutor.scala:83)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:521)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution.$anonfun$runStream$1(StreamExecution.scala:353)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution$$Lambda$2751/0x0000000801e813a0.apply$mcV$sp(Unknown Source)
[info]   app//scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[info]   app//org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:810)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution.org$apache$spark$sql$execution$streaming$runtime$StreamExecution$$runStream(StreamExecution.scala:313)
[info]   app//org.apache.spark.sql.execution.streaming.runtime.StreamExecution$$anon$1.run(StreamExecution.scala:236)
[info]   
[info]   
[info]   == Sink ==
[info]   0: [(4,[0,1,2,3],[1.0,1.0,1.0,1.0]),(4,[0,1,2,3],[1.0,1.0,1.0,1.0])] [(4,[1],[1.0]),(4,[2],[1.0])] [(4,[0],[1.0]),(4,[0],[1.0])]
[info]   
[info]   
[info]   == Plan ==
[info]   == Parsed Logical Plan ==
[info]   ~WriteToMicroBatchDataSource MemorySink, 0779d714-07d5-4b31-88a2-e89b6406eadb, Append, 0
[info]   +- ~Project [features#632, expected#621]
[info]      +- ~Project [id#619, words#620, expected#621, UDF(words#620) AS features#632]
[info]         +- ~Project [id#616 AS id#619, words#617 AS words#620, expected#618 AS expected#621]
[info]            +- ~Project [_1#606 AS id#616, _2#607 AS words#617, _3#608 AS expected#618]
[info]               +- ~StreamingDataSourceV2ScanRelation[_1#606, _2#607, _3#608] MemoryStreamDataSource
[info]   
[info]   == Analyzed Logical Plan ==
[info]   ~WriteToMicroBatchDataSource MemorySink, 0779d714-07d5-4b31-88a2-e89b6406eadb, Append, 0
[info]   +- ~Project [features#632, expected#621]
[info]      +- ~Project [id#619, words#620, expected#621, UDF(words#620) AS features#632]
[info]         +- ~Project [id#616 AS id#619, words#617 AS words#620, expected#618 AS expected#621]
[info]            +- ~Project [_1#606 AS id#616, _2#607 AS words#617, _3#608 AS expected#618]
[info]               +- ~StreamingDataSourceV2ScanRelation[_1#606, _2#607, _3#608] MemoryStreamDataSource
[info]   
[info]   == Optimized Logical Plan ==
[info]   ~WriteToDataSourceV2 MicroBatchWrite[epoch: 0, writer: org.apache.spark.sql.execution.streaming.sources.MemoryStreamingWrite@3ddc0dda]
[info]   +- ~Project [UDF(_2#607) AS features#632, _3#608 AS expected#621]
[info]      +- ~StreamingDataSourceV2ScanRelation[_1#606, _2#607, _3#608] MemoryStreamDataSource
[info]   
[info]   == Physical Plan ==
[info]   WriteToDataSourceV2 MicroBatchWrite[epoch: 0, writer: org.apache.spark.sql.execution.streaming.sources.MemoryStreamingWrite@3ddc0dda]
[info]   +- *(1) Project [UDF(_2#607) AS features#632, _3#608 AS expected#621]
[info]      +- MicroBatchScan[_1#606, _2#607, _3#608] MemoryStreamDataSource (StreamTest.scala:512)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Assertions.fail(Assertions.scala:933)
[info]   at org.scalatest.Assertions.fail$(Assertions.scala:929)
[info]   at org.scalatest.funsuite.AnyFunSuite.fail(AnyFunSuite.scala:1564)
[info]   at org.apache.spark.sql.streaming.StreamTest.failTest$1(StreamTest.scala:512)
[info]   at org.apache.spark.sql.streaming.StreamTest.executeAction$1(StreamTest.scala:862)
[info]   at org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$69(StreamTest.scala:887)
[info]   at org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$69$adapted(StreamTest.scala:874)
[info]   at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:630)
[info]   at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:628)
[info]   at scala.collection.AbstractIterable.foreach(Iterable.scala:936)
[info]   at org.apache.spark.sql.streaming.StreamTest.liftedTree1$1(StreamTest.scala:874)
[info]   at org.apache.spark.sql.streaming.StreamTest.testStream(StreamTest.scala:873)
[info]   at org.apache.spark.sql.streaming.StreamTest.testStream$(StreamTest.scala:391)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testStream(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerOnStreamData(MLTest.scala:125)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerOnStreamData$(MLTest.scala:104)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testTransformerOnStreamData(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerByGlobalCheckFunc(MLTest.scala:160)
[info]   at org.apache.spark.ml.util.MLTest.testTransformerByGlobalCheckFunc$(MLTest.scala:153)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testTransformerByGlobalCheckFunc(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.util.MLTest.testTransformer(MLTest.scala:150)
[info]   at org.apache.spark.ml.util.MLTest.testTransformer$(MLTest.scala:140)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.testTransformer(CountVectorizerSuite.scala:26)
[info]   at org.apache.spark.ml.feature.CountVectorizerSuite.$anonfun$new$19(CountVectorizerSuite.scala:244)
[info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[info]   at org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
[info]   at org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
[info]   at org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
[info]   at org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
[info]   at org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:30)
[info]   at org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:41)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
[info]   at org.apache.spark.SparkTestSuite.withFixture(SparkTestSuite.scala:175)
[info]   at org.apache.spark.SparkTestSuite.withFixture$(SparkTestSuite.scala:169)
[info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:30)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
[info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:30)
[info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
[info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
[info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:30)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
[info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
[info]   at scala.collection.immutable.List.foreach(List.scala:323)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268)
[info]   at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Suite.run(Suite.scala:1114)
[info]   at org.scalatest.Suite.run$(Suite.scala:1096)
[info]   at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273)
[info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272)
[info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:30)
[info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
[info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
[info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
[info]   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:30)
[info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
[info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
[info]   at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:414)
[info]   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[info]   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[info]   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[info]   at java.base/java.lang.Thread.run(Thread.java:840)
[info] - CountVectorizer read/write (360 milliseconds)
[info] - CountVectorizerModel read/write (1 second, 82 milliseconds)
[info] - SPARK-22974: CountVectorModel should attach proper attribute to output column (18 milliseconds)
[info] - SPARK-32662: Test on empty dataset (25 milliseconds)
[info] - SPARK-32662: Remove requirement for minimum vocabulary size (739 milliseconds)
[info] Run completed in 9 seconds, 415 milliseconds.
[info] Total number of tests run: 14
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 13, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.ml.feature.CountVectorizerSuite

dongjoon-hyun

I merged #54446 to unblock this. Could you rebase once more, @LuciferYang ?

pan3793

Checked Guava 33.5.0 Release Notes - no surprise.

LuciferYang · 2026-02-25T02:43:31Z

rebased

LuciferYang · 2026-02-25T05:59:14Z

https://github.com/LuciferYang/spark/runs/64789052252

all test paased

dongjoon-hyun

+1, LGTM.

LuciferYang · 2026-02-25T06:30:17Z

Merged into master. Thanks @dongjoon-hyun @pan3793

init

8faf34d

LuciferYang marked this pull request as draft February 24, 2026 11:45

dongjoon-hyun reviewed Feb 24, 2026

View reviewed changes

Merge branch 'apache:master' into SPARK-55656

ad78c40

pan3793 approved these changes Feb 25, 2026

View reviewed changes

LuciferYang added 2 commits February 25, 2026 10:38

Merge branch 'upmaster' into SPARK-55656

09e6b47

rebase

2fb6a9d

LuciferYang marked this pull request as ready for review February 25, 2026 02:49

dongjoon-hyun approved these changes Feb 25, 2026

View reviewed changes

LuciferYang closed this in 054cca1 Feb 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-55656][BUILD] Upgrade Guava to 33.5.0-jre#54447

[SPARK-55656][BUILD] Upgrade Guava to 33.5.0-jre#54447
LuciferYang wants to merge 4 commits intoapache:masterfrom
LuciferYang:SPARK-55656

LuciferYang commented Feb 24, 2026 •

edited

Loading

Uh oh!

LuciferYang commented Feb 24, 2026 •

edited

Loading

Uh oh!

dongjoon-hyun left a comment

Uh oh!

pan3793 left a comment

Uh oh!

LuciferYang commented Feb 25, 2026

Uh oh!

LuciferYang commented Feb 25, 2026

Uh oh!

dongjoon-hyun left a comment

Uh oh!

LuciferYang commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LuciferYang commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

LuciferYang commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

pan3793 left a comment

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Feb 25, 2026

Uh oh!

LuciferYang commented Feb 25, 2026

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LuciferYang commented Feb 24, 2026 •

edited

Loading

LuciferYang commented Feb 24, 2026 •

edited

Loading