Commit 32da21c
Update java client version (#428)
* Upgrade Java Client to V2 syncQuery & syncInsert
* Refactor to use the new client v2 api
* Add timeout to query operation
* Clean NodeClient
* Change binary reader
* Update client version
* Fix project to use snapshots
* merge with main
* run spotlessScalaApply and implement readAllBytes since java 8 does not support
* Remove unneeded remarks
* Chanage to client version 0.9.3
* Update socket timeout in new client
* Change max connections to 20
* ConnectTimeout to 1200000
* Add 3 sec to sleep
* Setting a new setConnectionRequestTimeout for experiment
* spotlessScalaApply fix
* Fix/json reader fixedstring v2 (#448)
* Wake up ClickHouse Cloud instance before tests (#429)
* fix: Handle FixedString as plain text in JSON reader for all Spark versions
Problem:
ClickHouse returns FixedString as plain text in JSON format, but the
connector was trying to decode it as Base64, causing InvalidFormatException.
Solution:
Use pattern matching with guard to check if the JSON node is textual.
- If textual (FixedString): decode as UTF-8 bytes
- If not textual (true binary): decode as Base64
Applied to Spark 3.3, 3.4, and 3.5.
---------
Co-authored-by: Bentsi Leviav <bentsi.leviav@clickhouse.com>
Co-authored-by: Shimon Steinitz <shimonsteinitz@Shimons-MacBook-Pro.local>
* Added reader and writer tests (#449)
* Wake up ClickHouse Cloud instance before tests (#429)
* feat: Add comprehensive read test coverage for Spark 3.3, 3.4, and 3.5
Add shared test trait ClickHouseReaderTestBase with 48 test scenarios covering:
- All primitive types (Boolean, Byte, Short, Int, Long, Float, Double)
- Large integers (UInt64, Int128, UInt128, Int256, UInt256)
- Decimals (Decimal32, Decimal64, Decimal128)
- Date/Time types (Date, Date32, DateTime, DateTime32, DateTime64)
- String types (String, UUID, FixedString)
- Enums (Enum8, Enum16)
- IP addresses (IPv4, IPv6)
- JSON data
- Collections (Arrays, Maps)
- Edge cases (empty strings, long strings, empty arrays, nullable variants)
Test suites for Binary and JSON read formats.
Test results: 96 tests per Spark version (288 total)
- Binary format: 47/48 passing
- JSON format: 47/48 passing
- Overall: 94/96 passing per version (98% pass rate)
Remaining failures are known bugs with fixes on separate branches.
* feat: Add comprehensive write test coverage for Spark 3.3, 3.4, and 3.5
Add shared test trait ClickHouseWriterTestBase with 17 test scenarios covering:
- Primitive types (Boolean, Byte, Short, Int, Long, Float, Double)
- Decimal types
- String types (regular and empty strings)
- Date and Timestamp types
- Collections (Arrays and Maps, including empty variants)
- Nullable variants
Test suites for JSON and Arrow write formats.
Note: Binary write format is not supported (only JSON and Arrow).
Test results: 34 tests per Spark version (102 total)
- JSON format: 17/17 passing (100%)
- Arrow format: 17/17 passing (100%)
- Overall: 34/34 passing per version (100% pass rate)
Known behavior: Boolean values write as BooleanType but read back as ShortType (0/1)
due to ClickHouse storing Boolean as UInt8.
* style: Apply spotless formatting
* style: Apply spotless formatting for Spark 3.3 and 3.4
Remove trailing whitespace from test files to pass CI spotless checks.
* fix: Change write format from binary to arrow in BinaryReaderSuite
The 'binary' write format option doesn't exist. Changed to 'arrow'
which is a valid write format option.
Applied to Spark 3.3, 3.4, and 3.5.
* test: Add nullable tests for ShortType, IntegerType, and LongType
Added missing nullable variant tests to ensure comprehensive coverage:
- decode ShortType - nullable with null values (Nullable(Int16))
- decode IntegerType - nullable with null values (Nullable(Int32))
- decode LongType - nullable with null values (Nullable(Int64))
These tests verify that nullable primitive types correctly handle NULL
values in both Binary and JSON read formats.
Applied to Spark 3.3, 3.4, and 3.5.
Total tests per Spark version: 51 (was 48)
Total across all versions: 153 (was 144)
* Refactor ClickHouseReaderTestBase: Add nullable tests and organize alphabetically
- Add missing nullable test cases for: Date32, Decimal32, Decimal128, UInt16, UUID, DateTime64
- Organize all 69 tests alphabetically by data type for better maintainability
- Ensure comprehensive coverage with both nullable and non-nullable variants for all data types
- Apply changes consistently across Spark 3.3, 3.4, and 3.5
* ci: Skip cloud tests on forks where secrets are unavailable
Add repository check to cloud workflow to prevent failures on forks
that don't have access to ClickHouse Cloud secrets. Tests will still
run on the main repository where secrets are properly configured.
* Refactor and enhance Reader/Writer tests for all Spark versions
- Add BooleanType tests to Reader (2 tests) with format-aware assertions
- Add 6 new tests to Writer: nested arrays, arrays with nullable elements,
multiple Decimal precisions (18,4 and 38,10), Map with nullable values, and StructType
- Reorder all tests lexicographically for better organization
- Writer tests increased from 17 to 33 tests
- Reader tests increased from 69 to 71 tests
- Remove section header comments for cleaner code
- Apply changes to all Spark versions: 3.3, 3.4, and 3.5
- All tests now properly sorted alphabetically by data type and variant
* style: Apply spotless formatting to Reader/Writer tests
---------
Co-authored-by: Bentsi Leviav <bentsi.leviav@clickhouse.com>
Co-authored-by: Shimon Steinitz <shimon.steinitz@clickhouse.com>
* Fix BinaryReader to handle new Java client types
- Fix DecimalType: Handle both BigInteger (Int256/UInt256) and BigDecimal (Decimal types)
- Fix ArrayType: Direct call to BinaryStreamReader.ArrayValue.getArrayOfObjects()
- Fix StringType: Handle UUID, InetAddress, and EnumValue types
- Fix DateType: Handle both LocalDate and ZonedDateTime
- Fix MapType: Handle all util.Map implementations
Removed reflection and defensive pattern matching for better performance.
All 34 Binary Reader test failures are now fixed (71/71 tests passing).
Fixes compatibility with new Java client API in update-java-client-version branch.
* Add high-precision decimal tests with tolerance
- Add Decimal(18,4) test with 0.001 tolerance for JSON/Arrow formats
- Documents precision limitation for decimals with >15-17 significant digits
- Uses tolerance-based assertions to account for observed precision loss
- Binary format preserves full precision (already tested in Binary Reader suite)
- All 278 tests passing
* Simplify build-and-test workflow trigger to run on all pushes
* Fix Scala 2.13 compatibility for nested arrays
- Convert mutable.ArraySeq to Array in ClickHouseJsonReader to ensure immutable collections
- Add test workaround for Spark's Row.getSeq behavior in Scala 2.13
- Fix Spotless formatting: remove trailing whitespace in ClickHouseBinaryReader
- Applied to all Spark versions: 3.3, 3.4, 3.5
* Update java version to 0.9.4
* Enable compression
* add logging TPCDSClusterSuite & change client buffers
* Change InputStream read code
* Remove hard coded settings for experiments
* Clean log from insert method
---------
Co-authored-by: Shimon Steinitz <shimonste@gmail.com>
Co-authored-by: Bentsi Leviav <bentsi.leviav@clickhouse.com>
Co-authored-by: Shimon Steinitz <shimonsteinitz@Shimons-MacBook-Pro.local>
Co-authored-by: Shimon Steinitz <shimon.steinitz@clickhouse.com>1 parent c05885e commit 32da21c
File tree
39 files changed
+7043
-332
lines changed- .github/workflows
- clickhouse-core/src/main/scala/com/clickhouse/spark/client
- spark-3.3
- clickhouse-spark-it/src/test/scala/org/apache/spark/sql/clickhouse
- cluster
- single
- clickhouse-spark/src/main/scala/com/clickhouse/spark
- read
- format
- write
- spark-3.4
- clickhouse-spark-it/src/test/scala/org/apache/spark/sql/clickhouse
- cluster
- single
- clickhouse-spark/src/main/scala/com/clickhouse/spark
- read
- format
- write
- spark-3.5
- clickhouse-spark-it/src/test/scala/org/apache/spark/sql/clickhouse
- cluster
- single
- clickhouse-spark/src/main/scala/com/clickhouse/spark
- read
- format
- write
39 files changed
+7043
-332
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
| 16 | + | |
26 | 17 | | |
27 | 18 | | |
28 | 19 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
| 32 | + | |
31 | 33 | | |
32 | 34 | | |
33 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
94 | 103 | | |
95 | 104 | | |
96 | 105 | | |
| |||
218 | 227 | | |
219 | 228 | | |
220 | 229 | | |
221 | | - | |
| 230 | + | |
222 | 231 | | |
223 | 232 | | |
224 | 233 | | |
| |||
239 | 248 | | |
240 | 249 | | |
241 | 250 | | |
| 251 | + | |
242 | 252 | | |
243 | 253 | | |
244 | 254 | | |
| |||
Lines changed: 69 additions & 57 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
18 | 17 | | |
19 | | - | |
20 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
21 | 27 | | |
22 | 28 | | |
23 | 29 | | |
| |||
30 | 36 | | |
31 | 37 | | |
32 | 38 | | |
33 | | - | |
| 39 | + | |
| 40 | + | |
34 | 41 | | |
35 | 42 | | |
36 | 43 | | |
| |||
40 | 47 | | |
41 | 48 | | |
42 | 49 | | |
43 | | - | |
| 50 | + | |
44 | 51 | | |
45 | 52 | | |
46 | 53 | | |
| |||
78 | 85 | | |
79 | 86 | | |
80 | 87 | | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
88 | 96 | | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
96 | 104 | | |
97 | 105 | | |
98 | | - | |
| 106 | + | |
| 107 | + | |
99 | 108 | | |
100 | 109 | | |
101 | 110 | | |
| |||
119 | 128 | | |
120 | 129 | | |
121 | 130 | | |
122 | | - | |
123 | 131 | | |
124 | 132 | | |
125 | 133 | | |
126 | 134 | | |
127 | 135 | | |
128 | 136 | | |
129 | 137 | | |
130 | | - | |
131 | 138 | | |
132 | 139 | | |
133 | 140 | | |
| |||
149 | 156 | | |
150 | 157 | | |
151 | 158 | | |
152 | | - | |
153 | 159 | | |
154 | 160 | | |
155 | 161 | | |
156 | 162 | | |
157 | 163 | | |
| 164 | + | |
| 165 | + | |
158 | 166 | | |
159 | 167 | | |
160 | 168 | | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
170 | 185 | | |
171 | 186 | | |
172 | 187 | | |
| |||
179 | 194 | | |
180 | 195 | | |
181 | 196 | | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
192 | 206 | | |
193 | 207 | | |
194 | 208 | | |
| |||
203 | 217 | | |
204 | 218 | | |
205 | 219 | | |
206 | | - | |
| 220 | + | |
207 | 221 | | |
208 | 222 | | |
209 | 223 | | |
210 | 224 | | |
211 | 225 | | |
212 | | - | |
213 | 226 | | |
214 | | - | |
| 227 | + | |
215 | 228 | | |
216 | 229 | | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
228 | 240 | | |
229 | 241 | | |
230 | 242 | | |
| |||
238 | 250 | | |
239 | 251 | | |
240 | 252 | | |
241 | | - | |
| 253 | + | |
242 | 254 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
41 | 43 | | |
42 | 44 | | |
43 | 45 | | |
| |||
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| 56 | + | |
54 | 57 | | |
55 | 58 | | |
56 | 59 | | |
| |||
Lines changed: 28 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
Lines changed: 40 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
0 commit comments