Error codes, error-condition tests, and privacy hardening#101
Open
robertbuessow wants to merge 3 commits into
Open
Error codes, error-condition tests, and privacy hardening#101robertbuessow wants to merge 3 commits into
robertbuessow wants to merge 3 commits into
Conversation
Adds `test/error_tests.jl` with comprehensive tests for failure paths when reading and writing Iceberg tables: - **`table_open` failures**: non-existent local path, invalid/empty/non- Iceberg JSON metadata, non-existent S3 key, wrong S3 credentials - **Missing/corrupted Parquet data files**: write data via memory catalog, delete or corrupt the `.parquet` files, assert scan throws `IcebergException` at `next_batch` - **Catalog error conditions**: load non-existent table/namespace, create table in missing namespace, duplicate namespace/table, drop non-existent table/namespace; all using the in-process memory catalog (no Docker) - **Writer schema mismatch**: write a NamedTuple with a missing required column or entirely non-matching column names All new assertions use `RustyIceberg.IcebergException` for consistency with existing stricter tests (writer_tests.jl, transaction_tests.jl). Tests in groups 1–4 run without Docker; S3 tests require MinIO via `make run-containers`. Labels: dismiss-release-notes, build:benchmark Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Replaces the untyped `msg::String / code::Union{Int,Nothing}` exception
with a three-field struct that gives callers a stable, machine-readable
semantic code alongside a user-displayable message and a technical detail
string for logs.
## Rust (new `error_codes.rs` + 7 modified files)
`iceberg_rust_ffi/src/error_codes.rs` introduces:
- 24 `pub const` error codes in six groups: Not Found (1xx), Auth (2xx),
Data/Corruption (3xx), Catalog (4xx), Resource State (5xx), I/O (6xx),
Internal (9xx).
- `ClassifiedError` — an `anyhow`-compatible error whose `Display` emits
`"{code}\t{user_msg}\t{detail}"`, threading the code through the
existing `error_message` C-string with no new FFI fields.
- `classify(anyhow::Error)` — downcasts to `iceberg::Error` for precise
`ErrorKind` matching (`TableNotFound`, `NamespaceNotFound`,
`TableAlreadyExists`, `CatalogCommitConflicts`, `DataInvalid`, …),
then falls back to pattern-matching the message string for S3/auth
errors, state errors, and I/O errors.
- `classify_iceberg(iceberg::Error)` convenience wrapper.
- `classified_error(code, user_msg, detail)` for pre-classified errors.
Every async block that can fail now routes through `classify` /
`classify_iceberg`. Explicit null-pointer and resource-state errors are
constructed with `classified_error` and the matching constant.
## Julia (6 modified files)
`src/RustyIceberg.jl`:
- `@enum IcebergError::UInt32` — all 24 codes, exported.
- `IcebergException` now carries `code::IcebergError`, `msg::String`
(short, user-displayable), `detail::String` (full Rust text for logs).
- `parse_and_throw` splits the `CODE\tUSER_MSG\tDETAIL` string, parses
the code into `IcebergError`, and throws a typed exception. Errors that
don't carry the classified prefix fall back to `INTERNAL`.
- `@throw_on_error` updated to delegate to `parse_and_throw`.
All explicit `IcebergException("…")` calls in `src/` updated to the
three-field form with the appropriate semantic code.
Labels: dismiss-release-notes, build:benchmark
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
IO_NETWORK and IO_S3 user messages no longer embed a truncated slice of the technical detail string (which can contain S3 bucket paths, file paths, or endpoint URLs). The `msg` field of IcebergException now always contains a short, generic description; the full context remains available in `detail` for log files and bug reports. Labels: dismiss-release-notes, build:benchmark Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Structured error codes (
IcebergExceptionnow carriescode::IcebergError,msg::String,detail::String): callers can programmatically branch on 24 stable semantic codes (Not Found 1xx, Auth 2xx, Data/Corruption 3xx, Catalog 4xx, Resource State 5xx, I/O 6xx, Internal 9xx) rather than parsing error strings. The Rust layer classifies every error at the FFI boundary usingiceberg::ErrorKindmatching and message-pattern fallback; the code is threaded through the existingerror_messageC-string asCODE\tUSER_MSG\tDETAIL— no new FFI fields needed.Error-condition tests (
test/error_tests.jl): 20 tests coveringtable_openwith invalid/missing/corrupt metadata, missing and corrupted Parquet data files, catalog errors (non-existent table/namespace, duplicates, non-empty namespace), writer schema mismatches — all using the in-process memory catalog where possible (no Docker required for most tests).Privacy hardening: S3 paths, file paths, and table names are no longer included in the user-visible
msgfield; they remain only indetailfor log files.Test plan
make testpasses with containers running (make run-containers)error_tests.jlpass without Docker (memory catalog only)e.code,e.msg,e.detailall populated on caughtIcebergExceptione.msgcontains no file paths or table names🤖 Generated with Claude Code