Skip to content

Error codes, error-condition tests, and privacy hardening#101

Open
robertbuessow wants to merge 3 commits into
mainfrom
rb-error-tests
Open

Error codes, error-condition tests, and privacy hardening#101
robertbuessow wants to merge 3 commits into
mainfrom
rb-error-tests

Conversation

@robertbuessow
Copy link
Copy Markdown
Contributor

Summary

  • Structured error codes (IcebergException now carries code::IcebergError, msg::String, detail::String): callers can programmatically branch on 24 stable semantic codes (Not Found 1xx, Auth 2xx, Data/Corruption 3xx, Catalog 4xx, Resource State 5xx, I/O 6xx, Internal 9xx) rather than parsing error strings. The Rust layer classifies every error at the FFI boundary using iceberg::ErrorKind matching and message-pattern fallback; the code is threaded through the existing error_message C-string as CODE\tUSER_MSG\tDETAIL — no new FFI fields needed.

  • Error-condition tests (test/error_tests.jl): 20 tests covering table_open with invalid/missing/corrupt metadata, missing and corrupted Parquet data files, catalog errors (non-existent table/namespace, duplicates, non-empty namespace), writer schema mismatches — all using the in-process memory catalog where possible (no Docker required for most tests).

  • Privacy hardening: S3 paths, file paths, and table names are no longer included in the user-visible msg field; they remain only in detail for log files.

Test plan

  • make test passes with containers running (make run-containers)
  • Groups 1–4 of error_tests.jl pass without Docker (memory catalog only)
  • e.code, e.msg, e.detail all populated on caught IcebergException
  • e.msg contains no file paths or table names

🤖 Generated with Claude Code

robertbuessow and others added 3 commits May 18, 2026 14:15
Adds `test/error_tests.jl` with comprehensive tests for failure paths
when reading and writing Iceberg tables:

- **`table_open` failures**: non-existent local path, invalid/empty/non-
  Iceberg JSON metadata, non-existent S3 key, wrong S3 credentials
- **Missing/corrupted Parquet data files**: write data via memory catalog,
  delete or corrupt the `.parquet` files, assert scan throws
  `IcebergException` at `next_batch`
- **Catalog error conditions**: load non-existent table/namespace, create
  table in missing namespace, duplicate namespace/table, drop non-existent
  table/namespace; all using the in-process memory catalog (no Docker)
- **Writer schema mismatch**: write a NamedTuple with a missing required
  column or entirely non-matching column names

All new assertions use `RustyIceberg.IcebergException` for consistency with
existing stricter tests (writer_tests.jl, transaction_tests.jl).
Tests in groups 1–4 run without Docker; S3 tests require MinIO via
`make run-containers`.

Labels: dismiss-release-notes, build:benchmark

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Replaces the untyped `msg::String / code::Union{Int,Nothing}` exception
with a three-field struct that gives callers a stable, machine-readable
semantic code alongside a user-displayable message and a technical detail
string for logs.

## Rust (new `error_codes.rs` + 7 modified files)

`iceberg_rust_ffi/src/error_codes.rs` introduces:
- 24 `pub const` error codes in six groups: Not Found (1xx), Auth (2xx),
  Data/Corruption (3xx), Catalog (4xx), Resource State (5xx), I/O (6xx),
  Internal (9xx).
- `ClassifiedError` — an `anyhow`-compatible error whose `Display` emits
  `"{code}\t{user_msg}\t{detail}"`, threading the code through the
  existing `error_message` C-string with no new FFI fields.
- `classify(anyhow::Error)` — downcasts to `iceberg::Error` for precise
  `ErrorKind` matching (`TableNotFound`, `NamespaceNotFound`,
  `TableAlreadyExists`, `CatalogCommitConflicts`, `DataInvalid`, …),
  then falls back to pattern-matching the message string for S3/auth
  errors, state errors, and I/O errors.
- `classify_iceberg(iceberg::Error)` convenience wrapper.
- `classified_error(code, user_msg, detail)` for pre-classified errors.

Every async block that can fail now routes through `classify` /
`classify_iceberg`. Explicit null-pointer and resource-state errors are
constructed with `classified_error` and the matching constant.

## Julia (6 modified files)

`src/RustyIceberg.jl`:
- `@enum IcebergError::UInt32` — all 24 codes, exported.
- `IcebergException` now carries `code::IcebergError`, `msg::String`
  (short, user-displayable), `detail::String` (full Rust text for logs).
- `parse_and_throw` splits the `CODE\tUSER_MSG\tDETAIL` string, parses
  the code into `IcebergError`, and throws a typed exception. Errors that
  don't carry the classified prefix fall back to `INTERNAL`.
- `@throw_on_error` updated to delegate to `parse_and_throw`.

All explicit `IcebergException("…")` calls in `src/` updated to the
three-field form with the appropriate semantic code.

Labels: dismiss-release-notes, build:benchmark

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
IO_NETWORK and IO_S3 user messages no longer embed a truncated slice of
the technical detail string (which can contain S3 bucket paths, file
paths, or endpoint URLs). The `msg` field of IcebergException now always
contains a short, generic description; the full context remains available
in `detail` for log files and bug reports.

Labels: dismiss-release-notes, build:benchmark

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant