Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
353 changes: 353 additions & 0 deletions docs/rfcs/0002_storage_trait.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,353 @@
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->

# Making Storage a Trait in Iceberg-rust

## Background

### Existing Implementation
The existing code implements storage functionality through a concrete Storage enum that handles different storage backends (S3, local filesystem, GCS, etc.). This implementation is tightly coupled with OpenDAL as the underlying storage layer. The FileIO struct wraps this Storage enum and provides a high-level API for file operations.

Original structure:

- **FileIO:** Main interface for file operations
- **Storage:** Enum with variants for different storage backends
- **InputFile / OutputFile:** Concrete structs for reading and writing files

All storage operations are implemented directly in these concrete types, making it hard to extend or customize the storage layer without modifying the core codebase.

### Problem Statement
The original design has several limitations:

- **Tight Coupling** – All storage logic depends on OpenDAL, limiting flexibility. Users cannot easily opt in for other storage implementations like `object_store`
- **Customization Barriers** – Users cannot easily add custom behaviors or optimizations

As discussed in Issue #1314, making Storage a trait would allow pluggable storage and better integration with existing systems.

## Design

### New Architecture

The new architecture uses trait-based abstractions with a registry pattern and separate storage crates:

```
┌─────────────────────────────────────────────────────┐
│ crates/iceberg/src/io/ │
│ ┌───────────────────────────────────────────────┐ │
│ │ Storage Trait & Registry │ │
│ │ - pub trait Storage │ │
│ │ - pub trait StorageBuilder │ │
│ │ - pub struct StorageBuilderRegistry │ │
Comment on lines +54 to +56
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the previous comment, we should only have one trait that provides similar functionality from Storage and StorageBuilderRegistry. The new storage abstraction should handle s3://path/to/file directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding: do you mean that we don't need a registry. Instead, we can just configure to use a general Storage implementation that should be able to handle multiple storage schemes like s3/gcs/...?

Copy link
Member

@Xuanwo Xuanwo Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, users just need to pick a Storage implementation. And this impl supports how many services, and what kinds of services, are on their own.

This design make it easier for users to compose different storage services on their own.

│ │ - pub struct InputFile │ │
│ │ - pub struct OutputFile │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
┌───────────────┼───────────────┬───────────────┐
│ │ │ │
┌───────┴────────────┐ │ ┌────────────┴──────┐ ┌────┴──────────┐
│ crates/storage/ │ │ │ crates/storage/ │ │ Third-Party │
│ opendal/ │ │ │ object_store/ │ │ Crates │
│ ┌────────────────┐ │ │ │ ┌───────────────┐ │ │ ┌───────────┐ │
│ │ opendal-s3 │ │ │ │ │ objstore-s3 │ │ │ │ custom │ │
│ │ impl Storage │ │ │ │ │ impl Storage │ │ │ │ storage │ │
│ │ impl Builder │ │ │ │ │ impl Builder │ │ │ │impl traits│ │
│ └────────────────┘ │ │ │ └───────────────┘ │ │ └───────────┘ │
│ ┌────────────────┐ │ │ │ ┌───────────────┐ │ └───────────────┘
│ │ opendal-fs │ │ │ │ │ objstore-gcs │ │
│ │ impl Storage │ │ │ │ │ impl Storage │ │
│ │ impl Builder │ │ │ │ │ impl Builder │ │
│ └────────────────┘ │ │ │ └───────────────┘ │
│ ┌────────────────┐ │ │ │ ┌───────────────┐ │
│ │ opendal-gcs │ │ │ │ │ objstore-azure│ │
│ │ impl Storage │ │ │ │ │ impl Storage │ │
│ │ impl Builder │ │ │ │ │ impl Builder │ │
│ └────────────────┘ │ │ │ └───────────────┘ │
│ ... (oss, azure, │ │ └───────────────────┘
│ memory) │ │
└────────────────────┘ │
```
Comment on lines +63 to +86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current feeling is that we don't need to add impls for every service, instead we add impls for every provider. One provider can provide multiple impls, and that can simplify a lot of our work.

For example, opendal can provide S3, GCS, OSS, Hf, and more, while object_store supports S3 and GCS. We can also have a fs implementation that only provides FS support. And maybe in the future, we can have hdfs which can provide hdfs-s3, hdfs-gcs and more.

This design matches the actual status better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this thread on the old google design doc is related: https://docs.google.com/document/d/1-CEvRvb52vPTDLnzwJRBx5KLpej7oSlTu_rg0qKEGZ8/edit?disco=AAABrRO9Prk

This design decision definitely deserves another round of discussion. Whether we use a single, general Storage type (e.g., OpenDALStorage) or multiple scheme-specific types (OpenDALS3Storage, OpenDALOSSStorage, etc.) depends on how we define Storage. Is it essentially an Iceberg wrapper around something like object_store::ObjectStore, or is it intended to serve as a higher-level Iceberg abstraction that operates across different storage backends?

If it’s more of a thin wrapper, then having multiple scheme-specific types makes more sense. If it’s meant to be a higher-level abstraction, a unified multi-scheme implementation feels more appropriate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also have a fs implementation that only provides FS support

I'm feeling that having a "default" Storage will be a must, just like we are having MemoryCatalog in the iceberg crate, we will need to at least allow users to do basic fs operations in places like iceberg crate where concrete Storage implementations like ObjectStoreStorage are unavailable

Copy link
Member

@Xuanwo Xuanwo Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling that having a "default" Storage will be a must, just like we are having MemoryCatalog in the iceberg crate, we will need to at least allow users to do basic fs operations in places like iceberg crate where concrete Storage implementations like ObjectStoreStorage are unavailable

I don't think so. Users should always need to pick a Storage impl to do IO.


### Storage Trait
The Storage trait defines the interface for all storage operations. This implementation uses Option 2 from the initial design: Storage as a trait with concrete `InputFile` and `OutputFile` structs.

```rust
#[async_trait]
pub trait Storage: Debug + Send + Sync {
// File existence and metadata
async fn exists(&self, path: &str) -> Result<bool>;
async fn metadata(&self, path: &str) -> Result<FileMetadata>;

// Reading operations
async fn read(&self, path: &str) -> Result<Bytes>;
async fn reader(&self, path: &str) -> Result<Box<dyn FileRead>>;

// Writing operations
async fn write(&self, path: &str, bs: Bytes) -> Result<()>;
async fn writer(&self, path: &str) -> Result<Box<dyn FileWrite>>;

// Deletion operations
async fn delete(&self, path: &str) -> Result<()>;
async fn remove_dir_all(&self, path: &str) -> Result<()>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrating comments from @Fokko

This name feels very much file-system like, while Iceberg is designed to work against object stores. How about delete_prefix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


// File object creation
fn new_input(&self, path: &str) -> Result<InputFile>;
fn new_output(&self, path: &str) -> Result<OutputFile>;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrating comments from @c-thiel

Many object stores have a good way of running batch deletions, for example the DeleteObjects API in AWS S3. How would you feel about including a delete_batch method too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can provides a function like delete_iter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think this should not be included in the initial cut tho. We can add the new function when stabilizing the Storage trait. I'll update the implementation plan to mention it

```

### InputFile and OutputFile Structs

`InputFile` and `OutputFile` are concrete structs that contain a reference to the storage:

```rust

pub struct InputFile {
pub storage: Arc<dyn Storage>,
pub path: String,
}

pub struct OutputFile {
pub storage: Arc<dyn Storage>,
pub path: String,
}
```

Functions in `InputFile` and `OutputFile` delegate operations to the underlying Storage:

```rust

impl InputFile {
pub async fn exists(&self) -> Result<bool> {
self.storage.exists(&self.path).await
}

pub async fn read(&self) -> Result<Bytes> {
self.storage.read(&self.path).await
}
// ... other methods
}
```

Benefits of this approach:

- Simpler and easier to maintain
- Less trait object overhead
- Clear delegation pattern
- Sufficient flexibility for most use cases

## StorageBuilder and Registry

### StorageBuilder Trait
The `StorageBuilder` trait defines how storage backends are constructed:

```rust
pub trait StorageBuilder: Debug + Send + Sync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub trait StorageBuilder: Debug + Send + Sync {
pub trait StorageFactory: Debug + Send + Sync {

nit: Factory sounds a litte better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should also be Serializable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I haven't figured out how to have both Serializable and dyn Trait yet, and I'm planning to explore more with typetag. Serializability is a huge pain and I'll add more details once I have more clarity

fn build(
&self,
props: HashMap<String, String>,
extensions: Extensions,
) -> Result<Arc<dyn Storage>>;
}
```

Key design decisions:

- Uses `&self` instead of `self` - builders are reusable
- No associated type - returns `Arc<dyn Storage>` directly
- `Send + Sync` for thread safety

### StorageBuilderRegistry
The registry manages storage builders and provides lookup by scheme:

```rust
#[derive(Debug, Clone)]
pub struct StorageBuilderRegistry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking maybe we also need a trait for StorageBuilderRegistry, and this trait lives in iceberg crate.

pub trait StorageBuilderRegistry {
}

The motivation is that we may also need a default concrete registry implementation, which by default loads builtin builders. And the default registry implementaion should have a standalone crate, and it needs to depend on different storages.

builders: HashMap<String, Arc<dyn StorageBuilder>>,
}

impl StorageBuilderRegistry {
pub fn new() -> Self { /* ... */ }
pub fn register(&mut self, scheme: impl Into<String>, builder: Arc<dyn StorageBuilder>);
pub fn get_builder(&self, scheme: &str) -> Result<Arc<dyn StorageBuilder>>;
pub fn supported_types(&self) -> Vec<String>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn supported_types(&self) -> Vec<String>;
pub fn supported_types(&self) -> impl Iterator<Item=&str>>;

}
```

Features:

- Automatic registration based on enabled cargo features
- Runtime registration of custom builders
- Case-insensitive scheme lookup
- Thread-safe and cloneable

## Example Usage

```rust
use iceberg::io::FileIOBuilder;

// Basic usage (same as the existing code)
let file_io = FileIOBuilder::new("s3")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer need FileIOBuilder?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that FileIOBuilder seems excessive. I'm keeping FileIOBuilder here for now mainly because I'm uncertain where to keep Extensions right now mainly due to serde-related concerns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After StorageBuilder becomes serializable, we should be able to remove it.

Copy link
Collaborator Author

@CTTY CTTY Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've explored a bit more on the serialization issue, and I realized that the existing FileIOBuilder is not serializable due to pub struct Extensions(HashMap<TypeId, Arc<dyn Any + Send + Sync>>);. And due to the unserializable nature of Any, it won't make sense to move Extensions to either Storage or StorageBuilder

We should be able to remove Extensions and the FileIOBuilder once the Storage trait is ready to use, so users can register a storage with credential loader directly. But I think it's better to address this problem later when we are stabilizing the Storage trait

.with_prop("s3.region", "us-west-2")
.build()?;

// Custom storage registration
use iceberg::io::{StorageBuilderRegistry, StorageBuilder};

let mut registry = StorageBuilderRegistry::new();
registry.register("custom", Arc::new(MyCustomStorageBuilder));

// Check supported types
println!("Supported: {:?}", registry.supported_types());
```

## Storage Implementations
Each storage backend has its own implementation:

| Storage | File | Struct | Builder | Schemes |
|---------|------|--------|---------|---------|
| S3 | `storage_s3.rs` | `OpenDALS3Storage` | `OpenDALS3StorageBuilder` | `s3`, `s3a` |
| GCS | `storage_gcs.rs` | `OpenDALGcsStorage` | `OpenDALGcsStorageBuilder` | `gs`, `gcs` |
| OSS | `storage_oss.rs` | `OpenDALOssStorage` | `OpenDALOssStorageBuilder` | `oss` |
| Azure | `storage_azdls.rs` | `OpenDALAzdlsStorage` | `OpenDALAzdlsStorageBuilder` | `abfs`, `abfss`, `wasb`, `wasbs` |
| Filesystem | `storage_fs.rs` | `OpenDALFsStorage` | `OpenDALFsStorageBuilder` | `file`, `""` |
| Memory | `storage_memory.rs` | `OpenDALMemoryStorage` | `OpenDALMemoryStorageBuilder` | `memory` |

### Implementation Pattern
Each storage follows this consistent pattern:

```rust
// 1. Storage struct with configuration
#[derive(Debug, Clone)]
pub struct OpenDALXxxStorage {
config: Arc<XxxConfig>,
}

// 2. Helper method to create operator
impl OpenDALXxxStorage {
fn create_operator<'a>(&self, path: &'a str) -> Result<(Operator, &'a str)> {
// Create OpenDAL operator with retry layer
}
}

// 3. Storage trait implementation
#[async_trait]
impl Storage for OpenDALXxxStorage {
async fn exists(&self, path: &str) -> Result<bool> { /* ... */ }
async fn metadata(&self, path: &str) -> Result<FileMetadata> { /* ... */ }

// ... implement all 10 methods
}

// 4. Builder struct
#[derive(Debug)]
pub struct OpenDALXxxStorageBuilder;

// 5. Builder trait implementation
impl StorageBuilder for OpenDALXxxStorageBuilder {
fn build(&self, props, extensions) -> Result<Arc<dyn Storage>> {
// Parse configuration and create storage
}
}
```

### Component Flow

```
User Code
FileIOBuilder::new("s3")
FileIOBuilder::build()
StorageBuilderRegistry::new()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think we should remove FileIOBuilder since then?
  2. The StorageBuilderRegistery should be an instance in catalog instance?

registry.get_builder("s3")
OpenDALS3StorageBuilder
builder.build(props, extensions)
OpenDALS3Storage (implements Storage)
FileIO { inner: Arc<dyn Storage> }
```

## Custom Storage Implementation
To implement a custom storage backend:

```rust
use std::collections::HashMap;
use std::sync::Arc;
use async_trait::async_trait;
use bytes::Bytes;
use iceberg::io::{
Storage, StorageBuilder, Extensions, FileMetadata,
FileRead, FileWrite, InputFile, OutputFile
};

use iceberg::Result;

// 1. Define your storage struct
#[derive(Debug, Clone)]
struct MyCustomStorage {
// Your configuration
}

// 2. Implement the Storage trait
#[async_trait]
impl Storage for MyCustomStorage {
async fn exists(&self, path: &str) -> Result<bool> {
// Your implementation
}
// ... implement all 10 methods
}

// 3. Define your builder
#[derive(Debug)]
struct MyCustomStorageBuilder;

// 4. Implement StorageBuilder
impl StorageBuilder for MyCustomStorageBuilder {
fn build(
&self,
props: HashMap<String, String>,
extensions: Extensions,
) -> Result<Arc<dyn Storage>> {
// Parse configuration from props
Ok(Arc::new(MyCustomStorage::new(props)?))
}

}

// 5. Register your builder
let mut registry = StorageBuilderRegistry::new();
registry.register("custom", Arc::new(MyCustomStorageBuilder));
```

## Implementation Plan

- **Phase 1 (Initial Implementation):**
- Define core traits (Storage, optionally InputFile/OutputFile)
- Implement StorageBuilder + StorageLoader
- Refactor OpenDAL to use traits
- **Phase 2:** Split storage backends into separate crates (crates/storage/opendal)
- **Phase 3:** Replace FileIO with StorageBuilderRegistry in Catalog structs
- **Phase 4:** Add object_store + other backends
Loading