A lightweight xarray-like object for building dataset metadata specifications.
Define your dataset structure, metadata, and encoding before creating the actual data arrays. Perfect for planning datasets, generating templates, and ensuring CF compliance.
- 📋 Metadata-first design - Define structure before data
- 🔄 xarray compatibility - Convert to/from xarray.Dataset
- ✅ CF compliance - Community standards via cf_xarray integration
- 📥 ncdump import - Create from
ncdump -houtput - 📂 Multi-file support - Track and query multiple NetCDF files
- 📊 Smart data generation - Populate with realistic random data
- 📝 History tracking - Record and replay all operations
- 💾 Multiple formats - Export to YAML, JSON, Zarr, NetCDF
- 🗂️ Intake catalogs - Export and import Intake catalog YAML files
- 🎯 Validation - Catch errors before expensive operations
# Using pixi (recommended)
pixi install
# Using pip
pip install -r requirements.txtfrom dummyxarray import DummyDataset
# Create dataset structure
ds = DummyDataset()
ds.assign_attrs(Conventions="CF-1.8", title="My Dataset")
# Add dimensions and coordinates
ds.add_dim("time", 12)
ds.add_dim("lat", 180)
ds.add_dim("lon", 360)
ds.add_coord("time", dims=["time"], attrs={"units": "days since 2000-01-01"})
ds.add_coord("lat", dims=["lat"], attrs={"units": "degrees_north"})
ds.add_coord("lon", dims=["lon"], attrs={"units": "degrees_east"})
# Add variable with encoding
ds.add_variable(
"temperature",
dims=["time", "lat", "lon"],
attrs={"standard_name": "air_temperature", "units": "K"},
encoding={"dtype": "float32", "chunks": (6, 32, 64)}
)
# Infer CF axis attributes
ds.infer_axis()
ds.set_axis_attributes()
# Validate CF compliance
result = ds.validate_cf()
# Populate with realistic data
ds.populate_with_random_data(seed=42)
# Convert to xarray or export
xr_ds = ds.to_xarray()
ds.to_zarr("output.zarr")
ds.save_yaml("template.yaml")
# Export to Intake catalog
catalog_yaml = ds.to_intake_catalog(
name="my_dataset",
description="My climate dataset",
driver="zarr",
data_path="data/my_dataset.zarr"
)
ds.save_intake_catalog("catalog.yaml", name="my_dataset")
# Import from Intake catalog
loaded_ds = DummyDataset.from_intake_catalog("catalog.yaml", "my_dataset")Dataset Planning - Define structure and metadata before generating data
Template Generation - Create reusable dataset specifications
CF Compliance - Ensure metadata follows CF conventions
Testing - Generate realistic test datasets quickly
Documentation - Export human-readable dataset specifications
Data Cataloging - Create Intake catalogs for dataset discovery and access
Full documentation is available at https://siligam.github.io/dummyxarray/
- Getting Started - Quick start guide
- User Guide - Detailed usage examples
- API Reference - Complete API documentation
- Examples - Working code examples
- CF Compliance - CF convention support
# Run tests
pixi run test
# Format code
pixi run format
# Lint code
pixi run lint
# Run all checks
pixi run checkSee CONTRIBUTING.md for development guidelines.
MIT License - see LICENSE for details.