diff --git a/docs/docker-model-runner.md b/docs/docker-model-runner.md new file mode 100644 index 0000000..26da622 --- /dev/null +++ b/docs/docker-model-runner.md @@ -0,0 +1,124 @@ +# Using Docker Model Runner with ModelPack + +This guide shows you how to use [Docker Model Runner](https://docs.docker.com/desktop/features/model-runner/) to pull and run AI models packaged using the ModelPack specification. Note that runtime compatibility depends on the model format and inference engine combinations that Docker Model Runner supports (currently GGUF models). + +## What is Docker Model Runner? + +Docker Model Runner is a built-in feature of Docker Desktop that enables pulling, managing, and running AI models directly from OCI registries. Since [v1.0.19](https://github.com/docker/model-runner/releases/tag/v1.0.19), it can detect and convert ModelPack-formatted OCI artifacts, allowing you to use ModelPack-packaged models that are in a supported format (e.g., GGUF) without any additional tools. + +## Prerequisites + +- [Docker Desktop](https://docs.docker.com/get-docker/) 4.40 or later with Model Runner enabled +- A ModelPack-compatible model pushed to an OCI registry (see [modctl](./modctl.md) or [AIKit](./aikit.md) for packaging) + +## Enable Docker Model Runner + +Docker Model Runner is available through Docker Desktop. Enable it in Docker Desktop settings: + +1. Open Docker Desktop +2. Go to **Settings** > **Features in development** +3. Enable **Docker Model Runner** + +You can verify it is enabled by running: + +```bash +docker model list +``` + +## Pull a ModelPack Model + +Docker Model Runner can pull models directly from OCI registries. When pulling a ModelPack-formatted artifact, Docker automatically detects the ModelPack config format and converts it for local use. + +```bash +# Pull a model from an OCI registry +docker model pull myregistry.com/mymodel:v1.0 +``` + +## Run a Model + +Once pulled, you can run inference using the model. The model must be pulled before running — `docker model run` does not pull automatically: + +```bash +# First pull the model (required before running) +docker model pull myregistry.com/mymodel:v1.0 + +# Run a model interactively +docker model run myregistry.com/mymodel:v1.0 + +# Send a prompt to the model +docker model run myregistry.com/mymodel:v1.0 "Explain cloud-native computing" +``` + +## List and Manage Models + +```bash +# List all downloaded models +docker model list + +# Remove a model +docker model rm myregistry.com/mymodel:v1.0 +``` + +## Use Models via the OpenAI-Compatible API + +Docker Model Runner exposes an OpenAI-compatible API endpoint, enabling integration with existing tools and libraries: + +```bash +curl http://localhost:12434/engines/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "myregistry.com/mymodel:v1.0", + "messages": [{"role": "user", "content": "Hello!"}] + }' +``` + +## How ModelPack Format Is Detected + +Docker Model Runner identifies a ModelPack artifact by checking the OCI config blob for any of the following fields: + +- `config.paramSize` — the model parameter size +- `descriptor.createdAt` — the model creation timestamp +- `modelfs` — the model filesystem descriptor + +If any of these fields are present, the artifact is recognized as a ModelPack-formatted model. + +## Field Mapping: ModelPack to Docker + +When Docker Model Runner pulls a ModelPack model, it converts the config to Docker's internal format. Some fields are remapped to different names or locations: + +**Fields that are renamed (breaking changes if modified):** + +| ModelPack Field | Docker Field | Notes | +|---|---|---| +| `descriptor.createdAt` | `created` (top-level) | Moved out of `descriptor` and renamed | +| `config.paramSize` | `parameters` (top-level) | Moved out of `config` and renamed | +| `modelfs` (top-level object) | `rootfs` (top-level object) | Renamed at top level | + +**Fields that are passed through unchanged (within their parent objects):** + +| ModelPack Field | Docker Field | Notes | +|---|---|---| +| `descriptor.name` | `descriptor.name` | Kept in `descriptor` object | +| `descriptor.family` | `descriptor.family` | Kept in `descriptor` object | +| `descriptor.description` | `descriptor.description` | Kept in `descriptor` object | +| `descriptor.licenses` | `descriptor.licenses` | Kept in `descriptor` object | +| `config.format` | `config.format` | Kept in `config` object | +| `config.quantization` | `config.quantization` | Kept in `config` object | +| `config.architecture` | `config.architecture` | Kept in `config` object | + +## Media Type Mapping + +ModelPack media types are converted to Docker's internal media types: + +| ModelPack Media Type | Docker Media Type | +|---|---| +| `application/vnd.cncf.model.weight.v1.raw` | Mapped based on file extension (e.g., `.gguf` → `application/vnd.docker.ai.gguf.v3`) | +| `application/vnd.cncf.model.weight.v1.tar+gzip` | `application/vnd.docker.ai.gguf.v3+gzip` | +| `application/vnd.cncf.model.weight.config.v1.raw` | `application/vnd.docker.ai.config` | +| `application/vnd.cncf.model.doc.v1.raw` | `application/vnd.docker.ai.doc` | + +## Next Steps + +- **Package models** using [modctl](./modctl.md) or [AIKit](./aikit.md) to create ModelPack artifacts +- **Learn about the [Model CSI Driver](https://github.com/modelpack/model-csi-driver)** for Kubernetes integration +- **Read the [full ModelPack specification](./spec.md)** for technical implementation details diff --git a/docs/getting-started.md b/docs/getting-started.md index b83ddfa..cf03fe5 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -35,6 +35,7 @@ This section lists the core infrastructure components that ModelPack is working - **[modctl](https://github.com/modelpack/modctl)**: CLI tool for building, pushing, pulling, and managing OCI model artifacts - **[KitOps](https://kitops.ml/)**: ModelKit packaging and deployment platform that supports the ModelPack specification - **[AIKit](https://kaito-project.github.io/aikit/docs/packaging)**: Package AI models as OCI artifacts from local, HTTP, or Hugging Face sources with extensible formats, including ModelPack specification +- **[Docker Model Runner](https://docs.docker.com/desktop/features/model-runner/)**: Pull and run ModelPack models directly from OCI registries using Docker Desktop ### Kubernetes Integration @@ -65,6 +66,7 @@ The ModelPack specification can be used with different tools depending on your n - **[modctl](./modctl.md)**: CLI tool for building, pushing, pulling, and managing OCI model artifacts. Great for command-line workflows and CI/CD pipelines. - **[AIKit](./aikit.md)**: Package AI models as OCI artifacts from local, HTTP, or Hugging Face sources with extensible formats. - **[KitOps](https://kitops.ml/)**: ModelKit packaging and deployment platform that supports the ModelPack specification. +- **[Docker Model Runner](./docker-model-runner.md)**: Pull and run ModelPack models directly from OCI registries using Docker Desktop. ### Install Model CSI Driver @@ -100,7 +102,7 @@ This example shows how to mount a model artifact directly into a Kubernetes pod ## Next Steps -1. **Get hands-on experience**: Follow the step-by-step guides for [modctl](./modctl.md) or [AIKit](./aikit.md) +1. **Get hands-on experience**: Follow the step-by-step guides for [modctl](./modctl.md), [AIKit](./aikit.md), or [Docker Model Runner](./docker-model-runner.md) 2. **Explore the [full ModelPack specification](./spec.md)** for technical implementation details 3. **Join the community** on [CNCF Slack #modelpack](https://cloud-native.slack.com/archives/C07T0V480LF) 4. **Contribute** to the ModelPack project - see our [contributing guidelines](../CONTRIBUTING.md) @@ -114,5 +116,6 @@ This example shows how to mount a model artifact directly into a Kubernetes pod - [KitOps](https://kitops.ml/) - [Hugging Face](https://huggingface.co/) - [AIKit](https://github.com/kaito-project/aikit) +- [Docker Model Runner](https://docs.docker.com/desktop/features/model-runner/) The ModelPack specification represents the next evolution in infrastructure standardization, bringing the benefits of containerization to AI model management. Start with the basics, explore the ecosystem, and join our growing community of contributors and users building the future of cloud-native AI. diff --git a/schema/compat_test.go b/schema/compat_test.go new file mode 100644 index 0000000..ef7501f --- /dev/null +++ b/schema/compat_test.go @@ -0,0 +1,344 @@ +/* + * Copyright 2025 The CNCF ModelPack Authors + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package schema_test + +import ( + "encoding/json" + "strings" + "testing" + "time" + + v1 "github.com/modelpack/model-spec/specs-go/v1" + digest "github.com/opencontainers/go-digest" +) + +// TestDownstreamDetectionFields verifies that a fully populated ModelPack config +// serializes to JSON containing the fields that downstream consumers (e.g., +// Docker Model Runner) use to detect ModelPack artifacts: +// - "config" containing "paramSize" +// - "descriptor" containing "createdAt" +// - "modelfs" as a top-level key +// +// If any of these field names change, downstream detection will break. +func TestDownstreamDetectionFields(t *testing.T) { + now := time.Now().UTC() + boolTrue := true + + model := v1.Model{ + Descriptor: v1.ModelDescriptor{ + CreatedAt: &now, + Name: "test-model", + Family: "llama3", + Version: "1.0", + }, + ModelFS: v1.ModelFS{ + Type: "layers", + DiffIDs: []digest.Digest{"sha256:abc123"}, + }, + Config: v1.ModelConfig{ + Architecture: "transformer", + Format: "gguf", + ParamSize: "8b", + Precision: "fp16", + Quantization: "q4_0", + Capabilities: &v1.ModelCapabilities{ + InputTypes: []v1.Modality{v1.TextModality}, + OutputTypes: []v1.Modality{v1.TextModality}, + Reasoning: &boolTrue, + }, + }, + } + + data, err := json.Marshal(model) + if err != nil { + t.Fatalf("failed to marshal Model: %v", err) + } + + var raw map[string]json.RawMessage + if err := json.Unmarshal(data, &raw); err != nil { + t.Fatalf("failed to unmarshal to map: %v", err) + } + + // Top-level keys that downstream consumers depend on. + for _, key := range []string{"descriptor", "modelfs", "config"} { + if _, ok := raw[key]; !ok { + t.Errorf("top-level key %q missing from serialized Model JSON — downstream detection will break", key) + } + } + + // Verify descriptor contains "createdAt". + var desc map[string]json.RawMessage + if err := json.Unmarshal(raw["descriptor"], &desc); err != nil { + t.Fatalf("failed to unmarshal descriptor: %v", err) + } + if _, ok := desc["createdAt"]; !ok { + t.Error("descriptor missing \"createdAt\" field — downstream detection will break") + } + + // Verify config contains "paramSize". + var cfg map[string]json.RawMessage + if err := json.Unmarshal(raw["config"], &cfg); err != nil { + t.Fatalf("failed to unmarshal config: %v", err) + } + if _, ok := cfg["paramSize"]; !ok { + t.Error("config missing \"paramSize\" field — downstream detection will break") + } +} + +// TestDownstreamFieldMapping verifies that every config field used by +// downstream consumers maps to the expected JSON key name. These mappings +// MUST NOT change without coordinating with downstream projects. +func TestDownstreamFieldMapping(t *testing.T) { + now := time.Now().UTC() + boolTrue := true + + model := v1.Model{ + Descriptor: v1.ModelDescriptor{ + CreatedAt: &now, + Name: "test-model", + Family: "llama3", + Description: "A test model", + Licenses: []string{"Apache-2.0"}, + }, + ModelFS: v1.ModelFS{ + Type: "layers", + DiffIDs: []digest.Digest{"sha256:abc123"}, + }, + Config: v1.ModelConfig{ + ParamSize: "8b", + Format: "gguf", + Quantization: "q4_0", + Architecture: "transformer", + Capabilities: &v1.ModelCapabilities{ + InputTypes: []v1.Modality{v1.TextModality}, + OutputTypes: []v1.Modality{v1.TextModality}, + Reasoning: &boolTrue, + }, + }, + } + + data, err := json.Marshal(model) + if err != nil { + t.Fatalf("failed to marshal: %v", err) + } + + // Parse into nested maps to check field names at each level. + var full map[string]json.RawMessage + if err := json.Unmarshal(data, &full); err != nil { + t.Fatalf("failed to unmarshal: %v", err) + } + + // Descriptor field mappings used by downstream consumers. + var desc map[string]json.RawMessage + if err := json.Unmarshal(full["descriptor"], &desc); err != nil { + t.Fatalf("failed to unmarshal descriptor: %v", err) + } + descFields := []string{"createdAt", "name", "family", "description", "licenses"} + for _, f := range descFields { + if _, ok := desc[f]; !ok { + t.Errorf("descriptor missing expected field %q", f) + } + } + + // Config field mappings used by downstream consumers. + var cfg map[string]json.RawMessage + if err := json.Unmarshal(full["config"], &cfg); err != nil { + t.Fatalf("failed to unmarshal config: %v", err) + } + cfgFields := []string{"paramSize", "format", "quantization", "architecture", "capabilities"} + for _, f := range cfgFields { + if _, ok := cfg[f]; !ok { + t.Errorf("config missing expected field %q", f) + } + } + + // ModelFS field mappings used by downstream consumers. + var mfs map[string]json.RawMessage + if err := json.Unmarshal(full["modelfs"], &mfs); err != nil { + t.Fatalf("failed to unmarshal modelfs: %v", err) + } + mfsFields := []string{"type", "diffIds"} + for _, f := range mfsFields { + if _, ok := mfs[f]; !ok { + t.Errorf("modelfs missing expected field %q", f) + } + } +} + +// TestDownstreamMediaTypePrefixes verifies that the ModelPack media type +// constants use the expected prefix that downstream consumers rely on for +// layer type detection. +func TestDownstreamMediaTypePrefixes(t *testing.T) { + prefix := "application/vnd.cncf.model." + + mediaTypes := []struct { + name string + value string + }{ + {"MediaTypeModelConfig", v1.MediaTypeModelConfig}, + {"MediaTypeModelWeightRaw", v1.MediaTypeModelWeightRaw}, + {"MediaTypeModelWeight", v1.MediaTypeModelWeight}, + {"MediaTypeModelWeightGzip", v1.MediaTypeModelWeightGzip}, + {"MediaTypeModelWeightZstd", v1.MediaTypeModelWeightZstd}, + {"MediaTypeModelWeightConfigRaw", v1.MediaTypeModelWeightConfigRaw}, + {"MediaTypeModelWeightConfig", v1.MediaTypeModelWeightConfig}, + {"MediaTypeModelWeightConfigGzip", v1.MediaTypeModelWeightConfigGzip}, + {"MediaTypeModelWeightConfigZstd", v1.MediaTypeModelWeightConfigZstd}, + {"MediaTypeModelDocRaw", v1.MediaTypeModelDocRaw}, + {"MediaTypeModelDoc", v1.MediaTypeModelDoc}, + {"MediaTypeModelDocGzip", v1.MediaTypeModelDocGzip}, + {"MediaTypeModelDocZstd", v1.MediaTypeModelDocZstd}, + {"MediaTypeModelCodeRaw", v1.MediaTypeModelCodeRaw}, + {"MediaTypeModelCode", v1.MediaTypeModelCode}, + {"MediaTypeModelCodeGzip", v1.MediaTypeModelCodeGzip}, + {"MediaTypeModelCodeZstd", v1.MediaTypeModelCodeZstd}, + {"MediaTypeModelDatasetRaw", v1.MediaTypeModelDatasetRaw}, + {"MediaTypeModelDataset", v1.MediaTypeModelDataset}, + {"MediaTypeModelDatasetGzip", v1.MediaTypeModelDatasetGzip}, + {"MediaTypeModelDatasetZstd", v1.MediaTypeModelDatasetZstd}, + } + + for _, mt := range mediaTypes { + t.Run(mt.name, func(t *testing.T) { + if !strings.HasPrefix(mt.value, prefix) { + t.Errorf("%s = %q, does not have expected prefix %q", mt.name, mt.value, prefix) + } + }) + } +} + +// TestDownstreamWeightMediaTypes verifies the exact media type strings for +// model weight layers that downstream consumers use for format detection +// and conversion. +func TestDownstreamWeightMediaTypes(t *testing.T) { + tests := []struct { + name string + got string + expected string + }{ + { + name: "raw weight", + got: v1.MediaTypeModelWeightRaw, + expected: "application/vnd.cncf.model.weight.v1.raw", + }, + { + name: "tar weight", + got: v1.MediaTypeModelWeight, + expected: "application/vnd.cncf.model.weight.v1.tar", + }, + { + name: "gzip weight", + got: v1.MediaTypeModelWeightGzip, + expected: "application/vnd.cncf.model.weight.v1.tar+gzip", + }, + { + name: "zstd weight", + got: v1.MediaTypeModelWeightZstd, + expected: "application/vnd.cncf.model.weight.v1.tar+zstd", + }, + { + name: "raw weight config", + got: v1.MediaTypeModelWeightConfigRaw, + expected: "application/vnd.cncf.model.weight.config.v1.raw", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + if tt.got != tt.expected { + t.Errorf("got %q, want %q — changing this value will break downstream media type mapping", tt.got, tt.expected) + } + }) + } +} + +// TestDownstreamArtifactType verifies the artifact type constant used in OCI +// manifests. Downstream consumers match on this to identify ModelPack manifests. +func TestDownstreamArtifactType(t *testing.T) { + expected := "application/vnd.cncf.model.manifest.v1+json" + if v1.ArtifactTypeModelManifest != expected { + t.Errorf("ArtifactTypeModelManifest = %q, want %q — changing this will break downstream manifest detection", v1.ArtifactTypeModelManifest, expected) + } +} + +// TestDownstreamRoundTrip verifies that a ModelPack config can be marshalled +// and unmarshalled without losing any fields. This ensures that downstream +// consumers can reliably parse configs produced by ModelPack tooling. +func TestDownstreamRoundTrip(t *testing.T) { + now := time.Date(2025, 6, 15, 12, 0, 0, 0, time.UTC) + boolTrue := true + boolFalse := false + + original := v1.Model{ + Descriptor: v1.ModelDescriptor{ + CreatedAt: &now, + Authors: []string{"CNCF ModelPack Authors"}, + Family: "llama3", + Name: "llama3-8b-instruct", + DocURL: "https://example.com/docs", + SourceURL: "https://example.com/source", + DatasetsURL: []string{"https://example.com/dataset1"}, + Version: "3.1", + Revision: "abc123", + Vendor: "Example Corp", + Licenses: []string{"Apache-2.0"}, + Title: "Llama 3 8B Instruct", + Description: "An instruction-tuned language model", + }, + ModelFS: v1.ModelFS{ + Type: "layers", + DiffIDs: []digest.Digest{ + "sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef", + }, + }, + Config: v1.ModelConfig{ + Architecture: "transformer", + Format: "gguf", + ParamSize: "8b", + Precision: "fp16", + Quantization: "q4_0", + Capabilities: &v1.ModelCapabilities{ + InputTypes: []v1.Modality{v1.TextModality, v1.ImageModality}, + OutputTypes: []v1.Modality{v1.TextModality}, + Reasoning: &boolTrue, + ToolUsage: &boolFalse, + Reward: &boolFalse, + Languages: []string{"en", "fr", "zh"}, + }, + }, + } + + data, err := json.Marshal(original) + if err != nil { + t.Fatalf("marshal failed: %v", err) + } + + var restored v1.Model + if err := json.Unmarshal(data, &restored); err != nil { + t.Fatalf("unmarshal failed: %v", err) + } + + // Re-marshal to compare JSON output (avoids pointer comparison issues). + data2, err := json.Marshal(restored) + if err != nil { + t.Fatalf("re-marshal failed: %v", err) + } + + if string(data) != string(data2) { + t.Errorf("round-trip JSON mismatch:\n original: %s\n restored: %s", string(data), string(data2)) + } +}