diff --git a/docs/pull_hf_models.md b/docs/pull_hf_models.md index d39653be6a..86860a909e 100644 --- a/docs/pull_hf_models.md +++ b/docs/pull_hf_models.md @@ -93,3 +93,58 @@ Check [parameters page](./parameters.md) for detailed descriptions of configurat In case you want to setup model and start server in one step, follow [instructions](./starting_server.md). > **Note:** When using pull mode you need both read and write access rights to models repository. + +## Resuming an interrupted pull + +Pulling Generative AI models from Hugging Face often involves transferring multi-gigabyte LFS files (e.g. `openvino_model.bin`). To make this robust against network errors and operator interventions, OVMS pull mode persists the in-progress download state on disk and resumes from where it stopped on the next `--pull` invocation. No extra flags are required — simply re-run the same `--pull` command against the same `--model_repository_path` and OVMS will continue any partially downloaded LFS files instead of starting from scratch. + +### What is persisted + +While a pull is in flight, OVMS / libgit2 keeps the following on disk under your `--model_repository_path`: + +| Artifact | Purpose | +|---|---| +| `.lfswip` (sibling of the repository directory) | Marker file indicating that an LFS download is work-in-progress. | +| `.lfs_part` (next to each LFS-tracked file) | Partially downloaded LFS object. The next `--pull` resumes the HTTP transfer from the existing byte offset. | +| LFS pointer file (in place of the final binary) | Standard `version https://git-lfs.github.com/spec/v1` pointer that allows libgit2 to identify which OID still needs to be fetched. | + +When the LFS transfer for a file completes successfully, the `.lfs_part` file is renamed to its final name and the pointer is replaced. Once **all** LFS files are present, the `.lfswip` marker is removed and the repository is considered clean. + +### Resume after Ctrl+C / SIGINT (graceful cancel) + +Pressing **Ctrl+C** (or sending `SIGINT` / `SIGTERM` on Linux, `CTRL_BREAK_EVENT` on Windows) while `ovms --pull` is running triggers a graceful cancellation: + +1. OVMS marks the server as shutting down. +2. libgit2 clone / LFS callbacks observe the cancellation request and abort the in-flight HTTP transfer cleanly. +3. The process exits with a non-zero status code. Partial `.lfs_part` files and the `.lfswip` marker are left on disk on purpose. +4. Re-running the **same** `--pull` command resumes each partial file using HTTP `Range` requests and finishes the remaining downloads. + +This is the recommended way to interrupt a pull — it avoids corrupted partial data and lets you resume without re-downloading completed files. + +### Resume after process termination (forced kill / power loss) + +If the OVMS process is killed forcibly (`SIGKILL`, OOM killer, container stop with no grace period, host crash, power loss), the on-disk state is the same as for a graceful cancel: any LFS files that were in flight remain as `.lfs_part` plus an LFS pointer file, and the `.lfswip` marker is still present. The next `--pull` invocation: + +1. Detects the `.lfswip` marker and the leftover LFS pointer files. +2. For each affected file, opens an HTTP `Range` request starting at the current size of the corresponding `.lfs_part` file and continues the transfer. +3. Cleans up the marker once every LFS file is fully present. + +If a forced termination corrupted an in-progress write, the resumed transfer will detect the size/hash mismatch on completion and the file will be re-downloaded on a subsequent attempt. User-edited or user-deleted files are **not** restored automatically — once a `--pull` has finished successfully, OVMS treats the local repository as authoritative and will not overwrite or re-fetch files that you have modified or removed. To force OVMS to re-download a model from scratch, pass `--overwrite_models` on the next `--pull` invocation; the existing model directory under `--model_repository_path` will be replaced with a fresh download. + +### Tuning resume behavior + +The number of resume attempts per LFS file and the interval between them can be tuned via environment variables read once on process start (defaults shown): + +| Environment variable | Default | Description | +|---|---|---| +| `GIT_LFS_RESUME_ATTEMPTS` | `5` | Maximum number of resume attempts for a single LFS file before giving up. `0` disables resume. | +| `GIT_LFS_RESUME_INTERVAL_SECONDS` | `10` | Delay between consecutive resume attempts. | + +On startup OVMS logs the resolved configuration, e.g.: + +```text +[INFO] LFS resume: attempts=5 interval=10 s +``` + +> **Note:** Resume relies on the remote server honoring HTTP `Range` requests. Hugging Face Hub supports this by default; private mirrors must allow ranged GETs for resume to work. +