Add per-LLM-node generation_config.json path override (#4233)#4330
Open
exzile wants to merge 1 commit into
Open
Add per-LLM-node generation_config.json path override (#4233)#4330exzile wants to merge 1 commit into
exzile wants to merge 1 commit into
Conversation
Adds an optional generation_config_path field to LLMCalculatorOptions so several deployments backed by the same model weights can use different generation defaults without duplicating the model directory. Mirrors how graph_path already lets one model directory back several deployments. When unset, generation_config.json from models_path is used as before. An explicit path may be absolute or relative to models_path; a relative path is resolved against the model directory (its parent when models_path points at a file, e.g. a GGUF). An explicit path that does not exist is a load error. Shared resolveGenerationConfigPath helper is used by both the continuous batching and legacy initializers. Adds a unit test covering default, absolute, relative, and missing-path cases. Implements openvinotoolkit#4233 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
79ec0ce to
44a52e6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #4233.
Adds an optional
generation_config_pathfield toLLMCalculatorOptions, so several deployments backed by the same model weights can use different generation defaults without duplicating the model directory. This mirrors howgraph_pathalready lets one model directory back several deployments with different graphs.Behavior
generation_config.jsonfrommodels_path, exactly as before (no change).models_path(resolved against the model directory, or its parent whenmodels_pathpoints at a file such as a GGUF).A shared
resolveGenerationConfigPathhelper is used by both the continuous-batching and legacy initializers, so behavior is identical across pipeline types (and the VLM CB servable, which reuses the CB initializer).Placement rationale
Per the discussion in #4233 (and the config taxonomy from #4221):
graph_pathalready lets one model directory back several deployments with different node options, so putting the override in the node options keeps the whole per-deployment configuration ingraph.pbtxt.Testing
Added
LLMGenerationConfigPath.ResolveGenerationConfigPathcovering default, absolute, relative, and missing-path cases. Built and ran locally on Windows (MSVC).🤖 Generated with Claude Code