Propagate global --cache_dir to continuous batching pipeline (#4230)#4329
Open
exzile wants to merge 1 commit into
Open
Propagate global --cache_dir to continuous batching pipeline (#4230)#4329exzile wants to merge 1 commit into
exzile wants to merge 1 commit into
Conversation
Author
End-to-end verification on GPUBeyond the unit test, I verified the actual caching behavior on an Intel Arc Pro B70 dGPU ( Run 1 (empty cache):
Run 2 (restart, populated cache):
Without the change the cache directory stays empty on the CB path and every restart recompiles. With it, blobs persist and are reused across restarts as intended. |
The continuous batching servable initializer constructs the GenAI ContinuousBatchingPipeline directly and never applied the server-level --cache_dir (ServerSettings.cacheDir). Unlike the non-CB path, which applies it via ModelInstance::setCacheOptions, the CB path left model compilation caching disabled unless the user duplicated the value into the node's plugin_config as CACHE_DIR. As a result, .blob/.cl_cache artifacts were never persisted and every restart fully recompiled the model. Inject the global cache_dir into the pipeline plugin config before constructing the pipeline. An explicit CACHE_DIR in the node's plugin_config remains authoritative. Adds a regression test (LLMNodeOptionsCacheDirPropagation) covering both propagation of the global value and precedence of an explicit node value. Fixes openvinotoolkit#4230 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6fd48db to
c45573c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #4230.
The continuous batching servable initializer constructs the GenAI
ContinuousBatchingPipelinedirectly and never applied the server-level--cache_dir(ServerSettings.cacheDir). Unlike the non-CB path — which applies it viaModelInstance::setCacheOptions(ieCore.set_property(ov::cache_dir(...))) — the CB path left model-compilation caching disabled unless the user manually duplicated the value into the node'splugin_configasCACHE_DIR. As a result,.blob/.cl_cacheartifacts were never persisted and every server restart fully recompiled the model.Fix
In
ContinuousBatchingServableInitializer::initialize, inject the globalcache_dirinto the pipelinepluginConfig(keyed byov::cache_dir.name()==CACHE_DIR) right after parsing the nodeplugin_configand before constructing the pipeline. An explicitCACHE_DIRin the node'splugin_configremains authoritative.This also applies to the VLM continuous-batching servable, which reuses this initializer.
Testing
Added a regression test
LLMNodeOptionsCacheDirPropagation(run for both the CB and VLM fixtures) covering:--cache_diris propagated intopluginConfig["CACHE_DIR"]when the node does not set it.CACHE_DIRin the nodeplugin_configtakes precedence over the global value.Built and ran locally on Windows (MSVC) against a real
facebook/opt-125mcontinuous-batching pipeline:🤖 Generated with Claude Code