Description
When using ModelBuilder (SDK v3) with a pre-built DJL LMI container image and source_code (via SourceCode) to provide custom requirements.txt, the model directory /opt/ml/model/ becomes read-only at runtime. This prevents the DJL container from downloading models from HuggingFace Hub, which tries to write cache files to /opt/ml/model/.
Additionally, ModelBuilder overrides user-provided HF_MODEL_ID environment variable with the value from the model= parameter, making it impossible to point the container to the local model path (/opt/ml/model) when S3 model artifacts are also provided via s3_model_data_url.
How to Reproduce
from sagemaker.serve import ModelBuilder, ModelServer
from sagemaker.serve.builder.schema_builder import SchemaBuilder
from sagemaker.serve.mode.function_pointers import Mode
from sagemaker.serve.model_builder import SourceCode
source_code = SourceCode(
source_dir="./model_code",
requirements="requirements.txt", # e.g. transformers>=4.55.0
)
mb = ModelBuilder(
model="chromadb/context-1", # HF Hub model ID
role_arn=ROLE,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/djl-inference:0.36.0-lmi22.0.0-cu129",
model_server=ModelServer.DJL_SERVING,
schema_builder=SchemaBuilder(
{"inputs": "Hello", "parameters": {"max_new_tokens": 64}},
[{"generated_text": "Hi"}],
),
source_code=source_code,
env_vars={"OPTION_TENSOR_PARALLEL_DEGREE": "4", ...},
instance_type="ml.g6e.12xlarge",
mode=Mode.SAGEMAKER_ENDPOINT,
)
model = mb.build()
endpoint = mb.deploy(endpoint_name="test", wait=True)
# FAILS: OSError: [Errno 30] Read-only file system: /opt/ml/model/models--chromadb--context-1
Observed Behavior
ModelBuilder.build() packages the source_code directory into a model.tar.gz and uploads to S3
- At deploy time, SageMaker mounts this tar.gz at
/opt/ml/model/ — which becomes read-only
ModelBuilder sets HF_MODEL_ID=chromadb/context-1 (from model=), overriding any user-provided value
- DJL LMI container sees
HF_MODEL_ID=chromadb/context-1 and tries to download from HF Hub
- HF Hub download tries to write cache to
/opt/ml/model/models--chromadb--context-1/
- Fails with
OSError: [Errno 30] Read-only file system
CloudWatch logs confirm:
OSError: [Errno 30] Read-only file system: /opt/ml/model/models--chromadb--context-1
Expected Behavior
Users should be able to use ModelBuilder with:
- A pre-built container image (e.g. DJL LMI)
source_code with a custom requirements.txt to install additional dependencies at container startup
- A HuggingFace Hub model ID that the container downloads at runtime
The requirements.txt installation should not make /opt/ml/model/ read-only, or the HF Hub cache should be redirected to a writable location (e.g. /tmp).
Workaround Attempted
Setting HF_HOME=/tmp/hf_home and HUGGINGFACE_HUB_CACHE=/tmp/hf_home/hub in env_vars — these appear in the container environment but the DJL container still writes to /opt/ml/model/.
Use Case
This is a common pattern for deploying newer models (e.g. OpenAI GPT-OSS based models like chromadb/context-1) that require a newer transformers version than what is bundled in the DJL LMI container. The source_code with requirements.txt is the natural SDK v3 mechanism for this, but it is incompatible with HF Hub model downloads.
Environment
- SageMaker Python SDK: 3.6.0
- Container:
djl-inference:0.36.0-lmi22.0.0-cu129
- Instance:
ml.g6e.12xlarge
- Region:
us-east-1
Description
When using
ModelBuilder(SDK v3) with a pre-built DJL LMI container image andsource_code(viaSourceCode) to provide customrequirements.txt, the model directory/opt/ml/model/becomes read-only at runtime. This prevents the DJL container from downloading models from HuggingFace Hub, which tries to write cache files to/opt/ml/model/.Additionally,
ModelBuilderoverrides user-providedHF_MODEL_IDenvironment variable with the value from themodel=parameter, making it impossible to point the container to the local model path (/opt/ml/model) when S3 model artifacts are also provided vias3_model_data_url.How to Reproduce
Observed Behavior
ModelBuilder.build()packages thesource_codedirectory into amodel.tar.gzand uploads to S3/opt/ml/model/— which becomes read-onlyModelBuildersetsHF_MODEL_ID=chromadb/context-1(frommodel=), overriding any user-provided valueHF_MODEL_ID=chromadb/context-1and tries to download from HF Hub/opt/ml/model/models--chromadb--context-1/OSError: [Errno 30] Read-only file systemCloudWatch logs confirm:
Expected Behavior
Users should be able to use
ModelBuilderwith:source_codewith a customrequirements.txtto install additional dependencies at container startupThe
requirements.txtinstallation should not make/opt/ml/model/read-only, or the HF Hub cache should be redirected to a writable location (e.g./tmp).Workaround Attempted
Setting
HF_HOME=/tmp/hf_homeandHUGGINGFACE_HUB_CACHE=/tmp/hf_home/hubinenv_vars— these appear in the container environment but the DJL container still writes to/opt/ml/model/.Use Case
This is a common pattern for deploying newer models (e.g. OpenAI GPT-OSS based models like
chromadb/context-1) that require a newertransformersversion than what is bundled in the DJL LMI container. Thesource_codewithrequirements.txtis the natural SDK v3 mechanism for this, but it is incompatible with HF Hub model downloads.Environment
djl-inference:0.36.0-lmi22.0.0-cu129ml.g6e.12xlargeus-east-1