@@ -33,6 +33,7 @@ length of 512 tokens:
3333 - [ Docker Images] ( #docker-images )
3434 - [ API Documentation] ( #api-documentation )
3535 - [ Using a private or gated model] ( #using-a-private-or-gated-model )
36+ - [ Air gapped deployment] ( #air-gapped-deployment )
3637 - [ Using Re-rankers models] ( #using-re-rankers-models )
3738 - [ Using Sequence Classification models] ( #using-sequence-classification-models )
3839 - [ Using SPLADE pooling] ( #using-splade-pooling )
@@ -100,11 +101,10 @@ Below are some examples of the currently supported models:
100101### Docker
101102
102103``` shell
103- model=BAAI/bge-large-en-v1.5
104- revision=refs/pr/5
104+ model=Alibaba-NLP/gte-base-en-v1.5
105105volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
106106
107- docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id $model --revision $revision
107+ docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id $model
108108```
109109
110110And then you can make requests like
@@ -347,6 +347,29 @@ token=<your cli READ token>
347347docker run --gpus all -e HF_API_TOKEN=$token -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id $model
348348```
349349
350+ ### Air gapped deployment
351+
352+ To deploy Text Embeddings Inference in an air-gapped environment, first download the weights and then mount them inside
353+ the container using a volume.
354+
355+ For example:
356+
357+ ``` shell
358+ # (Optional) create a `models` directory
359+ mkdir models
360+ cd models
361+
362+ # Make sure you have git-lfs installed (https://git-lfs.com)
363+ git lfs install
364+ git clone https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5
365+
366+ # Set the models directory as the volume path
367+ volume=$PWD
368+
369+ # Mount the models directory inside the container with a volume and set the model ID
370+ docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4 --model-id /data/gte-base-en-v1.5
371+ ```
372+
350373### Using Re-rankers models
351374
352375` text-embeddings-inference ` v0.4.0 added support for CamemBERT, RoBERTa and XLM-RoBERTa Sequence Classification models.
@@ -428,11 +451,10 @@ found [here](https://github.com/huggingface/text-embeddings-inference/blob/main/
428451You can use the gRPC API by adding the ` -grpc ` tag to any TEI Docker image. For example:
429452
430453``` shell
431- model=BAAI/bge-large-en-v1.5
432- revision=refs/pr/5
454+ model=Alibaba-NLP/gte-base-en-v1.5
433455volume=$PWD /data # share a volume with the Docker container to avoid downloading weights every run
434456
435- docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4-grpc --model-id $model --revision $revision
457+ docker run --gpus all -p 8080:80 -v $volume :/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.4-grpc --model-id $model
436458```
437459
438460``` shell
@@ -463,10 +485,9 @@ cargo install --path router -F metal
463485You can now launch Text Embeddings Inference on CPU with:
464486
465487``` shell
466- model=BAAI/bge-large-en-v1.5
467- revision=refs/pr/5
488+ model=Alibaba-NLP/gte-base-en-v1.5
468489
469- text-embeddings-router --model-id $model --revision $revision -- port 8080
490+ text-embeddings-router --model-id $model --port 8080
470491```
471492
472493** Note:** on some machines, you may also need the OpenSSL libraries and gcc. On Linux machines, run:
@@ -502,10 +523,9 @@ cargo install --path router -F candle-cuda -F http --no-default-features
502523You can now launch Text Embeddings Inference on GPU with:
503524
504525``` shell
505- model=BAAI/bge-large-en-v1.5
506- revision=refs/pr/5
526+ model=Alibaba-NLP/gte-base-en-v1.5
507527
508- text-embeddings-router --model-id $model --revision $revision -- port 8080
528+ text-embeddings-router --model-id $model --port 8080
509529```
510530
511531## Docker build
0 commit comments