diff --git a/serverless-fleets/README.md b/serverless-fleets/README.md index 703e1952..7f3fbc68 100644 --- a/serverless-fleets/README.md +++ b/serverless-fleets/README.md @@ -558,6 +558,43 @@ An IBM Cloud Logs instance is being setup and enabled by default during the auto ![](./images/prototype_logs.png) +### How to customize fleet workers + +> **Note:** This is an experimental feature to unlock specific use cases and might change or will be deprecated. + +Fleet workers can be customized using startup hooks to prepare the environment before tasks are executed. These hooks are configured through special environment variables that you set when creating the fleet. This customization capability allows you to install additional software, pull container images, or configure services that your tasks will use—all automatically before your workload begins processing. + +#### Example 1: Running Ollama on Fleet Workers + +See `run_hook_ollama` for a complete example that demonstrates: +- Running Ollama (local LLM runtime) on fleet workers +- Automatic GPU detection and configuration +- Preloading AI models during worker startup +- Using the environment variable `__CE_INTERNAL_HOOK_AFTER_STARTUP` to execute setup scripts + +Key environment variables used: +- `__CE_INTERNAL_HOOK_AFTER_STARTUP`: Script to run after worker startup +- `__CE_INTERNAL_HOOK_AFTER_STARTUP_RETRY_LIMIT=3`: Retry attempts if hook fails +- `__CE_INTERNAL_HOOK_AFTER_STARTUP_MAX_EXECUTION_TIME=30m`: Maximum hook execution time + +#### Example 2: Running Podman-in-Podman + +See `run_hook_podman_in_podman` for a complete example that demonstrates: +- Running Podman inside fleet workers for nested containerization +- Preloading container images during startup +- Using privileged containers and host path mounts + +Additional environment variables used: +- `__CE_INTERNAL_PRIVILEGED_CONTAINER=true`: Enable privileged mode (required for nested containers) +- `__CE_INTERNAL_HOSTPATH_MOUNTS=/var/lib/containers:/var/lib/containers`: Mount host paths + +**Available Hook Environment Variables:** +- `__CE_INTERNAL_HOOK_AFTER_STARTUP`: The script to execute after worker startup +- `__CE_INTERNAL_HOOK_AFTER_STARTUP_RETRY_LIMIT`: Number of retry attempts if the hook fails +- `__CE_INTERNAL_HOOK_AFTER_STARTUP_MAX_EXECUTION_TIME`: Maximum time allowed for hook execution +- `__CE_INTERNAL_PRIVILEGED_CONTAINER`: Enable privileged mode +- `__CE_INTERNAL_HOSTPATH_MOUNTS`: Mount host paths into the container + ### Cleanup the Environment To clean up all IBM Cloud resources, that have been created as part of the provided script, run: diff --git a/serverless-fleets/run_hook_ollama b/serverless-fleets/run_hook_ollama new file mode 100755 index 00000000..89cd0c96 --- /dev/null +++ b/serverless-fleets/run_hook_ollama @@ -0,0 +1,49 @@ +#!/bin/bash + +set -e + +uuid=$(uuidgen | tr '[:upper:]' '[:lower:]' | awk -F- '{print $1}') + +PREHOOK=$(cat <<'EOF' +#!/bin/bash + +if nvidia-smi >/dev/null 2>&1; then + echo "NVIDIA GPU detected" + podman run -d --device nvidia.com/gpu=all -v ollama:/root/.ollama -p 11434:11434 --name ollama docker.io/ollama/ollama +else + echo "No NVIDIA GPU detected" + podman run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama docker.io/ollama/ollama:latest +fi + +# pull the model into ollama +podman exec -it ollama ollama pull granite4:350m +EOF +) + +echo code-engine fleet create --name "fleet-${uuid}-1" +echo " "--tasks-state-store fleet-task-store +echo " "--subnetpool-name fleet-subnetpool +echo " "--image registry.access.redhat.com/ubi10/ubi-minimal +echo " "--max-scale 1 +echo " "--command="curl" +echo " "--arg "http://host.containers.internal:11434/api/tags" +echo " "--tasks 1 +echo " "--env __CE_INTERNAL_HOOK_AFTER_STARTUP="${PREHOOK}" +echo " "__CE_INTERNAL_HOOK_AFTER_STARTUP_RETRY_LIMIT=3 +echo " "__CE_INTERNAL_HOOK_AFTER_STARTUP_MAX_EXECUTION_TIME=10m +echo " "--cpu 2 +echo " "--memory 4G + +ibmcloud code-engine fleet create --name "fleet-${uuid}-1" \ + --tasks-state-store fleet-task-store \ + --subnetpool-name fleet-subnetpool \ + --image registry.access.redhat.com/ubi10/ubi-minimal \ + --max-scale 1 \ + --command="curl" \ + --arg "http://host.containers.internal:11434/api/tags" \ + --tasks 1 \ + --env __CE_INTERNAL_HOOK_AFTER_STARTUP="${PREHOOK}" \ + --env __CE_INTERNAL_HOOK_AFTER_STARTUP_RETRY_LIMIT=3 \ + --env __CE_INTERNAL_HOOK_AFTER_STARTUP_MAX_EXECUTION_TIME=30m \ + --cpu 2 \ + --memory 4G \ No newline at end of file diff --git a/serverless-fleets/run_hook_podman_in_podman b/serverless-fleets/run_hook_podman_in_podman new file mode 100755 index 00000000..e6692594 --- /dev/null +++ b/serverless-fleets/run_hook_podman_in_podman @@ -0,0 +1,47 @@ +#!/bin/bash + +set -e + + +uuid=$(uuidgen | tr '[:upper:]' '[:lower:]' | awk -F- '{print $1}') + +PREHOOK=$(cat <<'EOF' +#!/bin/bash +podman pull docker.io/library/hello-world:latest +EOF +) + + +echo ibmcloud code-engine fleet create --name "fleet-${uuid}-1" +echo " "--tasks-state-store fleet-task-store +echo " "--subnetpool-name fleet-subnetpool +echo " "--image quay.io/podman/stable:latest +echo " "--max-scale 1 +echo " "--command="podman" +echo " "--arg "run" +echo " "--arg "hello-world" +echo " "--tasks 1 +echo " "--env __CE_INTERNAL_HOOK_AFTER_STARTUP="${PREHOOK}" +echo " "__CE_INTERNAL_HOOK_AFTER_STARTUP_RETRY_LIMIT=3 +echo " "__CE_INTERNAL_HOOK_AFTER_STARTUP_MAX_EXECUTION_TIME=10m +echo " "--env __CE_INTERNAL_PRIVILEGED_CONTAINER=true +echo " "--env __CE_INTERNAL_HOSTPATH_MOUNTS=/var/lib/containers:/var/lib/containers +echo " "--cpu 2 +echo " "--memory 4G + +ibmcloud code-engine fleet create --name "fleet-${uuid}-1" \ + --tasks-state-store fleet-task-store \ + --subnetpool-name fleet-subnetpool \ + --image quay.io/podman/stable:latest \ + --max-scale 1 \ + --command="podman" \ + --arg "run" \ + --arg "hello-world" \ + --tasks 1 \ + --env __CE_INTERNAL_HOOK_AFTER_STARTUP="${PREHOOK}" \ + --env __CE_INTERNAL_HOOK_AFTER_STARTUP_RETRY_LIMIT=3 \ + --env __CE_INTERNAL_HOOK_AFTER_STARTUP_MAX_EXECUTION_TIME=10m \ + --env __CE_INTERNAL_PRIVILEGED_CONTAINER=true \ + --env __CE_INTERNAL_HOSTPATH_MOUNTS=/var/lib/containers:/var/lib/containers \ + --cpu 2 \ + --memory 4G \ No newline at end of file