Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,9 @@ requirements.*.backup
# Local run files
local-run.yaml

# Deps image hash (auto-generated by make)
.llama-stack-deps.hash

# Sisyphus planning files
.sisyphus/
# Per-developer feature design overrides (see docs/contributing/feature-design.config)
Expand Down
56 changes: 49 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,17 @@ LLAMA_STACK_CONFIG ?= run.yaml

# Container configuration
LLAMA_STACK_CONTAINER_NAME ?= lightspeed-llama-stack
LLAMA_STACK_DEPS_IMAGE ?= lightspeed-llama-stack-deps:local
LLAMA_STACK_IMAGE ?= lightspeed-llama-stack:local
LLAMA_STACK_PORT ?= 8321
CONTAINER_RUNTIME ?= $(shell command -v podman 2>/dev/null || command -v docker 2>/dev/null)

.PHONY: run run-stack build-llama-stack-image remove-llama-stack-container stop-llama-stack-container start-llama-stack-container wait-for-llama-stack-health clean-llama-stack
# Dependency change detection
DEPS_HASH_FILE := .llama-stack-deps.hash
CURRENT_DEPS_HASH := $(shell cat pyproject.toml uv.lock providers/pyproject.toml providers/uv.lock 2>/dev/null | shasum -a 256 | cut -d' ' -f1)
STORED_DEPS_HASH := $(shell cat $(DEPS_HASH_FILE) 2>/dev/null)

.PHONY: run run-stack build-llama-stack-deps-image ensure-llama-stack-deps-image build-llama-stack-image remove-llama-stack-container stop-llama-stack-container start-llama-stack-container wait-for-llama-stack-health clean-llama-stack

run-stack: ## Run lightspeed-stack directly, without building dependent service/s
uv run src/lightspeed_stack.py -c $(CONFIG)
Expand All @@ -27,13 +33,44 @@ run: start-llama-stack-container ## Run the service locally with dependent servi
@trap 'echo ""; echo "Stopping services..."; $(MAKE) stop-llama-stack-container' EXIT INT TERM; \
$(MAKE) run-stack

build-llama-stack-image: remove-llama-stack-container ## Build llama-stack container image
@echo "Building llama-stack container image..."
build-llama-stack-deps-image: ## Force rebuild the deps base image
@echo "Building llama-stack deps image..."
@if [ -z "$(CONTAINER_RUNTIME)" ]; then \
echo "ERROR: No container runtime found. Install podman or docker."; \
exit 1; \
fi
$(CONTAINER_RUNTIME) build -f deploy/llama-stack/test.containerfile -t $(LLAMA_STACK_IMAGE) .
@if $(CONTAINER_RUNTIME) image inspect $(LLAMA_STACK_DEPS_IMAGE) >/dev/null 2>&1; then \
echo "Removing existing deps image to avoid dangling images..."; \
$(CONTAINER_RUNTIME) rmi $(LLAMA_STACK_DEPS_IMAGE); \
fi
Comment on lines +42 to +45

@coderabbitai coderabbitai Bot Jun 10, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Both image removals share the same atomicity trade-off. Removing images before building (both deps at lines 42-45 and app at lines 67-70) means any build failure leaves no fallback image. This trades build resilience for avoiding dangling <none>:<none> images. During development with frequent build iterations, consider whether atomic replacement (remove only after successful build) or accepting dangling images with periodic cleanup would better serve reliability.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 42 - 45, Makefile currently removes
$(LLAMA_STACK_DEPS_IMAGE) (and similarly the app image at the other removal)
before building which sacrifices a fallback on build failure; change the flow so
the rmi of $(LLAMA_STACK_DEPS_IMAGE) (and $(LLAMA_STACK_APP_IMAGE) at the other
site) happens only after a successful build—e.g., build the new image into a
temporary/tagged name, verify build success, then remove the old image with
$(CONTAINER_RUNTIME) rmi—alternatively add a dedicated cleanup target (invoked
manually or in CI) to accept dangling images during development; update the
removal commands referenced by the Makefile targets that currently use
$(CONTAINER_RUNTIME) image inspect ... rmi to follow this atomic replacement
pattern.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a fallback image if we're building a developer build for testing? 🤔 If the build fails, we fix the problem and build it again. Cleaning those dangling images is just too tedious and they are huge.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point, @Jazzcort. For a developer build workflow, the "fix and rebuild" loop is the natural recovery path — there's no meaningful benefit to keeping a potentially stale fallback image around. Removing it upfront to avoid accumulating large dangling images is the right trade-off here. I'll withdraw this concern.

(┳◇┳)🐇


✏️ Learnings added
Learnt from: Jazzcort
Repo: lightspeed-core/lightspeed-stack PR: 1895
File: Makefile:42-45
Timestamp: 2026-06-10T18:11:00.333Z
Learning: In the `lightspeed-stack` repository (`Makefile`), the `build-llama-stack-deps-image` and `build-llama-stack-image` targets intentionally remove existing images *before* building (rather than atomic replacement after a successful build). This is deliberate for the developer/test workflow: the expected recovery from a build failure is to fix the issue and rebuild, so a fallback image provides no value; removing first avoids accumulating large dangling `<none>:<none>` images.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +42 to +45

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Do Docker and Podman allow removing a parent/base image tag via rmi when a tagged child image built FROM it still exists, or do they fail with a dependency conflict?

💡 Result:

Docker: No—removing (via docker rmi) a base/parent image while a tagged child image that is built FROM it still exists causes a dependency conflict. Docker’s docker image rm removes images from the local host (and untags as needed), but if the image you’re removing has a dependent child image, the removal is canceled as a conflict (even with -f in the parent/child case). [1][2] Podman: Also no—podman rmi will fail when the target image has child images (or is used by a container). Podman’s rmi exit status documentation explicitly states that “one of the specified images has child images or is being used by a container” causes failure. [3][4] Practical implication (both): you must remove the dependent child images first, then remove the parent/base image; otherwise rmi reports a conflict/dependency error. [1][3]

Citations:


Fix deps rebuild ordering to prevent docker/podman rmi dependency conflicts
Docker/Podman won’t remove a base/parent image if a child image still exists. The Makefile removes $(LLAMA_STACK_DEPS_IMAGE) in build-llama-stack-deps-image (lines 42-45) while build-llama-stack-image calls ensure-llama-stack-deps-image before removing the existing app image (lines 65-70), so a stale $(LLAMA_STACK_IMAGE) built FROM the deps image can make the deps rebuild fail/abort.

Suggested fix: remove $(LLAMA_STACK_IMAGE) (child) before removing $(LLAMA_STACK_DEPS_IMAGE) (parent)—either by moving the app cleanup earlier than ensure-llama-stack-deps-image, or by adding the app-image removal inside the deps build target while ensuring $(CONTAINER_RUNTIME) is checked for non-empty before it’s used.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 42 - 45, The deps rebuild can fail because
build-llama-stack-deps-image removes $(LLAMA_STACK_DEPS_IMAGE) while a child
$(LLAMA_STACK_IMAGE) still exists; update the Makefile so the child app image is
removed before the parent: either move the existing removal of
$(LLAMA_STACK_IMAGE) (the app image cleanup currently in build-llama-stack-image
/ ensure-llama-stack-deps-image flow) to run before build-llama-stack-deps-image
attempts to rmi $(LLAMA_STACK_DEPS_IMAGE), or add a step inside
build-llama-stack-deps-image to remove $(LLAMA_STACK_IMAGE) first, and ensure
you check CONTAINER_RUNTIME is non-empty before invoking it.

$(CONTAINER_RUNTIME) build -f deploy/llama-stack/test.containerfile --target deps-builder -t $(LLAMA_STACK_DEPS_IMAGE) .
@echo "$(CURRENT_DEPS_HASH)" > $(DEPS_HASH_FILE)
@echo "✓ Deps image built and hash saved"

ensure-llama-stack-deps-image: ## Build deps image only if missing or dependencies changed
@if [ -z "$(CONTAINER_RUNTIME)" ]; then \
echo "ERROR: No container runtime found. Install podman or docker."; \
exit 1; \
fi
@if ! $(CONTAINER_RUNTIME) image inspect $(LLAMA_STACK_DEPS_IMAGE) >/dev/null 2>&1; then \
echo "Deps image not found, building..."; \
$(MAKE) build-llama-stack-deps-image; \
elif [ "$(CURRENT_DEPS_HASH)" != "$(STORED_DEPS_HASH)" ]; then \
echo "Dependencies changed (pyproject.toml or uv.lock), rebuilding deps image..."; \
$(MAKE) build-llama-stack-deps-image; \
else \
echo "✓ Deps image is up-to-date (skipping rebuild)"; \
fi

build-llama-stack-image: ensure-llama-stack-deps-image ## Build llama-stack app image (source-only layer on top of deps)
@echo "Building llama-stack app image..."
@if $(CONTAINER_RUNTIME) image inspect $(LLAMA_STACK_IMAGE) >/dev/null 2>&1; then \
echo "Removing existing app image to avoid dangling images..."; \
$(CONTAINER_RUNTIME) rmi $(LLAMA_STACK_IMAGE); \
fi
Comment thread
coderabbitai[bot] marked this conversation as resolved.
$(CONTAINER_RUNTIME) build -f deploy/llama-stack/test.containerfile \
--build-arg DEPS_IMAGE=$(LLAMA_STACK_DEPS_IMAGE) \
-t $(LLAMA_STACK_IMAGE) .

stop-llama-stack-container: ## Gracefully stop llama-stack container
@if [ -n "$(CONTAINER_RUNTIME)" ] && $(CONTAINER_RUNTIME) inspect $(LLAMA_STACK_CONTAINER_NAME) >/dev/null 2>&1; then \
Expand All @@ -57,7 +94,7 @@ remove-llama-stack-container: ## Remove llama-stack container (saves logs first)
echo "✓ Container removed (logs saved to /tmp/llama-stack-last-run.log)"; \
fi

start-llama-stack-container: build-llama-stack-image ## Start llama-stack container
start-llama-stack-container: remove-llama-stack-container build-llama-stack-image ## Start llama-stack container
@echo "Starting llama-stack container..."
$(CONTAINER_RUNTIME) run -d \
--name $(LLAMA_STACK_CONTAINER_NAME) \
Expand Down Expand Up @@ -122,11 +159,16 @@ wait-for-llama-stack-health: ## Wait for llama-stack container to be healthy
$(CONTAINER_RUNTIME) logs $(LLAMA_STACK_CONTAINER_NAME); \
exit 1

clean-llama-stack: remove-llama-stack-container ## Remove container and image
clean-llama-stack: remove-llama-stack-container ## Remove containers, images, and deps hash
@if [ -n "$(CONTAINER_RUNTIME)" ] && $(CONTAINER_RUNTIME) images -q $(LLAMA_STACK_IMAGE) | grep -q .; then \
echo "Removing llama-stack image..."; \
echo "Removing llama-stack app image..."; \
$(CONTAINER_RUNTIME) rmi $(LLAMA_STACK_IMAGE); \
fi
@if [ -n "$(CONTAINER_RUNTIME)" ] && $(CONTAINER_RUNTIME) images -q $(LLAMA_STACK_DEPS_IMAGE) | grep -q .; then \
echo "Removing llama-stack deps image..."; \
$(CONTAINER_RUNTIME) rmi $(LLAMA_STACK_DEPS_IMAGE); \
fi
@rm -f $(DEPS_HASH_FILE)

run-llama-stack: ## Start Llama Stack with enriched config (for local service mode)
uv run src/llama_stack_configuration.py -c $(CONFIG) -i $(LLAMA_STACK_CONFIG) -o $(LLAMA_STACK_CONFIG) && \
Expand Down
41 changes: 33 additions & 8 deletions deploy/llama-stack/test.containerfile
Original file line number Diff line number Diff line change
@@ -1,25 +1,37 @@
# Upstream llama-stack built from Red Hat UBI Python 3.12 image
FROM registry.access.redhat.com/ubi9/python-312
# DEPS_IMAGE selects the base layer.
# Default: build deps inline (slow but self-contained).
# Override with --build-arg DEPS_IMAGE=lightspeed-llama-stack-deps:local
# to use a pre-built deps image (fast rebuilds, used by `make build-llama-stack-image`).
ARG DEPS_IMAGE=deps-builder

# --- Stage 1: deps (skipped by BuildKit when DEPS_IMAGE is overridden) ---
FROM registry.access.redhat.com/ubi9/python-312 AS deps-builder

USER root

# Install additional build tools
RUN dnf install -y --nodocs --setopt=keepcache=0 --setopt=tsflags=nodocs \
git tar gcc gcc-c++ make && \
dnf clean all

# Install uv
ENV PATH="/root/.local/bin:${PATH}"
RUN curl -LsSf https://astral.sh/uv/install.sh | sh

# Copy project files for dependency installation
# Copy only dependency-related files
WORKDIR /opt/app-root
COPY pyproject.toml uv.lock LICENSE README.md ./
COPY src ./src
COPY providers ./providers
COPY src/version.py ./src/version.py

# Install dependencies using uv sync
# Copy submodule dependency files only (source copied in app stage)
COPY providers/pyproject.tom[l] providers/uv.loc[k] ./providers/

# Install dependencies (not the project itself)
RUN uv sync --locked --no-install-project --group llslibdev
RUN if [ -f providers/pyproject.toml ]; then \
cd providers && uv export --locked --no-hashes > /tmp/providers-reqs.txt \
&& uv pip install -r /tmp/providers-reqs.txt; \
fi

# Add virtual environment to PATH for llama command
# Add providers to PYTHONPATH so lightspeed_stack_providers modules can be imported
Expand All @@ -39,11 +51,24 @@ RUN mkdir -p /opt/app-root/src/.llama/storage \
chown -R 1001:0 /opt/app-root && \
chmod -R 775 /opt/app-root

USER 1001

# --- Stage 2: app (thin source-only layer) ---
FROM ${DEPS_IMAGE}

USER root

# Copy source code and providers submodule
COPY src ./src
COPY provider[s] ./providers

# Copy enrichment scripts for runtime config enrichment
COPY src/llama_stack_configuration.py /opt/app-root/llama_stack_configuration.py
COPY scripts/llama-stack-entrypoint.sh /opt/app-root/enrich-entrypoint.sh
RUN chmod +x /opt/app-root/enrich-entrypoint.sh && \
chown 1001:0 /opt/app-root/enrich-entrypoint.sh /opt/app-root/llama_stack_configuration.py
chown 1001:0 /opt/app-root/enrich-entrypoint.sh /opt/app-root/llama_stack_configuration.py && \
chown -R 1001:0 /opt/app-root/src && \
chmod -R 775 /opt/app-root/src

# Switch back to the original user
USER 1001
Expand Down
Loading