LCORE-1873: Add container lifecycle integration tests by anik120 · Pull Request #1914 · lightspeed-core/lightspeed-stack

anik120 · 2026-06-11T20:56:25Z

Description

Add automated test suite for Llama Stack container build, startup, health monitoring, configuration, and teardown operations. Tests verify Makefile container orchestration targets work correctly across podman and docker.

Key features:

Class-scoped managed container fixture
Image ID-based idempotency verification (deterministic, container runtime cache-agnostic)
Host-side HTTP health checks
Parametrized mount point verification
Destructive test ordering to prevent dev environment impact
Proper cleanup of stale artifacts to prevent false positives

Test coverage:

Build: Image creation and idempotency via SHA256 comparison
Deployment: Container startup, health checks, port mapping, volume mounts
Configuration: Custom port handling
Teardown: Graceful stop, log persistence, full cleanup
Error handling: Double-start replacement behavior

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Tests
- Added comprehensive integration test suite validating container lifecycle management end-to-end
- Verifies image building success and idempotency checks
- Validates deployment health status and host HTTP endpoint accessibility with retry logic
- Tests port mapping configuration and required internal file presence
- Confirms graceful container shutdown with log persistence
- Includes error scenario testing for container replacement behavior

coderabbitai · 2026-06-11T20:56:38Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a1a3eac-8b3c-45ac-abdf-296d2bba9224

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

This PR introduces a comprehensive integration test suite for Llama Stack container lifecycle. It detects available container runtime (podman or docker), validates image build idempotency, verifies deployment health and port configuration, confirms required container filesystem mounts, and validates graceful stop, removal with log preservation, and error-recovery scenarios (replacement on double-start).

Changes

Container Lifecycle Integration Tests

Layer / File(s)	Summary
Test fixtures and infrastructure `tests/integration/container_lifecycle/test_container_lifecycle.py`	Imports, module docstring, and session/class-scoped fixtures detect available container runtime (podman or docker) and manage container setup/teardown: pre-class force removal, make target invocation with container name injection, startup assertion, and post-class cleanup.
Image build and idempotency validation `tests/integration/container_lifecycle/test_container_lifecycle.py`	TestContainerBuild verifies image build via `make build-llama-stack-image`, confirms built image exists in runtime output, and asserts rebuild idempotency by comparing image IDs across consecutive builds.
Deployment health checks and port configuration `tests/integration/container_lifecycle/test_container_lifecycle.py`	TestLlamaStackDeployment (using managed fixture) validates container running state, polls and confirms health status transition via runtime inspect, reaches HTTP health endpoint from host with retry logic, verifies default port mapping (8321), and confirms required filesystem mounts exist via in-container test commands. TestContainerCustomConfiguration starts a separate container with port override and validates custom port mapping.
Container teardown and error scenarios `tests/integration/container_lifecycle/test_container_lifecycle.py`	TestContainerTeardown validates graceful stop (container no longer running), removal with log preservation to /tmp/llama-stack-last-run.log, and destructive clean (make target removes both container and image). TestContainerErrorScenarios verifies double-start replacement: starting the same container twice results in different container IDs.

Sequence Diagram

sequenceDiagram
  participant Fixture as Fixture Setup
  participant Runtime as Container Runtime
  participant Make as Make Target
  participant Container as Running Container
  participant Host as Host Network
  participant FS as Container FS
  
  Fixture->>Runtime: Detect podman/docker
  Fixture->>Make: make start-llama-stack-container
  Make->>Runtime: Create container
  Make->>Container: Start container
  Container-->>Fixture: Container ready
  
  Fixture->>Runtime: Inspect health status
  Runtime->>Container: Query health
  Container-->>Runtime: Healthy
  
  Fixture->>Host: GET /v1/health
  Host->>Container: Health check
  Container-->>Host: 200 OK
  
  Fixture->>Runtime: Inspect port mappings
  Runtime-->>Fixture: Port 8321 verified
  
  Fixture->>Container: exec test -f /required/path
  Container->>FS: Check file exists
  FS-->>Container: File exists
  
  Fixture->>Make: make stop-llama-stack-container
  Make->>Container: Stop container
  
  Fixture->>Make: make remove-llama-stack-container
  Make->>Container: Remove container
  Make->>Host: Write logs to /tmp/llama-stack-last-run.log
  
  Fixture->>Runtime: Verify container removed
  Runtime-->>Fixture: Container absent

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

lightspeed-core/lightspeed-stack#1760: The main PR's new integration test suite directly exercises the container lifecycle behaviors introduced by PR #1760's Makefile targets (build/start/wait/health/logs/clean, including idempotent rebuild and container replacement).
lightspeed-core/lightspeed-stack#1802: The main PR's integration tests assert container teardown/removal behavior and verify log preservation to /tmp/llama-stack-last-run.log, which depends on the Makefile changes in PR #1802.

Suggested reviewers

tisnik
radofuchs

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding container lifecycle integration tests, with the ticket reference providing context.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/integration/container_lifecycle/test_container_lifecycle.py (1)

1-603: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Run Black on this module before merge.

CI is already red because Black would reformat this file.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around
lines 1 - 603, This file fails Black formatting — run the project Black
formatter on the test module and commit the reformatted file; specifically
reformat the module containing the managed_container fixture and tests such as
TestContainerBuild._get_image_id, test_build_llama_stack_image,
TestLlamaStackDeployment.test_container_is_running,
test_health_endpoint_responds_on_host,
TestContainerCustomConfiguration.test_custom_port_mapping, and
TestContainerTeardown.test_remove_container_saves_logs so all imports,
docstrings, and line-wrapping conform to Black's rules before merging.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/container_lifecycle/test_container_lifecycle.py`:
- Around line 440-446: The test currently asserts the saved container log
(variable target_log) is non-empty, which can be flaky for quiet containers;
remove the os.path.getsize(target_log) > 0 assertion and instead verify the file
was created and has a modification time (e.g., assert os.path.exists(target_log)
as already present and add an assertion like os.path.getmtime(target_log) is not
None or a truthy check on os.path.getmtime(target_log)) so the test only
requires the file to exist or have an mtime rather than non-zero size.
- Around line 222-245: The test test_health_endpoint_responds_on_host is calling
the versioned path /v1/health but health.router is registered at the unversioned
/health in src/app/routers.py (and the Makefile healthcheck expects the
unversioned route); update the URL variable in that test from
"http://localhost:8321/v1/health" to "http://localhost:8321/health" so the test
targets the actual registered health endpoint and aligns with the Makefile
healthcheck.
- Around line 201-215: The health check is using a substring match and will
treat "unhealthy" as healthy; update the loop's condition to compare the trimmed
health status exactly by using result.stdout.strip() == "healthy" (i.e., after
calling subprocess.run with container_runtime and managed_container, check the
exact equality of result.stdout.strip() to "healthy" instead of using "healthy"
in result.stdout).
- Around line 254-263: The test currently just searches for the substring "8321"
in result.stdout which can match the container-side port; change the assertion
to parse the port mapping line for the container port "8321/tcp" from the
subprocess result (the block that uses container_runtime, managed_container and
result), extract the published host endpoint (e.g. the right-hand side after
"->" such as "0.0.0.0:8321" or "127.0.0.1:8321") and assert that the host port
portion equals "8321" so the test verifies the published host port rather than
any occurrence of 8321.

---

Outside diff comments:
In `@tests/integration/container_lifecycle/test_container_lifecycle.py`:
- Around line 1-603: This file fails Black formatting — run the project Black
formatter on the test module and commit the reformatted file; specifically
reformat the module containing the managed_container fixture and tests such as
TestContainerBuild._get_image_id, test_build_llama_stack_image,
TestLlamaStackDeployment.test_container_is_running,
test_health_endpoint_responds_on_host,
TestContainerCustomConfiguration.test_custom_port_mapping, and
TestContainerTeardown.test_remove_container_saves_logs so all imports,
docstrings, and line-wrapping conform to Black's rules before merging.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2b3e495d-d713-4e0e-a7fa-151081e54a69

📥 Commits

Reviewing files that changed from the base of the PR and between 6116ef7 and 4c2a744.

📒 Files selected for processing (1)

tests/integration/container_lifecycle/test_container_lifecycle.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)

GitHub Check: Pylinter
GitHub Check: unit_tests (3.13)
GitHub Check: integration_tests (3.12)
GitHub Check: integration_tests (3.13)
GitHub Check: unit_tests (3.12)
GitHub Check: build-pr
GitHub Check: Pyright
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (1)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/integration/container_lifecycle/test_container_lifecycle.py

🪛 ast-grep (0.43.0)

tests/integration/container_lifecycle/test_container_lifecycle.py

[error] 28-30: Use of unsanitized data to create processes
Context: subprocess.run(
[runtime, "--version"], check=True, capture_output=True, timeout=5
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 52-56: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 59-68: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 74-78: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 96-102: Use of unsanitized data to create processes
Context: subprocess.run(
[runtime, "images", "-q", image_name],
capture_output=True,
text=True,
check=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 112-117: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"],
capture_output=True,
text=True,
timeout=600,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 125-130: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "images", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 144-146: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=600
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 151-153: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=120
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 175-187: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={managed_container}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 201-212: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 253-258: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "port", managed_container],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 284-288: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "exec", managed_container, "test", "-f", file_path],
capture_output=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 308-318: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
f"LLAMA_STACK_PORT={custom_port}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 319-324: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "port", container_name],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 330-334: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 351-360: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 363-372: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"stop-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=15,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 376-388: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={container_name}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 394-398: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 416-425: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 428-437: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"remove-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=15,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 448-452: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 471-476: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"],
check=True,
capture_output=True,
timeout=600,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 479-488: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 491-500: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"clean-llama-stack",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=30,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 504-509: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "ps", "-a", "--filter", f"name={container_name}"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 515-520: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "images", "-q", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 538-547: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 550-561: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 565-574: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 577-588: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 597-601: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[info] 410-410: Do not hardcode temporary file or directory names
Context: "/tmp/llama-stack-last-run.log"
Note: [CWE-377].

(hardcoded-tmp-file)

[error] 28-30: Command coming from incoming request
Context: subprocess.run(
[runtime, "--version"], check=True, capture_output=True, timeout=5
)
Note: [CWE-20].

(subprocess-from-request)

[error] 52-56: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 59-68: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 74-78: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 96-102: Command coming from incoming request
Context: subprocess.run(
[runtime, "images", "-q", image_name],
capture_output=True,
text=True,
check=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 112-117: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"],
capture_output=True,
text=True,
timeout=600,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 125-130: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "images", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 144-146: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=600
)
Note: [CWE-20].

(subprocess-from-request)

[error] 151-153: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=120
)
Note: [CWE-20].

(subprocess-from-request)

[error] 175-187: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={managed_container}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 201-212: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 253-258: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "port", managed_container],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 284-288: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "exec", managed_container, "test", "-f", file_path],
capture_output=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 308-318: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
f"LLAMA_STACK_PORT={custom_port}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 319-324: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "port", container_name],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 330-334: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 351-360: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 363-372: Command coming from incoming request
Context: subprocess.run(
[
"make",
"stop-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=15,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 376-388: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={container_name}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 394-398: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 416-425: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 428-437: Command coming from incoming request
Context: subprocess.run(
[
"make",
"remove-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=15,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 448-452: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 471-476: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"],
check=True,
capture_output=True,
timeout=600,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 479-488: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 491-500: Command coming from incoming request
Context: subprocess.run(
[
"make",
"clean-llama-stack",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=30,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 504-509: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "ps", "-a", "--filter", f"name={container_name}"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 515-520: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "images", "-q", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 538-547: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 550-561: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 565-574: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 577-588: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

[error] 597-601: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

🪛 GitHub Actions: Black / 0_black.txt

tests/integration/container_lifecycle/test_container_lifecycle.py

[error] 1-1: Black --check failed. Black would reformat this file; run 'uv tool run black src tests' (without --check) or apply the formatting changes.

🪛 GitHub Actions: Black / black

tests/integration/container_lifecycle/test_container_lifecycle.py

[error] 1-1: Black --check failed. The file would be reformatted: /home/runner/work/lightspeed-stack/lightspeed-stack/tests/integration/container_lifecycle/test_container_lifecycle.py. Re-run with black src tests (or black --write ...) to apply formatting.

🔇 Additional comments (1)

tests/integration/container_lifecycle/test_container_lifecycle.py (1)

455-457: Check that pytest.mark.order("last") is actually registered/enforced in CI
In tests/integration/container_lifecycle/test_container_lifecycle.py the test uses @pytest.mark.order("last") + @pytest.mark.destructive, but the repo’s searched pytest configuration files (pyproject.toml, pytest.ini, tox.ini, setup.cfg, and all conftest.py under tests/) contain no references to pytest-order/pytest_order, nor any marker registration for the order/destructive markers. Confirm CI installs the ordering plugin (or has equivalent marker registration/hook), otherwise this “runs last” safeguard may not work.

coderabbitai · 2026-06-11T21:01:50Z

+        for attempt in range(30):
+            result = subprocess.run(
+                [
+                    container_runtime,
+                    "inspect",
+                    "--format",
+                    "{{.State.Health.Status}}",
+                    managed_container,
+                ],
+                capture_output=True,
+                text=True,
+                timeout=5,
+            )
+            if "healthy" in result.stdout:
+                return


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Match the health status exactly.

Line 214 uses "healthy" in result.stdout, which is also true for "unhealthy". A broken container can therefore satisfy this poll and mask a failed rollout. Compare the trimmed status to "healthy" instead.

Suggested fix

- if "healthy" in result.stdout: + if result.stdout.strip() == "healthy": return

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for attempt in range(30):

result = subprocess.run(

[

container_runtime,

"inspect",

"--format",

"{{.State.Health.Status}}",

managed_container,

],

capture_output=True,

text=True,

timeout=5,

)

if "healthy" in result.stdout:

return

for attempt in range(30):

result = subprocess.run(

[

container_runtime,

"inspect",

"--format",

"{{.State.Health.Status}}",

managed_container,

],

capture_output=True,

text=True,

timeout=5,

)

if result.stdout.strip() == "healthy":

return

🧰 Tools

🪛 ast-grep (0.43.0)

[error] 201-212: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 201-212: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around lines 201 - 215, The health check is using a substring match and will treat "unhealthy" as healthy; update the loop's condition to compare the trimmed health status exactly by using result.stdout.strip() == "healthy" (i.e., after calling subprocess.run with container_runtime and managed_container, check the exact equality of result.stdout.strip() to "healthy" instead of using "healthy" in result.stdout).

coderabbitai · 2026-06-11T21:01:50Z

+    def test_health_endpoint_responds_on_host(self):
+        """Verify HTTP API accessibility from host without container-side curl."""
+        url = "http://localhost:8321/v1/health"
+
+        # Retry loop for network binding stabilization
+        for attempt in range(5):
+            try:
+                with urllib.request.urlopen(url, timeout=5) as response:
+                    body = response.read().decode("utf-8").lower()
+                    assert (
+                        response.status == 200
+                    ), f"Health endpoint returned status {response.status}"
+                    assert (
+                        "status" in body
+                    ), f"Health response missing 'status' field: {body}"
+                    return
+            except (urllib.error.URLError, ConnectionError) as e:
+                if attempt == 4:  # Last attempt
+                    pytest.fail(
+                        f"Could not reach /v1/health from host machine after "
+                        f"{attempt + 1} attempts. Last error: {e}"
+                    )
+                time.sleep(1)
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the unversioned health route here.

src/app/routers.py registers health.router outside the app-level /v1 prefix, but this test hard-codes http://localhost:8321/v1/health. That makes the suite validate a route the app does not appear to expose, and it also mismatches the Makefile healthcheck that this fixture depends on. Point both checks at the actual health URL instead of the versioned path.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around lines 222 - 245, The test test_health_endpoint_responds_on_host is calling the versioned path /v1/health but health.router is registered at the unversioned /health in src/app/routers.py (and the Makefile healthcheck expects the unversioned route); update the URL variable in that test from "http://localhost:8321/v1/health" to "http://localhost:8321/health" so the test targets the actual registered health endpoint and aligns with the Makefile healthcheck.

coderabbitai · 2026-06-11T21:01:50Z

+        result = subprocess.run(
+            [container_runtime, "port", managed_container],
+            capture_output=True,
+            text=True,
+            timeout=5,
+        )
+        assert result.returncode == 0, "Failed to query port mappings"
+        assert (
+            "8321" in result.stdout
+        ), f"Port 8321 not found in port mappings: {result.stdout}"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Assert the published host port, not just any 8321 substring.

podman/docker port output always includes the container-side port (8321/tcp), so this passes even when the host binding is wrong. Parse the mapping and assert that the published host port is 8321, which is the actual Makefile contract being tested.

Suggested fix

assert result.returncode == 0, "Failed to query port mappings" - assert ( - "8321" in result.stdout - ), f"Port 8321 not found in port mappings: {result.stdout}" + mapping = result.stdout.strip() + assert "->" in mapping, f"Unexpected port mapping format: {mapping}" + host_binding = mapping.split("->", 1)[1].strip() + assert host_binding.endswith(":8321") or host_binding == "8321", ( + f"Expected host port 8321, got: {mapping}" + )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around lines 254 - 263, The test currently just searches for the substring "8321" in result.stdout which can match the container-side port; change the assertion to parse the port mapping line for the container port "8321/tcp" from the subprocess result (the block that uses container_runtime, managed_container and result), extract the published host endpoint (e.g. the right-hand side after "->" such as "0.0.0.0:8321" or "127.0.0.1:8321") and assert that the host port portion equals "8321" so the test verifies the published host port rather than any occurrence of 8321.

coderabbitai · 2026-06-11T21:01:50Z

+            # Verify log file was created and is not empty
+            assert os.path.exists(
+                target_log
+            ), f"Container logs were not written to {target_log}"
+            assert (
+                os.path.getsize(target_log) > 0
+            ), "Log file was created but is empty"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't require the saved log file to be non-empty.

remove-llama-stack-container only guarantees that runtime logs are written to /tmp/llama-stack-last-run.log. A quiet container still satisfies that contract but can legitimately produce a zero-byte file, so this assertion can fail on healthy runs and make the suite flaky. Checking creation or mtime is enough here.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around lines 440 - 446, The test currently asserts the saved container log (variable target_log) is non-empty, which can be flaky for quiet containers; remove the os.path.getsize(target_log) > 0 assertion and instead verify the file was created and has a modification time (e.g., assert os.path.exists(target_log) as already present and add an assertion like os.path.getmtime(target_log) is not None or a truthy check on os.path.getmtime(target_log)) so the test only requires the file to exist or have an mtime rather than non-zero size.

Add automated test suite for Llama Stack container build, startup, health monitoring, configuration, and teardown operations. Tests verify Makefile container orchestration targets work correctly across podman and docker. Key features: - Class-scoped managed container fixture - Image ID-based idempotency verification (deterministic, container runtime cache-agnostic) - Host-side HTTP health checks - Parametrized mount point verification - Destructive test ordering to prevent dev environment impact - Proper cleanup of stale artifacts to prevent false positives Test coverage: - Build: Image creation and idempotency via SHA256 comparison - Deployment: Container startup, health checks, port mapping, volume mounts - Configuration: Custom port handling - Teardown: Graceful stop, log persistence, full cleanup - Error handling: Double-start replacement behavior Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

anik120 marked this pull request as draft June 12, 2026 12:12

anik120 force-pushed the container-test branch 3 times, most recently from e827a60 to fc1f53a Compare June 12, 2026 15:09

anik120 force-pushed the container-test branch from fc1f53a to dd7088c Compare June 12, 2026 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-1873: Add container lifecycle integration tests#1914

LCORE-1873: Add container lifecycle integration tests#1914
anik120 wants to merge 1 commit into
lightspeed-core:mainfrom
anik120:container-test

anik120 commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anik120 commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anik120 commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading