Skip to content

LCORE-1873: Add container lifecycle integration tests#1914

Draft
anik120 wants to merge 1 commit into
lightspeed-core:mainfrom
anik120:container-test
Draft

LCORE-1873: Add container lifecycle integration tests#1914
anik120 wants to merge 1 commit into
lightspeed-core:mainfrom
anik120:container-test

Conversation

@anik120

@anik120 anik120 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

Add automated test suite for Llama Stack container build, startup, health monitoring, configuration, and teardown operations. Tests verify Makefile container orchestration targets work correctly across podman and docker.

Key features:

  • Class-scoped managed container fixture
  • Image ID-based idempotency verification (deterministic, container runtime cache-agnostic)
  • Host-side HTTP health checks
  • Parametrized mount point verification
  • Destructive test ordering to prevent dev environment impact
  • Proper cleanup of stale artifacts to prevent false positives

Test coverage:

  • Build: Image creation and idempotency via SHA256 comparison
  • Deployment: Container startup, health checks, port mapping, volume mounts
  • Configuration: Custom port handling
  • Teardown: Graceful stop, log persistence, full cleanup
  • Error handling: Double-start replacement behavior

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Tests
    • Added comprehensive integration test suite validating container lifecycle management end-to-end
    • Verifies image building success and idempotency checks
    • Validates deployment health status and host HTTP endpoint accessibility with retry logic
    • Tests port mapping configuration and required internal file presence
    • Confirms graceful container shutdown with log persistence
    • Includes error scenario testing for container replacement behavior

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a1a3eac-8b3c-45ac-abdf-296d2bba9224

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This PR introduces a comprehensive integration test suite for Llama Stack container lifecycle. It detects available container runtime (podman or docker), validates image build idempotency, verifies deployment health and port configuration, confirms required container filesystem mounts, and validates graceful stop, removal with log preservation, and error-recovery scenarios (replacement on double-start).

Changes

Container Lifecycle Integration Tests

Layer / File(s) Summary
Test fixtures and infrastructure
tests/integration/container_lifecycle/test_container_lifecycle.py
Imports, module docstring, and session/class-scoped fixtures detect available container runtime (podman or docker) and manage container setup/teardown: pre-class force removal, make target invocation with container name injection, startup assertion, and post-class cleanup.
Image build and idempotency validation
tests/integration/container_lifecycle/test_container_lifecycle.py
TestContainerBuild verifies image build via make build-llama-stack-image, confirms built image exists in runtime output, and asserts rebuild idempotency by comparing image IDs across consecutive builds.
Deployment health checks and port configuration
tests/integration/container_lifecycle/test_container_lifecycle.py
TestLlamaStackDeployment (using managed fixture) validates container running state, polls and confirms health status transition via runtime inspect, reaches HTTP health endpoint from host with retry logic, verifies default port mapping (8321), and confirms required filesystem mounts exist via in-container test commands. TestContainerCustomConfiguration starts a separate container with port override and validates custom port mapping.
Container teardown and error scenarios
tests/integration/container_lifecycle/test_container_lifecycle.py
TestContainerTeardown validates graceful stop (container no longer running), removal with log preservation to /tmp/llama-stack-last-run.log, and destructive clean (make target removes both container and image). TestContainerErrorScenarios verifies double-start replacement: starting the same container twice results in different container IDs.

Sequence Diagram

sequenceDiagram
  participant Fixture as Fixture Setup
  participant Runtime as Container Runtime
  participant Make as Make Target
  participant Container as Running Container
  participant Host as Host Network
  participant FS as Container FS
  
  Fixture->>Runtime: Detect podman/docker
  Fixture->>Make: make start-llama-stack-container
  Make->>Runtime: Create container
  Make->>Container: Start container
  Container-->>Fixture: Container ready
  
  Fixture->>Runtime: Inspect health status
  Runtime->>Container: Query health
  Container-->>Runtime: Healthy
  
  Fixture->>Host: GET /v1/health
  Host->>Container: Health check
  Container-->>Host: 200 OK
  
  Fixture->>Runtime: Inspect port mappings
  Runtime-->>Fixture: Port 8321 verified
  
  Fixture->>Container: exec test -f /required/path
  Container->>FS: Check file exists
  FS-->>Container: File exists
  
  Fixture->>Make: make stop-llama-stack-container
  Make->>Container: Stop container
  
  Fixture->>Make: make remove-llama-stack-container
  Make->>Container: Remove container
  Make->>Host: Write logs to /tmp/llama-stack-last-run.log
  
  Fixture->>Runtime: Verify container removed
  Runtime-->>Fixture: Container absent
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • lightspeed-core/lightspeed-stack#1760: The main PR's new integration test suite directly exercises the container lifecycle behaviors introduced by PR #1760's Makefile targets (build/start/wait/health/logs/clean, including idempotent rebuild and container replacement).
  • lightspeed-core/lightspeed-stack#1802: The main PR's integration tests assert container teardown/removal behavior and verify log preservation to /tmp/llama-stack-last-run.log, which depends on the Makefile changes in PR #1802.

Suggested reviewers

  • tisnik
  • radofuchs
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding container lifecycle integration tests, with the ticket reference providing context.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/integration/container_lifecycle/test_container_lifecycle.py (1)

1-603: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Run Black on this module before merge.

CI is already red because Black would reformat this file.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around
lines 1 - 603, This file fails Black formatting — run the project Black
formatter on the test module and commit the reformatted file; specifically
reformat the module containing the managed_container fixture and tests such as
TestContainerBuild._get_image_id, test_build_llama_stack_image,
TestLlamaStackDeployment.test_container_is_running,
test_health_endpoint_responds_on_host,
TestContainerCustomConfiguration.test_custom_port_mapping, and
TestContainerTeardown.test_remove_container_saves_logs so all imports,
docstrings, and line-wrapping conform to Black's rules before merging.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/container_lifecycle/test_container_lifecycle.py`:
- Around line 440-446: The test currently asserts the saved container log
(variable target_log) is non-empty, which can be flaky for quiet containers;
remove the os.path.getsize(target_log) > 0 assertion and instead verify the file
was created and has a modification time (e.g., assert os.path.exists(target_log)
as already present and add an assertion like os.path.getmtime(target_log) is not
None or a truthy check on os.path.getmtime(target_log)) so the test only
requires the file to exist or have an mtime rather than non-zero size.
- Around line 222-245: The test test_health_endpoint_responds_on_host is calling
the versioned path /v1/health but health.router is registered at the unversioned
/health in src/app/routers.py (and the Makefile healthcheck expects the
unversioned route); update the URL variable in that test from
"http://localhost:8321/v1/health" to "http://localhost:8321/health" so the test
targets the actual registered health endpoint and aligns with the Makefile
healthcheck.
- Around line 201-215: The health check is using a substring match and will
treat "unhealthy" as healthy; update the loop's condition to compare the trimmed
health status exactly by using result.stdout.strip() == "healthy" (i.e., after
calling subprocess.run with container_runtime and managed_container, check the
exact equality of result.stdout.strip() to "healthy" instead of using "healthy"
in result.stdout).
- Around line 254-263: The test currently just searches for the substring "8321"
in result.stdout which can match the container-side port; change the assertion
to parse the port mapping line for the container port "8321/tcp" from the
subprocess result (the block that uses container_runtime, managed_container and
result), extract the published host endpoint (e.g. the right-hand side after
"->" such as "0.0.0.0:8321" or "127.0.0.1:8321") and assert that the host port
portion equals "8321" so the test verifies the published host port rather than
any occurrence of 8321.

---

Outside diff comments:
In `@tests/integration/container_lifecycle/test_container_lifecycle.py`:
- Around line 1-603: This file fails Black formatting — run the project Black
formatter on the test module and commit the reformatted file; specifically
reformat the module containing the managed_container fixture and tests such as
TestContainerBuild._get_image_id, test_build_llama_stack_image,
TestLlamaStackDeployment.test_container_is_running,
test_health_endpoint_responds_on_host,
TestContainerCustomConfiguration.test_custom_port_mapping, and
TestContainerTeardown.test_remove_container_saves_logs so all imports,
docstrings, and line-wrapping conform to Black's rules before merging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2b3e495d-d713-4e0e-a7fa-151081e54a69

📥 Commits

Reviewing files that changed from the base of the PR and between 6116ef7 and 4c2a744.

📒 Files selected for processing (1)
  • tests/integration/container_lifecycle/test_container_lifecycle.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: Pylinter
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: integration_tests (3.13)
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: build-pr
  • GitHub Check: Pyright
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (1)
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/integration/container_lifecycle/test_container_lifecycle.py
🪛 ast-grep (0.43.0)
tests/integration/container_lifecycle/test_container_lifecycle.py

[error] 28-30: Use of unsanitized data to create processes
Context: subprocess.run(
[runtime, "--version"], check=True, capture_output=True, timeout=5
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 52-56: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 59-68: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 74-78: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 96-102: Use of unsanitized data to create processes
Context: subprocess.run(
[runtime, "images", "-q", image_name],
capture_output=True,
text=True,
check=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 112-117: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"],
capture_output=True,
text=True,
timeout=600,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 125-130: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "images", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 144-146: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=600
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 151-153: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=120
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 175-187: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={managed_container}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 201-212: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 253-258: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "port", managed_container],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 284-288: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "exec", managed_container, "test", "-f", file_path],
capture_output=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 308-318: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
f"LLAMA_STACK_PORT={custom_port}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 319-324: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "port", container_name],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 330-334: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 351-360: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 363-372: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"stop-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=15,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 376-388: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={container_name}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 394-398: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 416-425: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 428-437: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"remove-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=15,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 448-452: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 471-476: Use of unsanitized data to create processes
Context: subprocess.run(
["make", "build-llama-stack-image"],
check=True,
capture_output=True,
timeout=600,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 479-488: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 491-500: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"clean-llama-stack",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=30,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 504-509: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "ps", "-a", "--filter", f"name={container_name}"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 515-520: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "images", "-q", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 538-547: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 550-561: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 565-574: Use of unsanitized data to create processes
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 577-588: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 597-601: Use of unsanitized data to create processes
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[info] 410-410: Do not hardcode temporary file or directory names
Context: "/tmp/llama-stack-last-run.log"
Note: [CWE-377].

(hardcoded-tmp-file)


[error] 28-30: Command coming from incoming request
Context: subprocess.run(
[runtime, "--version"], check=True, capture_output=True, timeout=5
)
Note: [CWE-20].

(subprocess-from-request)


[error] 52-56: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 59-68: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 74-78: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 96-102: Command coming from incoming request
Context: subprocess.run(
[runtime, "images", "-q", image_name],
capture_output=True,
text=True,
check=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 112-117: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"],
capture_output=True,
text=True,
timeout=600,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 125-130: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "images", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 144-146: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=600
)
Note: [CWE-20].

(subprocess-from-request)


[error] 151-153: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"], check=True, timeout=120
)
Note: [CWE-20].

(subprocess-from-request)


[error] 175-187: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={managed_container}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 201-212: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 253-258: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "port", managed_container],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 284-288: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "exec", managed_container, "test", "-f", file_path],
capture_output=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 308-318: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
f"LLAMA_STACK_PORT={custom_port}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 319-324: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "port", container_name],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 330-334: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 351-360: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 363-372: Command coming from incoming request
Context: subprocess.run(
[
"make",
"stop-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=15,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 376-388: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"--filter",
f"name={container_name}",
"--format",
"{{.Names}}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 394-398: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 416-425: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 428-437: Command coming from incoming request
Context: subprocess.run(
[
"make",
"remove-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=15,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 448-452: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 471-476: Command coming from incoming request
Context: subprocess.run(
["make", "build-llama-stack-image"],
check=True,
capture_output=True,
timeout=600,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 479-488: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 491-500: Command coming from incoming request
Context: subprocess.run(
[
"make",
"clean-llama-stack",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
capture_output=True,
text=True,
timeout=30,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 504-509: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "ps", "-a", "--filter", f"name={container_name}"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 515-520: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "images", "-q", "lightspeed-llama-stack:local"],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 538-547: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 550-561: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 565-574: Command coming from incoming request
Context: subprocess.run(
[
"make",
"start-llama-stack-container",
f"LLAMA_STACK_CONTAINER_NAME={container_name}",
],
check=True,
capture_output=True,
timeout=120,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 577-588: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"ps",
"-q",
"--filter",
f"name={container_name}",
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)


[error] 597-601: Command coming from incoming request
Context: subprocess.run(
[container_runtime, "rm", "-f", container_name],
capture_output=True,
timeout=10,
)
Note: [CWE-20].

(subprocess-from-request)

🪛 GitHub Actions: Black / 0_black.txt
tests/integration/container_lifecycle/test_container_lifecycle.py

[error] 1-1: Black --check failed. Black would reformat this file; run 'uv tool run black src tests' (without --check) or apply the formatting changes.

🪛 GitHub Actions: Black / black
tests/integration/container_lifecycle/test_container_lifecycle.py

[error] 1-1: Black --check failed. The file would be reformatted: /home/runner/work/lightspeed-stack/lightspeed-stack/tests/integration/container_lifecycle/test_container_lifecycle.py. Re-run with black src tests (or black --write ...) to apply formatting.

🔇 Additional comments (1)
tests/integration/container_lifecycle/test_container_lifecycle.py (1)

455-457: Check that pytest.mark.order("last") is actually registered/enforced in CI
In tests/integration/container_lifecycle/test_container_lifecycle.py the test uses @pytest.mark.order("last") + @pytest.mark.destructive, but the repo’s searched pytest configuration files (pyproject.toml, pytest.ini, tox.ini, setup.cfg, and all conftest.py under tests/) contain no references to pytest-order/pytest_order, nor any marker registration for the order/destructive markers. Confirm CI installs the ordering plugin (or has equivalent marker registration/hook), otherwise this “runs last” safeguard may not work.

Comment on lines +201 to +215
for attempt in range(30):
result = subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
if "healthy" in result.stdout:
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Match the health status exactly.

Line 214 uses "healthy" in result.stdout, which is also true for "unhealthy". A broken container can therefore satisfy this poll and mask a failed rollout. Compare the trimmed status to "healthy" instead.

Suggested fix
-            if "healthy" in result.stdout:
+            if result.stdout.strip() == "healthy":
                 return
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for attempt in range(30):
result = subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
if "healthy" in result.stdout:
return
for attempt in range(30):
result = subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
if result.stdout.strip() == "healthy":
return
🧰 Tools
🪛 ast-grep (0.43.0)

[error] 201-212: Use of unsanitized data to create processes
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-78].

(os-system-unsanitized-data)


[error] 201-212: Command coming from incoming request
Context: subprocess.run(
[
container_runtime,
"inspect",
"--format",
"{{.State.Health.Status}}",
managed_container,
],
capture_output=True,
text=True,
timeout=5,
)
Note: [CWE-20].

(subprocess-from-request)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around
lines 201 - 215, The health check is using a substring match and will treat
"unhealthy" as healthy; update the loop's condition to compare the trimmed
health status exactly by using result.stdout.strip() == "healthy" (i.e., after
calling subprocess.run with container_runtime and managed_container, check the
exact equality of result.stdout.strip() to "healthy" instead of using "healthy"
in result.stdout).

Comment on lines +222 to +245
def test_health_endpoint_responds_on_host(self):
"""Verify HTTP API accessibility from host without container-side curl."""
url = "http://localhost:8321/v1/health"

# Retry loop for network binding stabilization
for attempt in range(5):
try:
with urllib.request.urlopen(url, timeout=5) as response:
body = response.read().decode("utf-8").lower()
assert (
response.status == 200
), f"Health endpoint returned status {response.status}"
assert (
"status" in body
), f"Health response missing 'status' field: {body}"
return
except (urllib.error.URLError, ConnectionError) as e:
if attempt == 4: # Last attempt
pytest.fail(
f"Could not reach /v1/health from host machine after "
f"{attempt + 1} attempts. Last error: {e}"
)
time.sleep(1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the unversioned health route here.

src/app/routers.py registers health.router outside the app-level /v1 prefix, but this test hard-codes http://localhost:8321/v1/health. That makes the suite validate a route the app does not appear to expose, and it also mismatches the Makefile healthcheck that this fixture depends on. Point both checks at the actual health URL instead of the versioned path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around
lines 222 - 245, The test test_health_endpoint_responds_on_host is calling the
versioned path /v1/health but health.router is registered at the unversioned
/health in src/app/routers.py (and the Makefile healthcheck expects the
unversioned route); update the URL variable in that test from
"http://localhost:8321/v1/health" to "http://localhost:8321/health" so the test
targets the actual registered health endpoint and aligns with the Makefile
healthcheck.

Comment on lines +254 to +263
result = subprocess.run(
[container_runtime, "port", managed_container],
capture_output=True,
text=True,
timeout=5,
)
assert result.returncode == 0, "Failed to query port mappings"
assert (
"8321" in result.stdout
), f"Port 8321 not found in port mappings: {result.stdout}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Assert the published host port, not just any 8321 substring.

podman/docker port output always includes the container-side port (8321/tcp), so this passes even when the host binding is wrong. Parse the mapping and assert that the published host port is 8321, which is the actual Makefile contract being tested.

Suggested fix
         assert result.returncode == 0, "Failed to query port mappings"
-        assert (
-            "8321" in result.stdout
-        ), f"Port 8321 not found in port mappings: {result.stdout}"
+        mapping = result.stdout.strip()
+        assert "->" in mapping, f"Unexpected port mapping format: {mapping}"
+        host_binding = mapping.split("->", 1)[1].strip()
+        assert host_binding.endswith(":8321") or host_binding == "8321", (
+            f"Expected host port 8321, got: {mapping}"
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around
lines 254 - 263, The test currently just searches for the substring "8321" in
result.stdout which can match the container-side port; change the assertion to
parse the port mapping line for the container port "8321/tcp" from the
subprocess result (the block that uses container_runtime, managed_container and
result), extract the published host endpoint (e.g. the right-hand side after
"->" such as "0.0.0.0:8321" or "127.0.0.1:8321") and assert that the host port
portion equals "8321" so the test verifies the published host port rather than
any occurrence of 8321.

Comment on lines +440 to +446
# Verify log file was created and is not empty
assert os.path.exists(
target_log
), f"Container logs were not written to {target_log}"
assert (
os.path.getsize(target_log) > 0
), "Log file was created but is empty"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't require the saved log file to be non-empty.

remove-llama-stack-container only guarantees that runtime logs are written to /tmp/llama-stack-last-run.log. A quiet container still satisfies that contract but can legitimately produce a zero-byte file, so this assertion can fail on healthy runs and make the suite flaky. Checking creation or mtime is enough here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/container_lifecycle/test_container_lifecycle.py` around
lines 440 - 446, The test currently asserts the saved container log (variable
target_log) is non-empty, which can be flaky for quiet containers; remove the
os.path.getsize(target_log) > 0 assertion and instead verify the file was
created and has a modification time (e.g., assert os.path.exists(target_log) as
already present and add an assertion like os.path.getmtime(target_log) is not
None or a truthy check on os.path.getmtime(target_log)) so the test only
requires the file to exist or have an mtime rather than non-zero size.

@anik120 anik120 marked this pull request as draft June 12, 2026 12:12
@anik120 anik120 force-pushed the container-test branch 3 times, most recently from e827a60 to fc1f53a Compare June 12, 2026 15:09
Add automated test suite for Llama Stack container build, startup, health
monitoring, configuration, and teardown operations. Tests verify Makefile
container orchestration targets work correctly across podman and docker.

Key features:
- Class-scoped managed container fixture
- Image ID-based idempotency verification (deterministic, container runtime cache-agnostic)
- Host-side HTTP health checks
- Parametrized mount point verification
- Destructive test ordering to prevent dev environment impact
- Proper cleanup of stale artifacts to prevent false positives

Test coverage:
- Build: Image creation and idempotency via SHA256 comparison
- Deployment: Container startup, health checks, port mapping, volume mounts
- Configuration: Custom port handling
- Teardown: Graceful stop, log persistence, full cleanup
- Error handling: Double-start replacement behavior

Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant