-
Notifications
You must be signed in to change notification settings - Fork 632
Add database upgrade and rollback compatibility tests #2084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a28a05e
273e39b
4cd602b
1065890
f449f5f
b5930fc
bf230a8
c686f0b
6e24c50
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| name: Upgrade Test Setup | ||
|
|
||
| description: >- | ||
| Shared prelude for the upgrade-tests and rolling-upgrade-tests jobs: resolve | ||
| the upgrade-from version (with the prev-stable == adjacent skip) and bring up | ||
| the build/cluster toolchain. Exposes the resolved version and skip flag so the | ||
| caller can gate its test step. The caller MUST run actions/checkout (with | ||
| fetch-depth: 0 + fetch-tags) before this action — a local action is loaded | ||
| from the checked-out workspace and the version resolvers need full history. | ||
|
|
||
| inputs: | ||
| upgrade-from: | ||
| description: 'Which release to upgrade from: "adjacent" or "prev-stable".' | ||
| required: true | ||
| buildx-builder-name: | ||
| description: Name for the docker/setup-buildx-action builder. | ||
| required: true | ||
| buildx-version: | ||
| description: Buildx version for docker/setup-buildx-action. | ||
| required: true | ||
|
|
||
| outputs: | ||
| skip: | ||
| description: '"true" when this leg is redundant (prev-stable == adjacent) and the caller should skip its test step.' | ||
| value: ${{ steps.resolve.outputs.skip }} | ||
| version: | ||
| description: The resolved upgrade-from version (empty when skip is true). | ||
| value: ${{ steps.resolve.outputs.version }} | ||
|
|
||
| runs: | ||
| using: "composite" | ||
| steps: | ||
| # The caller must run actions/checkout (fetch-depth: 0 + fetch-tags) before | ||
| # this action: a local action is loaded from the checked-out workspace, so | ||
| # its files don't exist on the runner until checkout has run. | ||
| - name: Resolve upgrade-from version | ||
| id: resolve | ||
| shell: bash | ||
| run: | | ||
| ADJ="$(./scripts/upgrade-from-version.sh)" | ||
| if [ "${{ inputs.upgrade-from }}" = "prev-stable" ]; then | ||
| V="$(./scripts/prev-stable-version.sh)" | ||
| if [ -z "$V" ]; then | ||
| echo "no stable line below the current line; skipping prev-stable leg." | ||
| echo "skip=true" >> "$GITHUB_OUTPUT" | ||
| exit 0 | ||
| fi | ||
| if [ "$V" = "$ADJ" ]; then | ||
| echo "prev-stable ($V) == adjacent; skipping (covered by the adjacent leg)." | ||
| echo "skip=true" >> "$GITHUB_OUTPUT" | ||
| exit 0 | ||
| fi | ||
| else | ||
| V="$ADJ" | ||
| fi | ||
| echo "Upgrade-from target: $V" | ||
| echo "version=$V" >> "$GITHUB_OUTPUT" | ||
| - name: Initialize Environment | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| uses: ./.github/actions/initialize-environment | ||
| - name: Allow unprivileged user namespaces | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| shell: bash | ||
| run: | | ||
| sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0 || true | ||
| - name: Set up QEMU | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| uses: docker/setup-qemu-action@v4 | ||
| with: | ||
| platforms: linux/amd64,linux/arm64 | ||
| - name: Set up Docker Buildx | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| uses: docker/setup-buildx-action@v4 | ||
| with: | ||
| name: ${{ inputs.buildx-builder-name }} | ||
| version: ${{ inputs.buildx-version }} | ||
| platforms: linux/amd64,linux/arm64 | ||
| use: "true" | ||
| driver-opts: network=host | ||
| - name: Set up Helm | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| uses: azure/setup-helm@v5.0.0 | ||
| with: | ||
| version: v3.18.0 | ||
| - name: Install Kind | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| uses: helm/kind-action@ef37e7f390d99f746eb8b610417061a60e82a6cc | ||
| with: | ||
| install_only: true | ||
| - name: Create Kind cluster | ||
| if: steps.resolve.outputs.skip != 'true' | ||
| shell: bash | ||
| run: | | ||
| make create-kind-cluster | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -494,6 +494,100 @@ helm-uninstall: ## Uninstall kagent and kagent-crds Helm releases from the kind | |
| helm uninstall kagent --namespace kagent --kube-context kind-$(KIND_CLUSTER_NAME) --wait | ||
| helm uninstall kagent-crds --namespace kagent --kube-context kind-$(KIND_CLUSTER_NAME) --wait | ||
|
|
||
| # Upgrade test targets install the previous released kagent chart from the public | ||
| # OCI registry, build the current images, then run the e2e assertions in | ||
| # go/core/test/e2e/upgrade. The Go test performs the actual upgrade to the current | ||
| # build by invoking `make helm-install-provider`. UPGRADE_FROM_VERSION defaults to | ||
| # the latest release reachable from HEAD (scripts/upgrade-from-version.sh); CI runs | ||
| # this against two targets via a matrix — that adjacent release and the previous | ||
| # stable line's latest patch (scripts/prev-stable-version.sh) — and you can pin | ||
| # either locally, e.g. `UPGRADE_FROM_VERSION=$$(./scripts/prev-stable-version.sh)`. | ||
| # The previous install pins the bundled Postgres image to whatever the | ||
| # upgrade-from release's own install target shipped (see PREV_DB_SET_FLAGS), so | ||
| # the baseline matches how that release actually runs rather than a hardcoded | ||
| # guess; the upgrade then exercises the real app/migration (and any DB image) | ||
| # change between that release and the current build. | ||
| # | ||
| # Prerequisite (provided by CI as a separate step; run it locally first): a kind | ||
| # cluster (make create-kind-cluster). agent-sandbox is not required — the | ||
| # controller tolerates the missing CRD and these tests create no SandboxAgents. | ||
| UPGRADE_FROM_VERSION ?= $(shell ./scripts/upgrade-from-version.sh) | ||
|
|
||
| # The bundled-Postgres image is selected by the install target's --set flags, not | ||
| # by the chart defaults (the chart ships a non-vector image). So the previous | ||
| # install must use the exact pins the upgrade-from release shipped — otherwise the | ||
| # baseline DB would differ from how that release actually runs, and the upgrade | ||
| # would conflate a DB swap with the migration change under test. Read those flags | ||
| # straight from that release's own helm-install-provider target (via its tagged | ||
| # Makefile) instead of hardcoding values that drift as the bundled image changes. | ||
| # Assumes the flags are literal (no make/env variables); the guard in | ||
| # install-previous-release fails loudly if they can't be read. | ||
| PREV_DB_SET_FLAGS = $(shell git show v$(UPGRADE_FROM_VERSION):Makefile 2>/dev/null | \ | ||
|
iplay88keys marked this conversation as resolved.
Comment on lines
+514
to
+525
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This means that these will be called every time |
||
| grep -oE '\-\-set[[:space:]]+database\.postgres\.[^[:space:]\\]+') | ||
|
|
||
| .PHONY: install-previous-release | ||
| install-previous-release: ## Install the previous released kagent + kagent-crds charts from the public OCI registry | ||
| test -n "$(UPGRADE_FROM_VERSION)" || { echo "UPGRADE_FROM_VERSION is empty; set it explicitly or ensure git tags are fetched." >&2; exit 1; } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is |
||
| test -n "$(strip $(PREV_DB_SET_FLAGS))" || { echo "Could not read bundled-Postgres --set flags from v$(UPGRADE_FROM_VERSION):Makefile; the upgrade-from release's install target may have moved or renamed them." >&2; exit 1; } | ||
| case '$(PREV_DB_SET_FLAGS)' in *'$$'*|*'{'*|*'('*) echo "Bundled-Postgres --set flags from v$(UPGRADE_FROM_VERSION):Makefile contain an unexpanded variable and cannot be passed to helm verbatim: $(PREV_DB_SET_FLAGS)" >&2; exit 1;; esac | ||
| @echo "=== Installing previous release: $(UPGRADE_FROM_VERSION) ===" | ||
| @echo " bundled-Postgres flags (from v$(UPGRADE_FROM_VERSION) install target): $(PREV_DB_SET_FLAGS)" | ||
| helm upgrade --install kagent-crds $(HELM_REPO)/kagent/helm/kagent-crds \ | ||
| --version $(UPGRADE_FROM_VERSION) \ | ||
| --namespace kagent --create-namespace \ | ||
| --kube-context kind-$(KIND_CLUSTER_NAME) \ | ||
| --timeout 5m --wait | ||
| helm upgrade --install kagent $(HELM_REPO)/kagent/helm/kagent \ | ||
| --version $(UPGRADE_FROM_VERSION) \ | ||
| --namespace kagent --create-namespace \ | ||
| --kube-context kind-$(KIND_CLUSTER_NAME) \ | ||
| --timeout 5m --wait \ | ||
| --set ui.service.type=LoadBalancer \ | ||
| --set controller.service.type=LoadBalancer \ | ||
| --set providers.default=openAI \ | ||
| --set providers.openAI.apiKey="$${OPENAI_API_KEY:-test}" \ | ||
| $(PREV_DB_SET_FLAGS) \ | ||
| $(UPGRADE_PREV_EXTRA_ARGS) | ||
|
|
||
| # run-upgrade-tests installs the previous release, builds the current images, and | ||
| # runs the DB-layer upgrade scenario in TestUpgrade: seed -> upgrade -> controller | ||
| # rollout (no crash) -> data survival -> schema-equivalence (upgraded == clean | ||
| # install) -> reverse schema to target (down files) + data survival. | ||
| # Prerequisite (provided by CI as a separate step; run it locally first): a kind | ||
| # cluster (make create-kind-cluster). The controller tolerates the missing | ||
| # agent-sandbox CRD (the owned-resource watch is skipped), and these tests create | ||
| # no SandboxAgents, so agent-sandbox is not required. | ||
| .PHONY: run-upgrade-tests | ||
| run-upgrade-tests: ## Install the previous release, build current images, and run the upgrade test (migration round-trip) | ||
| test -n "$(UPGRADE_FROM_VERSION)" || { echo "UPGRADE_FROM_VERSION is empty; set it explicitly or ensure git tags are fetched." >&2; exit 1; } | ||
| $(MAKE) build | ||
| $(MAKE) install-previous-release | ||
|
Comment on lines
+563
to
+564
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are these not dependency targets? calling make inside of make seems weird |
||
| @echo "=== Upgrade test: $(UPGRADE_FROM_VERSION) -> $(VERSION) (registry=$(DOCKER_REGISTRY)) ===" | ||
| cd go && \ | ||
| RUN_UPGRADE_TESTS=true \ | ||
| REPO_ROOT=$(CURDIR) \ | ||
| UPGRADE_FROM_VERSION=$(UPGRADE_FROM_VERSION) \ | ||
| VERSION=$(VERSION) \ | ||
| DOCKER_REGISTRY=$(DOCKER_REGISTRY) \ | ||
| KIND_CLUSTER_NAME=$(KIND_CLUSTER_NAME) \ | ||
| OPENAI_API_KEY="$${OPENAI_API_KEY:-test}" \ | ||
| go test ./core/test/e2e/upgrade -run TestUpgrade -count=1 -timeout=20m -v | ||
|
|
||
| .PHONY: run-rolling-upgrade-tests | ||
| run-rolling-upgrade-tests: ## Install the previous release with 2 controller replicas, build the current images, and run the rolling upgrade e2e test | ||
| $(MAKE) build | ||
| $(MAKE) install-previous-release UPGRADE_PREV_EXTRA_ARGS="--set controller.replicas=2" | ||
|
Comment on lines
+578
to
+579
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment about calling |
||
| @echo "=== Rolling upgrade test: $(UPGRADE_FROM_VERSION) -> $(VERSION) (registry=$(DOCKER_REGISTRY)) ===" | ||
| cd go && \ | ||
| RUN_ROLLING_UPGRADE_TESTS=true \ | ||
| REPO_ROOT=$(CURDIR) \ | ||
| UPGRADE_FROM_VERSION=$(UPGRADE_FROM_VERSION) \ | ||
| VERSION=$(VERSION) \ | ||
| DOCKER_REGISTRY=$(DOCKER_REGISTRY) \ | ||
| KIND_CLUSTER_NAME=$(KIND_CLUSTER_NAME) \ | ||
| OPENAI_API_KEY="$${OPENAI_API_KEY:-test}" \ | ||
| go test ./core/test/e2e/upgrade -run TestRollingUpgradeCompatibility -count=1 -timeout=20m -v | ||
|
|
||
| .PHONY: helm-publish | ||
| helm-publish: ## Package and push all Helm charts to the OCI registry | ||
| helm-publish: helm-version | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these required inputs? Can't they be static?