Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .tool-versions
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# SPDX-FileCopyrightText: 2025 INDUSTRIA DE DISEÑO TEXTIL S.A. (INDITEX S.A.)
# SPDX-License-Identifier: Apache-2.0
golang 1.25.7
golang 1.25.8
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
### Build stage

# Define the desired Golang version
ARG GOLANG_VERSION=1.25.7
ARG GOLANG_VERSION=1.25.8

# Use an official Golang image with a specific version based on Debian
FROM golang:${GOLANG_VERSION}-trixie AS builder
Expand Down
39 changes: 35 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ SHELL := /bin/bash
NAME := redkey-operator
VERSION := 0.1.0
ROBIN_VERSION := 0.1.0
GOLANG_VERSION := 1.25.7
GOLANG_VERSION := 1.25.8
DELVE_VERSION := 1.25

## Tool Versions
Expand Down Expand Up @@ -559,8 +559,8 @@ ginkgo:
$(GO) install github.com/onsi/ginkgo/v2/ginkgo


TEST_PARALLEL_PROCESS ?= 4
GOMAXPROCS ?= 4
TEST_PARALLEL_PROCESS ?= 8
GOMAXPROCS ?= 8
REDIS_IMAGE ?= redis:8.4.0
CHANGED_REDIS_IMAGE ?= redis:8.2.3

Expand All @@ -575,7 +575,7 @@ GINKGO_ENV ?= GOMAXPROCS=$(GOMAXPROCS) \

GINKGO_PACKAGES ?= ./test/e2e

.PHONY: test-e2e
.PHONY: install test-e2e
test-e2e: process-manifests-crd ginkgo ## Execute e2e application test
$(info $(M) running e2e tests...)
@mkdir -p $(dir $(TEST_E2E_OUTPUT))
Expand All @@ -588,3 +588,34 @@ test-e2e-cov: process-manifests-crd ginkgo ## Execute e2e application test with
$(GINKGO_ENV) ginkgo \
-cover -covermode=count -coverprofile=$(TEST_COVERAGE_PROFILE_OUTPUT_FILE) -output-dir=$(TEST_COVERAGE_PROFILE_OUTPUT) \
$(GINKGO_OPTS) $(GINKGO_PACKAGES)

##@ Chaos Testing

K6_IMG ?= localhost:5001/redkey-k6:dev
CHAOS_ITERATIONS ?= 10
CHAOS_SEED ?=
# With 10 iterations 60 m is not enough with TEST_PARALLEL_PROCESS=8 export GOMAXPROCS=8
CHAOS_TIMEOUT ?= 100m
CHAOS_PACKAGES ?= ./test/chaos
CHAOS_TEST_OUTPUT = .local/chaos-test.json
# CHAOS_KEEP_NAMESPACE_ON_FAILED=1 # if != "" skip delete namespace if failed
.PHONY: k6-build
k6-build: ## Build k6 image with xk6-redis extension
$(info $(M) building k6 docker image with redis extension)
docker build --build-arg GOLANG_VERSION=$(GOLANG_VERSION) -t $(K6_IMG) -f test/chaos/k6.Dockerfile test/chaos

.PHONY: k6-push
k6-push: k6-build ## Push k6 image to local registry
$(info $(M) pushing k6 image)
docker push $(K6_IMG)

.PHONY: test-chaos
test-chaos: process-manifests-crd ginkgo ## Execute chaos tests
$(info $(M) running chaos tests...)
@mkdir -p $(dir $(CHAOS_TEST_OUTPUT))
$(GINKGO_ENV) K6_IMG=$(K6_IMG) CHAOS_ITERATIONS=$(CHAOS_ITERATIONS) \
$(if $(CHAOS_SEED),CHAOS_SEED=$(CHAOS_SEED),) \
ginkgo \
--timeout=$(CHAOS_TIMEOUT) \
--json-report=$(CHAOS_TEST_OUTPUT) \
$(GINKGO_OPTS) $(CHAOS_PACKAGES)
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ Contributions are welcome! Please read our [contributing guidelines](./CONTRIBUT

## Versions

- Go version (https://github.com/golang/go): v1.25.7
- Go version (https://github.com/golang/go): v1.25.8
- Operator SDK version (https://github.com/operator-framework/operator-sdk): v1.42.0
- Kubernetes Controller Tools version (https://github.com/kubernetes-sigs/controller-tools): v0.18.0

Expand Down
1 change: 0 additions & 1 deletion config/crd/bases/redkey.inditex.dev_redkeyclusters.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# SPDX-FileCopyrightText: 2025 INDUSTRIA DE DISEÑO TEXTIL S.A. (INDITEX S.A.)
# SPDX-License-Identifier: Apache-2.0

---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
Expand Down
2 changes: 1 addition & 1 deletion debug.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# SPDX-License-Identifier: Apache-2.0

# Define the desired Golang version
ARG GOLANG_VERSION=1.25.7
ARG GOLANG_VERSION=1.25.8

# Use an official Golang image with a specific version based on Debian
FROM golang:${GOLANG_VERSION}-trixie
Expand Down
59 changes: 59 additions & 0 deletions docs/developer-guide/development-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,65 @@ Or run only an E2E test:
make test-e2e GINKGO_EXTRA_OPTS='--focus="sets and clears custom labels"'
```

### Chaos tests

Chaos tests validate operator resilience under disruptive conditions: random pod
deletions, scaling, operator restarts, and topology corruption — all while the
cluster is under k6 write/read load. They live in `test/chaos/` and run via:

```shell
make test-chaos
```

Run a single scenario:

```shell
make test-chaos GINKGO_EXTRA_OPTS='--focus="survives continuous scaling"'
```

#### Chaos test environment variables

The chaos suite runs multiple Ginkgo processes in parallel, each creating its
own isolated namespace with an operator, Robin, and a Redis cluster. The
following variables control test behavior and should be set in your shell or
`.envrc` before running `make test-chaos`:

```shell
export TEST_PARALLEL_PROCESS=8
export GOMAXPROCS=8
export CHAOS_KEEP_NAMESPACE_ON_FAILED=true
export IMG_ROBIN=localhost:5001/redkey-robin:0.1.0
```

| Variable | Default | Description |
|---------------------------------|----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `TEST_PARALLEL_PROCESS` | `8` | Number of parallel Ginkgo processes (`-procs`). Each process runs a separate test spec in its own namespace. Higher values run more specs concurrently but require more cluster resources. This also applies to E2E tests. |
| `GOMAXPROCS` | `8` | Go runtime parallelism. Should match `TEST_PARALLEL_PROCESS` so each Ginkgo process has a dedicated OS thread. Setting this lower than `TEST_PARALLEL_PROCESS` causes goroutine contention; setting it higher wastes CPU without benefit. |
| `CHAOS_ITERATIONS` | `10` | Number of chaos loop iterations per test spec. Each iteration performs disruptive actions (scale, delete pods, etc.) and waits for recovery. More iterations increase coverage but extend the total run time proportionally. |
| `CHAOS_TIMEOUT` | `100m` | Maximum wall-clock time Ginkgo allows for the entire chaos suite (`--timeout`). Must be large enough to accommodate `CHAOS_ITERATIONS` x recovery time x number of specs / `TEST_PARALLEL_PROCESS`. With 10 iterations and 8 parallel processes, 100 minutes is typically sufficient. |
| `CHAOS_SEED` | *(auto: Ginkgo random seed)* | Fixed random seed for reproducibility. When a chaos run fails, the seed is printed in the output so you can replay the exact sequence of random actions. |
| `CHAOS_KEEP_NAMESPACE_ON_FAILED`| *(unset)* | When set to any non-empty value, failed test namespaces are preserved instead of deleted. This allows post-mortem inspection of pods, logs, and cluster state with `kubectl`. Remember to clean up namespaces manually afterwards. |
| `IMG_ROBIN` | `ghcr.io/inditextech/redkey-robin:$(ROBIN_VERSION)` | Robin sidecar image. For local development, point this to your local registry (e.g. `localhost:5001/redkey-robin:0.1.0`). Passed to tests as `ROBIN_IMAGE` via `GINKGO_ENV`. |
| `K6_IMG` | `localhost:5001/redkey-k6:dev` | k6 load generator image (built with xk6-redis extension). Build it with `make k6-build` before running chaos tests. |

#### Relationship between parallelism and timeouts

`TEST_PARALLEL_PROCESS` controls how many test specs run concurrently. The chaos
suite has 8 specs (4 scenarios x 2 `purgeKeysOnRebalance` modes), so with
`TEST_PARALLEL_PROCESS=8` all specs run in parallel and the total wall-clock
time equals roughly the duration of the slowest single spec.

`CHAOS_TIMEOUT` must account for the worst case: if one spec takes longer than
expected (e.g. slow recovery after a scale-down), the timeout must be generous
enough to avoid killing a spec mid-recovery. As a rule of thumb:

- With `CHAOS_ITERATIONS=10` and `TEST_PARALLEL_PROCESS=8`: `CHAOS_TIMEOUT=100m`
- With `CHAOS_ITERATIONS=5` and `TEST_PARALLEL_PROCESS=4`: `CHAOS_TIMEOUT=60m`

`GOMAXPROCS` should always match `TEST_PARALLEL_PROCESS`. Each Ginkgo process
creates its own Kubernetes clients with independent rate limiters (QPS=5,
Burst=10), so they don't contend on API access — but they do share CPU.

## How to test the operator with CRC and operator-sdk locally (OLM deployment)

These commands allow us to deploy with OLM the Redkey Operator in a OC cluster in local environment
Expand Down
Loading
Loading