jepsen(dynamodb): linearizable register tests for every attribute type#615
jepsen(dynamodb): linearizable register tests for every attribute type#615
Conversation
The existing DynamoDB Jepsen workload only exercises the List type via list-append. Add a complementary workload that runs a Knossos linearizable-register check (PutItem + ConsistentRead GetItem) for each of the 10 attribute types elastickv supports: S, N, B, BOOL, NULL, SS, NS, BS, L, M. Each type writes to its own table and uses a per-type encoder/decoder that canonicalises the value (sorted sets, byte arrays as int vecs) so register equality is reliable. Wired into the Jepsen CI workflows and the local run-jepsen-local.sh script.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 47 minutes and 57 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis pull request introduces a comprehensive Jepsen workload for validating DynamoDB linearizability across attribute types (string, number, binary, boolean, null, and their set/list/map variants). The implementation includes per-type table isolation, encode/decode canonicalization, read/write operation generation, and integration into existing test workflows and CI pipelines. Changes
Sequence DiagramsequenceDiagram
participant Gen as Jepsen Generator
participant Client as DynamoDB Client
participant DDB as DynamoDB
participant Checker as Linearizability Checker
rect rgba(100, 150, 200, 0.5)
Note over Gen, Checker: Write Phase
Gen->>Client: Generate PutItem(key, encoded_value)
Client->>DDB: PutItem request
DDB-->>Client: Success/Error response
Client->>Checker: Record write operation & response
end
rect rgba(150, 150, 100, 0.5)
Note over Gen, Checker: Read Phase
Gen->>Client: Generate GetItem(key, ConsistentRead=true)
Client->>DDB: GetItem request
DDB-->>Client: Item value (if exists)
Client->>Client: Decode value
Client->>Checker: Record read operation & decoded result
end
rect rgba(100, 200, 100, 0.5)
Note over Checker: Validation
Checker->>Checker: Validate linearizability of read/write history
Checker-->>Checker: Return valid/invalid result
end
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eafa5da3e8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| :bool {:table "jepsen_types_bool" | ||
| :gen (fn [i] (odd? i)) | ||
| :encode (fn [v] {:BOOL v}) | ||
| :decode (fn [a] (boolean (:BOOL a)))} |
There was a problem hiding this comment.
Preserve nil when decoding BOOL attributes
Decoding with boolean collapses malformed/missing BOOL payloads to false, which is also a valid register value in this workload. That can mask server-side type/shape errors as successful reads and weaken the linearizability signal for the bool type. Returning (:BOOL a) (or explicitly validating type) preserves nil for invalid payloads instead of silently turning them into a legitimate value.
Useful? React with 👍 / 👎.
| timeout 240 ~/lein run -m elastickv.dynamodb-types-workload --local \ | ||
| --time-limit ${{ inputs.time-limit || '60' }} \ | ||
| --rate ${{ inputs.rate || '5' }} \ |
There was a problem hiding this comment.
Align timeout wrapper with configurable runtime
This command wraps each per-type run in timeout 240 while also passing a user-configurable --time-limit. In this workflow, time-limit is exposed as an input (default 300 for workflow_dispatch), so runs that legitimately need more than 240 seconds will be terminated by timeout (exit 124) before the requested Jepsen duration completes. The wrapper and input limits should be made consistent to avoid false CI failures during longer stress runs.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request introduces a new Jepsen workload, dynamodb-types-workload, designed to verify the linearizability of all supported DynamoDB attribute types. The implementation includes a dedicated Jepsen client, comprehensive unit tests for data encoding and decoding, and updates to the local test script to automate testing across all types. Review feedback suggests enhancing error handling by explicitly recognizing more server-side exceptions and making the provisioned throughput for test tables configurable to prevent performance bottlenecks.
| (contains? #{"InternalServerError"} err-type) | ||
| (assoc op :type :info :error (str err-type)) |
There was a problem hiding this comment.
The check for InternalServerError is quite narrow. DynamoDB-compatible APIs may return other 5xx error types such as ServiceUnavailableException, ThrottlingException, or ProvisionedThroughputExceededException. While the else branch at line 268 safely maps unknown errors to :info (indeterminate), explicitly including common retryable server-side errors improves the clarity of the test history.
| (contains? #{"InternalServerError"} err-type) | |
| (assoc op :type :info :error (str err-type)) | |
| (contains? #{"InternalServerError" "ServiceUnavailableException" "ThrottlingException" "ProvisionedThroughputExceededException"} err-type) | |
| (assoc op :type :info :error (str err-type)) |
| {:TableName table | ||
| :KeySchema [{:AttributeName pk-attr :KeyType "HASH"}] | ||
| :AttributeDefinitions [{:AttributeName pk-attr :AttributeType "S"}] | ||
| :ProvisionedThroughput {:ReadCapacityUnits 5 :WriteCapacityUnits 5}}) |
There was a problem hiding this comment.
The ProvisionedThroughput values are hardcoded to 5/5. While this is likely sufficient for a smoke test, it may cause ProvisionedThroughputExceededException if the workload is run with high concurrency or a high rate. Consider making these values configurable via CLI options or increasing the defaults to avoid artificial bottlenecks during stress testing.
- BOOL decoder: return (:BOOL a) instead of (boolean (:BOOL a)) so a nil/malformed payload no longer silently collapses to false and confuses the register checker (codex). - Expand the retryable DynamoDB error set to include ServiceUnavailableException, ThrottlingException, and ProvisionedThroughputExceededException alongside InternalServerError (gemini). - Make test-table ProvisionedThroughput configurable and raise the default to 100/100 so real-DynamoDB stress runs do not self-throttle (gemini). - Scheduled CI: decouple the per-type sweep time-limit from the workflow_dispatch input and derive the inner `timeout` from it, so the outer guard never races a user-bumped runtime (codex).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 51e452dae7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (when-not (= "ResourceInUseException" (:type (ex-data e))) | ||
| (throw e))))) |
There was a problem hiding this comment.
Clear existing type tables before starting register history
By swallowing ResourceInUseException here, reruns against a persistent endpoint reuse old table contents without any cleanup. In contexts like local --no-cluster runs or shared DynamoDB endpoints, the generator can issue reads before this run has written a key, so stale values from prior runs appear in the current history and can produce spurious linearizability failures (or otherwise distort the experiment signal).
Useful? React with 👍 / 👎.
If the test table already exists (ResourceInUseException path) and holds items from a previous run, reads issued before this run's first write on a key could observe those stale items, distorting the register history and producing spurious linearizability failures. This is especially relevant for --no-cluster local runs or shared DynamoDB endpoints where data outlives a single invocation. setup! now issues DeleteItem for each key the run will touch after CreateTable. DeleteItem on a missing key is a no-op in DynamoDB, so this is safe on the first run too. Addresses PR #615 codex review feedback.
….0.1:42276/git/bootjp/elastickv into claude/dynamodb-jepsen-all-types-FWjD0
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce4688ab4d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # timeout regardless of the workflow_dispatch time-limit input. | ||
| # The per-invocation `timeout` is derived from TYPE_TL + buffer | ||
| # so bumping TYPE_TL never races against the outer timeout. | ||
| timeout-minutes: 30 |
There was a problem hiding this comment.
Scale step timeout to cover full per-type loop
The new scheduled per-type sweep can be terminated before finishing all 10 attribute types because the outer step budget is timeout-minutes: 30 while each iteration allows up to PER_TYPE_TIMEOUT = TYPE_TL + 180 (currently 240s), i.e. a worst-case 40 minutes for the loop. In slow or degraded runs where several type jobs approach their per-invocation timeout, GitHub Actions will kill the step early and produce partial coverage/false failures even though each individual invocation is still within its configured limit.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
scripts/run-jepsen-local.sh (1)
90-131: LGTM — EXIT_CODE tracking is correct.The
set +e/set -epairing cleanly captures each stage's exit code without short-circuiting the sweep, andEXIT_CODEends up non-zero whenever any stage failed (individual failures still get logged). The final exit-code check preserves useful signal for CI.One minor note: the
TYPES=(...)array is now duplicated in four places (this script plus two workflows plusvalue-type-keysin the workload). Not worth restructuring today, but worth keeping in mind if a new type is ever added — all four lists must be updated together.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run-jepsen-local.sh` around lines 90 - 131, The TYPES array is duplicated across this script, two workflows, and the workload's value-type-keys; to avoid drift when adding a new DynamoDB type, centralize the list: extract the TYPES definition into a single source (e.g., an environment variable or a small shared script that exports TYPES) and update this script to source that shared symbol instead of declaring its own TYPES, and update the other workflows/workload to read the same shared symbol (reference: TYPES and value-type-keys).jepsen/test/elastickv/dynamodb_types_workload_test.clj (1)
28-35: Close the AWS client to avoid resource leaks across test runs.
client/open!calls(make-ddb-client ...)which creates a cognitect aws-api client backing a pooled HTTP client and async threads. Withoutclose!, each run of this test leaks those resources (they won't be reaped until JVM exit), which compounds when running the test namespace repeatedly in a REPL or in a larger suite.♻️ Proposed fix
(deftest host-override-creates-client (let [test-map (workload/elastickv-dynamodb-types-test {:value-type :string :dynamo-host "127.0.0.1" :node->port {"n1" 8000 "n2" 8001}}) c (:client test-map) - opened (client/open! c test-map "n1")] - (is (some? (:ddb opened))))) + opened (client/open! c test-map "n1")] + (try + (is (some? (:ddb opened))) + (finally + (client/close! opened test-map)))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jepsen/test/elastickv/dynamodb_types_workload_test.clj` around lines 28 - 35, The test host-override-creates-client currently opens an AWS DDB client (via client/open! -> make-ddb-client) but never closes it; wrap the usage of opened (from client/open!) in a try/finally (or equivalent test cleanup) and call the appropriate close function (e.g., client/close! or close! on the :ddb entry) in the finally to ensure the cognitect aws-api client and its pooled HTTP threads are shutdown after the test; update the host-override-creates-client test to always invoke the close on (:ddb opened) so resources are not leaked.jepsen/src/elastickv/dynamodb_types_workload.clj (1)
211-223:clear-keys!runs per-node during setup — consider guarding against transient errors on real DynamoDB.In local/elastickv this is effectively instant, but when pointed at a real DynamoDB endpoint after
CreateTablesucceeds from another node withResourceInUseException, the table may briefly be inCREATINGstate. ADeleteItemagainst a not-yet-ACTIVEtable can surfaceResourceNotFoundExceptionand failsetup!for that node.Not urgent (the configured retryable classification in
invoke!already covers the steady-state runs), but worth a small retry or adescribe-tablegate if you expect to point this workload at real DynamoDB later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jepsen/src/elastickv/dynamodb_types_workload.clj` around lines 211 - 223, The clear-keys! function calls ddb-invoke! :DeleteItem immediately during per-node setup and can fail if the table is still CREATING; wrap each DeleteItem call with a short retry/backoff (or precede the loop with a DescribeTable poll until Status == "ACTIVE") to tolerate transient ResourceNotFoundException/ResourceInUseException; specifically modify clear-keys! to either (a) call DescribeTable via ddb-invoke! and wait for "ACTIVE" before the doseq, or (b) catch/retry ddb-invoke! :DeleteItem a few times with small sleeps/backoff on retryable errors, so setup! won't fail when pointing at real DynamoDB.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@jepsen/src/elastickv/dynamodb_types_workload.clj`:
- Around line 211-223: The clear-keys! function calls ddb-invoke! :DeleteItem
immediately during per-node setup and can fail if the table is still CREATING;
wrap each DeleteItem call with a short retry/backoff (or precede the loop with a
DescribeTable poll until Status == "ACTIVE") to tolerate transient
ResourceNotFoundException/ResourceInUseException; specifically modify
clear-keys! to either (a) call DescribeTable via ddb-invoke! and wait for
"ACTIVE" before the doseq, or (b) catch/retry ddb-invoke! :DeleteItem a few
times with small sleeps/backoff on retryable errors, so setup! won't fail when
pointing at real DynamoDB.
In `@jepsen/test/elastickv/dynamodb_types_workload_test.clj`:
- Around line 28-35: The test host-override-creates-client currently opens an
AWS DDB client (via client/open! -> make-ddb-client) but never closes it; wrap
the usage of opened (from client/open!) in a try/finally (or equivalent test
cleanup) and call the appropriate close function (e.g., client/close! or close!
on the :ddb entry) in the finally to ensure the cognitect aws-api client and its
pooled HTTP threads are shutdown after the test; update the
host-override-creates-client test to always invoke the close on (:ddb opened) so
resources are not leaked.
In `@scripts/run-jepsen-local.sh`:
- Around line 90-131: The TYPES array is duplicated across this script, two
workflows, and the workload's value-type-keys; to avoid drift when adding a new
DynamoDB type, centralize the list: extract the TYPES definition into a single
source (e.g., an environment variable or a small shared script that exports
TYPES) and update this script to source that shared symbol instead of declaring
its own TYPES, and update the other workflows/workload to read the same shared
symbol (reference: TYPES and value-type-keys).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b110f4ff-88fa-4eb0-9e83-a3e678a8d960
📒 Files selected for processing (6)
.github/workflows/jepsen-test-scheduled.yml.github/workflows/jepsen-test.ymljepsen/src/elastickv/dynamodb_types_workload.cljjepsen/src/elastickv/jepsen_test.cljjepsen/test/elastickv/dynamodb_types_workload_test.cljscripts/run-jepsen-local.sh
The per-type Jepsen CI step used \`set -e\` inside the for-loop, so the first failing attribute type aborted the sweep with nothing but a "process completed with exit code 1" annotation — leaving no way to tell which type crashed. Changes: - Local script and both CI workflows now run every type independently, tracking pass/fail per type and printing a final summary table. The step still fails at the end if any type failed, but the log now identifies each failing type by name. - Upload jepsen/store as a CI artifact on failure so the analysis output (history.txt, results.edn, timeline.html) is inspectable. - Harden ->byte-array against unexpected return shapes from cognitect/aws-api for Binary attributes (accept byte[], ByteBuffer, or any Seqable of byte-coercible numbers).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 31d46a6939
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| timeout 120 ~/lein run -m elastickv.dynamodb-workload --local --time-limit 5 --rate 5 --concurrency 5 --dynamo-ports 63801,63802,63803 --host 127.0.0.1 | ||
| - name: Run DynamoDB per-type Jepsen workloads against elastickv | ||
| working-directory: jepsen | ||
| timeout-minutes: 10 |
There was a problem hiding this comment.
Increase per-type step timeout to cover worst-case loop
In .github/workflows/jepsen-test.yml, this step allows each of the 10 type runs to consume up to timeout 120, but the enclosing step budget is only 10 minutes; if several invocations hang or run near their timeout, GitHub Actions will terminate the step before the loop finishes, so the promised per-type summary is incomplete and failures can be reported as a generic timeout instead of per-type results. Raising timeout-minutes (or reducing per-invocation timeout/count) keeps the workflow behavior aligned with the loop’s own retry/reporting logic.
Useful? React with 👍 / 👎.
| # All DynamoDB attribute types currently supported by elastickv. | ||
| # Each runs as its own register/linearizable test. | ||
| TYPES=(string number binary bool null string-set number-set binary-set list map) | ||
| declare -A TYPE_RESULT |
There was a problem hiding this comment.
Avoid Bash 4-only associative array in local runner
scripts/run-jepsen-local.sh now uses declare -A, which is unsupported by the default Bash 3.2 shipped on macOS; in that environment the script exits at this line (set -euo pipefail), so local Jepsen runs fail before any per-type workload executes. Using an indexed array or plain string accumulation would preserve compatibility for developers running the documented local workflow on macOS.
Useful? React with 👍 / 👎.
Root cause: the per-type sweep was dense enough (8 keys × 150 writes × 2 threads/key, ~2400 ops total) that Knossos's :linear graph-search could not finish inside its analysis budget for the binary type, and it returned :valid? :unknown. fail-on-invalid! treats :unknown as a hard failure, so the step exited 1. Fix: - Checker now uses :competition, which races Knossos's :linear and :wgl algorithms in parallel and takes whichever proves a verdict first. :wgl (tree search) typically finishes on workloads where :linear times out. - Scheduled CI: split the per-type sweep's concurrency/key-count/ max-writes env vars away from the stress-run inputs, and cap them at 4 / 8 / 80 respectively so the per-key history stays comfortably inside Knossos's budget. The deep dynamodb-workload step still uses the full inputs.*. The per-PR jepsen-test.yml already uses a tiny 5-second sweep so no change is needed there.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2adabcbeb8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (->> (gen/mix [(map (fn [i] {:f :write :value (gen-fn i)}) | ||
| (range)) | ||
| (gen/repeat {:f :read})]) | ||
| (gen/limit max-writes)))) |
There was a problem hiding this comment.
Count writes, not total ops, in per-key generator
--max-writes-per-key is wired to max-writes, but here gen/limit wraps a mixed read/write generator, so it caps total operations rather than writes; with a read-heavy mix, a key can execute far fewer writes (even zero) before exhaustion. That weakens the intended register signal and makes CI knobs like TYPE_MAX_WRITES misleading, because runs may pass with mostly reads instead of exercising write/read interleavings.
Useful? React with 👍 / 👎.
gen/limit max-writes was wrapping a mixed read/write generator, so --max-writes-per-key actually capped total ops; with a 50/50 mix a key could finish with roughly half as many writes as advertised (or even zero on the unlucky tail), starving the register signal and making the CI knob misleading. Now both the write stream and the read stream are finite: the write generator emits exactly max-writes distinct writes (one per index), and reads are capped at max-writes too. gen/mix terminates when both are drained, so --max-writes-per-key is honest regardless of the random mix ratio. Addresses PR #615 codex review feedback.
The existing DynamoDB Jepsen workload only exercises the List type via list-append. Add a complementary workload that runs a Knossos linearizable-register check (PutItem + ConsistentRead GetItem) for each of the 10 attribute types elastickv supports: S, N, B, BOOL, NULL, SS, NS, BS, L, M.
Each type writes to its own table and uses a per-type encoder/decoder that canonicalises the value (sorted sets, byte arrays as int vecs) so register equality is reliable. Wired into the Jepsen CI workflows and the local run-jepsen-local.sh script.
Summary by CodeRabbit
Release Notes
New Features
Tests