-
Notifications
You must be signed in to change notification settings - Fork 0
[codex] Reduce idle dev SQS polling #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| # SQS Worker Runbook | ||
|
|
||
| GrantStack consumes `grantstack-<environment>-processing` through an AWS Lambda event source mapping. There is no application-side polling loop. Lambda owns the `ReceiveMessage` calls, long polling, scaling, and idle backoff behavior. | ||
|
|
||
| The processing queue uses a 20-second receive wait. The 960-second visibility timeout exceeds the processor Lambda's 900-second timeout. The mapping keeps a batch size of one because a single report can use most of that processing window; increasing the batch would risk messages becoming visible before the batch completes. Lambda deletes successfully processed messages and the handler reports individual failures through `ReportBatchItemFailures`. | ||
|
|
||
| ## Dev Worker Lifecycle | ||
|
|
||
| The event source mapping is disabled by default when `environment = "dev"`. This stops idle SQS receives without deleting the queue, DLQ, or pending messages. Staging and production remain enabled by default. | ||
|
|
||
| `enable_sqs_worker` is a deployment-time control because the Lambda service poller exists outside the function process. Automation can pass it as `TF_VAR_enable_sqs_worker`; a checked-in or CLI variable value takes normal Terraform precedence. | ||
|
|
||
| Enable dev processing intentionally: | ||
|
|
||
| ```sh | ||
| terraform -chdir=grantstack-backend/terraform plan \ | ||
| -var-file=env/dev.tfvars \ | ||
| -var='enable_sqs_worker=true' \ | ||
| -out=grantstack-dev-worker.tfplan | ||
| terraform -chdir=grantstack-backend/terraform apply grantstack-dev-worker.tfplan | ||
| ``` | ||
|
|
||
| After the dev work is complete, set `enable_sqs_worker = false` in the dev variable file and apply again. `terraform output -raw sqs_worker_enabled` reports the deployed intent. Processor cold invocations log the queue name, 20-second wait, Lambda-managed idle behavior, and received batch size. | ||
|
|
||
| Lambda event source mappings do not expose a configurable exponential idle-backoff setting. AWS manages their pollers and can keep multiple long polls active even when a queue is empty. Disabling the dev mapping is therefore the reliable zero-idle-request control; adding sleep or jitter inside the Lambda handler would not affect the separate AWS-managed pollers. | ||
|
|
||
| ## Verify Empty Receives | ||
|
|
||
| In CloudWatch Metrics, select `AWS/SQS`, `Queue Metrics`, `NumberOfEmptyReceives`, and the `QueueName` dimension. Use a one-day period, the `Sum` statistic, and the deployment date through the current date. The operations dashboard also includes this metric at one-minute resolution. | ||
|
|
||
| CLI example for UTC dates: | ||
|
|
||
| ```sh | ||
| aws cloudwatch get-metric-statistics \ | ||
| --namespace AWS/SQS \ | ||
| --metric-name NumberOfEmptyReceives \ | ||
| --dimensions Name=QueueName,Value=grantstack-dev-processing \ | ||
| --statistics Sum \ | ||
| --period 86400 \ | ||
| --start-time 2026-06-01T00:00:00Z \ | ||
| --end-time 2026-07-01T00:00:00Z | ||
| ``` | ||
|
|
||
| After disabling the mapping, confirm that daily sums fall to zero after any in-flight polls finish. Also verify the mapping state: | ||
|
|
||
| ```sh | ||
| aws lambda list-event-source-mappings \ | ||
| --function-name grantstack-dev-processor \ | ||
| --event-source-arn "$(terraform -chdir=grantstack-backend/terraform output -raw processing_queue_arn)" | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| mock_provider "aws" { | ||
| mock_data "aws_caller_identity" { | ||
| defaults = { | ||
| account_id = "123456789012" | ||
| } | ||
| } | ||
|
|
||
| mock_data "aws_iam_policy_document" { | ||
| defaults = { | ||
| json = "{\"Version\":\"2012-10-17\",\"Statement\":[]}" | ||
| } | ||
| } | ||
| } | ||
| mock_provider "archive" {} | ||
|
|
||
| run "dev_worker_is_disabled_by_default" { | ||
| command = plan | ||
|
|
||
| variables { | ||
| environment = "dev" | ||
| } | ||
|
|
||
| assert { | ||
| condition = aws_lambda_event_source_mapping.processor_sqs.enabled == false | ||
| error_message = "The dev SQS worker must not poll unless it is explicitly enabled." | ||
| } | ||
|
|
||
| assert { | ||
| condition = aws_sqs_queue.processing.receive_wait_time_seconds == 20 | ||
| error_message = "The processing queue must use 20-second long polling." | ||
| } | ||
|
|
||
| assert { | ||
| condition = aws_sqs_queue.processing_dlq.receive_wait_time_seconds == 20 | ||
| error_message = "The processing DLQ must use 20-second long polling." | ||
| } | ||
| } | ||
|
|
||
| run "dev_worker_can_be_enabled_explicitly" { | ||
| command = plan | ||
|
|
||
| variables { | ||
| environment = "dev" | ||
| enable_sqs_worker = true | ||
| } | ||
|
|
||
| assert { | ||
| condition = aws_lambda_event_source_mapping.processor_sqs.enabled == true | ||
| error_message = "The dev SQS worker should run when explicitly enabled." | ||
| } | ||
| } | ||
|
|
||
| run "production_worker_remains_enabled_by_default" { | ||
| command = plan | ||
|
|
||
| variables { | ||
| environment = "prod" | ||
| } | ||
|
|
||
| assert { | ||
| condition = aws_lambda_event_source_mapping.processor_sqs.enabled == true | ||
| error_message = "The production SQS worker must remain enabled by default." | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
eventisNoneor if the"Records"key is explicitly set tonull(which parses asNonein Python), callingevent.get("Records", [])will returnNone. This will subsequently cause aTypeErrorwhenlen(records)is evaluated on line 69, or aTypeErrorwhen iterating overrecordson line 72.To make this more robust and adhere to defensive programming practices, use a guard to handle both
Noneevent andNonerecords cases.