Skip to content

DSM Queue tab CloudWatch metrics empty: eventSourceARN used as queue identifier instead of short name #755

@jharrisonSV

Description

@jharrisonSV

Description

The DSM (Data Streams Monitoring) Queue tab shows empty CloudWatch metric graphs for SQS queues consumed by Lambda functions. The root cause is a mismatch between how datadog-lambda-python identifies queues in DSM checkpoints vs how the AWS CloudWatch integration tags SQS metrics.

Root Cause

In datadog_lambda/tracing.py, _dsm_set_checkpoint() passes the full eventSourceARN to set_consume_checkpoint():

# tracing.py line ~262
source_arn = first_record.get("eventSourceARN", "")
# ...
_dsm_set_checkpoint(context_json, event_type, source_arn)  # full ARN

This results in a DSM checkpoint with:

topic:arn:aws:sqs:eu-west-2:123456789012:my-queue.fifo

The DSM Queue tab then uses this full ARN value to construct CloudWatch metric queries:

sum:aws.sqs.number_of_messages_received{queuename:arn:aws:sqs:eu-west-2:123456789012:my-queue.fifo}

But the AWS CloudWatch integration tags SQS metrics with just the short queue name (from the CloudWatch QueueName dimension):

queuename:my-queue.fifo

Result: the query returns no data, and all Queue tab graphs are empty.

Comparison with botocore SDK path

The botocore instrumentation in dd-trace-py correctly extracts the short name:

# ddtrace/internal/datastreams/botocore.py
def get_queue_name(params):
    queue_url = params["QueueUrl"]
    url = parse.urlparse(queue_url)
    return url.path.rsplit("/", 1)[-1]  # returns "my-queue.fifo"

Both handle_sqs_sns_produce() and handle_sqs_receive() use this short name as the DSM topic: tag. When SQS messages are consumed by a long-running process polling with sqs.receive_message(), the botocore instrumentation handles the DSM checkpoint and the Queue tab works correctly. The bug is specific to Lambda functions triggered by SQS event source mappings, where datadog-lambda-python handles the checkpoint instead.

Expected Behavior

_dsm_set_checkpoint() should extract the short queue name from the ARN before passing it to set_consume_checkpoint(), e.g.:

queue_name = source_arn.rsplit(":", 1)[-1]  # "my-queue.fifo"
set_consume_checkpoint(event_type, queue_name, carrier_get, manual_checkpoint=False)

This would align the Lambda consumption path with the botocore SDK path, and the DSM Queue tab CloudWatch queries would match the actual queuename tag values.

Workaround

Users can manually change the metric filter from queuename to dd_resource_key (which contains the full ARN) to see data. But this must be done for each graph individually and doesn't persist.

Environment

  • datadog-lambda v8.123.0
  • dd-trace-py v4.6.0
  • Lambda Extension v92-next
  • Python 3.14
  • SQS FIFO queue consumed via Lambda event source mapping
  • AWS region: eu-west-2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions