Skip to content

Broker throughput for case sensitive brokers doesn't handle two queues differing only in case #5439

@PhilBastian

Description

@PhilBastian

Describe the bug

Description

Broker throughput detection has a mechanism for handling name clashes using the PostfixGenerator and SanitizedName, however this is:

  • not de-duplicating queue throughput correctly
  • subject to race conditions and incorrect throughput values
  • causing issues when writing the throughput report

Scenario:

  • Case sensitive broker (e.g. RabbitMQ). Two queues differing only in casing, e.g. "Sales" and "sales"

First time ServiceControl throughput detection runs after the queues are created

  • queue names are fetched and run through postfix. "sales" gets assigned a postfix of 1
  • adding the throughput information is run asynchronously, without an await
  • Both queuename instances query Raven for an existing match. Both return null, since neither have reached the await for saving
  • "Sales" saves its endpoint data to Raven with Id of "Sales/Broker" and SanitizedName of "Sales".
  • "sales" saves its endpoint data to Raven with Id of "sales/Broker" and SanitizedName of "sales1". Raven, being case insensitive, overwrites "Sales/Broker" with the details of "sales".

Subsequent throughput detection runs

  • Depending on the order of fetching queue names from RabbitMQ, either "Sales" or "sales" gets through first
  • the endpoint read is the existing "Sales/Broker" for both
  • whichever one hits the save first will write their throughput data.

If the two queues were created at different times

  • If "Sales" was created first, SC broker throughput detection runs, then some time in the future "sales" is created, the same behaviour as Subsequent throughput detection runs above will happen, since the record fetched by case insensitiveid of "sales/Broker" is fetched. The difference is that the record in Raven would have SanitizedName set to "Sales" rather than "sales1", so the throughput report issue below wouldn't happen.

Generating throughput report

  • The queue name written to the report is from the Id, i.e. "Sales"
  • The queue name used to match to other throughput sources, i.e. Audit or Monitoring, is from SanitizedName, i.e. "sales1"
  • The throughput report ends up with two entries for "Sales", one from the Broker and one with Audit/Monitoring
  • Also note that, if there was a genuine queue named "Sales1", then its broker throughput would be recorded against the same throughput report record as "Sales", since their SanitizedName's match.

Expected behavior

Correctly deduplicated reporting of throughput against both "Sales" and "sales", with the correct matching to their Audit and Monitoring throughput values.

Actual behavior

As above

Versions

6.7.2+ (since the throughput report generator has existed in ServiceControl)

Steps to reproduce

have two queues with the same name but different casing in the same broker.

Relevant log output

Additional Information

Workarounds

Possible solutions

Additional information

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions