Skip to content

Add AuTest for connect_attempts_max_retries_down_server#12922

Draft
masaori335 wants to merge 1 commit intoapache:masterfrom
masaori335:asf-master-hostdb-autest-down-server
Draft

Add AuTest for connect_attempts_max_retries_down_server#12922
masaori335 wants to merge 1 commit intoapache:masterfrom
masaori335:asf-master-hostdb-autest-down-server

Conversation

@masaori335
Copy link
Contributor

@masaori335 masaori335 commented Feb 27, 2026

This is an effort to verify connect_attempts_max_retries_down_server behavior.

The doc for proxy.config.http.connect_attempts_max_retries_down_server says below.

Maximum number of connection attempts Traffic Server can make while an origin is marked down per request.

Also doc of proxy.config.http.connect_attempts_max_retries says below

Once the maximum number of retries is reached, the origin is marked down. After this, the setting proxy.config.http.connect_attempts_max_retries_down_server is used to limit the number of retry attempts to the known down origin.

However, I don't see ATS makes connection attempts against down servers. From my understanding, HostDBRecord::select_best_http doesn't return HostDBInfo which is in down state. How HttpSM can try to down servers?

@masaori335 masaori335 self-assigned this Feb 27, 2026
@masaori335 masaori335 force-pushed the asf-master-hostdb-autest-down-server branch from 60af02a to 8e43f53 Compare February 27, 2026 06:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new AuTest (ATSReplay-based gold test) intended to exercise proxy.config.http.connect_attempts_max_retries_down_server, and extends the ATSReplay harness to allow DNS records to be specified directly in replay YAML.

Changes:

  • Added a new replay scenario connect_attempts_rr_down_server.replay.yaml under tests/gold_tests/dns/replay/.
  • Added a new gold test driver tests/gold_tests/dns/connect_attempts.test.py to run the replay.
  • Extended ATSReplayTest to accept autest.dns.records and apply them to the DNS server process via addRecords.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tests/gold_tests/dns/replay/connect_attempts_rr_down_server.replay.yaml New replay scenario configuring connect-attempt retry records and expected proxy responses.
tests/gold_tests/dns/connect_attempts.test.py New gold test entrypoint invoking the replay via Test.ATSReplayTest.
tests/gold_tests/autest-site/ats_replay.test.ext ATSReplay harness enhancement: allow replay YAML to inject DNS records into the test DNS server.

# limitations under the License.

#
# This replay file assumes that caching is enabled.
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header comment says this replay assumes caching is enabled, but autest.ats.process_config.enable_cache is set to false. Please update/remove the comment to match the actual configuration (or enable cache if that was intended).

Suggested change
# This replay file assumes that caching is enabled.
# This replay file runs with caching disabled.

Copilot uses AI. Check for mistakes.
proxy-response:
status: 502

# request expected hit down_server cache and immidiately get 500
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: immidiately should be immediately.

Suggested change
# request expected hit down_server cache and immidiately get 500
# request expected hit down_server cache and immediately get 500

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +146
proxy-response:
status: 500
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This transaction expects a 500 response, but ATS's "origin down" path returns 502 (see HttpTransact::OriginDown), and with proxy.config.http.connect_attempts_max_retries_down_server: 1 ATS will still attempt a connection to a marked-down origin rather than short-circuiting. The expected proxy-response.status here should be updated accordingly (likely 502), and the accompanying comment should be adjusted to reflect what behavior is being asserted.

Copilot uses AI. Check for mistakes.
@masaori335
Copy link
Contributor Author

masaori335 commented Mar 5, 2026

I think I got what I'm missing. To test this config, ATS needs to go this line in HttpTransact::handle_response_from_server, but current test is trying connect attempt against un-exist servers and ATS doesn't comes here.

max_connect_retries = s->txn_conf->connect_attempts_max_retries_down_server - 1;

@bryancall
Copy link
Contributor

[approve ci ubuntu autest 0]

@bryancall
Copy link
Contributor

AuTest 0of4 is a real failure in the new connect_attempts test — not flaky. It failed on both the original run and a retrigger (build 35759). The proxy-verifier client exits with code 1, meaning the response from ATS didn't match the replay expectations. Server, DNS, and ATS processes all pass fine.

FAILED TESTS:
  - connect_attempts

Process: Default: Failed
  Test : Checking that ReturnCode == 0 - Failed
     Reason: Returned Value 1 != 0

Ubuntu passed on retrigger — that one was flaky. Just this test remaining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants