Skip to content

Conversation

@sunryeokkim
Copy link

@sunryeokkim sunryeokkim commented Oct 29, 2025

What does this PR do?

Previously, when a ping command returned a non-zero exit code, the check raised a CheckException, causing the entire check run to fail. This behavior made the Agent mark the check as ERROR, even for normal network conditions such as a host being temporarily unreachable.

This PR improves the robustness and cross-platform behavior of the PingCheck integration by refining how ping failures are interpreted.

Specifically, it adds logic to distinguish between network-level unreachable errors and name resolution / invalid address errors on platforms where ping returns the same exit code (especially Windows).

Key changes include:
• Adds detection of name resolution and invalid address errors even when ping returns exit code 1.
• Raises a CheckException for these execution-level errors to correctly surface DNS and malformed address issues.
• Preserves exit-code-1 handling for genuine network unreachable cases by returning a structured "unreachable" status.
• Ensures consistent behavior across Linux, macOS, and Windows ping implementations.

@sunryeokkim sunryeokkim self-assigned this Oct 29, 2025
@sunryeokkim
Copy link
Author

@jstanton617 Could you please review this PR when you have a chance?
This update refines PingCheck error handling and has passed validation on Linux and Windows.

@sunryeokkim sunryeokkim requested a review from masafm November 11, 2025 05:45
Copy link

@masafm masafm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you reviewed or updated the test cases to align with this change?

# Real execution error (e.g. missing binary, permission, DNS failure)
# The service check is marked CRITICAL, and the check itself raises an error.
self.log.info("%s check error (%s)", host, e)
self.service_check(self.SERVICE_CHECK_NAME, AgentCheck.CRITICAL, custom_tags, message=str(e))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not exactly a correction, but wouldn’t AgentCheck.UNKNOWN instead of AgentCheck.CRITICAL be more appropriate?

@sunryeokkim sunryeokkim requested a review from masafm November 28, 2025 06:25
@github-actions
Copy link

This pull request has not been updated for more than 21 days. If there are no updates to this PR within 7 days, it will be closed. If you'd like to re-open this PR after it's been closed, you can start from the latest master branch or pull the latest changes into your branch and create a new pull request.

@github-actions github-actions bot added the stale label Dec 19, 2025
@github-actions
Copy link

This pull request was not updated after an additional 7 days of no activity. If you would like to continue work on this PR, please re-open this PR or create a fresh branch off of the latest master branch.

@github-actions github-actions bot closed this Dec 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants