Skip to content

improve: log diagnostic for remote deployment junit extension#3306

Open
csviri wants to merge 4 commits intooperator-framework:mainfrom
csviri:tomcat-fix
Open

improve: log diagnostic for remote deployment junit extension#3306
csviri wants to merge 4 commits intooperator-framework:mainfrom
csviri:tomcat-fix

Conversation

@csviri
Copy link
Copy Markdown
Collaborator

@csviri csviri commented Apr 22, 2026

adds also option to not delete CRDs

Signed-off-by: Attila Mészáros a_meszaros@apple.com

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 22, 2026
@csviri csviri linked an issue Apr 22, 2026 that may be closed by this pull request
@csviri csviri changed the title improve: log diagnostic for remote deployment junit test improve: log diagnostic for remote deployment junit extension Apr 22, 2026
@csviri csviri force-pushed the tomcat-fix branch 2 times, most recently from ddc23d3 to 5bb595d Compare April 22, 2026 13:20
@csviri csviri marked this pull request as ready for review April 22, 2026 13:21
Copilot AI review requested due to automatic review settings April 22, 2026 13:21
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 22, 2026
@openshift-ci openshift-ci Bot requested review from metacosm and xstefank April 22, 2026 13:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds richer diagnostics when a cluster-deployed operator times out during deployment, to make remote E2E failures easier to debug.

Changes:

  • Catch KubernetesClientTimeoutException during waitUntilReady and emit diagnostics before rethrowing.
  • Add logDiagnosticInfo(...) to report deployments, pods, container statuses, related events, and recent pod logs on timeout.

Comment on lines +234 to +236
" Could not retrieve logs for pod '{}': {}",
pod.getMetadata().getName(),
logEx.getMessage());
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When pod log retrieval fails, this only logs the exception message and drops the stack trace, which makes diagnosing client/auth/network issues harder. Consider logging the exception itself as the last argument (similar to the diagEx handling below) so the full cause is available when needed.

Suggested change
" Could not retrieve logs for pod '{}': {}",
pod.getMetadata().getName(),
logEx.getMessage());
" Could not retrieve logs for pod '{}'",
pod.getMetadata().getName(),
logEx);

Copilot uses AI. Check for mistakes.
@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 22, 2026

@metacosm @xstefank I restarted a couple times the tomcat E2E test, but it does not want to fail, however this might be useful in general to log issues such a way as in this PR. So we could merge this an see if it fails again.

What do you think?

Copy link
Copy Markdown
Collaborator

@xstefank xstefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK with me

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 23, 2026

Now it fails, this seems to be an issue that I also observed with the operations sample PR:
#3291

It seems that CRD is not applied in some cases, will investigate further. This might be actually because of junit extension issue - maybe with newer version of junit? - since we never observed this before.

Or maybe with your recent changes @xstefank ?

@xstefank
Copy link
Copy Markdown
Collaborator

@csviri there is only one test in TomcatOperatorE2ETest so I don't think deleting of the CRD is the issue.

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 23, 2026

Will mere

@csviri there is only one test in TomcatOperatorE2ETest so I don't think deleting of the CRD is the issue.

No, not the deletion, just thought it might ring some bell, nvm

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 23, 2026

@csviri there is only one test in TomcatOperatorE2ETest so I don't think deleting of the CRD is the issue.

Yes but there are two test modes local and remote, so that will be the problem. @xstefank

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 23, 2026

I will do a nicer opt-out from that feature for E2Es

Copy link
Copy Markdown
Collaborator

@xstefank xstefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csviri all tests passed with the CRD deleting after the test class is executed in my PR. So I don't understand why you think that it caused failures in this PR? Local and Cluster tests run separatedly, so they should both restart the operator and redeploy the CRD. I don't think leaving some CRDs like it is proposed in this PR is good.

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 24, 2026

@csviri all tests passed with the CRD deleting after the test class is executed in my PR. So I don't understand why you think that it caused failures in this PR? Local and Cluster tests run separatedly, so they should both restart the operator and redeploy the CRD. I don't think leaving some CRDs like it is proposed in this PR is good.

@xstefank they are running against the same cluster, there is probably an issue with cluster test runner regarding the CRD, I will take a look on that later; for this fixes the issue. Also this api to turn off CRD deletion might help for others too. Will continue on this in this PR.

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 24, 2026

@xstefank btw, currently the deletion of CRD-s would great to handle uniformly also for ClusterDeployedOperatorExtension so implementing it in AbstractOperatorExtension, would you care to create a PR for that?

@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 24, 2026

@xstefank I addressed the CRD issue in separate PR what we discussed. Removed the deletion opt-out, but let there the option for the future / if some might need it.

@csviri csviri requested a review from xstefank April 24, 2026 12:38
csviri added 4 commits April 24, 2026 15:24
Signed-off-by: Attila Mészáros <a_meszaros@apple.com>
Signed-off-by: Attila Mészáros <a_meszaros@apple.com>
Signed-off-by: Attila Mészáros <a_meszaros@apple.com>
Signed-off-by: Attila Mészáros <a_meszaros@apple.com>
@csviri
Copy link
Copy Markdown
Collaborator Author

csviri commented Apr 24, 2026

Delete CRD flag I will move to a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky Sample TomcatOperator E2E test

4 participants