Document direct kubectl logs as fallback for /loki-query#1682
Merged
Document direct kubectl logs as fallback for /loki-query#1682
Conversation
Loki is good at multi-pod sweeps, structured filters, and historical queries — but not every investigation fits that. Add a section to the /loki-query slash command covering when and how to read pod logs directly: - After a pod crash/exit (--previous), e.g. when JmsFailoverWatchdog recycles a wedged pod and we need the prior container's stack trace. - For real-time tail of a single pod, or windows that have aged out of Loki retention. - For the broker pods (activemqint, activemqsim, artemismq) — verified during the 2026-05-06 wedge investigation that these run supervisord as PID 1 and the actual broker logs to a file inside the pod (/var/log/activemq/activemq.log for ActiveMQ Classic). Neither `kubectl logs` nor Loki sees those events; the only access path is `kubectl exec ... cat`. Document this gap explicitly so future investigators don't waste time on a `kubectl logs` that returns ten supervisord lifecycle lines and call it a dead end. Mirrors the Loki kubeconfig convention (LOKI_KUBECONFIG → ~/.kube/kubeconfig_vxrails.yaml), provides a discovery command for actual deployment/statefulset names, and notes the in-pod log rotation horizon (~14h on activemqint) as a real operational gap worth fixing by reconfiguring these brokers to log to stdout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.claude/commands/loki-query.mdcovering: investigations after a pod crash (--previous), real-time tail of a single pod, windows aged out of Loki retention, and the broker pods.activemqint,activemqsim, andartemismqall run supervisord as PID 1. The actual broker logs to a file inside the pod (/var/log/activemq/activemq.logfor ActiveMQ Classic), sokubectl logs deployment/activemqintreturns ~10 lines of supervisord lifecycle frozen at pod startup, and Loki's promtail isn't scraping these pods either. Documents thekubectl exec ... cat/awkworkaround and notes the ~14h in-pod log rotation horizon as a gap worth fixing by reconfiguring the brokers to log to stdout.LOKI_KUBECONFIG→~/.kube/kubeconfig_vxrails.yaml), provides a discovery command for actual deployment/statefulset names, and adds a "Putting it together for a JMS-wedge incident" subsection cross-referencing client-side Loki with broker-sidekubectl exec.Test plan
kubectl exec deployment/activemqint -- tail -50 /var/log/activemq/activemq.log) to confirm the path and command pattern match reality/loki-queryreads its own definition, so any future incident-investigation session will pick up the new section automatically🤖 Generated with Claude Code