document lifecycle of connectors and backend pod availability#297
document lifecycle of connectors and backend pod availability#297pwright wants to merge 2 commits into
Conversation
|
|
||
|
|
||
| <a id="connector-lifecycle-kubernetes"></a> | ||
| ## Observing connector lifecycle on Kubernetes sites |
There was a problem hiding this comment.
Suggest placing this with the rest of the connector doc
|
|
||
| On Kubernetes sites, a connector uses a pod selector to discover backend pods dynamically. The Skupper controller watches for pod changes and updates the router configuration accordingly. | ||
|
|
||
| Each matching pod gets its own `tcpConnector` entry in the router, named `connector/<name>@<pod-IP>`. |
There was a problem hiding this comment.
Suggest not introducing the tcpConnector entry in the router as it is more of an implementation detail
|
|
||
| Each matching pod gets its own `tcpConnector` entry in the router, named `connector/<name>@<pod-IP>`. | ||
|
|
||
| **Procedure** |
There was a problem hiding this comment.
Maybe explaining the meaning of the resource status and conditions would be helpful and the sequence of transitions that occur (e.g. target pod exists, a listener exists, etc.) Then client connection behaviors could be tied back to what status/condition is at a point in time.
| kubectl logs deploy/skupper-controller -f | ||
| ``` | ||
|
|
||
| With debug logging enabled, you will see: |
There was a problem hiding this comment.
Do we document how to change controller logging levels somewhere else? I could not see where.
|
|
||
|
|
||
| <a id="tcp-client-errors"></a> | ||
| ## Understanding TCP client errors when backends fail |
There was a problem hiding this comment.
Suggest detailing client behaviors against connector status/conditions as backend being not available or available covers removal or error condition as well.
|
|
||
|
|
||
| <a id="router-failures-kubernetes"></a> | ||
| ## Detecting router failures on Kubernetes sites |
There was a problem hiding this comment.
Instead of "Detecting router failures" maybe change to "Observing router operation" (e.g. is it running, how many restarts, why it restarted, etc.)
Also, looking directly at the pod would be more direct than via site status.
Adding details for when there are frequent restarts could be to look at the termination reason, etc.
9b44588 to
db49b75
Compare
addresses https://redhat.atlassian.net/browse/SKUPPER-2963