Skip to content

feat(gax): implement dynamic channel refreshing on 401 retries#13212

Draft
blakeli0 wants to merge 1 commit into
googleapis:mainfrom
blakeli0:feat/gax-mwlid-channel-refresh
Draft

feat(gax): implement dynamic channel refreshing on 401 retries#13212
blakeli0 wants to merge 1 commit into
googleapis:mainfrom
blakeli0:feat/gax-mwlid-channel-refresh

Conversation

@blakeli0
Copy link
Copy Markdown
Contributor

This PR implements dynamic channel refreshing on 401 Unauthenticated retries under the isMwlidEnvironment environment variable. It introduces compile-time type-safe refresh contracts across TransportChannel and ApiCallContext, with debouncing protection in ChannelPool to prevent connection stampedes.

@blakeli0 blakeli0 force-pushed the feat/gax-mwlid-channel-refresh branch from 4f508a8 to 9e55d01 Compare May 15, 2026 21:51
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to automatically refresh transport channels when an UnauthenticatedException occurs, specifically within environments where the isMwlidEnvironment variable is set. Key changes include adding a refresh method to the TransportChannel interface and ChannelPool implementation, incorporating a 5-second debounce for refreshes, and updating the retry logic to trigger these refreshes. Review feedback highlights a potential bug in the debounce initialization, suggests using constants for magic numbers, recommends caching environment variable lookups to improve performance, and advises using imports instead of fully qualified names for better readability.

private ScheduledFuture<?> resizeFuture = null;

private final Object entryWriteLock = new Object();
private long lastRefreshTimeNanos = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Initializing lastRefreshTimeNanos to 0 can lead to the first refresh being skipped if System.nanoTime() returns a value close to zero (which is possible depending on the JVM's arbitrary time origin). Additionally, the 5-second debounce interval should be defined as a constant.

Suggested change
private long lastRefreshTimeNanos = 0;
private static final long REFRESH_DEBOUNCE_THRESHOLD_NANOS = java.util.concurrent.TimeUnit.SECONDS.toNanos(5);
private long lastRefreshTimeNanos = System.nanoTime() - REFRESH_DEBOUNCE_THRESHOLD_NANOS;

// replaces the list)
synchronized (entryWriteLock) {
long now = System.nanoTime();
if (now - lastRefreshTimeNanos < TimeUnit.SECONDS.toNanos(5)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the newly defined constant for the debounce threshold instead of a hardcoded magic number.

Suggested change
if (now - lastRefreshTimeNanos < TimeUnit.SECONDS.toNanos(5)) {
if (now - lastRefreshTimeNanos < REFRESH_DEBOUNCE_THRESHOLD_NANOS) {

*/
@Override
public ApiFuture<ResponseT> submit(RetryingFuture<ResponseT> retryingFuture) {
if ("true".equalsIgnoreCase(System.getenv("isMwlidEnvironment"))) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Accessing environment variables via System.getenv on every RPC attempt (inside submit) is inefficient. Consider caching this value in a static final boolean field to avoid repeated lookups and potential performance overhead.

Comment on lines +137 to +141
if (cause instanceof com.google.api.gax.rpc.UnauthenticatedException) {
RetryingContext context = retryingFuture.getRetryingContext();
if (context instanceof com.google.api.gax.rpc.ApiCallContext) {
com.google.api.gax.rpc.TransportChannel transportChannel =
((com.google.api.gax.rpc.ApiCallContext) context).getTransportChannel();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of fully qualified names for UnauthenticatedException, ApiCallContext, and TransportChannel makes the code verbose and harder to read. It is recommended to use imports instead.

Comment on lines +41 to +44
if ("true".equalsIgnoreCase(System.getenv("isMwlidEnvironment"))
&& previousThrowable instanceof UnauthenticatedException) {
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for checking the isMwlidEnvironment environment variable and the exception type is duplicated across both shouldRetry method overloads. Consider consolidating this into a private helper method and caching the environment variable result to improve maintainability and performance.

@blakeli0 blakeli0 force-pushed the feat/gax-mwlid-channel-refresh branch from 9e55d01 to 188158f Compare May 15, 2026 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant