Skip to content

Conversation

@robobario
Copy link
Member

This proposal outlines a new feature for the Kroxylicious filter framework: a mechanism to resolve topic names from their corresponding topic IDs. This will enable filters to implement business logic based on human-readable topic names, rather than the underlying Kafka implementation detail of topic IDs.

@robobario robobario requested a review from a team as a code owner August 15, 2025 04:54
@robobario
Copy link
Member Author

Here's a rough prototype kroxylicious/kroxylicious#2583


The 'edge' aspect is saying that we want to cache what the upstream said the topic names are, at the edge of the Proxy -> Upstream, but they must then traverse the Filter chain to maintain composability.

We can implement a very long lived cache invalidation strategy as Kafkatopics cannot be renamed. Inconsistencies will arise if the upstream topics are deleted but we should be able to rely on the upstream failing if we forward any interactions with deleted topics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: KIP-516 imagines a world where topics can be renamed. Obviously, we don't need to think about rightnow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, too, don't want to think about this. But we should.

Kafka already has a mechanism for deferred deletion. I don't recall if that's just soft deleting the data, with the topic deleted from the metadata immediately (such that you can immediately recreate the topic once you have the deletion response success), or whether it imposes a delay before you can recreate the topic. If the latter then knowing the timeout would at least give us an upper bound on how long to leave it before revalidating caches.

* Asynchronously resolves a set of topic UUIDs to their corresponding topic names.
*
* @param topicUuids A Set of topic UUIDs to resolve.
* @return A CompletionStage that will complete with an unmodifiable Map<Uuid, String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I question whether failing in the case where a Uuid is not found is the right behavior. The caller has no way to know which Uuid has been deleted, so I think you are really forcing them to query with singletons. I think in the case where a topic id no longer exists, returning a map with that key absent is better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit cautious about this. "cannot be resolved" covers things like the user not being authorized to see the topic, as well as races wrt topic creation and deletion. We need to be sure that what the client observes is consistent with Kafka behaviour.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have reworked it with a Result class that will take either the name or an exception, so we can have specific exception types for non-existence vs authorization or other error codes from Kafka.

CompletionStage<Map<Uuid, String>> getTopicNames(Set<Uuid> topicUuids);
```

This method will internally send a `Metadata` request to the upstream Kafka cluster to fetch the required topic information.
Copy link
Member

@k-wall k-wall Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this must be sending a request for all topics rather than just the subset queried by the user? I am guessing too that includeClusterAuthorizedOperations etc will be false to minimize the amount of the work the server has to do.

It is a pity one can't query for topics without pulling the partition information (which might be huge).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had imagined only querying the topic ids that we are immediately interested in. I guess doing them all at once would save incrementally building the cache up.

@k-wall
Copy link
Member

k-wall commented Aug 26, 2025

Thanks for the write up @robobario. I left some comments but I think the shape is broadly good.
I'd mention this proposal on Slack and see if anyone in the community has any opinions.

Copy link
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robobario

* Asynchronously resolves a set of topic UUIDs to their corresponding topic names.
*
* @param topicUuids A Set of topic UUIDs to resolve.
* @return A CompletionStage that will complete with an unmodifiable Map<Uuid, String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit cautious about this. "cannot be resolved" covers things like the user not being authorized to see the topic, as well as races wrt topic creation and deletion. We need to be sure that what the client observes is consistent with Kafka behaviour.


The 'edge' aspect is saying that we want to cache what the upstream said the topic names are, at the edge of the Proxy -> Upstream, but they must then traverse the Filter chain to maintain composability.

We can implement a very long lived cache invalidation strategy as Kafkatopics cannot be renamed. Inconsistencies will arise if the upstream topics are deleted but we should be able to rely on the upstream failing if we forward any interactions with deleted topics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, too, don't want to think about this. But we should.

Kafka already has a mechanism for deferred deletion. I don't recall if that's just soft deleting the data, with the topic deleted from the metadata immediately (such that you can immediately recreate the topic once you have the deletion response success), or whether it imposes a delay before you can recreate the topic. If the latter then knowing the timeout would at least give us an upper bound on how long to leave it before revalidating caches.

@k-wall k-wall changed the title Add topic name lookup facility 008 - Add topic name lookup facility Sep 19, 2025
Copy link
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robobario I had a few questions, but overall this looks good.

@tombentley tombentley moved this to In Progress in 2025_Q3 Oct 9, 2025
@tombentley
Copy link
Member

@robobario assuming #75 doesn't get merged first, I suppose you will need to renumber the filename.

Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
@robobario robobario changed the title 008 - Add topic name lookup facility 007 - Add topic name lookup facility Oct 31, 2025
@robobario robobario force-pushed the name-lookup-facility branch from 3dd0c2a to 35b7351 Compare October 31, 2025 03:51
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
@k-wall k-wall merged commit dd6e94b into kroxylicious:main Oct 31, 2025
1 check passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in 2025_Q3 Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants