Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f28262f
improve: add license headers to source files (#2980)
csviri Oct 9, 2025
3ff81a8
chore: version to 5.3.0-SNAPSHOT
csviri Oct 14, 2025
e77318d
Annotation removal using locking (#3015)
shawkins Oct 30, 2025
0bd0844
improve: complete comparable resource version configs (#3027)
csviri Nov 13, 2025
23f15ed
improve: run pr-s checks for v5.3 (#3042)
csviri Nov 17, 2025
8c64470
fix: rebase on main after release
csviri Dec 1, 2025
2c45235
fix(javadoc): invalid method ref blocks snapshot release (#3076)
csviri Dec 1, 2025
dd68f15
feat: record desired state in Context (#3082)
metacosm Dec 3, 2025
8e036d7
improve: rename junit5 module to junit (#3081)
csviri Dec 4, 2025
ef43fc2
fix: delete empty files result of rebase on main (#3093)
csviri Dec 15, 2025
82a7b82
feat: ReconcileUtils for strongly consistent updates (#3106)
csviri Jan 15, 2026
1d61d88
Event filtering now records resource action and previous resource (#3…
csviri Jan 21, 2026
063e8e3
improve: facelift samples to use ReconcileUtils (#3135)
csviri Jan 27, 2026
9818d87
improve: move compare resource version methods to internal utils (#3137)
csviri Jan 28, 2026
26e51e6
feat: move ReconcileUtils methods to ResourceOperations accessible fr…
csviri Feb 2, 2026
9041832
improve: KubernetesDependentResource uses resource operations directl…
csviri Feb 2, 2026
d73e95e
feat: provide de-duplicated secondary resources stream on Context (#3…
metacosm Feb 3, 2026
23c68f2
refactor: avoid creating intermediate collections when unneeded (#3156)
metacosm Feb 5, 2026
010754d
improve: event filtering algorithm for multiple parallel updates (#3155)
csviri Feb 6, 2026
6766ba9
improve: prepare for removal of exitOnStopLeading from public API (#3…
metacosm Feb 6, 2026
6f057d0
fix: typo (#3173)
metacosm Feb 19, 2026
ee3b212
fix: incorrect logic by introducing createOrUpdate method (#3172)
metacosm Feb 19, 2026
e46fb0b
chore: set next version to 999-SNAPSHOT (#3180)
metacosm Feb 23, 2026
bd014e8
feat: allow to skip namespace deletion in junit extension (#3178)
csviri Feb 23, 2026
9140239
improve: logging for resource filter cache (#3167)
csviri Feb 23, 2026
a01f524
fix: unify how resource information is added, prevent NPEs (#3185)
metacosm Feb 25, 2026
4389865
feat: emit MDCUtils.NO_NAMESPACE value when namespace is null (#3186)
metacosm Feb 26, 2026
06fa48a
improve: do not close infra client if same as client (#3187)
csviri Feb 26, 2026
2b12af6
feat: add MDC to workflow execution (#3188)
csviri Feb 28, 2026
e6c2fa7
fix: concurrency issue with filtering and caching update (#3191)
csviri Mar 2, 2026
942afd1
improve: deprecate redundant ManagedInformerEventSource.getCachedValu…
csviri Mar 2, 2026
b8d1139
docs: read-cache-after-write consistency and event filtering (#3193)
csviri Mar 2, 2026
8536fdf
fix: logging for flaky MySQLSchema E2E test (#3198)
csviri Mar 2, 2026
440f8b7
feat: configuration adapters (#3177)
csviri Mar 6, 2026
a0d2297
feat: metricsV2 + oTel + prometheus sample and Grafana dashboard (#3154)
csviri Mar 6, 2026
d8e3992
fix: wait for pods to be ready before port-forwarding in MetricsHandl…
csviri Mar 9, 2026
543b917
improve: informer health check should not rely on isWatching (#3209)
csviri Mar 9, 2026
f2c9b7a
Obsolete resource handling for read-cache-after-write (#3207)
csviri Mar 12, 2026
9d5b0a7
improve: simplified MicrometerMetricsV2 builder api (#3220)
csviri Mar 12, 2026
944f5b1
fix: typo in AgregatePriorityListConfigProvider name (#3217)
csviri Mar 12, 2026
c69810a
improve: cleanup gh actions (#3223)
csviri Mar 13, 2026
3ec985d
improve: doc fixes for micrometer metrics (#3222)
csviri Mar 13, 2026
b491b6a
docs: blog about 5.3.0 release (#3205)
csviri Mar 13, 2026
84d7b89
docs: blog post on read-cache-after-write consistency (#3194)
csviri Mar 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/e2e-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
- "sample-operators/tomcat-operator"
- "sample-operators/webpage"
- "sample-operators/leader-election"
- "sample-operators/metrics-processing"
runs-on: ubuntu-latest
steps:
- name: Checkout
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ on:
paths-ignore:
- 'docs/**'
- 'adr/**'
- 'observability/**'
workflow_dispatch:
jobs:
check_format_and_unit_tests:
Expand Down
2 changes: 1 addition & 1 deletion bootstrapper-maven-plugin/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<parent>
<groupId>io.javaoperatorsdk</groupId>
<artifactId>java-operator-sdk</artifactId>
<version>5.2.4-SNAPSHOT</version>
<version>999-SNAPSHOT</version>
</parent>

<artifactId>bootstrapper</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
</dependency>
<dependency>
<groupId>io.javaoperatorsdk</groupId>
<artifactId>operator-framework-junit-5</artifactId>
<artifactId>operator-framework-junit</artifactId>
<version>${josdk.version}</version>
<scope>test</scope>
</dependency>
Expand Down
4 changes: 2 additions & 2 deletions caffeine-bounded-cache-support/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>io.javaoperatorsdk</groupId>
<artifactId>java-operator-sdk</artifactId>
<version>5.2.4-SNAPSHOT</version>
<version>999-SNAPSHOT</version>
</parent>

<artifactId>caffeine-bounded-cache-support</artifactId>
Expand All @@ -43,7 +43,7 @@
</dependency>
<dependency>
<groupId>io.javaoperatorsdk</groupId>
<artifactId>operator-framework-junit-5</artifactId>
<artifactId>operator-framework-junit</artifactId>
<version>${project.version}</version>
<scope>test</scope>
</dependency>
Expand Down
10 changes: 10 additions & 0 deletions docs/content/en/blog/news/primary-cache-for-next-recon.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@ author: >-
[Attila Mészáros](https://github.com/csviri) and [Chris Laprun](https://github.com/metacosm)
---

{{% alert title="Deprecated" %}}

Read-cache-after-write consistency feature replaces this functionality. (since version 5.3.0)

> It provides this functionality also for secondary resources and optimistic locking
is not required anymore. See the [docs](./../../docs/documentation/reconciler.md#read-cache-after-write-consistency-and-event-filtering) and
related [blog post](read-after-write-consistency.md) for details.

{{% /alert %}}

We recently released v5.1 of Java Operator SDK (JOSDK). One of the highlights of this release is related to a topic of
so-called
[allocated values](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#representing-allocated-values
Expand Down
282 changes: 282 additions & 0 deletions docs/content/en/blog/news/read-after-write-consistency.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
---
title: Welcome read-cache-after-write consistency!
date: 2026-03-13
author: >-
[Attila Mészáros](https://github.com/csviri)
---

**TL;DR:**
In version 5.3.0 we introduced strong consistency guarantees for updates with a new API.
You can now update resources (both your custom resource and managed resources)
and the framework will guarantee that these updates will be instantly visible
when accessing resources from caches,
and naturally also for subsequent reconciliations.

I briefly [talked about this](https://www.youtube.com/watch?v=HrwHh5Yh6AM&t=1387s) topic at KubeCon last year.

```java
public UpdateControl<WebPage> reconcile(WebPage webPage, Context<WebPage> context) {

ConfigMap managedConfigMap = prepareConfigMap(webPage);
// apply the resource with new API
context.resourceOperations().serverSideApply(managedConfigMap);

// fresh resource instantly available from our update in the caches
var upToDateResource = context.getSecondaryResource(ConfigMap.class);

// from now on built-in update methods by default use this feature;
// it is guaranteed that resource changes will be visible for next reconciliation
return UpdateControl.patchStatus(alterStatusObject(webPage));
}
```

In addition to that, the framework will automatically filter events for your own updates,
so they don't trigger the reconciliation again.

{{% alert color=success %}}
**This should significantly simplify controller development, and will make reconciliation
much simpler to reason about!**
{{% /alert %}}

This post will deep dive into this topic, exploring the details and rationale behind it.

See the related umbrella [issue](https://github.com/operator-framework/java-operator-sdk/issues/2944) on GitHub.

## Informers and eventual consistency

First, we have to understand a fundamental building block of Kubernetes operators: Informers.
Since there is plentiful accessible information about this topic, here's a brief summary. Informers:

1. Watch Kubernetes resources — the K8S API sends events if a resource changes to the client
through a websocket. An event usually contains the whole resource. (There are some exceptions, see Bookmarks.)
See details about watch as a K8S API concept in the [official docs](https://kubernetes.io/docs/reference/using-api/api-concepts/#semantics-for-watch).
2. Cache the latest state of the resource.
3. If an informer receives an event in which the `metadata.resourceVersion` is different from the version
in the cached resource, it calls the event handler, thus in our case triggering the reconciliation.

A controller is usually composed of multiple informers: one tracking the primary resource, and
additional informers registered for each (secondary) resource we manage.
Informers are great since we don't have to poll the Kubernetes API — it is push-based. They also provide
a cache, so reconciliations are very fast since they work on top of cached resources.

Now let's take a look at the flow when we update a resource:


```mermaid
graph LR
subgraph Controller
Informer:::informer
Cache[(Cache)]:::teal
Reconciler:::reconciler
Informer -->|stores| Cache
Reconciler -->|reads| Cache
end
K8S[⎈ Kubernetes API Server]:::k8s

Informer -->|watches| K8S
Reconciler -->|updates| K8S

classDef informer fill:#C0527A,stroke:#8C3057,color:#fff
classDef reconciler fill:#E8873A,stroke:#B05E1F,color:#fff
classDef teal fill:#3AAFA9,stroke:#2B807B,color:#fff
classDef k8s fill:#326CE5,stroke:#1A4AAF,color:#fff
```

It is easy to see that the cache of the informer is eventually consistent with the update we sent from the reconciler.
It usually takes only a very short time (a few milliseconds) to sync the caches and everything is fine. Well, sometimes
it isn't. The websocket can be disconnected (which actually happens on purpose sometimes), the API Server can be slow, etc.


## The problem(s) we try to solve

Let's consider an operator with the following requirements:
- we have a custom resource `PrefixedPod` where the spec contains only one field: `podNamePrefix`
- the goal of the operator is to create a Pod with a name that has the prefix and a random suffix
- it should never run two Pods at once; if the `podNamePrefix` changes, it should delete
the current Pod and then create a new one
- the status of the custom resource should contain the `generatedPodName`

How the code would look in 5.2.x:

```java

public UpdateControl<PrefixedPod> reconcile(PrefixedPod primary, Context<PrefixedPod> context) {

Optional<Pod> currentPod = context.getSecondaryResource(Pod.class);

if (currentPod.isPresent()) {
if (podNameHasPrefix(primary.getSpec().getPodNamePrefix() ,currentPod.get())) {
// all ok we can return
return UpdateControl.noUpdate();
} else {
// deletes the current pod with different name pattern
context.getClient().resource(currentPod.get()).delete();
// return; pod delete event will trigger the reconciliation
return UpdateControl.noUpdate();
}
} else {
// creates new pod
var newPod = context.getClient().resource(createPodWithOwnerReference(primary)).serverSideApply();
return UpdateControl.patchStatus(setGeneratedPodNameToStatus(primary,newPod));
}
}

@Override
public List<EventSource<?, PrefixedPod>> prepareEventSources(EventSourceContext<PrefixedPod> context) {
// Code omitted for adding InformerEventSource for the Pod
}
```

That is quite simple: if there is a Pod with a different name prefix we delete it, otherwise we create the Pod
and update the status. The Pod is created with an owner reference, so any update on the Pod will trigger
the reconciliation.

Now consider the following sequence of events:

1. We create a `PrefixedPod` with `spec.podNamePrefix`: `first-pod-prefix`.
2. Concurrently:
- The reconciliation logic runs and creates a Pod with a generated name suffix: "first-pod-prefix-a3j3ka";
it also sets this in the status and updates the custom resource status.
- While the reconciliation is running, we update the custom resource to have the value
`second-pod-prefix`.
3. The update of the custom resource triggers the reconciliation.

When the spec change triggers the reconciliation in point 3, there is absolutely **no guarantee** that:
- the created Pod will already be visible — `currentPod` might simply be empty
- the `status.generatedPodName` will be visible

Since both are backed by an informer and the caches of those informers are only eventually consistent with our updates,
the next reconciliation would create a new Pod, violating the requirement to not have two
Pods running at the same time. In addition, the controller would override the status. Although in the case of a Kubernetes
resource we can still find the existing Pods later via owner references, if we were managing a
non-Kubernetes (external) resource we would not notice that we had already created one.

So can we have stronger guarantees regarding caches? It turns out we can now...

## Achieving read-cache-after-write consistency

When we send an update (this also applies to various create and patch requests) to the Kubernetes API, in the response
we receive the up-to-date resource with the resource version that is the most recent at that point.
The idea is that we can cache this response in a cache on top of the Informer's cache.
We call this cache `TemporaryResourceCache` (TRC), and besides caching such responses, it also plays a role in event filtering
as we will see later.

Note that the challenge in the past was knowing when to evict this response from the TRC. Eventually,
we will receive an event in the informer and the informer cache will be populated with an up-to-date resource.
But it was not possible to reliably tell whether an event contained a resource that was the result
of an update before or after our own update. The reason is that the Kubernetes documentation stated that
`metadata.resourceVersion` should be treated as an opaque string and matched only with equality.
Although with optimistic locking we were able to overcome this issue — see [this blog post](primary-cache-for-next-recon.md).

{{% alert color=success %}}
This changed in the Kubernetes guidelines. Now, if we can parse the `resourceVersion` as an integer,
we can use numerical comparison. See the related [KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/5504-comparable-resource-version).
{{% /alert %}}

From this point the idea of the algorithm is very simple:

1. After updating a Kubernetes resource, cache the response in the TRC.
2. When the informer propagates an event, check if its resource version is greater than or equal to
the one in the TRC. If yes, evict the resource from the TRC.
3. When the controller reads a resource from cache, it checks the TRC first, then falls back to the Informer's cache.


```mermaid
sequenceDiagram
box rgba(50,108,229,0.1)
participant K8S as ⎈ Kubernetes API Server
end
box rgba(232,135,58,0.1)
participant R as Reconciler
end
box rgba(58,175,169,0.1)
participant I as Informer
participant IC as Informer Cache
participant TRC as Temporary Resource Cache
end

R->>K8S: 1. Update resource
K8S-->>R: Updated resource (with new resourceVersion)
R->>TRC: 2. Cache updated resource in TRC

I-)K8S: 3. Watch event (resource updated)
I->>TRC: On event: event resourceVersion ≥ TRC version?
alt Yes: event is up-to-date
I-->>TRC: Evict resource from TRC
else No: stale event
Note over TRC: TRC entry retained
end

R->>TRC: 4. Read resource from cache
alt Resource found in TRC
TRC-->>R: Return cached resource
else Not in TRC
R->>IC: Read from Informer Cache
IC-->>R: Return resource
end
```

## Filtering events for our own updates

When we update a resource, eventually the informer will propagate an event that would trigger a reconciliation.
However, this is mostly not desired. Since we already have the up-to-date resource at that point,
we would like to be notified only if the resource is changed after our change.
Therefore, in addition to caching the resource, we also filter out events that contain a resource
version older than or equal to our cached resource version.

Note that the implementation of this is relatively complex, since while performing the update we want to record all the
events received in the meantime and decide whether to propagate them further once the update request is complete.

However, this way we significantly reduce the number of reconciliations, making the whole process much more efficient.

### The case for instant reschedule

We realize that some of our users might rely on the fact that reconciliation is triggered by their own updates.
To support backwards compatibility, or rather a migration path, we now provide a way to instruct the framework
to queue an instant reconciliation:

```java
public UpdateControl<WebPage> reconcile(WebPage webPage, Context<WebPage> context) {

// omitted reconciliation logic

return UpdateControl.<WebPage>noUpdate().reschedule();
}
```

## Additional considerations and alternatives

An alternative approach would be to not trigger the next reconciliation until the
target resource appears in the Informer's cache. The upside is that we don't have to maintain an
additional cache of the resource, just the target resource version; therefore this approach might have
a smaller memory footprint, but not necessarily. See the related [KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/5647-stale-controller-handling#proposal)
that takes this approach.

On the other hand, when we make a request, the response object is always deserialized regardless of whether we are going
to cache it or not. This object in most cases will be cached for a very short time and later garbage collected.
Therefore, the memory overhead should be minimal.

Having the TRC has an additional advantage: since we have the resource instantly in our caches, we can
elegantly continue the reconciliation in the same pass and reconcile resources that depend
on the latest state. More concretely, this also helps with our [Dependent Resources / Workflows](../../docs/documentation/dependent-resource-and-workflows/workflows.md#reconcile-sample)
which rely on up-to-date caches. In this sense, this approach is much more optimal regarding throughput.

## Conclusion

I personally worked on a prototype of an operator that depended on an unreleased version of JOSDK already
implementing these features. The most obvious gain was how much simpler the reasoning became in some cases and how it reduced the corner
cases that we would otherwise have to solve with the [expectation pattern](https://ahmet.im/blog/controller-pitfalls/#expectations-pattern)
or other facilities.

## Special thanks

I would like to thank all the contributors who directly or indirectly contributed, including [metacosm](https://github.com/metacosm),
[manusa](https://github.com/manusa), and [xstefank](https://github.com/xstefank).

Last but certainly not least, special thanks to [Steven Hawkins](https://github.com/shawkins),
who maintains the Informer implementation in the [fabric8 Kubernetes client](https://github.com/fabric8io/kubernetes-client)
and implemented the first version of the algorithms. We then iterated on it together multiple times.
Covering all the edge cases was quite an effort.
Just as a highlight, I'll mention the [last one](https://github.com/operator-framework/java-operator-sdk/issues/3208).

Thank you!
Loading
Loading