You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pipeline/processors/tda.md
+21-18Lines changed: 21 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
# TDA
2
2
3
-
The `tda` processor applies **Topological Data Analysis (TDA)** – specifically, **persistent homology** – to Fluent Bit’s metrics stream and exports **Betti numbers** that summarize the shape of recent behavior in metric space.
4
-
5
-
This processor is intended for detecting **phase transitions**, **regime changes**, and **intermittent instabilities** that are hard to see from individual counters, gauges, or standard statistical aggregates. It can, for example, differentiate between a single, one-off failure and an extended period of intermittent failures where the system never settles into a stable regime.
3
+
The `tda` processor applies **Topological Data Analysis (TDA)**—specifically, **persistent homology**—to Fluent Bit metrics stream and exports **Betti numbers** that summarize the shape of recent behavior in metric space.
6
4
5
+
This processor is intended for detecting **phase transitions**, **regime changes**, and **intermittent instabilities** that are difficult to detect from individual counters, gauges, or standard statistical aggregates.
6
+
It can, for example, differentiate between a single, one-off failure and an extended period of intermittent failures where the system never settles into a stable regime.
7
7
Currently, `tda` works only in the **metrics pipeline** (`processors.metrics`).
8
8
9
9
---
@@ -48,7 +48,8 @@ On each metrics flush, `tda`:
48
48
To stabilize very different magnitudes and bursty traffic, each rate is mapped to
49
49
`norm = log1p(|rate|)`, and the sign of `rate` is reattached. This yields a vector that is roughly scale-invariant but still sensitive to relative changes in rates across groups.
50
50
51
-
The resulting normalized vector is written into a **ring buffer window** (`tda_window`), implemented via a lightweight circular buffer (`lwrb`) that stores timestamped samples. The window maintains at most `window_size` samples; older samples are dropped when the buffer is full.
51
+
The resulting normalized vector is written into a **ring buffer window** (`tda_window`), implemented through a lightweight circular buffer (`lwrb`) that stores timestamped samples.
52
+
The window maintains at most `window_size` samples; older samples are dropped when the buffer is full.
52
53
53
54
### 2. Sliding window and delay embedding
54
55
@@ -65,7 +66,7 @@ $$
65
66
66
67
where each `x_·` is the **D-dimensional normalized metrics vector** at that time. This yields embedded points in (\mathbb{R}^{mD}).
67
68
68
-
Because we need all lags to be inside the window, the number of embedded points is:
69
+
Because all lags must be inside the window, the number of embedded points is:
69
70
70
71
$$
71
72
n_{\text{embed}} = n_{\text{raw}} - (m - 1)\tau
@@ -77,8 +78,8 @@ This embedding follows the idea of **Takens’ theorem**, which states that, und
77
78
78
79
Intuitively:
79
80
80
-
*`embed_dim = 1`: you see only the current “snapshot” geometry.
81
-
*`embed_dim > 1`: you expose **loops and recurrent trajectories** in the joint evolution of metrics, which later show up as **H₁ (Betti₁) features**.
81
+
*`embed_dim = 1`: only the current "snapshot" geometry is visible.
82
+
*`embed_dim > 1`: **loops and recurrent trajectories** in the joint evolution of metrics become visible, which later show up as **H₁ (Betti₁) features**.
* The off-diagonal distances are collected, sorted, and several quantiles are evaluated, e.g.`q ∈ {0.10, 0.20, …, 0.90}`.
100
+
* The off-diagonal distances are collected, sorted, and several quantiles are evaluated, for example`q ∈ {0.10, 0.20, …, 0.90}`.
100
101
* For each candidate quantile `q`, a threshold `r_q` is chosen and Betti numbers are computed using Ripser.
101
102
* The plugin prefers the scale where **Betti₁** (loops) is maximized; if all Betti₁ are zero, it falls back to Betti₀ as a secondary indicator.
102
103
103
104
2.**Fixed quantile mode** (`0 < threshold < 1`)
104
105
105
106
*`threshold` is interpreted as a single quantile `q`. The Rips radius is set at this quantile of all pairwise distances.
106
-
* The multi-quantile scan still runs internally for robustness, but reported diagnostics (e.g., debug logs) will reflect the user-selected quantile.
107
+
* The multi-quantile scan still runs internally for robustness, but reported diagnostics (For example, debug logs) will reflect the user-selected quantile.
107
108
108
109
Internally, quantile selection is handled by `tda_choose_threshold_from_dist`, which gathers all `i > j` entries of the distance matrix, sorts them, and picks the specified quantile index.
109
110
110
-
### 5. Persistent homology via Ripser
111
+
### 5. Persistent Homology through Ripser
111
112
112
113
Once the compressed lower-triangular distance matrix is built, it is passed to a thin wrapper around **Ripser**, a well-known implementation of Vietoris–Rips persistent homology:
113
114
114
115
1.**Compression and C API**
115
116
116
-
* The dense `n_embed × n_embed` matrix is converted into Ripser’s `compressed_lower_distance_matrix`.
117
+
* The dense `n_embed × n_embed` matrix is converted into Ripser's `compressed_lower_distance_matrix`.
117
118
* The wrapper function `flb_ripser_compute_betti_from_dense_distance` runs Ripser up to `max_dim = 2` (H₀, H₁, H₂), using coefficients in (\mathbb{Z}/2\mathbb{Z}), and accumulates persistence intervals into Betti numbers with a small persistence cutoff to ignore very short-lived noise features.
118
119
119
120
2.**Interval aggregation**
@@ -124,7 +125,7 @@ Once the compressed lower-triangular distance matrix is built, it is passed to a
124
125
3.**Multi-scale selection**
125
126
126
127
* For each candidate threshold, Betti numbers are computed.
127
-
* The “best” scale is chosen as the one with the largest Betti₁ (loops); if Betti₁ is zero across scales, the plugin picks the scale where Betti₀ is largest.
128
+
* The "best" scale is chosen as the one with the largest Betti₁ (loops); if Betti₁ is zero across scales, the plugin picks the scale where Betti₀ is largest.
128
129
* The corresponding Betti₀, Betti₁, and Betti₂ values are then exported as Fluent Bit gauges.
129
130
130
131
### 6. Exported metrics
@@ -133,9 +134,9 @@ Once the compressed lower-triangular distance matrix is built, it is passed to a
|`fluentbit_tda_betti0`| gauge | Approximate Betti₀ – number of connected components (clusters) in the embedded point cloud at the selected scale. Large values indicate fragmentation into many “micro-regimes”. |
137
-
|`fluentbit_tda_betti1`| gauge | Approximate Betti₁ – number of 1-dimensional loops / cycles in the Rips complex. Non-zero values often signal **recurrent, quasi-periodic, or cycling behavior**, typical of intermittent failure / recovery patterns and other regime switches. |
138
-
|`fluentbit_tda_betti2`| gauge | Approximate Betti₂ – number of 2-dimensional voids (higher-order structures). These can appear when the system explores different “surfaces” in state space, e.g., transitioning between distinct operating modes. |
137
+
|`fluentbit_tda_betti0`| gauge | Approximate Betti₀ - number of connected components (clusters) in the embedded point cloud at the selected scale. Large values indicate fragmentation into many "micro-regimes". |
138
+
|`fluentbit_tda_betti1`| gauge | Approximate Betti₁ - number of 1-dimensional loops / cycles in the Rips complex. Non-zero values often signal **recurrent, quasi-periodic, or cycling behavior**, typical of intermittent failure / recovery patterns and other regime switches. |
139
+
|`fluentbit_tda_betti2`| gauge | Approximate Betti₂ - number of 2-dimensional voids (higher-order structures). These can appear when the system explores different “surfaces” in state space, e.g., transitioning between distinct operating modes. |
139
140
140
141
Each metric is timestamped with the current time at the moment of TDA computation and is exported via the same metrics context it received, so downstream metric outputs can scrape or forward them like any other Fluent Bit metric.
141
142
@@ -170,11 +171,13 @@ Some practical patterns:
170
171
171
172
3.**Intermittent failure / unstable regime**
172
173
173
-
* The system repeatedly bounces between “healthy” and “unhealthy” states (e.g., repeated `Connection refused` / `broken connection` errors interspersed with 200 responses).
174
+
* The system repeatedly bounces between "healthy" and "unhealthy" states (e.g., repeated `Connection refused` / `broken connection` errors interspersed with 200 responses).
174
175
* The trajectory in phase space forms **loops**: metrics move away from the healthy region and then return, many times.
175
176
* Betti₁ (and occasionally Betti₂) increases noticeably while this behavior persists, reflecting the emergence of non-trivial cycles in the metric dynamics.
176
177
177
-
In the sample output, as the HTTP output oscillates between success and various `Connection refused` / `broken connection` errors, `fluentbit_tda_betti1` and `fluentbit_tda_betti2` grow from small values to larger plateaus (e.g., Betti₁ around 10–13, Betti₂ around 1–2) while Betti₀ also increases. This is a direct signature of a **phase transition** from a stable regime to one with persistent, intermittent instability.
178
+
In the sample output, the HTTP output oscillates between success and various "Connection refused" and "broken connection" errors.
179
+
As this occurs, `fluentbit_tda_betti1` and `fluentbit_tda_betti2` grow from small values to larger plateaus (for example, Betti₁ around 10—13, Betti₂ around 1—2) while Betti₀ also increases.
180
+
This is a direct signature of a **phase transition** from a stable regime to one with persistent, intermittent instability.
178
181
179
182
These interpretations are consistent with results from condensed matter physics and dynamical systems, where persistent homology has been used to detect phase transitions and changes in underlying order purely from data (References 1 and 2).
180
183
@@ -184,7 +187,7 @@ These interpretations are consistent with results from condensed matter physics
184
187
185
188
### Basic setup with `fluentbit_metrics`
186
189
187
-
The following example computes TDA on Fluent Bit’s own internal metrics, using `metrics_selector` to remove a few high-cardinality or uninteresting metrics before feeding them into `tda`:
190
+
The following example computes TDA on Fluent Bit's own internal metrics, using `metrics_selector` to remove a few high-cardinality or uninteresting metrics before feeding them into `tda`:
0 commit comments