[Bug] (dynamic-partition) DynamicPartitionScheduler.runtimeInfos leaks entries on DROP TABLE, causing FE OOM

### Search before asking

- [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues.


### Version

4.0.5-rc01 (commit `59de8c4c524`). The same code paths are present on `branch-4.0` HEAD and `master` HEAD as of today.

### What's Wrong?

`DynamicPartitionScheduler.runtimeInfos` (`Map<Long, Map<String,String>>` keyed by `tableId`) accumulates entries indefinitely.

Entries are added by `createOrUpdateRuntimeInfo()` on every scheduler tick for tables with `dynamic_partition.enable=true` or `partitionRetentionCount > 0`.

The only place `removeRuntimeInfo()` is called in production code is `ShowDynamicPartitionCommand.doRun()` (`fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowDynamicPartitionCommand.java:107`). That cleanup is opportunistic. It only runs when:

- a user explicitly issues `SHOW DYNAMIC PARTITION` against a database;
- the table is still present in `db.getTables()` (so DROP TABLE never reaches it — the table is already gone from the catalog);
- the table no longer satisfies `olapTable.dynamicPartitionExists()`.

No catalog mutation path calls it. Three scenarios leave stale entries behind:

1. `InternalCatalog.unprotectDropTable()` — table is gone from the catalog, the `runtimeInfos` entry stays.
2. `DynamicPartitionScheduler.executeDynamicPartition()` when `db == null` — the scheduler removes the pair from `dynamicPartitionTableInfo` via `iterator.remove()` and leaves `runtimeInfos` untouched.
3. Same method when `olapTable` is null, an `MTMV`, or has lost both its `dynamic_partition.enable` flag and `partitionRetentionCount` — same story.

In a cluster where users don't run `SHOW DYNAMIC PARTITION` regularly (most automated ETL workloads), the map grows unbounded.

In our production cluster (4.0.5-rc01, ETL workload with frequent CREATE/DROP on `dynamic_partition` tables, ~24K DDL/hour), the FE OOMed after a few weeks of uptime. Heap dump:

- `runtimeInfos` backed by a `ConcurrentHashMap$Node[]` of 2,097,152 buckets, roughly 1M–1.5M leaked entries.
- 554 MB retained on `DynamicPartitionScheduler` (17% of live heap post-GC walk).
- Dump file: 52 GiB, live heap 3.23 GB after reachability walk, the rest is the leak path holding old `Database`/`Table` graphs alive transitively through these stale entries.


### What You Expected?

`runtimeInfos.remove(tableId)` should run when:

1. A table is dropped — inside `InternalCatalog.unprotectDropTable()`, alongside `db.unregisterTable()`.
2. The scheduler removes a table from its working set in `executeDynamicPartition()`, in each of the `iterator.remove(); continue;` branches.

### How to Reproduce?

Run a CREATE/DROP loop against a table with `dynamic_partition.enable=true` for long enough and watch FE heap. The leak is fastest under high-DDL-churn ETL workloads, but slow churn hits it eventually because nothing ever clears the map.

A self-contained repro:

```sql
CREATE DATABASE leak_repro;
USE leak_repro;
-- in a shell loop, repeat for N=10000 cycles:
CREATE TABLE t (
  k INT, dt DATE
) DUPLICATE KEY(k) PARTITION BY RANGE(dt) ()
DISTRIBUTED BY HASH(k) BUCKETS 1
PROPERTIES (
  "dynamic_partition.enable"="true",
  "dynamic_partition.time_unit"="DAY",
  "dynamic_partition.start"="-3",
  "dynamic_partition.end"="3",
  "dynamic_partition.prefix"="p",
  "replication_num"="1"
);
DROP TABLE t;
```

Take a heap dump, open in Eclipse MAT, look at retained heap on `DynamicPartitionScheduler`. Bucket count on `runtimeInfos` will track the iteration count.


### Anything Else?


Suggested patch (4 lines added across two files):

```diff
diff --git a/fe/fe-core/src/main/java/org/apache/doris/clone/DynamicPartitionScheduler.java b/fe/fe-core/src/main/java/org/apache/doris/clone/DynamicPartitionScheduler.java
@@ -671,6 +671,7 @@ public class DynamicPartitionScheduler extends MasterDaemon {
             Database db = Env.getCurrentInternalCatalog().getDbNullable(dbId);
             if (db == null) {
                 iterator.remove();
+                removeRuntimeInfo(tableId);
                 continue;
             }
@@ -688,6 +689,7 @@ public class DynamicPartitionScheduler extends MasterDaemon {
                             || !olapTable.getTableProperty().getDynamicPartitionProperty().getEnable())
                     && olapTable.getPartitionRetentionCount() <= 0) {
                 iterator.remove();
+                removeRuntimeInfo(tableId);
                 continue;
             } else if (olapTable.isBeingSynced()) {
diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java
@@ -1027,6 +1027,8 @@ public class InternalCatalog implements CatalogIf<Database> {
         Env.getCurrentEnv().getQueryStats().clear(...);
         table.removeTableIdentifierFromPrimaryTable();
         db.unregisterTable(table.getId());
+        // Fix DynamicPartitionScheduler.runtimeInfos leak on DROP TABLE.
+        Env.getCurrentEnv().getDynamicPartitionScheduler().removeRuntimeInfo(table.getId());
         StopWatch watch = StopWatch.createStarted();
         Env.getCurrentRecycleBin().recycleTable(...);
```

A patched build is running in our production cluster since 2026-04-27. Same workload that previously grew `runtimeInfos` past 500 MB now keeps it flat.

Two analogous leaks fixed in the past by similar `remove()` calls on the cleanup path:

- #44019 — ExportMgr leak
- #47074 — CallBackFactory leak

### Are you willing to submit PR?

- [x] Yes I am willing to submit a PR!

### Code of Conduct

- [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] (dynamic-partition) DynamicPartitionScheduler.runtimeInfos leaks entries on DROP TABLE, causing FE OOM #62883

Search before asking

Version

What's Wrong?

What You Expected?

How to Reproduce?

Anything Else?

Are you willing to submit PR?

Code of Conduct

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] (dynamic-partition) DynamicPartitionScheduler.runtimeInfos leaks entries on DROP TABLE, causing FE OOM #62883

Description

Search before asking

Version

What's Wrong?

What You Expected?

How to Reproduce?

Anything Else?

Are you willing to submit PR?

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions