From 6423af4b1eabc3e42ac748599efaa278482f246d Mon Sep 17 00:00:00 2001 From: Marko Budiselic Date: Wed, 13 May 2026 12:33:55 +0200 Subject: [PATCH 01/12] Add Memgraph v3.11.0 and Lab v3.11.0 release note titles Co-authored-by: Cursor --- pages/release-notes.mdx | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/pages/release-notes.mdx b/pages/release-notes.mdx index 7746ab549..cf86405d7 100644 --- a/pages/release-notes.mdx +++ b/pages/release-notes.mdx @@ -46,6 +46,14 @@ guide. ## 🚀 Latest release +### Memgraph v3.11.0 - June 17th, 2026 + +### Lab v3.11.0 - June 16th, 2026 + + + +## Previous releases + ### Memgraph v3.10.0 - May 13th, 2026 {

⚠️ Breaking changes

} @@ -342,8 +350,6 @@ guide. -## Previous releases - ### Memgraph v3.9.0 {

⚠️ Breaking changes

} From c5656715b23207e0985f25415f2809a4557eb4d2 Mon Sep 17 00:00:00 2001 From: Marko Budiselic Date: Wed, 13 May 2026 12:40:04 +0200 Subject: [PATCH 02/12] Update the release docs start skill --- skills/new-release-branch/SKILL.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/skills/new-release-branch/SKILL.md b/skills/new-release-branch/SKILL.md index 111de0317..78a76fa43 100644 --- a/skills/new-release-branch/SKILL.md +++ b/skills/new-release-branch/SKILL.md @@ -96,6 +96,11 @@ The result should look like: **Important:** If there are multiple Memgraph patch releases under Latest (e.g. v3.8.1 and v3.8.0), move *all* of them together with the Lab entry. +**Implementation tip:** Do both changes in one `StrReplace` call by matching +from `## 🚀 Latest release` through `## Previous releases` and rewriting the +entire block at once. This avoids error-prone multi-step edits where +`## Previous releases` can end up in the wrong place. + ## Step 4 — Commit and push ```bash From 4f24dbfa1eae2d800f18038f6b2ccdf051c8ca1a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ivan=20Milinovi=C4=87?= <44698587+imilinovic@users.noreply.github.com> Date: Sat, 23 May 2026 20:03:29 +0200 Subject: [PATCH 03/12] Add mgp_graph_get_start_timestamp / Graph.start_timestamp (#1640) Documents the transaction starting timestamp exposed to procedures. The value stays stable across USING PERIODIC COMMIT boundaries, making it usable as a per-query cache key. Pairs with memgraph/memgraph#4167. --- pages/custom-query-modules/c/c-api.mdx | 16 ++++++++++++++ .../python/python-api.mdx | 21 +++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/pages/custom-query-modules/c/c-api.mdx b/pages/custom-query-modules/c/c-api.mdx index 919708e53..6f314b097 100644 --- a/pages/custom-query-modules/c/c-api.mdx +++ b/pages/custom-query-modules/c/c-api.mdx @@ -200,6 +200,7 @@ Memgraph in order to use them. | enum [mgp_error](#variable-mgp-error) | **[mgp_edge_iter_properties](#function-mgp-edge-iter-properties)**(struct mgp_edge * e, struct mgp_memory * memory, struct mgp_properties_iterator ** result)
Start iterating over properties stored in the given edge. | | enum [mgp_error](#variable-mgp-error) | **[mgp_graph_get_vertex_by_id](#function-mgp-graph-get-vertex-by-id)**(struct mgp_graph * g, struct [mgp_vertex_id](#mgp_vertex_id) id, struct mgp_memory * memory, struct mgp_vertex ** result)
Get the vertex corresponding to given ID, or NULL if no such vertex exists. | | enum [mgp_error](#variable-mgp-error) | **[mgp_graph_is_transactional](#function-mgp-graph-is-transactional)**(struct mgp_graph * graph, int * result)
Result is non-zero if the graph is in transactional storage mode. | +| enum [mgp_error](#variable-mgp-error) | **[mgp_graph_get_start_timestamp](#function-mgp-graph-get-start-timestamp)**(struct mgp_graph * graph, int64_t * result)
Return the transaction's starting timestamp; stays stable across `USING PERIODIC COMMIT` batches. | | enum [mgp_error](#variable-mgp-error) | **[mgp_graph_has_text_index](#function-mgp-graph-has-text-index)**(struct mgp_graph * graph, const char * index_name, int * result)
(Experimental) Result is non-zero if there exists a text index with the given name. | | enum [mgp_error](#variable-mgp-error) | **[mgp_graph_search_text_index](#function-mgp-graph-search-text-index)**(struct mgp_graph * graph, const char * index_name, const char * search_query, enum text_search_mode search_mode, struct mgp_memory * memory, struct mgp_map ** result)
(Experimental) Search over the given text index. The result contains a list of all vertices matching the search query. | | enum [mgp_error](#variable-mgp-error) | **[mgp_graph_aggregate_over_text_index](#function-mgp-graph-aggregate-over-text-index)**(struct mgp_graph * graph, const char * index_name, const char * search_query, const char * aggregation_query, struct mgp_memory * memory, struct mgp_map ** result)
(Experimental) Aggregate over the results of a search over the named text index. | @@ -2494,6 +2495,19 @@ Result is non-zero if the graph can be modified. If a graph is immutable, then vertices cannot be created or deleted, and all of the returned vertices will be immutable also. The same applies for edges. Current implementation always returns without errors. +### mgp_graph_get_start_timestamp [#function-mgp-graph-get-start-timestamp] +```cpp +enum mgp_error mgp_graph_get_start_timestamp( + struct mgp_graph * graph, + int64_t * result +) +``` + +Return the transaction's starting timestamp. The value is assigned when the transaction begins and stays the same for the rest of the query, including when `USING PERIODIC COMMIT` rotates the underlying transaction between batches. + +Procedures can rely on it as a stable per-query identifier (for example, as a key in caches that span procedure calls). The current implementation always returns without errors. + + ### mgp_graph_has_text_index [#function-mgp-graph-has-text-index] ```cpp enum mgp_error mgp_graph_has_text_index( @@ -5043,6 +5057,8 @@ enum mgp_error mgp_graph_get_vertex_by_id(struct mgp_graph *g, struct mgp_vertex enum mgp_error mgp_graph_is_transactional(struct mgp_graph *graph, int *result); +enum mgp_error mgp_graph_get_start_timestamp(struct mgp_graph *graph, int64_t *result); + enum mgp_error mgp_graph_is_mutable(struct mgp_graph *graph, int *result); enum mgp_error mgp_graph_has_text_index(struct mgp_graph *graph, const char *index_name, int *result); diff --git a/pages/custom-query-modules/python/python-api.mdx b/pages/custom-query-modules/python/python-api.mdx index aea6c6c3e..a39e46c7b 100644 --- a/pages/custom-query-modules/python/python-api.mdx +++ b/pages/custom-query-modules/python/python-api.mdx @@ -1458,6 +1458,27 @@ Check if the graph is mutable. Thus it can be used to modify vertices and edges. ```graph.is_mutable()``` +### start\_timestamp + +```python +@property +def start_timestamp() -> int +``` + +Return the transaction's starting timestamp. The value is assigned when the +transaction begins and stays the same for the rest of the query — including +when `USING PERIODIC COMMIT` rotates the underlying transaction between +batches. Procedures can rely on it as a stable per-query identifier (for +example, as a key in caches that span procedure calls). + +**Returns**: + + An `int` representing the transaction's starting timestamp. + +**Examples**: + + ```graph.start_timestamp``` + ### create\_vertex() ```python From 430f9ad123aa33553706ba2d792eee5d98aa93bc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ivan=20Milinovi=C4=87?= <44698587+imilinovic@users.noreply.github.com> Date: Wed, 27 May 2026 11:29:06 +0200 Subject: [PATCH 04/12] docs: document start_time and elapsed_ms columns in SHOW TRANSACTIONS (#1627) Add the two new columns to the schema table and example outputs, note that snapshot rows now populate them at the top level, and update the snapshot-progress callout to mention the brief window where start_time can read as null. --- pages/fundamentals/transactions.mdx | 47 ++++++++++++++++------------- 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/pages/fundamentals/transactions.mdx b/pages/fundamentals/transactions.mdx index b8e1e0ad9..572924547 100644 --- a/pages/fundamentals/transactions.mdx +++ b/pages/fundamentals/transactions.mdx @@ -92,7 +92,7 @@ SHOW TRANSACTIONS; ``` Each row in the result represents one transaction (or one in-progress snapshot -creation) and contains five columns: +creation) and contains seven columns: | Column | Type | Description | |---|---|---| @@ -101,15 +101,17 @@ creation) and contains five columns: | `query` | `List[String]` | Queries executed within the transaction so far. | | `status` | `String` | Lifecycle phase of the transaction: `running`, `committing`, or `aborting`. Snapshot rows always show `running`. | | `metadata` | `Map` | Metadata supplied by the client when the transaction was opened. For in-progress snapshots this contains progress details (see below). | +| `start_time` | `ZonedDateTime` | UTC time at which the transaction started. | +| `elapsed_ms` | `Integer` | How long the transaction has been running, in milliseconds. | ```copy=false memgraph> SHOW TRANSACTIONS; -+----------+------------------------+-----------------------------------------------+--------------+----------+ -| username | transaction_id | query | status | metadata | -+----------+------------------------+-----------------------------------------------+--------------+----------+ -| "" | "9223372036854794885" | ["UNWIND range(1,100) AS i CREATE(:L{p:i});"] | "committing" | {} | -| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | "running" | {} | -+----------+------------------------+-----------------------------------------------+--------------+----------+ ++----------+-----------------------+-----------------------------------------------+--------------+----------+-------------------------------+------------+ +| username | transaction_id | query | status | metadata | start_time | elapsed_ms | ++----------+-----------------------+-----------------------------------------------+--------------+----------+-------------------------------+------------+ +| "" | "9223372036854794885" | ["UNWIND range(1,100) AS i CREATE(:L{p:i});"] | "committing" | {} | 2026-05-12T14:32:18.412Z[UTC] | 47 | +| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | "running" | {} | 2026-05-12T14:32:18.451Z[UTC] | 8 | ++----------+-----------------------+-----------------------------------------------+--------------+----------+-------------------------------+------------+ ``` #### Filter by status @@ -140,16 +142,18 @@ rows contains: | `phase` | Current phase of snapshot creation: `EDGES`, `VERTICES`, `INDICES`, `CONSTRAINTS`, or `FINALIZING`. | | `items_done` | Number of objects serialized in the current phase so far. | | `items_total` | Total number of objects expected in the current phase. | -| `elapsed_ms` | Milliseconds elapsed since the snapshot started. | | `db_name` | Name of the database whose snapshot is being created. | +The top-level `start_time` and `elapsed_ms` columns are populated for snapshot +rows as well, reflecting when the snapshot started. + ```copy=false memgraph> SHOW TRANSACTIONS; -+----------+----------------+-----------------------------+-----------+------------------------------------------------------------------+ -| username | transaction_id | query | status | metadata | -+----------+----------------+-----------------------------+-----------+------------------------------------------------------------------+ -| "" | "snapshot" | ["CREATE SNAPSHOT"] | "running" | {phase: "VERTICES", items_done: 142000, items_total: 500000, ... | -+----------+----------------+-----------------------------+-----------+------------------------------------------------------------------+ ++----------+----------------+---------------------+-----------+------------------------------------------------------------------+-------------------------------+------------+ +| username | transaction_id | query | status | metadata | start_time | elapsed_ms | ++----------+----------------+---------------------+-----------+------------------------------------------------------------------+-------------------------------+------------+ +| "" | "snapshot" | ["CREATE SNAPSHOT"] | "running" | {phase: "VERTICES", items_done: 142000, items_total: 500000, ... | 2026-05-12T14:32:17.205Z[UTC] | 1247 | ++----------+----------------+---------------------+-----------+------------------------------------------------------------------+-------------------------------+------------+ ``` @@ -157,8 +161,9 @@ Snapshot progress values are read from independent atomic counters and are not captured as a single consistent snapshot. `items_done`, `items_total`, and `phase` may reflect slightly different points in time, so treat them as best-effort estimates rather than exact figures. In particular, `items_done` -may briefly read as `0` when the phase transitions, and `elapsed_ms` may be -absent if the snapshot started between the phase check and the time read. +may briefly read as `0` when the phase transitions, and `start_time` may be +`null` if the snapshot was observed in the brief window before it recorded +its start. @@ -289,12 +294,12 @@ currently being run as part of the transaction ID "9223372036854794885". ```copy=false memgraph> SHOW TRANSACTIONS; -+----------+------------------------+-------------------------------------------+-----------+----------+ -| username | transaction_id | query | status | metadata | -+----------+------------------------+-------------------------------------------+-----------+----------+ -| "" | "9223372036854794885" | ["CALL infinite.get() YIELD * RETURN *;"] | "running" | {} | -| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | "running" | {} | -+----------+------------------------+-------------------------------------------+-----------+----------+ ++----------+-----------------------+-------------------------------------------+-----------+----------+-------------------------------+------------+ +| username | transaction_id | query | status | metadata | start_time | elapsed_ms | ++----------+-----------------------+-------------------------------------------+-----------+----------+-------------------------------+------------+ +| "" | "9223372036854794885" | ["CALL infinite.get() YIELD * RETURN *;"] | "running" | {} | 2026-05-12T14:32:00.000Z[UTC] | 18230 | +| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | "running" | {} | 2026-05-12T14:32:18.230Z[UTC] | 0 | ++----------+-----------------------+-------------------------------------------+-----------+----------+-------------------------------+------------+ ``` To terminate the transaction, run the following query: From 87f273e0c7a363a955767d567089501cc45ea3fd Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Wed, 27 May 2026 11:35:48 +0200 Subject: [PATCH 05/12] feat: Ignore --query-modules-directory on coordinators (#1636) --- .../how-high-availability-works.mdx | 9 ++++++++- pages/database-management/configuration.mdx | 2 +- pages/release-notes.mdx | 15 +++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/pages/clustering/high-availability/how-high-availability-works.mdx b/pages/clustering/high-availability/how-high-availability-works.mdx index 682ff4540..454f92fc9 100644 --- a/pages/clustering/high-availability/how-high-availability-works.mdx +++ b/pages/clustering/high-availability/how-high-availability-works.mdx @@ -123,7 +123,14 @@ well as `SET DATABASE SETTING` and `RELOAD SSL`. Since coordinators do not store user data, the following restrictions apply: - **Snapshots are automatically disabled** on coordinators, even if - `--storage-snapshot-interval-sec` is set. + `--storage-snapshot-interval-sec` is set. The `storage.snapshot.interval` + setting is not registered on coordinators, so attempting to read or modify it + via `SHOW DATABASE SETTING` / `SET DATABASE SETTING` returns an unknown + setting error. +- **The `--query-modules-directory` flag is ignored** on coordinators. + Coordinators do not execute data queries, so query modules are never loaded + and the embedded Python runtime is not initialized. The flag is still accepted + (so packaged defaults do not need to be overridden) but has no effect. - **The `--init-file` and `--init-data-file` flags are not supported** on coordinators (and likewise not supported on data instances in HA mode). The instance will fail to start if either flag is provided. diff --git a/pages/database-management/configuration.mdx b/pages/database-management/configuration.mdx index 70a005dd3..7c49244f8 100644 --- a/pages/database-management/configuration.mdx +++ b/pages/database-management/configuration.mdx @@ -451,7 +451,7 @@ execution in Memgraph. | `--query-cost-planner=true` | Use the cost-estimating query planner. When enabled (`true`), Memgraph generates multiple query plans, selecting the one with the lowest cost. If disabled (`false`), it creates a single plan that is executed. | `[bool]` | | `--query-execution-timeout-sec=600` | Maximum allowed query execution time.
Queries exceeding this limit will be aborted. Value of 0 means no limit. | `[uint64]` | | `--query-max-plans=1000` | Maximum number of generated plans for a query. | `[uint64]` | -| `--query-modules-directory=/usr/lib/memgraph/query_modules` | Directory where modules with custom query procedures are stored. NOTE: Multiple comma-separated directories can be defined. | `[string]` | +| `--query-modules-directory=/usr/lib/memgraph/query_modules` | Directory where modules with custom query procedures are stored. NOTE: Multiple comma-separated directories can be defined. The flag is ignored on coordinator instances in a [high availability](/clustering/high-availability) setup. | `[string]` | | `--query-plan-cache-max-size=1000` | Maximum number of query plans to cache. | `[int32]` | | `--query-vertex-count-to-expand-existing=10` | Maximum count of indexed vertices which provoke indexed lookup and then expand to existing,
instead of a regular expand. Default is 10, to turn off use -1. | `[int64]` | | `--query-log-directory=/var/log/memgraph/session_trace` | Location to store log files for session tracing. | `[string]` | diff --git a/pages/release-notes.mdx b/pages/release-notes.mdx index cf86405d7..0e8021b15 100644 --- a/pages/release-notes.mdx +++ b/pages/release-notes.mdx @@ -48,6 +48,21 @@ guide. ### Memgraph v3.11.0 - June 17th, 2026 +{

⚠️ Breaking changes

} + +- `--query-modules-directory` is now ignored on coordinator instances and + query modules are no longer loaded there. The embedded Python runtime is also + skipped on coordinators, so any Python-based modules that previously loaded + (but were unusable) will no longer be initialized. The flag itself is still + accepted so packaged defaults do not need to be overridden when starting a + coordinator. Additionally, `storage.snapshot.interval` is no longer registered + as a runtime setting on coordinators — `SHOW DATABASE SETTING + 'storage.snapshot.interval'` and `SET DATABASE SETTING + 'storage.snapshot.interval'` now return `Unknown setting name` instead of + `Coordinators don't support snapshots`. Update any tooling that pattern-matched + on the previous error message. + [#4066](https://github.com/memgraph/memgraph/pull/4066) + ### Lab v3.11.0 - June 16th, 2026 From a4dee19f7f7afbbe3ef70a89236a3cf9afddffe3 Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Wed, 27 May 2026 11:36:32 +0200 Subject: [PATCH 06/12] feat: Document init container chown (#1637) --- .../setup-ha-cluster-k8s.mdx | 43 +++++++++++++++++-- 1 file changed, 39 insertions(+), 4 deletions(-) diff --git a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx index f641488cd..ec231a5bb 100644 --- a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx +++ b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx @@ -190,7 +190,8 @@ All Memgraph HA instances run as Kubernetes `StatefulSet` workloads, each with a single pod. Depending on configuration, the pod contains two or three containers: - **memgraph-coordinator** - runs the Memgraph binary. -- **Optional init container** - enabled when `sysctlInitContainer.enabled` is set. +- **Optional sysctl init container** - enabled when `sysctlInitContainer.enabled` is set. +- **Optional fix-ownership init container** - enabled when `fixOwnershipInitContainer.enabled` is set. See [Manual ownership fix](#manual-ownership-fix). Memgraph processes run as the non-root **memgraph** user with **no Linux capabilities and no privilege escalation**. @@ -399,6 +400,36 @@ high-memory workloads, such as increasing: - [`vm.max_map_count`](/database-management/system-configuration#increasing-memory-map-areas) +### Manual ownership fix + +Some storage drivers (notably `rancher.io/local-path`) do not honor pod-level +`fsGroup`, leaving the volume root owned by `root:root`. Because Memgraph runs +as a non-root user, its storage directory ownership assertion (process euid == +data directory owner uid) fails on startup. + +When `fixOwnershipInitContainer.enabled` is set to `true`, an init container +runs as root before Memgraph starts and `chown`s the lib, log, and core-dumps +mount points to `memgraphUserId:memgraphGroupId`. The container drops all Linux +capabilities except `CHOWN`, uses a read-only root filesystem, and disables +privilege escalation. + +To enable it: + +```yaml +fixOwnershipInitContainer: + enabled: true + image: + repository: docker.io/library/busybox + tag: 1.37.0 + pullPolicy: IfNotPresent +``` + +The container only chowns the mount paths that exist for the role — `/var/log/memgraph` +is included when `storage..createLogStorageClaim` is `true`, and +`storage..coreDumpsMountPath` is included when `storage..createCoreDumpsClaim` +is `true`. + + ### Authentication By default, Memgraph HA starts **without authentication** enabled. @@ -1008,7 +1039,7 @@ and their default values. | `storage.data.coreDumpsStorageSize` | Size of the core dumps PVC on data instances | `10Gi` | | `storage.data.coreDumpsMountPath` | Mount path for core dumps on data instances | `/var/core/memgraph` | | `storage.data.coreDumpsImage.repository` | Image repository for the data instance core-dumps init container. | `docker.io/library/busybox` | -| `storage.data.coreDumpsImage.tag` | Image tag for the data instance core-dumps init container. | `latest` | +| `storage.data.coreDumpsImage.tag` | Image tag for the data instance core-dumps init container. | `1.37.0` | | `storage.data.coreDumpsImage.pullPolicy` | Image pull policy for the data instance core-dumps init container. | `IfNotPresent` | | `storage.data.extraVolumes` | Additional volumes to add to data instance pods | `[]` | | `storage.data.extraVolumeMounts` | Additional volume mounts to add to data instance containers | `[]` | @@ -1024,7 +1055,7 @@ and their default values. | `storage.coordinators.coreDumpsStorageSize` | Size of the core dumps PVC on coordinators | `10Gi` | | `storage.coordinators.coreDumpsMountPath` | Mount path for core dumps on coordinators | `/var/core/memgraph` | | `storage.coordinators.coreDumpsImage.repository` | Image repository for the coordinator core-dumps init container. | `docker.io/library/busybox` | -| `storage.coordinators.coreDumpsImage.tag` | Image tag for the coordinator core-dumps init container. | `latest` | +| `storage.coordinators.coreDumpsImage.tag` | Image tag for the coordinator core-dumps init container. | `1.37.0` | | `storage.coordinators.coreDumpsImage.pullPolicy` | Image pull policy for the coordinator core-dumps init container. | `IfNotPresent` | | `storage.coordinators.extraVolumes` | Additional volumes to add to coordinator pods | `[]` | | `storage.coordinators.extraVolumeMounts` | Additional volume mounts to add to coordinator containers | `[]` | @@ -1078,8 +1109,12 @@ and their default values. | `sysctlInitContainer.enabled` | Enable the init container to set sysctl parameters | `true` | | `sysctlInitContainer.maxMapCount` | Value for `vm.max_map_count` to be set by the init container | `262144` | | `sysctlInitContainer.image.repository` | Image repository for the sysctl init container | `library/busybox` | -| `sysctlInitContainer.image.tag` | Image tag for the sysctl init container | `latest` | +| `sysctlInitContainer.image.tag` | Image tag for the sysctl init container | `1.37.0` | | `sysctlInitContainer.image.pullPolicy` | Image pull policy for the sysctl init container | `IfNotPresent` | +| `fixOwnershipInitContainer.enabled` | Enable the init container that `chown`s lib/log/core-dump mounts to `memgraphUserId:memgraphGroupId` before Memgraph starts. Use when the storage driver does not honor `fsGroup`. | `false` | +| `fixOwnershipInitContainer.image.repository` | Image repository for the fix-ownership init container. | `docker.io/library/busybox` | +| `fixOwnershipInitContainer.image.tag` | Image tag for the fix-ownership init container. | `1.37.0` | +| `fixOwnershipInitContainer.image.pullPolicy` | Image pull policy for the fix-ownership init container. | `IfNotPresent` | | `secrets.name` | Name of the Kubernetes Secret holding the Memgraph Enterprise license and organization name. Must exist before `helm install`. | `memgraph-secrets` | | `secrets.licenseKey` | Key in the Secret whose value is exposed as `MEMGRAPH_ENTERPRISE_LICENSE` to data and coordinator pods. | `MEMGRAPH_ENTERPRISE_LICENSE` | | `secrets.organizationKey` | Key in the Secret whose value is exposed as `MEMGRAPH_ORGANIZATION_NAME` to data and coordinator pods. | `MEMGRAPH_ORGANIZATION_NAME` | From ea08fb9283ea4610bc74bfa3be801914868d9007 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ivan=20Milinovi=C4=87?= <44698587+imilinovic@users.noreply.github.com> Date: Wed, 27 May 2026 11:39:12 +0200 Subject: [PATCH 07/12] Schema (node/rel)_type_properties docs (#1638) * docs(schema): document expanded output of schema type-properties procedures Update the Output, Usage, and Example sections of schema.node_type_properties and schema.rel_type_properties to describe the new sourceNodeLabels, targetNodeLabels, propertyObservations, and totalObservations columns and the (relType, sourceNodeLabels, targetNodeLabels) partitioning for the rel-properties procedure. Replace example result tables with output captured against the updated procedures. * docs(schema): fix Kept key value in schema.assert example The Kept row of the schema.assert example showed "id" for a unique constraint on a single property, but the procedure emits "[id]" (the stringified property list) for that action. --- pages/querying/schema.mdx | 72 +++++++++++++++++++++++---------------- 1 file changed, 42 insertions(+), 30 deletions(-) diff --git a/pages/querying/schema.mdx b/pages/querying/schema.mdx index 88a99cf15..5c0c30f53 100644 --- a/pages/querying/schema.mdx +++ b/pages/querying/schema.mdx @@ -376,9 +376,11 @@ This procedure is also exposed as `apoc.meta.nodeTypeProperties` and - `nodeType: string` ➡ Concatenated node labels separated by a `:`. - `nodeLabels: List[string]` ➡ A list of node labels. +- `propertyName: string` ➡ Property name. +- `propertyTypes: List[string]` ➡ Property types observed for this property on nodes of this type. - `mandatory: boolean` ➡ Returns `True` if every node with a given node type (defined by nodeType or nodeLabels) possesses the listed property (propertyName), and `False` otherwise. -- `propertyName: string` ➡ Property name. -- `propertyTypes: string` ➡ Property type. +- `propertyObservations: integer` ➡ Number of nodes of this type that carried this property. +- `totalObservations: integer` ➡ Total number of nodes of this type that were examined (bounded by the `sample` config option). {

Usage:

} @@ -386,14 +388,14 @@ To get the information about nodes and properties, run the following query: ```cypher CALL schema.node_type_properties() -YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes; +YIELD nodeType, nodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations; ``` To restrict the scan to a subset of labels, pass a `config` map: ```cypher CALL schema.node_type_properties({includeLabels: ["Dog"]}) -YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes; +YIELD nodeType, nodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations; ``` ### rel_type_properties() @@ -423,9 +425,15 @@ This procedure is also exposed as `apoc.meta.relTypeProperties` and {

Output:

} - `relType: string` ➡ The type of the relationship. -- `mandatory: boolean` ➡ Returns `True` if every relationship with a given relationship type (defined by relType) possesses the listed property (propertyName), and `False` otherwise. +- `sourceNodeLabels: List[string]` ➡ Labels on the start node of relationships in this partition. +- `targetNodeLabels: List[string]` ➡ Labels on the end node of relationships in this partition. - `propertyName: string` ➡ Property name. -- `propertyTypes: string` ➡ Property type. +- `propertyTypes: List[string]` ➡ Property types observed for this property on relationships in this partition. +- `mandatory: boolean` ➡ Returns `True` if every relationship in this partition possesses the listed property, and `False` otherwise. +- `propertyObservations: integer` ➡ Number of relationships in this partition that carried this property. +- `totalObservations: integer` ➡ Total number of relationships in this partition that were examined. + +One row is emitted per `(relType, sourceNodeLabels, targetNodeLabels, propertyName)` combination. The same relationship type connecting different label sets (for example `(:Dog)-[:LOVES]->(:Activity)` vs `(:Cat)-[:LOVES]->(:Place)`) produces separate rows, so each partition can be characterised independently. {

Usage:

} @@ -434,14 +442,14 @@ To get the information about relationships and properties, run the following que ```cypher CALL schema.rel_type_properties() -YIELD relType, mandatory, propertyName, propertyTypes; +YIELD relType, sourceNodeLabels, targetNodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations; ``` To restrict the scan to a subset of relationship types, pass a `config` map: ```cypher CALL schema.rel_type_properties({includeRels: ["LOVES"]}) -YIELD relType, mandatory, propertyName, propertyTypes; +YIELD relType, sourceNodeLabels, targetNodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations; ``` ### assert() @@ -517,7 +525,7 @@ Results: +-----------------------------------------------------------------------------+ | "Created" | "[name, surname]" | ["name", "surname"] | "Person" | true | +-----------------------------------------------------------------------------+ -| "Kept" | "id" | ["id"] | "Person" | true | +| "Kept" | "[id]" | ["id"] | "Person" | true | +-----------------------------------------------------------------------------+ ``` @@ -555,41 +563,45 @@ Call the procedure to get information about the nodes: ```cypher CALL schema.node_type_properties() -YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes; +YIELD nodeType, nodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations; ``` Result: ```plaintext -+--------------------+--------------------+--------------------+--------------------+--------------------+ -| nodeType | nodeLabels | mandatory | propertyName | propertyTypes | -+--------------------+--------------------+--------------------+--------------------+--------------------+ -| ":`Sky`" | ["Sky"] | false | "" | "" | -| ":`Park`" | ["Park"] | false | "" | "" | -| ":`Human`:`Owner`" | ["Human", "Owner"] | false | "age" | "Int" | -| ":`Human`:`Owner`" | ["Human", "Owner"] | false | "name" | "String" | -| ":`Bird`" | ["Bird"] | false | "" | "" | -| ":`Dog`" | ["Dog"] | false | "age" | "Int" | -| ":`Dog`" | ["Dog"] | false | "name" | "String" | -+--------------------+--------------------+--------------------+--------------------+--------------------+ ++--------------------+--------------------+--------------+----------------+-----------+----------------------+--------------------+ +| nodeType | nodeLabels | propertyName | propertyTypes | mandatory | propertyObservations | totalObservations | ++--------------------+--------------------+--------------+----------------+-----------+----------------------+--------------------+ +| ":`Bird`" | ["Bird"] | "" | [] | false | 0 | 1 | +| ":`Dog`" | ["Dog"] | "age" | ["Int"] | false | 1 | 2 | +| ":`Dog`" | ["Dog"] | "name" | ["String"] | false | 1 | 2 | +| ":`Human`:`Owner`" | ["Human", "Owner"] | "age" | ["Int"] | true | 1 | 1 | +| ":`Human`:`Owner`" | ["Human", "Owner"] | "name" | ["String"] | true | 1 | 1 | +| ":`Park`" | ["Park"] | "" | [] | false | 0 | 1 | +| ":`Sky`" | ["Sky"] | "" | [] | false | 0 | 1 | ++--------------------+--------------------+--------------+----------------+-----------+----------------------+--------------------+ ``` +Of the two `:Dog` nodes, only one carries `name` and `age` — hence `propertyObservations = 1`, `totalObservations = 2`, and `mandatory = false`. The single `:Human:Owner` node has both properties, so they are marked mandatory. + Call the procedure to get information about the relationships: ```cypher CALL schema.rel_type_properties() -YIELD relType, mandatory, propertyName, propertyTypes; +YIELD relType, sourceNodeLabels, targetNodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations; ``` Results: ```plaintext -+------------------------+------------------------+------------------------+------------------------+ -| relType | mandatory | propertyName | propertyTypes | -+------------------------+------------------------+------------------------+------------------------+ -| ":`FLIES_TO`" | false | "" | "" | -| ":`RUNS_AND_PLAYS_IN`" | false | "duration" | "String" | -| ":`RUNS_AND_PLAYS_IN`" | false | "speed" | "Int" | -| ":`LOVES`" | true | "how_much" | "String" | -+------------------------+------------------------+------------------------+------------------------+ ++------------------------+------------------+--------------------+--------------+----------------+-----------+----------------------+--------------------+ +| relType | sourceNodeLabels | targetNodeLabels | propertyName | propertyTypes | mandatory | propertyObservations | totalObservations | ++------------------------+------------------+--------------------+--------------+----------------+-----------+----------------------+--------------------+ +| ":`FLIES_TO`" | ["Bird"] | ["Sky"] | "" | [] | false | 0 | 1 | +| ":`LOVES`" | ["Dog"] | ["Human", "Owner"] | "how_much" | ["String"] | true | 1 | 1 | +| ":`RUNS_AND_PLAYS_IN`" | ["Dog"] | ["Park"] | "duration" | ["String"] | true | 1 | 1 | +| ":`RUNS_AND_PLAYS_IN`" | ["Dog"] | ["Park"] | "speed" | ["Int"] | true | 1 | 1 | ++------------------------+------------------+--------------------+--------------+----------------+-----------+----------------------+--------------------+ ``` + +Each row is keyed by the relationship type *together with* the labels on the start and end nodes. If the dataset had another `:LOVES` relationship between different label sets (for example `(:Cat)-[:LOVES]->(:Activity)`), it would appear on its own row rather than being merged into the existing `:LOVES` partition. From 5bc2caa47cdf8fdad54c9838bfa87a3931f5012a Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Wed, 27 May 2026 11:40:04 +0200 Subject: [PATCH 08/12] feat: Add support for MCP authorization using OAuth (#1644) --- pages/ai-ecosystem/mcp.mdx | 108 +++++++++++++++++++++++++++++++++++++ 1 file changed, 108 insertions(+) diff --git a/pages/ai-ecosystem/mcp.mdx b/pages/ai-ecosystem/mcp.mdx index 440249a48..9a5ee1fb9 100644 --- a/pages/ai-ecosystem/mcp.mdx +++ b/pages/ai-ecosystem/mcp.mdx @@ -224,6 +224,114 @@ language. +### Multi-tenant authentication (OIDC / JWT) + +The MCP server can optionally enforce **OIDC / JWT authentication** on the +streamable-HTTP transport and route each authenticated session to a different +Memgraph logical database based on JWT claims. This is disabled by default — +when off, the server behaves exactly as documented in the sections above. + +Enable it when you want to: + +- Serve different users different Memgraph databases on the same MCP + deployment. +- Place MCP behind an OIDC provider (Keycloak, Auth0, Okta, Entra ID, …). +- Capture per-user audit trails on tool calls. + +

Auth-only tools

+ +When `MCP_AUTH_ENABLED=true`, the server exposes two additional tools: + +| Tool | Description | +| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | +| `list_databases()` | Returns the databases the calling user is authorized to access (the intersection of their JWT `tenants` claim and `MCP_TENANT_CATALOG`). Flags the currently-active database. | +| `use_database(name)` | Switches the active database for the current MCP session. `name` must be in the caller's allowed set — the tool cannot expand authorization beyond what the JWT grants. | + +

Environment variables

+ +All of these are no-ops when `MCP_AUTH_ENABLED=false` (the default). When auth +is enabled, the server **fails fast at startup** if any of the three required +variables are missing. + +| Variable | Default | Required | Purpose | +| --------------------------------- | ---------------------------------------------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------- | +| `MCP_AUTH_ENABLED` | `false` | — | Master switch. | +| `MCP_AUTH_ISSUER` | — | ✓ | OIDC issuer URL, e.g. `https://auth.example.com/realms/memgraph`. | +| `MCP_AUTH_AUDIENCE` | — | ✓ | Expected `aud` claim on accepted JWTs. | +| `MCP_TENANT_CATALOG` | — | ✓ | Comma-separated tenants this MCP deployment serves. Names must match both JWT `tenants` claim values and Memgraph database names. | +| `MCP_AUTH_JWKS_URL` | derived: `/protocol/openid-connect/certs` | — | Override the JWKS endpoint (rarely needed). | +| `MCP_AUTH_TENANTS_CLAIM` | `tenants` | — | Claim holding the user's allowed tenant list (must be an array of strings). | +| `MCP_AUTH_DEFAULT_TENANT_CLAIM` | `default_tenant` | — | Optional claim selecting the user's preferred initial tenant; if absent, the server picks the alphabetically-first allowed one. | +| `MCP_AUTH_REQUIRED_SCOPE` | `mcp:tools` | — | Scope the JWT must carry. | +| `MCP_AUTH_STATIC_CLIENT_ID` | — | — | Opt-in DCR intercept — see [DCR intercept](#dcr-intercept-workaround-for-claude-code) below. | + +

How it works

+ +1. Every request to `/mcp` must carry `Authorization: Bearer `. +2. The middleware validates the JWT signature against the IdP's JWKS (cached + in-process and auto-refreshed when an unknown `kid` arrives). +3. It verifies `iss`, `aud`, `exp`, and the required scope. +4. It reads the `tenants` array claim, intersects it with `MCP_TENANT_CATALOG`, + and builds a per-session `SessionAuth` keyed by `Mcp-Session-Id`. +5. The session's `current_tenant` defaults to the JWT's `default_tenant` claim + (when present and allowed), otherwise the first allowed tenant. +6. Each tool call routes to the Memgraph database with the same name as + `current_tenant`. + +Within a session, users can switch among their allowed databases with the +`use_database` tool, and discover them with `list_databases`. + +

Discovery endpoints exposed when auth is enabled

+ +| Path | Purpose | +| ----------------------------------------------- | -------------------------------------------------------------------------------------------------------- | +| `GET /.well-known/oauth-protected-resource` | RFC 9728 PRM telling MCP clients which authorization server to use. | +| `GET /.well-known/oauth-authorization-server` | RFC 8414 AS metadata (proxied from the upstream IdP). | +| `GET /.well-known/openid-configuration` | OIDC discovery (proxied from the upstream IdP). | +| `POST /register` | DCR intercept — only present when `MCP_AUTH_STATIC_CLIENT_ID` is set. | + +The discovery document fetched from the upstream IdP is cached in-process and +re-fetched on the next request if the cache is empty (e.g., the IdP was down +on the first attempt). + +

DCR intercept (workaround for Claude Code)

+ +Some MCP clients — notably current Claude Code (see +[anthropics/claude-code#26675](https://github.com/anthropics/claude-code/issues/26675)) +— force Dynamic Client Registration even when a pre-registered `clientId` is +configured. Setting `MCP_AUTH_STATIC_CLIENT_ID=` makes +the MCP server return the same pre-registered client ID for every DCR request, +sidestepping the bug. + +When that variable is set, the PRM document also advertises the MCP server +itself as the `authorization_server` so DCR requests come back to the MCP +server instead of going directly to the IdP. All other OAuth flows +(authorize, token, JWKS) still happen against the real IdP. + + + Leave `MCP_AUTH_STATIC_CLIENT_ID` unset for production deployments whose + clients respect pre-configured `clientId` values. + + +

What you need on the IdP side

+ +In any OIDC provider, roughly: + +1. A public client with PKCE enabled, with redirect URI patterns matching the + IDEs you'll use (e.g., `http://localhost:*`, `vscode://*`, `cursor://*`, + `claude://*`). +2. A `tenants` claim mapper that emits a JSON-array claim of the user's tenant + memberships (in Keycloak: a Group Membership mapper; in Auth0/Okta: a + custom rule reading group or role attributes). +3. An audience claim mapper baking your `MCP_AUTH_AUDIENCE` value into issued + tokens. +4. A scope (default: `mcp:tools`) attached to the client. +5. For each tenant in `MCP_TENANT_CATALOG`, a corresponding Memgraph logical + database created via `CREATE DATABASE `. + +A complete Keycloak example (single-pod, dev-mode) is available in the +[`keycloak-k8s`](https://github.com/memgraph/keycloak-k8s) reference setup. + ### Run Memgraph MCP server on Kubernetes A dedicated [`memgraph-mcp` Helm From 3057a53f44293d056d3f472a5c444ef3c6c31b5b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ivan=20Milinovi=C4=87?= <44698587+imilinovic@users.noreply.github.com> Date: Wed, 27 May 2026 11:49:42 +0200 Subject: [PATCH 09/12] Document cross_database module (deprecates migrate in 3.11) (#1634) Add the cross_database reference page covering all procedures (bolt, neo4j, mysql, postgresql, sql_server, oracle_db, s3, arrow_flight, duckdb, servicenow), the Bolt type conversion rules (including the lossy 30-days-per-month Duration flattening), the same-parameters guard, and the migrate.* alias table. Reduce the migrate page to a redirect stub pointing at cross_database; the migrate.* names continue to work via Memgraph's callable-mapping aliases, but the canonical reference now lives at cross_database. --- .../available-algorithms/_meta.ts | 1 + .../available-algorithms/cross_database.mdx | 462 ++++++++++++++ .../available-algorithms/migrate.mdx | 567 +----------------- 3 files changed, 472 insertions(+), 558 deletions(-) create mode 100644 pages/advanced-algorithms/available-algorithms/cross_database.mdx diff --git a/pages/advanced-algorithms/available-algorithms/_meta.ts b/pages/advanced-algorithms/available-algorithms/_meta.ts index 6795c2595..885416252 100644 --- a/pages/advanced-algorithms/available-algorithms/_meta.ts +++ b/pages/advanced-algorithms/available-algorithms/_meta.ts @@ -13,6 +13,7 @@ export default { "convert_c": "convert_c", "do": "do", "create": "create", + "cross_database": "cross_database", "cugraph": "cugraph", "cycles": "cycles", "date": "date", diff --git a/pages/advanced-algorithms/available-algorithms/cross_database.mdx b/pages/advanced-algorithms/available-algorithms/cross_database.mdx new file mode 100644 index 000000000..dd60e8a42 --- /dev/null +++ b/pages/advanced-algorithms/available-algorithms/cross_database.mdx @@ -0,0 +1,462 @@ +--- +title: cross_database +description: Query other databases (Memgraph, Neo4j, PostgreSQL, MySQL, Oracle, SQL Server, S3, Arrow Flight, DuckDB, ServiceNow) directly from Memgraph and stream their rows into your graph. +--- + +import { Cards } from 'nextra/components' +import GitHub from '/components/icons/GitHub' +import { Callout } from 'nextra/components'; + +# cross_database + +The `cross_database` module lets you reach into another database from a running Cypher query and stream +its rows into Memgraph. Use it to migrate data, build hybrid OLTP/graph pipelines, or join graph data +with rows fetched on-demand from a relational/Bolt/object-store source. + + +**The `migrate` module is deprecated as of Memgraph 3.11** and has been replaced by `cross_database`. +Existing `migrate.*` calls keep working via the [aliases](#backwards-compatibility-with-migrate) shipped +with Memgraph, but new code should call `cross_database.*` directly. The previous `migrate.memgraph()` +procedure has been replaced by the more general [`cross_database.bolt()`](#bolt). + + + + } + title="Source code" + href="https://github.com/memgraph/memgraph/blob/master/mage/python/cross_database.py" + /> + + +| Trait | Value | +| ------------------- | ---------- | +| **Module type** | util | +| **Implementation** | Python | +| **Parallelism** | sequential | + + +When running multiple cross-database calls against the same source, avoid repeating the `config` map in every call. +Use [server-side parameters](/database-management/server-side-parameters) to store the connection config once +and reference it as `$config` across all your queries: + +```cypher +SET GLOBAL PARAMETER pg_config = {user: 'memgraph', password: 'password', host: 'localhost', database: 'demo_db'}; + +CALL cross_database.postgresql('users', $pg_config) YIELD row CREATE (u:User {id: row.id}); +CALL cross_database.postgresql('orders', $pg_config) YIELD row CREATE (o:Order {id: row.id}); +``` + + +## Backwards compatibility with `migrate` + +Every procedure listed below has a `migrate.*` alias preserved from earlier versions, so existing queries +keep working unchanged. The aliases are wired up through Memgraph's callable-mapping mechanism +(`/etc/memgraph/apoc_compatibility_mappings.json`, enabled by default), so no configuration is required. + +| Pre-3.11 name | New name | +| -------------------------- | --------------------------------- | +| `migrate.memgraph` | `cross_database.bolt` | +| `migrate.neo4j` | `cross_database.neo4j` | +| `migrate.mysql` | `cross_database.mysql` | +| `migrate.postgresql` | `cross_database.postgresql` | +| `migrate.sql_server` | `cross_database.sql_server` | +| `migrate.oracle_db` | `cross_database.oracle_db` | +| `migrate.s3` | `cross_database.s3` | +| `migrate.arrow_flight` | `cross_database.arrow_flight` | +| `migrate.duckdb` | `cross_database.duckdb` | +| `migrate.servicenow` | `cross_database.servicenow` | + +If you mix old and new names within the same transaction, treat them as the same procedure — calling +`migrate.postgresql` and `cross_database.postgresql` with identical arguments will trigger the +[same-parameters guard](#same-parameters-guard). + +You can inspect the active mapping at runtime with: + +```cypher +SHOW QUERY CALLABLE MAPPINGS; +``` + +## Type conversion + +For Bolt-based sources (`bolt`, `neo4j`), primitives (`Boolean`, `Integer`, `Float`, `String`, `Null`), +lists, maps, the calendar-aware temporal types (`Date`, `LocalTime`, `LocalDateTime`, `DateTime`), and +spatial points (`Point2d`, `Point3d`, with their `srid` preserved) pass through cleanly. The cases worth +knowing about: + +| Source type | Result | Notes | +| ---------------------------- | ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `Duration` with `months > 0` | flattened | Months are coerced to 30 days each. **This differs from native Neo4j behavior**, where months are preserved and `date + P1M` stays calendar-aware. See the warning below. | +| `Time` (zoned) | *unsupported* | Memgraph has no zoned-time type. Convert to `LocalTime` on the source side, or carry the offset as a separate column. | +| Enum | *unsupported* | Memgraph enums can't cross the Bolt boundary — the call raises an error. | +| Node / Relationship / Path | *unsupported* | Streaming raw graph objects is not supported. Return their `properties()` / `labels()` instead. | + + +The 30-days-per-month flattening means a duration like `P1M` becomes `30 days` on the Memgraph side, and +`P1Y` becomes `360 days`. Native Neo4j keeps months separate from days, so `date + duration('P1M')` stays +calendar-aware there; after crossing the Bolt boundary into Memgraph that calendar information is gone. +If you need exact calendar arithmetic, fetch the months and days components as separate columns and +reconstruct them on the Memgraph side. + + +## Same-parameters guard + +`cross_database` deliberately rejects two concurrent calls with the same `(query, config, params)` inside +a single transaction. Doing so would race on the underlying connection and yield duplicate rows. The error +message is: + +``` +Cross database module with these parameters is already running. +Please wait for it to finish before starting a new one. +``` + +If you intentionally need two reads from the same source in one transaction, vary the query (for example +add a different `LIMIT`, alias, or comment) so the cache key differs. + +--- + +## Procedures + +### `bolt()` + +`cross_database.bolt()` queries any Bolt-compatible database — Memgraph itself, another Memgraph instance, +or Neo4j — and streams the rows back. This replaces the old `migrate.memgraph()` procedure and is the +recommended way to read from any Bolt source. + +{

Input:

} + +- `label_or_rel_or_query: str` ➡ A label name (`(:Label)`), a relationship type (`[:REL_TYPE]`), or a plain + Cypher query. When a label/relationship shorthand is used, `cross_database.bolt()` synthesises the + matching `MATCH … RETURN labels/properties …` query for you. +- `config: mgp.Map` ➡ Connection parameters. Notable keys: `host` (default `localhost`), `port` + (default `7687`), `username`, `password`, `database`, `uri_scheme` (default `bolt`). +- `config_path: str` (optional) ➡ Path to a JSON file containing connection parameters; values in the + file override values in `config`. +- `params: mgp.Nullable[mgp.Any]` (optional, default `None`) ➡ A `Map` of Cypher parameters passed to + the remote query (e.g. `{val: 42}` for a query using `$val`). + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + - When fetching with the `(:Label)` syntax, each row has `labels` and `properties`. + - When fetching with the `[:REL_TYPE]` syntax, each row has `from_labels`, `to_labels`, + `from_properties`, `to_properties`, and `edge_properties`. + - When passing a plain Cypher query, row keys match the columns the remote query returns. + +{

Usage:

} + +#### Retrieve nodes of a certain label and recreate them locally +```cypher +CALL cross_database.bolt('(:Person)', {host: 'localhost', port: 7687}) +YIELD row +WITH row.labels AS labels, row.properties AS props +CREATE (n:labels) SET n += props; +``` + +#### Pass query parameters +```cypher +CALL cross_database.bolt( + 'MATCH (u:User) WHERE u.id = $id RETURN u.name AS name', + {host: 'localhost', port: 7687}, + '', + {id: 42} +) +YIELD row +RETURN row.name AS name; +``` + +#### Connect to Neo4j with explicit credentials +```cypher +CALL cross_database.bolt( + 'MATCH (n) RETURN count(n) AS cnt', + {host: 'neo4j-host', port: 7687, username: 'neo4j', password: 'secret'} +) +YIELD row +RETURN row.cnt AS cnt; +``` + +--- + +### `neo4j()` + +`cross_database.neo4j()` is a thin convenience wrapper around [`bolt()`](#bolt) that defaults the +credentials to Neo4j's stock `neo4j` / `password`. Use it when you only need to override the host/port. + +{

Input:

} + +Same as [`bolt()`](#bolt). If `username` / `password` are missing from `config`, they default to +`neo4j` / `password`. + +{

Output:

} + +Same as [`bolt()`](#bolt). + +{

Usage:

} + +```cypher +CALL cross_database.neo4j('(:Person)', {host: 'neo4j-host', port: 7687}) +YIELD row +WITH row.labels AS labels, row.properties AS props +CREATE (n:labels) SET n += props; +``` + +--- + +### `mysql()` + +Query MySQL and stream rows back. The result table is converted into a stream that can be used to create +graph structures. + +{

Input:

} + +- `table_or_sql: str` ➡ A table name (a single word — automatically expanded to `SELECT * FROM `) + or a full SQL query. +- `config: mgp.Map` ➡ Connection parameters (as in `mysql.connector.connect`). +- `config_path: str` (optional) ➡ Path to a JSON file with connection parameters. +- `params: mgp.Nullable[mgp.Any]` (optional, default `None`) ➡ Query parameters. Accepts a `List` for + `%s`-style placeholders or a `Map` for `%(name)s`-style placeholders (both supported by + `mysql.connector`). + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + +{

Usage:

} + +#### Retrieve and inspect data +```cypher +CALL cross_database.mysql('example_table', {user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'demo_db'}) +YIELD row +RETURN row +LIMIT 5000; +``` + +#### Create nodes from migrated data +```cypher +CALL cross_database.mysql('SELECT id, name, age FROM users', {user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'demo_db'}) +YIELD row +CREATE (u:User {id: row.id, name: row.name, age: row.age}); +``` + +--- + +### `postgresql()` + +Query PostgreSQL and stream rows back. + +{

Input:

} + +- `table_or_sql: str` ➡ Table name or SQL query. +- `config: mgp.Map` ➡ Connection parameters (as in `psycopg2.connect`). +- `config_path: str` (optional) ➡ Path to a JSON file with connection parameters. +- `params: mgp.Nullable[mgp.Any]` (optional, default `None`) ➡ Query parameters as a `List` + (psycopg2 uses positional `%s` placeholders). Passing a `Map` raises a `TypeError`. + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + +{

Usage:

} + +#### Retrieve and inspect data +```cypher +CALL cross_database.postgresql('example_table', {user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'demo_db'}) +YIELD row +RETURN row +LIMIT 5000; +``` + +#### Establish relationships between orders and customers +```cypher +CALL cross_database.postgresql('SELECT order_id, customer_id FROM orders', {user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'retail_db'}) +YIELD row +MATCH (o:Order {id: row.order_id}), (c:Customer {id: row.customer_id}) +CREATE (c)-[:PLACED]->(o); +``` + +--- + +### `sql_server()` + +Query SQL Server and stream rows back. + +{

Input:

} + +- `table_or_sql: str` ➡ Table name or SQL query. +- `config: mgp.Map` ➡ Connection parameters (as in `pyodbc.connect`). +- `config_path: str` (optional) ➡ Path to a JSON file with connection parameters. +- `params: mgp.Nullable[mgp.Any]` (optional, default `None`) ➡ Query parameters as a `List`. + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + +{

Usage:

} + +```cypher +CALL cross_database.sql_server('SELECT id, name, role FROM employees', {user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'company_db'}) +YIELD row +CREATE (e:Employee {id: row.id, name: row.name, role: row.role}); +``` + +--- + +### `oracle_db()` + +Query Oracle DB and stream rows back. + +{

Input:

} + +- `table_or_sql: str` ➡ Table name or SQL query. +- `config: mgp.Map` ➡ Connection parameters (as in `oracledb.connect`). +- `config_path: str` (optional) ➡ Path to a JSON file with connection parameters. +- `params: mgp.Nullable[mgp.Any]` (optional, default `None`) ➡ Query parameters. Accepts a `List` for + positional placeholders or a `Map` for `:name`-style named placeholders (both supported by `oracledb`). + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + +{

Usage:

} + +```cypher +CALL cross_database.oracle_db('SELECT id, name FROM companies', {user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'business_db'}) +YIELD row +MERGE (c:Company {id: row.id}) +SET c.name = row.name; +``` + +--- + +### `s3()` + +Read a CSV file directly from AWS S3 and stream its rows into Memgraph. + +{

Input:

} + +- `file_path: str` ➡ S3 path in the form `s3://bucket-name/path/to/file.csv`. +- `config: mgp.Map` ➡ AWS credentials; all keys optional. Missing keys fall back to the corresponding + environment variables: + - `aws_access_key_id` (env: `AWS_ACCESS_KEY_ID`) + - `aws_secret_access_key` (env: `AWS_SECRET_ACCESS_KEY`) + - `region_name` (env: `AWS_REGION`) + - `aws_session_token` (env: `AWS_SESSION_TOKEN`) +- `config_path: str` (optional) ➡ Path to a JSON file with AWS credentials. + +{

Output:

} + +- `row: mgp.Map` ➡ Each CSV row as a `{column_name: value}` map. + +{

Usage:

} + +```cypher +CALL cross_database.s3('s3://my-bucket/employees.csv', {aws_access_key_id: 'your-key', + aws_secret_access_key: 'your-secret', + region_name: 'eu-central-1'}) +YIELD row +CREATE (e:Employee {id: row.id, name: row.name, position: row.position}); +``` + +--- + +### `arrow_flight()` + +Connect to any data source that speaks the [Arrow Flight RPC protocol](https://arrow.apache.org/docs/format/Flight.html) +(for example, [Dremio](https://www.dremio.com/)) and stream rows in. + +{

Input:

} + +- `query: str` ➡ Query against the data source. +- `config: mgp.Map` ➡ Connection parameters (as in `pyarrow.flight.connect`). Notable keys: `host`, + `port`, `username`, `password`. +- `config_path: str` (optional) ➡ Path to a JSON file with connection parameters. + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + +{

Usage:

} + +```cypher +CALL cross_database.arrow_flight('SELECT id, name, age FROM users', {username: 'memgraph', + password: 'password', + host: 'localhost', + port: '12345'}) +YIELD row +CREATE (u:User {id: row.id, name: row.name, age: row.age}); +``` + +--- + +### `duckdb()` + +Connect to DuckDB and use it as a proxy to query the [data sources DuckDB +supports](https://duckdb.org/docs/stable/data/data_sources.html). DuckDB is run in-memory with no +persistence — it's only used to proxy to underlying sources. + +{

Input:

} + +- `query: str` ➡ Table name or SQL query. +- `setup_queries: mgp.Nullable[List[str]]` (optional) ➡ Queries executed before `query`, used to attach + or configure the source DuckDB will proxy to (e.g. `INSTALL httpfs; LOAD httpfs;`). + +{

Output:

} + +- `row: mgp.Map` ➡ The result table as a stream of rows. + +{

Usage:

} + +```cypher +CALL cross_database.duckdb( + 'SELECT * FROM read_csv_auto(''s3://my-bucket/users.csv'')', + ['INSTALL httpfs;', 'LOAD httpfs;'] +) +YIELD row +CREATE (u:User {id: row.id, name: row.name}); +``` + +--- + +### `servicenow()` + +Pull data from the [ServiceNow REST API](https://developer.servicenow.com/dev.do#!/reference/api/xanadu/rest/). +The endpoint must return JSON of the form `{"results": [...]}`. + +{

Input:

} + +- `endpoint: str` ➡ The full ServiceNow URL, including any query parameters. +- `config: mgp.Map` ➡ Connection parameters. Notable keys: `username`, `password` (passed through to + `requests.get`). +- `config_path: str` (optional) ➡ Path to a JSON file with connection parameters. +- `params: mgp.Nullable[mgp.Any]` (optional, default `None`) ➡ Additional URL query parameters, + forwarded to `requests.get(endpoint, params=...)`. Unlike the SQL backends, these are HTTP query + string parameters, not SQL placeholders. + +{

Output:

} + +- `row: mgp.Map` ➡ Each element of `results` as a structured dictionary. + +{

Usage:

} + +```cypher +CALL cross_database.servicenow('http://my_endpoint/api/data', {}) +YIELD row +CREATE (e:Employee {id: row.id, name: row.name, position: row.position}); +``` diff --git a/pages/advanced-algorithms/available-algorithms/migrate.mdx b/pages/advanced-algorithms/available-algorithms/migrate.mdx index 884b57e6d..6cf31632f 100644 --- a/pages/advanced-algorithms/available-algorithms/migrate.mdx +++ b/pages/advanced-algorithms/available-algorithms/migrate.mdx @@ -1,568 +1,19 @@ --- title: migrate -description: Discover the migration capabilities of Memgraph for efficient transfer of graph data between instances. Access tutorials and comprehensive documentation for improved experience throughout the migration. +description: The migrate module was renamed to cross_database in Memgraph 3.11. The migrate.* procedure names continue to work as aliases. --- -import { Cards } from 'nextra/components' -import GitHub from '/components/icons/GitHub' -import { Steps } from 'nextra/components' import { Callout } from 'nextra/components'; # migrate -The `migrate` module provides an efficient way to transfer graph data from various relational databases -into Memgraph. This module allows you to retrieve data from various source systems, -transforming tabular data into graph structures. - -With Cypher, you can shape the migrated data dynamically, making it easy to create nodes, -establish relationships, and enrich your graph. Below are examples showing how to retrieve, -filter, and convert relational data into a graph format. - - - } - title="Source code" - href="https://github.com/memgraph/memgraph/blob/master/mage/python/migrate.py" - /> - - -| Trait | Value | -| ------------------- | ---------- | -| **Module type** | util | -| **Implementation** | Python | -| **Parallelism** | sequential | - - -When running multiple migrations against the same source, avoid repeating the `config` map in every call. -Use [server-side parameters](/database-management/server-side-parameters) to store the connection config once -and reference it as `$config` across all your queries: - -```cypher -SET GLOBAL PARAMETER pg_config = {user: 'memgraph', password: 'password', host: 'localhost', database: 'demo_db'}; - -CALL migrate.postgresql('users', $pg_config) YIELD row CREATE (u:User {id: row.id}); -CALL migrate.postgresql('orders', $pg_config) YIELD row CREATE (o:Order {id: row.id}); -``` + +**The `migrate` module was renamed to [`cross_database`](./cross_database) in Memgraph 3.11.** +The `migrate.*` procedure names continue to work via aliases shipped with Memgraph, but new code +should call `cross_database.*` directly. See the [`cross_database`](./cross_database) page for the +full reference, including the new [`bolt()`](./cross_database#bolt) procedure (which supersedes +`migrate.memgraph()`) and the type-conversion rules that apply when reading from Bolt sources. ---- - -## Procedures - -### `arrow_flight()` - -With the `arrow_flight()` procedure, users can access data sources which support the [Arrow Flight RPC protocol](https://arrow.apache.org/docs/format/Flight.html) for transfer -of large data records to achieve high performance. Underlying implementation is using the `pyarrow` Python library to stream rows to -Memgraph. [Dremio](https://www.dremio.com/) is a confirmed data source that works with the `arrow_flight()` procedure. Other sources may also be compatible, but Dremio is based on previous experience. - -{

Input:

} - -- `query: str` ➡ Query used to query the data source. -- `config: mgp.Map` ➡ Connection parameters (as in `pyarrow.flight.connect`). Useful parameters for connecting are `host`, `port`, `username` and `password`. -- `config_path` ➡ Path to a JSON file containing configuration parameters. - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - -#### Retrieve and inspect data -```cypher -CALL migrate.arrow_flight('SELECT * FROM users', {username: 'memgraph', - password: 'password', - host: 'localhost', - port: '12345'} ) -YIELD row -RETURN row -LIMIT 5000; -``` - -#### Filter specific data -```cypher -CALL migrate.arrow_flight('SELECT * FROM users', {username: 'memgraph', - password: 'password', - host: 'localhost', - port: '12345'} ) -YIELD row -WHERE row.age >= 30 -RETURN row; -``` - -#### Create nodes from migrated data -```cypher -CALL migrate.arrow_flight('SELECT id, name, age FROM users', {username: 'memgraph', - password: 'password', - host: 'localhost', - port: '12345'} ) -YIELD row -CREATE (u:User {id: row.id, name: row.name, age: row.age}); -``` - -#### Create relationships between users -```cypher -CALL migrate.arrow_flight('SELECT user1_id, user2_id FROM friendships', {username: 'memgraph', - password: 'password', - host: 'localhost', - port: '12345'} ) -YIELD row -MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id}) -CREATE (u1)-[:FRIENDS_WITH]->(u2); -``` - -### `duckdb()` -With the `migrate.duckdb()` procedure, users can connect to the ** DuckDB** database and query various data sources. -List of data sources that are supported by DuckDB can be found on their [official documentation page](https://duckdb.org/docs/stable/data/data_sources.html). -The underlying implementation streams results from DuckDB to Memgraph using the `duckdb` Python Library. DuckDB is started with the in-memory mode, without any -persistence and is used just to proxy to the underlying data sources. - -{

Input:

} - -- `query: str` ➡ Table name or an SQL query. -- `setup_queries: mgp.Nullable[List[str]]` ➡ List of queries that will be executed prior to the query provided as the initial argument. -Used for setting up the connection to additional data sources. - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - -{

Usage:

} - -#### Retrieve and inspect data -```cypher -CALL migrate.duckdb("SELECT * FROM 'test.parquet';") -YIELD row -RETURN row -LIMIT 5000; -``` - -#### Filter specific data -```cypher -CALL migrate.duckdb("SELECT * FROM 'test.parquet';") -YIELD row -WHERE row.age >= 30 -RETURN row; -``` - -#### Create nodes from migrated data -```cypher -CALL migrate.duckdb("SELECT * FROM 'test.parquet';") -YIELD row -CREATE (u:User {id: row.id, name: row.name, age: row.age}); -``` - -#### Create relationships between users -```cypher -CALL migrate.duckdb("SELECT * FROM 'test.parquet';") -YIELD row -MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id}) -CREATE (u1)-[:FRIENDS_WITH]->(u2); -``` - -#### Setup connection to query additional data sources -```cypher -CALL migrate.duckdb("SELECT * FROM 's3://your_bucket/your_file.parquet';", ["CREATE SECRET secret1 (TYPE s3, KEY_ID 'key', SECRET 'secret', REGION 'region');"]) -YIELD row -MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id}) -CREATE (u1)-[:FRIENDS_WITH]->(u2); -``` - ---- - -### `memgraph()` - -With the `migrate.memgraph()` procedure, you can access another Memgraph instance and migrate your data to a new Memgraph instance. -The resulting nodes and edges are converted into a stream of rows which can include labels, properties, and primitives. - - -Streaming of raw node and relationship objects is not supported and users are advised to migrate all the necessary identifiers in order to recreate the same graph in Memgraph. - - -{

Input:

} - -- `label_or_rel_or_query: str` ➡ Label name (written in format `(:Label)`), relationship name (written in format `[:rel_type]`) or a plain cypher query. -- `config: mgp.Map` ➡ Connection parameters (as in `gqlalchemy.Memgraph`). Notable parameters are `host[String]`, and `port[Integer]` -- `config_path` ➡ Path to a JSON file containing configuration parameters. -- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable). - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - - when retrieving nodes using the `(:Label)` syntax, row will have the following keys: `labels`, and `properties` - - when retrieving relationships using the `[:REL_TYPE]` syntax, row will have the following keys: `from_labels`, `to_labels`, `from_properties`, `to_properties`, and `edge_properties` - - when retrieving results using a plain Cypher query, row will have keys identical to the returned column names from the Cypher query - -{

Usage:

} - -#### Retrieve nodes of certain label and create them in a new Memgraph instance -```cypher -CALL migrate.memgraph('(:Person)', {host: 'localhost', port: 7687}) -YIELD row -WITH row.labels AS labels, row.properties as props -CREATE (n:labels) SET n += row.props -``` - -#### Retrieve relationships of certain type and create them in a new Memgraph instance -```cypher -CALL migrate.memgraph('[:KNOWS]', {host: 'localhost', port: 7687}) -YIELD row -WITH row.from_labels AS from_labels, - row.to_labels AS to_labels, - row.from_properties AS from_properties, - row.to_properties AS to_properties, - row.edge_properties AS edge_properties -MATCH (p1:Person {id: row.from_properties.id}) -MATCH (p2:Person {id: row.to_properties.id}) -CREATE (p1)-[r:KNOWS]->(p2) -SET r += edge_properties; -``` - -#### Retrieve information from Memgraph using an arbitrary Cypher query -```cypher -CALL migrate.memgraph('MATCH (n) RETURN count(n) as cnt', {host: 'localhost', port: 7687}) -YIELD row -RETURN row.cnt as cnt; -``` - ---- - -### `mysql()` - -With the `migrate.mysql()` procedure, you can access MySQL and migrate your data to Memgraph. -The result table is converted into a stream, and the returned rows can be used to create graph structures. - -{

Input:

} - -- `table_or_sql: str` ➡ Table name or an SQL query. -- `config: mgp.Map` ➡ Connection parameters (as in `mysql.connector.connect`). -- `config_path` ➡ Path to a JSON file containing configuration parameters. -- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable). - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - -{

Usage:

} - -#### Retrieve and inspect data -```cypher -CALL migrate.mysql('example_table', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -RETURN row -LIMIT 5000; -``` - -#### Filter specific data -```cypher -CALL migrate.mysql('SELECT * FROM users', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -WHERE row.age >= 30 -RETURN row; -``` - -#### Create nodes from migrated data -```cypher -CALL migrate.mysql('SELECT id, name, age FROM users', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -CREATE (u:User {id: row.id, name: row.name, age: row.age}); -``` - -#### Create relationships between users -```cypher -CALL migrate.mysql('SELECT user1_id, user2_id FROM friendships', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id}) -CREATE (u1)-[:FRIENDS_WITH]->(u2); -``` - ---- - -### `neo4j()` - -With the `migrate.neo4j()` procedure, you can access Neo4j and migrate your data to Memgraph. -The resulting nodes and edges are converted into a stream of rows which can include labels, properties, and primitives. -**Streaming of raw node and relationship objects is not supported**, and users are advised to migrate all the necessary identifiers -in order to recreate the same graph in Memgraph. - -{

Input:

} - -- `label_or_rel_or_query: str` ➡ Label name (written in format `(:Label)`), relationship name (written in format `[:rel_type]`) or a plain cypher query. -- `config: mgp.Map` ➡ Connection parameters (as in `gqlalchemy.Neo4j`). Notable parameters are `host[String]` and `port[Integer]`. -- `config_path` ➡ Path to a JSON file containing configuration parameters. -- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable). - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - - When retrieving nodes using the `(:Label)` syntax, row will have the following keys: `labels` and `properties`. - - When retrieving relationships using the `[:REL_TYPE]` syntax, row will have the following keys: `from_labels`, `to_labels`, `from_properties`, `to_properties` and `edge_properties`. - - When retrieving results using a plain Cypher query, row will have keys identical to the returned column names from the Cypher query. - -{

Usage:

} - -#### Retrieve nodes of certain label and create them in Memgraph -```cypher -CALL migrate.neo4j('(:Person)', {host: 'localhost', port: 7687}) -YIELD row -WITH row.labels AS labels, row.properties as props -CREATE (n:labels) SET n += row.props -``` - -#### Retrieve relationships of certain type and create them in Memgraph -```cypher -CALL migrate.neo4j('[:KNOWS]', {host: 'localhost', port: 7687}) -YIELD row -WITH row.from_labels AS from_labels, - row.to_labels AS to_labels, - row.from_properties AS from_properties, - row.to_properties AS to_properties, - row.edge_properties AS edge_properties -MATCH (p1:Person {id: row.from_properties.id}) -MATCH (p2:Person {id: row.to_properties.id}) -CREATE (p1)-[r:KNOWS]->(p2) -SET r += edge_properties; -``` - -#### Retrieve information from Neo4j using an arbitrary Cypher query -```cypher -CALL migrate.neo4j('MATCH (n) RETURN count(n) as cnt', {host: 'localhost', port: 7687}) -YIELD row -RETURN row.cnt as cnt; -``` - ---- - -### `oracle_db()` - -With the `migrate.oracle_db()` procedure, you can access Oracle DB and migrate your data to Memgraph. - -{

Input:

} - -- `table_or_sql: str` ➡ Table name or an SQL query. -- `config: mgp.Map` ➡ Connection parameters (as in `mysql.connector.connect`). -- `config_path` ➡ Path to a JSON file containing configuration parameters. -- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable). - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - -{

Usage:

} - -#### Retrieve and inspect data -```cypher -CALL migrate.oracle_db('example_table', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -RETURN row -LIMIT 5000; -``` - -#### Merge nodes to avoid duplicates -```cypher -CALL migrate.oracle_db('SELECT id, name FROM companies', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'business_db'} ) -YIELD row -MERGE (c:Company {id: row.id}) -SET c.name = row.name; -``` - ---- - -### `postgresql()` - -With the `migrate.postgresql()` procedure, you can access PostgreSQL and migrate your data to Memgraph. - -{

Input:

} - -- `table_or_sql: str` ➡ Table name or an SQL query. -- `config: mgp.Map` ➡ Connection parameters (as in `mysql.connector.connect`). -- `config_path` ➡ Path to a JSON file containing configuration parameters. -- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable). - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - -{

Usage:

} - -#### Retrieve and inspect data -```cypher -CALL migrate.postgresql('example_table', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -RETURN row -LIMIT 5000; -``` - -#### Create nodes for products -```cypher -CALL migrate.postgresql('SELECT product_id, name, price FROM products', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'retail_db'} ) -YIELD row -CREATE (p:Product {id: row.product_id, name: row.name, price: row.price}); -``` - -#### Establish relationships between orders and customers -```cypher -CALL migrate.postgresql('SELECT order_id, customer_id FROM orders', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'retail_db'} ) -YIELD row -MATCH (o:Order {id: row.order_id}), (c:Customer {id: row.customer_id}) -CREATE (c)-[:PLACED]->(o); -``` - ---- - -### `sql_server()` - -With the `migrate.sql_server()` procedure, you can access SQL Server and migrate your data to Memgraph. - -{

Input:

} - -- `table_or_sql: str` ➡ Table name or an SQL query. -- `config: mgp.Map` ➡ Connection parameters (as in `mysql.connector.connect`). -- `config_path` ➡ Path to a JSON file containing configuration parameters. -- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable). - -{

Output:

} - -- `row: mgp.Map` ➡ The result table as a stream of rows. - -{

Usage:

} - -#### Retrieve and inspect data -```cypher -CALL migrate.sql_server('example_table', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'demo_db'} ) -YIELD row -RETURN row -LIMIT 5000; -``` - -#### Convert SQL table rows into graph nodes -```cypher -CALL migrate.sql_server('SELECT id, name, role FROM employees', {user: 'memgraph', - password: 'password', - host: 'localhost', - database: 'company_db'} ) -YIELD row -CREATE (e:Employee {id: row.id, name: row.name, role: row.role}); -``` - ---- - -### `s3()` - -With the `migrate.s3()` procedure, you can **access a CSV file in AWS S3**, stream the data into Memgraph, -and transform it into a **graph representation** using Cypher. The migration is using the Python `boto3` client. - -{

Input:

} - -- `file_path: str` ➡ S3 file path in the format `'s3://bucket-name/path/to/file.csv'`. -- `config: mgp.Map` ➡ AWS connection parameters. All of them are optional. - - `aws_access_key_id` - if not provided, environment variable `AWS_ACCESS_KEY_ID` will be used - - `aws_secret_access_key` - if not provided, environment variable `AWS_SECRET_ACCESS_KEY` will be used - - `region_name` - if not provided, environment variable `AWS_REGION` will be used - - `aws_session_token` - if not provided, environment variable `AWS_SESSION_TOKEN` will be used -- `config_path: str` (optional) ➡ Path to a JSON file containing AWS credentials. - -{

Output:

} - -- `row: mgp.Map` ➡ Each row from the CSV file as a structured dictionary. - -{

Usage:

} - -#### Retrieve and inspect CSV data from S3 -```cypher -CALL migrate.s3('s3://my-bucket/data.csv', {aws_access_key_id: 'your-key', - aws_secret_access_key: 'your-secret', - region_name: 'us-east-1'} ) -YIELD row -RETURN row -LIMIT 100; -``` - -#### Filter specific rows from the CSV -```cypher -CALL migrate.s3('s3://my-bucket/customers.csv', {aws_access_key_id: 'your-key', - aws_secret_access_key: 'your-secret', - region_name: 'us-west-2'} ) -YIELD row -WHERE row.age >= 30 -RETURN row; -``` - -#### Create nodes dynamically from CSV data -```cypher -CALL migrate.s3('s3://my-bucket/employees.csv', {aws_access_key_id: 'your-key', - aws_secret_access_key: 'your-secret', - region_name: 'eu-central-1'} ) -YIELD row -CREATE (e:Employee {id: row.id, name: row.name, position: row.position}); -``` - ---- - -### `servicenow()` - -With the `migrate.servicenow()` procedure, you can access [ServiceNow REST API](https://developer.servicenow.com/dev.do#!/reference/api/xanadu/rest/) and transfer your data to Memgraph. -The underlying implementation is using the [`requests` Python library] to migrate results to Memgraph. The REST API from -ServiceNow must provide results in the format `{results: []}` in order for Memgraph to stream it into result rows. - -{

Input:

} - -- `endpoint: str` ➡ ServiceNow endpoint. Users can optionally include their own query parameters to filter results. -- `config: mgp.Map` ➡ Connection parameters. Notable connection parameters are `username` and `password`, per `requests.get()` method. -- `config_path: str` ➡ Path to a JSON file containing configuration parameters. - -{

Output:

} - -- `row: mgp.Map` ➡ Each row from the CSV file as a structured dictionary. - -{

Usage:

} - -#### Retrieve and inspect CSV data from ServiceNow -```cypher -CALL migrate.servicenow('http://my_endpoint/api/data', {}) -YIELD row -RETURN row -LIMIT 100; -``` - -#### Filter specific rows from the CSV -```cypher -CALL migrate.servicenow('http://my_endpoint/api/data', {}) -YIELD row -WHERE row.age >= 30 -RETURN row; -``` - -#### Create nodes dynamically from CSV data -```cypher -CALL migrate.servicenow('http://my_endpoint/api/data', {}) -YIELD row -CREATE (e:Employee {id: row.id, name: row.name, position: row.position}); -``` +This page is kept for backwards-compatible inbound links. All content has moved to +[`cross_database`](./cross_database). From ad72d7106c589e85077265f0f991a86c52bdb53c Mon Sep 17 00:00:00 2001 From: Andi Skrgat Date: Wed, 27 May 2026 11:50:39 +0200 Subject: [PATCH 10/12] docs: Bolt TLS (#1641) --- .../setup-ha-cluster-k8s.mdx | 143 ++++++++++++++++-- 1 file changed, 134 insertions(+), 9 deletions(-) diff --git a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx index ec231a5bb..d355e343e 100644 --- a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx +++ b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx @@ -454,6 +454,80 @@ Run the same statements on every data instance you want the user to exist on. Coordinators run without authentication and do not need user setup. +### Bolt SSL/TLS + +Each data instance and coordinator can independently terminate Bolt +connections over TLS. When enabled, the chart mounts a pre-existing +Kubernetes Secret containing the certificate and private key at +`/etc/memgraph/ssl/` and auto-appends `--bolt-cert-file=/etc/memgraph/ssl/tls.crt` +and `--bolt-key-file=/etc/memgraph/ssl/tls.key` to the instance's args. + + +**Breaking change in HA chart version with TLS config**: The previous way of +enabling Bolt TLS — passing `--bolt-cert-file` / `--bolt-key-file` through +`data[].args` / `coordinators[].args` and mounting the certificate Secret +through `storage.{data,coordinators}.extraVolumes` / `extraVolumeMounts` — is +no longer supported. Setting `--bolt-cert-file` or `--bolt-key-file` in `args` +now causes `helm install` to fail with a template error. Migrate to the +`tls.bolt` block on each instance instead. + + +To enable Bolt TLS, first create a Kubernetes Secret holding the certificate +and private key in the release namespace: + +```bash +kubectl create secret tls bolt-tls-secret \ + --cert=path/to/tls.crt \ + --key=path/to/tls.key +``` + +Then enable `tls.bolt` on each instance that should terminate TLS: + +```yaml +data: + - id: "0" + tls: + bolt: + enabled: true + secretName: bolt-tls-secret + certSecretPath: tls.crt + keySecretPath: tls.key + - id: "1" + tls: + bolt: + enabled: true + secretName: bolt-tls-secret + certSecretPath: tls.crt + keySecretPath: tls.key + +coordinators: + - id: "1" + tls: + bolt: + enabled: true + secretName: bolt-tls-secret + - id: "2" + tls: + bolt: + enabled: true + secretName: bolt-tls-secret + - id: "3" + tls: + bolt: + enabled: true + secretName: bolt-tls-secret +``` + +`certSecretPath` and `keySecretPath` are the keys inside the Secret holding +the certificate and key respectively (default `tls.crt` and `tls.key`). +The chart fails the install if `tls.bolt.enabled` is `true` but +`tls.bolt.secretName` is empty. + +When a coordinator has `tls.bolt.enabled: true`, the cluster-setup job +that registers coordinators and data instances automatically uses +`--use-ssl` when connecting to coordinator 1. + + ## Setting up the cluster Although many configuration options exist, especially for networking, the workflow for creating a Memgraph HA cluster follows these steps: @@ -830,8 +904,9 @@ prometheus: port: 9115 pullFrequencySeconds: 5 repository: memgraph/mg-exporter - tag: 0.2.1 + tag: 0.2.3 serviceMonitor: + enabled: true kubePrometheusStackReleaseName: kube-prometheus-stack interval: 15s ``` @@ -840,9 +915,51 @@ If you set `prometheus.enabled` to `false`, resources from `charts/memgraph-high-availability/templates/mg-exporter.yaml` will still be installed into the `monitoring` namespace. +`prometheus.serviceMonitor.enabled` defaults to `false`; set it to `true` only +when you have `kube-prometheus-stack` (or another Prometheus Operator) in the +cluster to consume the `ServiceMonitor` resource. + Refer to the configuration table later in the document for details on all parameters. +#### mg-exporter TLS + +When any data instance or coordinator has `tls.bolt.enabled: true`, the +chart automatically configures the mg-exporter to scrape that instance over +`https://` instead of `http://`. Each instance entry in the exporter config +also gets `skip_tls_verify` and (optionally) `ca_file` derived from +`prometheus.memgraphExporter.tls`: + +```yaml +prometheus: + memgraphExporter: + tls: + skipVerify: true + caSecretName: "" + caSecretKey: ca.crt +``` + +- `skipVerify` — when `true` (default), the exporter does not verify the + Memgraph server certificate. Convenient for self-signed certs but not + suitable for production. +- `caSecretName` — name of a pre-created Secret holding the CA bundle that + signed Memgraph's certificate. When set and `skipVerify` is `false`, the + chart mounts the Secret at `/etc/mg-exporter/ssl` and passes + `ca_file=/etc/mg-exporter/ssl/` to the exporter. +- `caSecretKey` — key inside the Secret holding the CA certificate + (default `ca.crt`). + +Example with strict CA verification: + +```yaml +prometheus: + memgraphExporter: + tls: + skipVerify: false + caSecretName: bolt-ca-bundle + caSecretKey: ca.crt +``` + ### Uninstall kube-prometheus-stack ```bash @@ -947,10 +1064,10 @@ coordinators: The chart auto-appends `--bolt-port`, `--management-port`, `--coordinator-port`, `--coordinator-id`, `--coordinator-hostname`, `--data-directory`, `--log-level`, -`--also-log-to-stderr` and `--log-file` from `ports.*` and -`commonArgs.{data,coordinators}.logging.*`. Setting any of these in -`data[].args` or `coordinators[].args` causes `helm install` to fail with a -template error. +`--also-log-to-stderr`, `--log-file`, `--bolt-cert-file` and `--bolt-key-file` +from `ports.*`, `commonArgs.{data,coordinators}.logging.*` and the per-instance +`tls.bolt.*` block. Setting any of these in `data[].args` or +`coordinators[].args` causes `helm install` to fail with a template error. Create credentials secret in the namespace where vmagent runs (usually `monitoring`): @@ -1125,10 +1242,13 @@ and their default values. | `prometheus.memgraphExporter.port` | The port on which Memgraph's Prometheus exporter is available. | `9115` | | `prometheus.memgraphExporter.pullFrequencySeconds` | How often will Memgraph's Prometheus exporter pull data from Memgraph instances. | `5` | | `prometheus.memgraphExporter.repository` | The repository where Memgraph's Prometheus exporter image is available. | `docker.io/memgraph/prometheus-exporter` | -| `prometheus.memgraphExporter.tag` | The tag of Memgraph's Prometheus exporter image. | `0.2.1` | +| `prometheus.memgraphExporter.tag` | The tag of Memgraph's Prometheus exporter image. | `0.2.3` | +| `prometheus.memgraphExporter.tls.skipVerify` | When `true`, mg-exporter does not verify Memgraph's server certificate. Only applied when scraping instances with `tls.bolt.enabled=true`. | `true` | +| `prometheus.memgraphExporter.tls.caSecretName` | Name of a pre-created Secret containing the CA bundle. When set (and `skipVerify=false`), the chart mounts it at `/etc/mg-exporter/ssl`. | `""` | +| `prometheus.memgraphExporter.tls.caSecretKey` | Key inside the Secret holding the CA certificate. | `ca.crt` | | `prometheus.memgraphExporter.extraVolumes` | Additional volumes mounted on the `mg-exporter` Deployment (e.g. ConfigMaps with custom exporter configs). | `[]` | | `prometheus.memgraphExporter.extraVolumeMounts` | Additional volume mounts for the `mg-exporter` container. | `[]` | -| `prometheus.serviceMonitor.enabled` | If enabled, a `ServiceMonitor` object will be deployed. | `true` | +| `prometheus.serviceMonitor.enabled` | If enabled, a `ServiceMonitor` object will be deployed. | `false` | | `prometheus.serviceMonitor.kubePrometheusStackReleaseName` | The release name under which `kube-prometheus-stack` chart is installed. | `kube-prometheus-stack` | | `prometheus.serviceMonitor.interval` | How often will Prometheus pull data from Memgraph's Prometheus exporter. | `15s` | | `vmagentRemote.enabled` | Deploy a vmagent Deployment that scrapes mg-exporter and remote-writes to a Prometheus-compatible endpoint. | `false` | @@ -1204,14 +1324,19 @@ following parameters: | `id` | ID of the instance | `0` for data, `1` for coordinators | | `internalAccessAnnotations` | Per-instance annotations for the internal ClusterIP Service. | `{}` | | `externalAccessAnnotations` | Per-instance annotations for the external access Service, merged with global annotations. | `{}` | +| `tls.bolt.enabled` | Enable Bolt TLS termination on this instance. The chart auto-appends `--bolt-cert-file` / `--bolt-key-file` and mounts the certificate Secret at `/etc/memgraph/ssl`. | `false` | +| `tls.bolt.secretName` | Name of a pre-existing Kubernetes Secret holding the Bolt TLS certificate and private key. Required when `tls.bolt.enabled=true`. | `bolt-tls-secret` | +| `tls.bolt.certSecretPath` | Key inside the Secret holding the TLS certificate. | `tls.crt` | +| `tls.bolt.keySecretPath` | Key inside the Secret holding the TLS private key. | `tls.key` | | `args` | Per-instance Memgraph CLI flags. Append-only — see the note below for flags the chart manages. | `["--storage-snapshot-on-exit=false"]` for data, `[]` for coordinators | The `args` field accepts any Memgraph CLI flag **except** the following, which the chart appends automatically and rejects when set per-instance: `--bolt-port`, `--management-port`, `--coordinator-port`, `--coordinator-id`, `--coordinator-hostname`, `--data-directory`, `--log-level`, -`--also-log-to-stderr`, and `--log-file`. Configure those through `ports.*` -and `commonArgs.{data,coordinators}.logging.*` instead. +`--also-log-to-stderr`, `--log-file`, `--bolt-cert-file` and `--bolt-key-file`. +Configure those through `ports.*`, `commonArgs.{data,coordinators}.logging.*` +and the per-instance `tls.bolt.*` block instead. For all available database settings, refer to the [configuration settings docs](/database-management/configuration). From 00155f3c2df973aa692da2429a8353b3cc852626 Mon Sep 17 00:00:00 2001 From: Marko Budiselic Date: Wed, 27 May 2026 12:11:54 +0200 Subject: [PATCH 11/12] Apply the docs merging skill - pass #1 --- .../advanced-algorithms/available-algorithms/cross_database.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/advanced-algorithms/available-algorithms/cross_database.mdx b/pages/advanced-algorithms/available-algorithms/cross_database.mdx index dd60e8a42..1856d6704 100644 --- a/pages/advanced-algorithms/available-algorithms/cross_database.mdx +++ b/pages/advanced-algorithms/available-algorithms/cross_database.mdx @@ -125,7 +125,7 @@ recommended way to read from any Bolt source. {

Input:

} - `label_or_rel_or_query: str` ➡ A label name (`(:Label)`), a relationship type (`[:REL_TYPE]`), or a plain - Cypher query. When a label/relationship shorthand is used, `cross_database.bolt()` synthesises the + Cypher query. When a label/relationship shorthand is used, `cross_database.bolt()` synthesizes the matching `MATCH … RETURN labels/properties …` query for you. - `config: mgp.Map` ➡ Connection parameters. Notable keys: `host` (default `localhost`), `port` (default `7687`), `username`, `password`, `database`, `uri_scheme` (default `bolt`). From 6c7bcf922f7d37370ce73fd973323e67b1169417 Mon Sep 17 00:00:00 2001 From: Marko Budiselic Date: Wed, 27 May 2026 12:21:19 +0200 Subject: [PATCH 12/12] Add changelog items --- pages/release-notes.mdx | 81 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/pages/release-notes.mdx b/pages/release-notes.mdx index 11b3b857c..5bacc4b14 100644 --- a/pages/release-notes.mdx +++ b/pages/release-notes.mdx @@ -62,6 +62,87 @@ guide. `Coordinators don't support snapshots`. Update any tooling that pattern-matched on the previous error message. [#4066](https://github.com/memgraph/memgraph/pull/4066) +- `schema.node_type_properties()` and `schema.rel_type_properties()` return + additional columns and changed types. `propertyTypes` is now `List` + (was untyped), and both procedures include `propertyObservations` and + `totalObservations` counts. `rel_type_properties()` now partitions results by + endpoint labels (`sourceNodeLabels`, `targetNodeLabels`), so the same edge + type connecting different label sets produces separate rows. Clients that + destructure results by column index or expect the previous column set must be + updated. + [#4153](https://github.com/memgraph/memgraph/pull/4153) +- The JSON metrics endpoint histogram buckets have changed, so p50, p90, and + p99 latency values may report slightly different numbers compared to 3.10. + `SHOW METRICS INFO` now returns consistent per-database metrics for the + currently active database rather than mixing per-database vertex/edge counts + with aggregate metrics for everything else. Update any alerting thresholds + that rely on exact histogram values. + [#3911](https://github.com/memgraph/memgraph/pull/3911) +- The `cross_database` module (and `migrate` aliases) now uses + `graph.start_timestamp` internally instead of `graph.transaction_id`. The + `transaction_id` parameter has been removed from the procedure call API. If + you wrote custom procedures that relied on the `transaction_id` parameter + passed by the `cross_database` module, switch to `graph.start_timestamp`. + [#4167](https://github.com/memgraph/memgraph/pull/4167) + +{

✨ New features

} + +- Added the `cross_database` module, replacing the deprecated `migrate` module. + Query other databases (Memgraph, Neo4j, PostgreSQL, MySQL, Oracle, SQL Server, + S3, Arrow Flight, DuckDB, ServiceNow) directly from Cypher and stream rows + into your graph. The old `migrate.*` procedure names continue to work as + aliases. + [#3832](https://github.com/memgraph/memgraph/pull/3832) +- Added native OpenMetrics/Prometheus metrics export with per-database + granularity. Set `--metrics-format=OpenMetrics` to serve metrics at `/metrics` + in Prometheus scrape format. Each database gets its own query counters, + latency histograms, and storage gauges labeled by database name. The JSON + format remains the default but is deprecated. + [#3911](https://github.com/memgraph/memgraph/pull/3911) +- Vector indexes now support multi-label and wildcard filters. Create indexes on + any combination of labels with `:L1|L2(prop)` (match any), `:L1&L2(prop)` + (match all), or `(prop)` (wildcard — every entity with the property). Applies + to both vertex and edge vector indexes. + [#3981](https://github.com/memgraph/memgraph/pull/3981) +- `schema.node_type_properties()` and `schema.rel_type_properties()` now return + richer output including `propertyObservations` and `totalObservations` counts, + and `propertyTypes` is a proper `List`. `rel_type_properties()` + additionally partitions results by endpoint labels, so different label-set + combinations for the same edge type are reported separately. + [#4153](https://github.com/memgraph/memgraph/pull/4153) +- Added intra-cluster TLS for high availability deployments. New flags + `--cluster-cert-file`, `--cluster-key-file`, and `--cluster-ca-file` secure + communication on the management, replication, and coordinator servers. Bolt + TLS remains independently configured. + [#4140](https://github.com/memgraph/memgraph/pull/4140) +- `SHOW TRANSACTIONS` now includes `start_time` (UTC timestamp) and + `elapsed_ms` columns, making it easier to identify long-running transactions + without calculating elapsed time manually. + [#4136](https://github.com/memgraph/memgraph/pull/4136) +- The OEM license tier is now split into `OEM` (enterprise-equivalent) and + `OEM_COMMUNITY` (community-tier tracking). Existing OEM license keys continue + to work and map to the enterprise-equivalent tier on upgrade. + [#4158](https://github.com/memgraph/memgraph/pull/4158) +- Snapshot creation now logs per-phase progress (edges, vertices, each index + type, constraints, finalize), the trigger reason, thread count, and final + statistics. Stalled snapshots can now be diagnosed from the log without + guessing which phase they are stuck in. + [#4164](https://github.com/memgraph/memgraph/pull/4164) +- The JSON metrics endpoint now has a formally defined contract validated in CI, + ensuring backward compatibility of metric names and structure across releases. + [#4143](https://github.com/memgraph/memgraph/pull/4143) + +{

🐞 Bug fixes

} + +- Fixed a bug where `MERGE` followed by `CREATE` in the same query could + produce unbounded node creation and out-of-memory errors. The underlying scan + iterator now correctly scopes itself to nodes that existed when the scan + started. + [#4080](https://github.com/memgraph/memgraph/pull/4080) +- Fixed `cross_database` (and `migrate`) procedures failing with a `KeyError` + when used with `USING PERIODIC COMMIT`. The procedures now use a stable + per-query identifier that does not rotate between batches. + [#4167](https://github.com/memgraph/memgraph/pull/4167) ## Previous releases